Concentus
Apply window and compute the MDCT for all sub-frames and
all channels in a frame
OPT: This is the kernel you really want to optimize. It gets used a lot by the prefilter and by the PLC.
non-pointer case
only needed in one place
Decoder state
Scratch space used by the decoder. It is actually a variable-sized
field that resulted in a variable-sized struct. There are 6 distinct regions inside.
I have laid them out into separate variables here,
but these were the original definitions:
val32 decode_mem[], Size = channels*(DECODE_BUFFER_SIZE+mode.overlap)
val16 lpc[], Size = channels*LPC_ORDER
val16 oldEBands[], Size = 2*mode.nbEBands
val16 oldLogE[], Size = 2*mode.nbEBands
val16 oldLogE2[], Size = 2*mode.nbEBands
val16 backgroundLogE[], Size = 2*mode.nbEBands
The original C++ defined in_mem as a single float[1] which was the "caboose"
to the overall encoder struct, containing 5 separate variable-sized buffer
spaces of heterogeneous datatypes. I have laid them out into separate variables here,
but these were the original definitions:
val32 in_mem[], Size = channels*mode.overlap
val32 prefilter_mem[], Size = channels*COMBFILTER_MAXPERIOD
val16 oldBandE[], Size = channels*mode.nbEBands
val16 oldLogE[], Size = channels*mode.nbEBands
val16 oldLogE2[], Size = channels*mode.nbEBands
Definition for each "pseudo-critical band"
Number of lines in allocVectors
Number of bits in each band for several rates
Takes the pitch vector and the decoded residual vector, computes the gain
that will give ||p+g*y||=1 and mixes the residual with the pitch.
Decode pulse vector and combine the result with the pitch vector to produce
the final normalised signal in the current band.
For performance reasons, do not use this generic class if possible
This simulates a C++ style pointer as far as can be implemented in C#. It represents a handle
to an array of objects, along with a base offset that represents the address.
When you are programming in debug mode, this class also enforces memory boundaries,
tracks uninitialized values, and also records all statistics of accesses to its base array.
Returns the value currently under the pointer, and returns a new pointer with +1 offset.
This method is not very efficient because it creates new pointers; this is because we must preserve
the pass-by-value nature of C++ pointers when they are used as arguments to functions
Copies the contents of this pointer, starting at its current address, into the space of another pointer.
!!! IMPORTANT !!! REMEMBER THAT C++ memcpy is (DEST, SOURCE, LENGTH) !!!!
IN C# IT IS (SOURCE, DEST, LENGTH). DON'T GET SCOOPED LIKE I DID
Copies the contents of this pointer, starting at its current address, into an array.
!!! IMPORTANT !!! REMEMBER THAT C++ memcpy is (DEST, SOURCE, LENGTH) !!!!
Loads N values from a source array into this pointer's space
Assigns a certain value to a range of spaces in this array
The value to set
The number of values to write
Assigns a certain value to a range of spaces in this array
The value to set
The number of values to write
Moves regions of memory within the bounds of this pointer's array.
Extra checks are done to ensure that the data is not corrupted if the copy
regions overlap
The offset to send this pointer's data to
The number of values to copy
This is a helper class which contains static methods that involve pointers
Allocates a new array and returns a pointer to it
Creates a pointer to an existing array
*The number of bits to use for the range-coded part of uint integers.*/
*The resolution of fractional-precision bit usage measurements, i.e.,
Normalizes the contents of val and rng so that rng lies entirely in the high-order symbol.
The probability of having a "one" is 1/(1<<_logp).
Outputs a symbol, with a carry bit.
If there is a potential to propagate a carry over several symbols, they are
buffered until it can be determined whether or not an actual carry will
occur.
If the counter for the buffered symbols overflows, then the stream becomes
undecodable.
This gives a theoretical limit of a few billion symbols in a single packet on
32-bit systems.
The alternative is to truncate the range in order to force a carry, but
requires similar carry tracking in the decoder, needlessly slowing it down.
Returns the number of bits "used" by the encoded or decoded symbols so far.
This same number can be computed in either the encoder or the decoder, and is
suitable for making coding decisions.
This will always be slightly larger than the exact value (e.g., all
rounding error is in the positive direction).
The number of bits.
This is a faster version of ec_tell_frac() that takes advantage
of the low(1/8 bit) resolution to use just a linear function
followed by a lookup to determine the exact transition thresholds.
Integer log in base2. Undefined for zero and negative numbers
Integer log in base2. Defined for zero, but not for negative numbers
Multiplies two 16-bit fractional values. Bit-exactness of this macro is important
Compute floor(sqrt(_val)) with exact arithmetic.
This has been tested on all possible 32-bit inputs.
Sqrt approximation (QX input, QX/2 output)
Reciprocal approximation (Q15 input, Q16 output)
Reciprocal sqrt approximation in the range [0.25,1) (Q16 in, Q14 out)
Base-2 logarithm approximation (log2(x)). (Q14 input, Q10 output)
Base-2 exponential approximation (2^x). (Q10 input, Q16 output)
Rotate a32 right by 'rot' bits. Negative rot values result in rotating
left. Output is 32bit int.
Rotate a32 right by 'rot' bits. Negative rot values result in rotating
left. Output is 32bit uint.
((a32 >> 16) * (b32 >> 16))
Adds two signed 32-bit values in a way that can overflow, while not relying on undefined behaviour
(just standard two's complement implementation-specific behaviour)
Subtracts two signed 32-bit values in a way that can overflow, while not relying on undefined behaviour
(just standard two's complement implementation-specific behaviour)
Multiply-accumulate macros that allow overflow in the addition (ie, no asserts in debug mode)
(a32 * (int)((short)(b32))) >> 16 output have to be 32bit int
//////////////////
Add with saturation for positive input values
Add with saturation for positive input values
Add with saturation for positive input values
Add with saturation for positive input values
saturates before shifting
Macro to convert floating-point constants to fixed-point by applying a scalar factor
Because of limitations of the C# JIT, this macro is actually evaluated at runtime and therefore should not be used if you want to maximize performance
PSEUDO-RANDOM GENERATOR
Make sure to store the result as the seed for the next call (also in between
frames), otherwise result won't be random at all. When only using some of the
bits, take the most significant bits by right-shifting.
silk_SMMUL: Signed top word multiply.
Divide two int32 values and return result as int32 in a given Q-domain
I numerator (Q0)
I denominator (Q0)
I Q-domain of result (>= 0)
O returns a good approximation of "(a32 << Qres) / b32"
Invert int32 value and return result as int32 in a given Q-domain
I denominator (Q0)
I Q-domain of result (> 0)
a good approximation of "(1 << Qres) / b32"
a32 + (b32 * (int)((short)(c32))) >> 16 output have to be 32bit int
* (a32 * (b32 >> 16)) >> 16 */
* (int)((short)(a32)) * (b32 >> 16) */
* a32 + (int)((short)(b32)) * (c32 >> 16) */
* a64 + (b32 * c32) */
(a32 * b32) >> 16
a32 + ((b32 * c32) >> 16)
Get number of leading zeros and fractional part (the bits right after the leading one)
input
number of leading zeros
the 7 bits right after the leading one
Approximation of square root.
Accuracy: +/- 10% for output values > 15
+/- 2.5% for output values > 120
Approximation of 128 * log2() (very close inverse of silk_log2lin())
Convert input to a log scale
(I) input in linear scale
Approximation of 2^() (very close inverse of silk_lin2log())
Convert input to a linear scale
input on log scale
Linearized value
Interpolate two vectors
(O) interpolated vector [MAX_LPC_ORDER]
(I) first vector [MAX_LPC_ORDER]
(I) second vector [MAX_LPC_ORDER]
(I) interp. factor, weight on 2nd vector
(I) number of parameters
Inner product with bit-shift
I input vector 1
I input vector 2
I number of bits to shift
I vector lengths
returns the value that has fewer higher-order bits, ignoring sign bit (? I think?)
Counts leading zeroes
returns inverse base-2 log of a value
Arbitrary-rate audio resampler originally implemented for the Speex codec.
typedef int (* resampler_basic_func)(SpeexResamplerState*, int , Pointer<short>, int *, Pointer<short>, Pointer<int>);
Create a new resampler with integer input and output rates (in hertz).
The number of channels to be processed
Input sampling rate, in hertz
Output sampling rate, in hertz
Resampling quality, from 0 to 10
Create a new resampler with fractional input/output rates. The sampling
rate ratio is an arbitrary rational number with both the numerator and
denominator being 32-bit integers.
The number of channels to be processed
Numerator of sampling rate ratio
Denominator of sampling rate ratio
Input sample rate rounded to the nearest integer (in hz)
Output sample rate rounded to the nearest integer (in hz)
Resampling quality, from 0 to 10
A newly created resampler
Make sure that the first samples to go out of the resamplers don't have
leading zeros. This is only useful before starting to use a newly created
resampler. It is recommended to use that when resampling an audio file, as
it will generate a file with the same length.For real-time processing,
it is probably easier not to use this call (so that the output duration
is the same for the first frame).
Clears the resampler buffers so a new (unrelated) stream can be processed.
Sets the input and output rates
Input sampling rate, in hertz
Output sampling rate, in hertz
Get the current input/output sampling rates (integer value).
(Output) Sampling rate of input
(Output) Sampling rate of output
Sets the input/output sampling rates and resampling ration (fractional values in Hz supported)
Numerator of the sampling rate ratio
Denominator of the sampling rate ratio
Input sampling rate rounded to the nearest integer (in Hz)
Output sampling rate rounded to the nearest integer (in Hz)
Gets the current resampling ratio. This will be reduced to the least common denominator
(Output) numerator of the sampling rate ratio
(Output) denominator of the sampling rate ratio
Gets or sets the resampling quality between 0 and 10, where 0 has poor
quality and 10 has very high quality.
Gets or sets the input stride
Gets or sets the output stride
Get the latency introduced by the resampler measured in input samples.
Gets the latency introduced by the resampler measured in output samples.
Gets the latency introduced by the resampler.
The Opus decoder structure.
Opus is a stateful codec with overlapping blocks and as a result Opus
packets are not coded independently of each other. Packets must be
passed into the decoder serially and in the correct order for a correct
decode. Lost packets can be replaced with loss concealment by calling
the decoder with a null reference and zero length for the missing packet.
A single codec state may only be accessed from a single thread at
a time and any required locking must be performed by the caller. Separate
streams must be decoded with separate decoder states and can be decoded
in parallel.
Decodes an Opus packet, putting the decoded audio into a floating-point buffer.
The input payload. This may be empty if the previous packet was lost in transit (when PLC is enabled)
A buffer to put the output PCM. The output size is (# of samples) * (# of channels).
You can use the OpusPacketInfo helpers to get a hint of the frame size before you decode the packet if you need
exact sizing. Otherwise, the minimum safe buffer size is 5760 samples
The number of samples (per channel) of available space in the output PCM buf.
If this is less than the maximum packet duration (120ms; 5760 for 48khz), this function will
not be capable of decoding some packets. In the case of PLC (data == NULL) or FEC (decode_fec == true),
then frame_size needs to be exactly the duration of the audio that is missing, otherwise the decoder will
not be in an optimal state to decode the next incoming packet. For the PLC and FEC cases, frame_size *must*
be a multiple of 10 ms.
Indicates that we want to recreate the PREVIOUS (lost) packet using FEC data from THIS packet. Using this packet
recovery scheme, you will actually decode this packet twice, first with decode_fec TRUE and then again with FALSE. If FEC data is not
available in this packet, the decoder will simply generate a best-effort recreation of the lost packet.
The number of decoded samples
Decodes an Opus packet, putting the decoded audio into an int16 buffer.
The input payload. This may be empty if the previous packet was lost in transit (when PLC is enabled)
A buffer to put the output PCM. The output size is (# of samples) * (# of channels).
You can use the OpusPacketInfo helpers to get a hint of the frame size before you decode the packet if you need
exact sizing. Otherwise, the minimum safe buffer size is 5760 samples
The number of samples (per channel) of available space in the output PCM buf.
If this is less than the maximum packet duration (120ms; 5760 for 48khz), this function will
not be capable of decoding some packets. In the case of PLC (data == NULL) or FEC (decode_fec == true),
then frame_size needs to be exactly the duration of the audio that is missing, otherwise the decoder will
not be in an optimal state to decode the next incoming packet. For the PLC and FEC cases, frame_size *must*
be a multiple of 10 ms.
Indicates that we want to recreate the PREVIOUS (lost) packet using FEC data from THIS packet. Using this packet
recovery scheme, you will actually decode this packet twice, first with decode_fec TRUE and then again with FALSE. If FEC data is not
available in this packet, the decoder will simply generate a best-effort recreation of the lost packet.
The number of decoded samples
Resets all buffers and prepares this decoder to process a fresh (unrelated) stream
Gets the version string of the library backing this implementation.
An arbitrary version string.
Gets the encoded bandwidth of the last packet decoded. This may be lower than the actual decoding sample rate,
and is only an indicator of the encoded audio's quality
Returns the final range of the entropy coder. If you need this then I also assume you know what it's for.
Gets or sets the gain (Q8) to use in decoding
Gets the duration of the last packet, in PCM samples per channel
Gets the number of channels that this decoder decodes to. Always constant for the lifetime of the decoder.
Gets the last estimated pitch value of the decoded audio
Gets the sample rate that this decoder decodes to. Always constant for the lifetime of the decoder
Represents an Opus encoder for a 1- or 2-channel audio stream.
May be backed either by managed code or a native adapter layer,
depending on your platform and performance requirements.
Encodes an Opus frame.
Input signal (Interleaved if stereo). Length should be at least frame_size * channels
The number of samples per channel in the inpus signal.
The frame size must be a valid Opus framesize for the given sample rate.
For example, at 48Khz the permitted values are 120, 240, 480, 960, 1920, and 2880. Passing in a duration of less than 10ms
(480 samples at 48Khz) will prevent the encoder from using FEC, DTX, or hybrid modes.
Destination buffer for the output payload. This must contain at least max_data_bytes
The maximum amount of space allocated for the output payload. This may be used to impose
an upper limit on the instant bitrate, but should not be used as the only bitrate control (use the Bitrate parameter for that)
The length of the encoded packet, in bytes. This value will always be less than or equal to 1275, the maximum Opus packet size.
Encodes an Opus frame using floating point input.
Input signal in float format (Interleaved if stereo). Length should be at least frame_size * channels.
Value should be normalized to the +/- 1.0 range. Samples with a range beyond +/-1.0 will be clipped.
The number of samples per channel in the inpus signal.
The frame size must be a valid Opus framesize for the given sample rate.
For example, at 48Khz the permitted values are 120, 240, 480, 960, 1920, and 2880. Passing in a duration of less than 10ms
(480 samples at 48Khz) will prevent the encoder from using FEC, DTX, or hybrid modes.
Destination buffer for the output payload. This must contain at least max_data_bytes
The maximum amount of space allocated for the output payload. This may be used to impose
an upper limit on the instant bitrate, but should not be used as the only bitrate control (use the Bitrate parameter for that)
The length of the encoded packet, in bytes. This value will always be less than or equal to 1275, the maximum Opus packet size.
Resets the state of this encoder, usually to prepare it for processing
a new audio stream without reallocating.
Gets the version string of the library backing this implementation.
An arbitrary version string.
Gets or sets the application (or signal type) of the input signal. This hints
to the encoder what type of details we want to preserve in the encoding.
This cannot be changed after the encoder has started
Gets or sets the bitrate for encoder, in bits per second. Valid bitrates are between 6K (6144) and 510K (522240)
Gets or sets the maximum number of channels to be encoded. This can be used to force a downmix from stereo to mono if stereo
separation is not important
Gets or sets the maximum bandwidth to be used by the encoder. This can be used if
high-frequency audio is not important to your application (e.g. telephony)
Gets or sets the "preferred" encoded bandwidth. This does not affect the sample rate of the input audio,
only the encoding cutoffs
Gets or sets a flag to enable Discontinuous Transmission mode. This mode is only available in the SILK encoder
(Bitrate < 40Kbit/s and/or ForceMode == SILK). When enabled, the encoder detects silence and background noise
and reduces the number of output packets, with up to 600ms in between separate packet transmissions.
Gets or sets the encoder complexity, between 0 and 10
Gets or sets a flag to enable Forward Error Correction. This mode is only available in the SILK encoder
(Bitrate < 40Kbit/s and/or ForceMode == SILK). When enabled, lost packets can be partially recovered
by decoding data stored in the following packet.
Gets or sets the expected amount of packet loss in the transmission medium, from 0 to 100.
Only applies if UseInbandFEC is also enabled, and the encoder is in SILK mode.
Gets or sets a flag to enable Variable Bitrate encoding. This is recommended as it generally improves audio quality
with little impact on average bitrate
Gets or sets a flag to enable constrained VBR. This only applies when the encoder is in CELT mode (i.e. high bitrates)
Gets or sets a hint to the encoder for what type of audio is being processed, voice or music.
This is not set by the encoder itself i.e. it's not the result of any actual signal analysis.
Gets the number of samples of audio that are being stored in a buffer and are therefore contributing to latency.
Gets the encoder's input sample rate. This is fixed for the lifetime of the encoder.
Gets the number of channels that this encoder expects in its input. Always constant for the lifetime of the decoder.
Returns the final range of the entropy coder. If you need this then I also assume you know what it's for.
Gets or sets the bit resolution of the input audio signal. Though the encoder always uses 16-bit internally, this can help
it make better decisions about bandwidth and cutoff values
Gets or sets a fixed length for each encoded frame. Typically, the encoder just chooses a frame duration based on the input length
and the current internal mode. This can be used to enforce an exact length if it is required by your application (e.g. monotonous transmission)
Sets a user-forced mode for the encoder. There are three modes, SILK, HYBRID, and CELT. Silk can only encode below 40Kbit/s and is best suited
for speech. Silk also has modes such as FEC which may be desirable. Celt sounds better at higher bandwidth and is comparable to AAC. It also performs somewhat faster.
Hybrid is used to create a smooth transition between the two modes. Note that this value may not always be honored due to other factors such
as frame size and bitrate.
Gets or sets a flag to disable prediction, which does... something with the SILK codec
The Opus multistream decoder structure.
Multistream decoding is an aggregate of several internal decoders and extra logic to parse multiple frames from
single packets and map them to the correct channels. The behavior of a multistream decoder is functionally the
same as a single decoder in most other respects.
Decodes a multichannel Opus packet, putting the decoded audio into a floating-point buffer.
The input payload. This may be empty if the previous packet was lost in transit (when PLC is enabled)
A buffer to put the output PCM. The output size is (# of samples) * (# of channels) for a given single frame size (maximum 120ms).
The number of samples (per channel) of available space in the output PCM buf.
It should contain at least enough space to contain (# of samples) * (# of channels) for a given single frame size (maximum 120ms).
In the case of PLC (data == NULL) or FEC (decode_fec == true),
then frame_size needs to be exactly the duration of the audio that is missing, otherwise the decoder will
not be in an optimal state to decode the next incoming packet. For the PLC and FEC cases, frame_size *must*
be a multiple of 10 ms.
Indicates that we want to recreate the PREVIOUS (lost) packet using FEC data from THIS packet. Using this packet
recovery scheme, you will actually decode this packet twice, first with decode_fec TRUE and then again with FALSE. If FEC data is not
available in this packet, the decoder will simply generate a best-effort recreation of the lost packet.
The number of decoded samples
Decodes a multichannel Opus packet, putting the decoded audio into an int16 buffer.
The input payload. This may be empty if the previous packet was lost in transit (when PLC is enabled)
A buffer to put the output PCM. The output size is (# of samples) * (# of channels) for a given single frame size (maximum 120ms).
The number of samples (per channel) of available space in the output PCM buf.
It should contain at least enough space to contain (# of samples) * (# of channels) for a given single frame size (maximum 120ms).
In the case of PLC (data == NULL) or FEC (decode_fec == true),
then frame_size needs to be exactly the duration of the audio that is missing, otherwise the decoder will
not be in an optimal state to decode the next incoming packet. For the PLC and FEC cases, frame_size *must*
be a multiple of 10 ms.
Indicates that we want to recreate the PREVIOUS (lost) packet using FEC data from THIS packet. Using this packet
recovery scheme, you will actually decode this packet twice, first with decode_fec TRUE and then again with FALSE. If FEC data is not
available in this packet, the decoder will simply generate a best-effort recreation of the lost packet.
The number of decoded samples
Resets all buffers and prepares this decoder to process a fresh (unrelated) stream
Gets the version string of the library backing this implementation.
An arbitrary version string.
Gets the encoded bandwidth of the last packet decoded. This may be lower than the actual decoding sample rate,
and is only an indicator of the encoded audio's quality
Returns the final range of the entropy coder. If you need this then I also assume you know what it's for.
Gets or sets the gain (Q8) to use in decoding
Gets the duration of the last packet, in PCM samples per channel
Gets the sample rate that this decoder decodes to. Always constant for the lifetime of the decoder
Gets the number of channels of the input data. Always constant for the lifetime of the decoder
The Opus multistream encoder structure.
Multistream encoding is an aggregate of several internal encoders and extra logic to pack multiple frames into
single packets and map them to the correct channels. The behavior of a multistream encoder is functionally the
same as a single encoder in most other respects.
Encodes a multistream Opus frame.
Input signal, interleaved to the total number of surround channels, according to Vorbis channel layouts.
Length should be at least (# of samples) * (# of channels) for a given single frame size (maximum 120ms).
The number of samples per channel in the inpus signal.
The frame size must be a valid Opus framesize for the given sample rate.
Destination buffer for the output payload. This must contain at least max_data_bytes
The maximum amount of space allocated for the output payload. This may be used to impose
an upper limit on the instant bitrate, but should not be used as the only bitrate control (use the Bitrate parameter for that)
The length of the encoded packet, in bytes. This value will always be less than or equal to 1275, the maximum Opus packet size.
Encodes a multistream Opus frame.
Input signal, interleaved to the total number of surround channels, according to Vorbis channel layouts.
Length should be at least (# of samples) * (# of channels) for a given single frame size (maximum 120ms).
The number of samples per channel in the inpus signal.
The frame size must be a valid Opus framesize for the given sample rate.
Destination buffer for the output payload. This must contain at least max_data_bytes
The maximum amount of space allocated for the output payload. This may be used to impose
an upper limit on the instant bitrate, but should not be used as the only bitrate control (use the Bitrate parameter for that)
The length of the encoded packet, in bytes. This value will always be less than or equal to 1275, the maximum Opus packet size.
Resets the state of this encoder, usually to prepare it for processing
a new audio stream without reallocating.
Gets the version string of the library backing this implementation.
An arbitrary version string.
Gets or sets the application (or signal type) of the input signal. This hints
to the encoder what type of details we want to preserve in the encoding.
This cannot be changed after the encoder has started
Gets or sets the "preferred" encoded bandwidth. This does not affect the sample rate of the input audio,
only the encoding cutoffs
Gets or sets the bitrate for encoder, in bits per second. Valid bitrates are between 6K (6144) and 510K (522240)
Gets or sets the encoder complexity, between 0 and 10
Gets the number of channels that this encoder expects in its input. Always constant for the lifetime of the decoder.
Gets or sets a fixed length for each encoded frame. Typically, the encoder just chooses a frame duration based on the input length
and the current internal mode. This can be used to enforce an exact length if it is required by your application (e.g. monotonous transmission)
Returns the final range of the entropy coder. If you need this then I also assume you know what it's for.
Gets or sets a user-forced mode for the encoder. There are three modes, SILK, HYBRID, and CELT. Silk can only encode below 40Kbit/s and is best suited
for speech. Silk also has modes such as FEC which may be desirable. Celt sounds better at higher bandwidth and is comparable to AAC. It also performs somewhat faster.
Hybrid is used to create a smooth transition between the two modes. Note that this value may not always be honored due to other factors such
as frame size and bitrate.
Gets the number of samples of audio that are being stored in a buffer and are therefore contributing to latency.
Gets or sets the bit resolution of the input audio signal. Though the encoder always uses 16-bit internally, this can help
it make better decisions about bandwidth and cutoff values
Gets or sets the maximum bandwidth to be used by the encoder. This can be used if
high-frequency audio is not important to your application (e.g. telephony)
Gets or sets the expected amount of packet loss in the transmission medium, from 0 to 100.
Only applies if UseInbandFEC is also enabled, and the encoder is in SILK mode.
Gets or sets a flag to disable prediction, which does... something with the SILK codec
Gets the encoder's input sample rate. This is fixed for the lifetime of the encoder.
Gets or sets a hint to the encoder for what type of audio is being processed, voice or music.
This is not set by the encoder itself i.e. it's not the result of any actual signal analysis.
Gets or sets a flag to enable constrained VBR. This only applies when the encoder is in CELT mode (i.e. high bitrates)
Gets or sets a flag to enable Discontinuous Transmission mode. This mode is only available in the SILK encoder
(Bitrate < 40Kbit/s and/or ForceMode == SILK). When enabled, the encoder detects silence and background noise
and reduces the number of output packets, with up to 600ms in between separate packet transmissions.
Gets or sets a flag to enable Forward Error Correction. This mode is only available in the SILK encoder
(Bitrate < 40Kbit/s and/or ForceMode == SILK). When enabled, lost packets can be partially recovered
by decoding data stored in the following packet.
Gets or sets a flag to enable Variable Bitrate encoding. This is recommended as it generally improves audio quality
with little impact on average bitrate
Represents an audio resampler which can process single-channel or interleaved-channel inputs
in either int16 or float32 formats.
Get the latency introduced by the resampler measured in input samples.
Gets or sets the input stride.
Gets the latency introduced by the resampler measured in output samples.
Gets the latency introduced by the resampler.
Gets or sets the output stride.
Gets or sets the resampling quality between 0 and 10, where 0 has poor
quality and 10 has very high quality.
Gets the current resampling ratio. This will be reduced to the least common denominator
(Output) numerator of the sampling rate ratio
(Output) denominator of the sampling rate ratio
Get the current input/output sampling rates (integer value).
(Output) Sampling rate of input
(Output) Sampling rate of output
Clears the resampler buffers so a new (unrelated) stream can be processed.
Make sure that the first samples to go out of the resamplers don't have
leading zeros. This is only useful before starting to use a newly created
resampler. It is recommended to use that when resampling an audio file, as
it will generate a file with the same length.For real-time processing,
it is probably easier not to use this call (so that the output duration
is the same for the first frame).
Resample a float32 sample array. The input and output buffers must *not* overlap
The index of the channel to process (for multichannel input, 0 otherwise)
Input buffer
Number of input samples in the input buffer. After this function returns, this value
will be set to the number of input samples actually processed
Output buffer
Size of the output buffer. After this function returns, this value will be set to the number
of output samples actually produced
Resample an int16 sample array. The input and output buffers must *not* overlap
The index of the channel to process (for multichannel input, 0 otherwise)
Input buffer
Number of input samples in the input buffer. After this function returns, this value
will be set to the number of input samples actually processed
Output buffer
Size of the output buffer. After this function returns, this value will be set to the number
of output samples actually produced
Resamples an interleaved float32 array. The stride is automatically determined by the number of channels of the resampler.
Input buffer
The number of samples *PER-CHANNEL* in the input buffer. After this function returns, this
value will be set to the number of input samples actually processed
Output buffer
The size of the output buffer in samples-per-channel. After this function returns, this value
will be set to the number of samples per channel actually produced
Resamples an interleaved int16 array. The stride is automatically determined by the number of channels of the resampler.
Input buffer
The number of samples *PER-CHANNEL* in the input buffer. After this function returns, this
value will be set to the number of input samples actually processed
Output buffer
The size of the output buffer in samples-per-channel. After this function returns, this value
will be set to the number of samples per channel actually produced
Sets the input/output sampling rates and resampling ration (fractional values in Hz supported)
Numerator of the sampling rate ratio
Denominator of the sampling rate ratio
Input sampling rate rounded to the nearest integer (in Hz)
Output sampling rate rounded to the nearest integer (in Hz)
Sets the input and output rates
Input sampling rate, in hertz
Output sampling rate, in hertz
Represents the status of loading a native library.
The library may or may not be available.
The library is available and ready to invoke.
The library is not available on this system.
Global helpers for handling platform and OS-specific tasks regarding native P/Invoke libraries
(mostly to make sure that the right one for the current platform actually gets invoked).
Gets information about the current runtime OS and processor, in parity with .Net's Runtime Identifier (RID) system.
The current OS and architecture.
Given a native developer-provided library name, such as "mynativelib",
search the current runtime directory + /runtimes/{runtime ID}/native for files like "mynativelib.dll" / "mynativelib.so",
matching the given library name and current runtime OS and architecture, and then prepare that library file
in such a way that future P/Invoke calls to that library should succeed and should invoke the correct
architecture-specific code.
The library name to prepare (without platform-specific extensions such as ".dll")
A logger to log the result of the operation
Whether the runtime believes the given library is now available for loading or not.
Gets the runtime ID string for a given architecture, e.g. "arm64"
Gets the runtime ID string for a given operating system, e.g. "osx"
Given a runtime ID, such as "android-arm64", get the list of all inherited runtime IDs in descending order of specificity,
not including the requested ID.
Parses the output from NetCore's RuntimeInformation.RuntimeIdentifier into an struct.
The runtime identifier.
A parsed identifier struct.
Attempts to parse a runtime OS identifier string (e.g. "win10", "osx") into a structured
operating system enum. Returns if parsing failed.
The OS identifier string (should be lowercase but not strictly necessary)
A parsed OS enumeration
Attempts to parse a runtime OS identifier string (e.g. "win10", "osx") into a structured
operating system enum. Returns if parsing failed.
The OS identifier string (should be lowercase but not strictly necessary)
A parsed OS enumeration
Attempts to parse a runtime platform architecture string (e.g. "x64", "arm") into a structured
architecture enum. Returns if parsing failed.
The architecture identifier string (should be lowercase but not strictly necessary)
A parsed architecture enumeration
Attempts to parse a runtime platform architecture string (e.g. "x64", "arm") into a structured
architecture enum. Returns if parsing failed.
The architecture identifier string (should be lowercase but not strictly necessary)
A parsed architecture enumeration
Attempts to load the given library using kernel hooks for the current runtime operating system.
The name of the library to open, e.g. "libc"
The currently running platform
A logger
The availability of the given library after the probe attempt (it may load a locally provided or system-installed version of the requested library).
Represents a tuple combination of operating system + processor architecture.
An operating system specifier, e.g. Linux
A processor architecture specifier, e.g. i386
Constructs a new .
An operating system specifier.
A proceses architecture specifier.
An enumerated value for the current platform's processor architecture.
Error case.
"any" platform identifier
"x86" platform identifier
"x64" platform identifier
"arm" platform identifier (Implies hard float support)
"arm64" platform identifier, also called AArch64
"armel" platform identifier (ARM v5 or older)
"armv6" platform identifier
"mips64" platform identifier
"ppc64le" platform identifier
"riscv64" platform identifier
"s390x" platform identifier
"loongarch64" platform identifier
Nobody supports this
An enumerated value for the current platform's operating system.
Error case.
"any" OS identifier
"win" OS identifier
"linux" OS identifier
"osx" OS identifier
"ios" OS identifier
"iossimulator" OS identifier
"android" OS identifier
"freebsd" OS identifier
"illumos" OS identifier
"linux-bionic" OS identifier
"linux-musl" OS identifier
"maccatalyst" OS identifier
"solaris" OS identifier
"tvos" OS identifier
"tvossimulator" OS identifier
"unix" OS identifier
"browser" OS identifier
"wasi" OS identifier
Central factory class for creating Opus encoder / decoder structs.
Using these methods allows the runtime to decide the most appropriate
implementation for your platform based on what is available (generally,
this means using a P/Invoke native adapter if native libopus is present)
Creates an IOpusEncoder appropriate for the current platform.
This could potentially involve a native code layer.
The input sample rate. Must be a valid Opus samplerate (8K, 12K, 16K, 24K, 48K)
The number of channels of input (1 or 2)
The hint for the type of audio or application this encoder will be used for.
An optional logger for debugging messages about native library bindings.
A newly created opus encoder.
Creates an IOpusEncoder appropriate for the current platform.
This could potentially involve a native code layer.
The output sample rate to decode to.
Doesn't have to be the same sample rate the audio was encoded at.
Must be a valid Opus samplerate (8K, 12K, 16K, 24K, 48K)
The number of channels to decode to (1 or 2).
Doesn't have to be the same channel count the audio was encoded at.
An optional logger for debugging messages about native library bindings.
A newly created opus decoder.
Creates a multichannel Opus encoder using the "new API". This constructor allows you to use predefined Vorbis channel mappings, or specify your own.
The samples rate of the input
The total number of channels to encode (1 - 255)
The mapping family to use. 0 = mono/stereo, 1 = use Vorbis mappings, 255 = use raw channel mapping
The number of streams to encode
The number of coupled streams
A channel mapping describing which streams go to which channels: see
The application to use for the encoders
An optional logger for debugging messages about native library bindings.
A newly created opus multistream encoder.
Creates a new multichannel decoder
The sample rate to decode to.
The total number of channels being decoded.
The number of streams being decoded.
The number of coupled streams being decoded.
A channel mapping describing which streams go to which channels: see
An optional logger for debugging messages about native library bindings.
A newly created opus multistream decoder.
Gets or sets a global flag that determines whether the codec factory should attempt to
use a native opus.dll or libopus implementation. True by default, but you can override
the value to false if the library probe causes problems or something.
The type of signal being handled (either short or float) - changes based on which API is used
Returns the version number of this library
The type of signal being handled (either short or float)
Best for most VoIP/videoconference applications where listening quality and intelligibility matter most
Best for broadcast/high-fidelity application where the decoded audio should be as close as possible to the input
Only use when lowest-achievable latency is what matters most. Voice-optimized modes cannot be used.
These are the actual Encoder CTL ID numbers.
They should not be used directly by applications.
In general, SETs should be even and GETs should be odd.
Resets the codec state to be equivalent to a freshly initialized state.
This should be called when switching streams in order to prevent
the back to back decoding from giving different results from
one at a time decoding.
Note that since most API-level errors are detected and thrown as
OpusExceptions, direct use of this class is not usually needed
unless you need to interop with existing C-style error handlers.
No error
-1: One or more invalid/out of range arguments
-2: Not enough bytes allocated in the buffer
-3: An public error was detected
-4: The compressed data passed is corrupted
-5: Invalid/unsupported request number
-6: An encoder or decoder structure is invalid or already freed
-7: Memory allocation has failed (This is typically not possible in the C# implementation).
-8: Used in rare cases where Concentus throws an error that is not covered by the
original Opus spec.
Select frame size from the argument (default)
Use 2.5 ms frames
Use 5 ms frames
Use 10 ms frames
Use 20 ms frames
Use 40 ms frames
Use 60 ms frames
Do not use - not fully implemented. Optimize the frame size dynamically.
Signal being encoded is voice
Signal being encoded is music
multi-layer perceptron processor
Auto/default setting
Maximum bitrate
An exception type which wraps a raw Opus error code.
Gets the raw Opus error code as defined in the C spec. These codes can
be found in the enumeration.
Creates a new empty .
This constructor is discouraged as it does not set the raw error code.
Creates a new with a custom error message.
This constructor is discouraged as it does not set the raw error code.
so it reports
Creates a new with a custom error message.
This constructor is discouraged as it does not set the raw error code.
so it reports
Creates a new with a custom error message and matching Opus error code.
The entire error message string.
The raw error code that can be passed to other C-style error handlers
if necessary (it is not used to format the error string).
state object for multi-layer perceptron
The Opus decoder structure.
Opus is a stateful codec with overlapping blocks and as a result Opus
packets are not coded independently of each other. Packets must be
passed into the decoder serially and in the correct order for a correct
decode. Lost packets can be replaced with loss concealment by calling
the decoder with a null reference and zero length for the missing packet.
A single codec state may only be accessed from a single thread at
a time and any required locking must be performed by the caller. Separate
streams must be decoded with separate decoder states and can be decoded
in parallel.
Sampling rate (at the API level)
OPUS_DECODER_RESET_START
Allocates and initializes a decoder state.
Internally Opus stores data at 48000 Hz, so that should be the default
value for Fs. However, the decoder can efficiently decode to buffers
at 8, 12, 16, and 24 kHz so if for some reason the caller cannot use
data at the full sample rate, or knows the compressed data doesn't
use the full frequency range, it can request decoding at a reduced
rate. Likewise, the decoder is capable of filling in either mono or
interleaved stereo pcm buffers, at the caller's request.
Sample rate to decode at (Hz). This must be one of 8000, 12000, 16000, 24000, or 48000.
Number of channels (1 or 2) to decode
The created encoder
Initializes a previously allocated decoder state.
The state must be at least the size returned by opus_decoder_get_size().
This is intended for applications which use their own allocator instead of malloc. @see opus_decoder_create,opus_decoder_get_size
To reset a previously initialized state, use the #OPUS_RESET_STATE CTL.
@param [in] st OpusDecoder*: Decoder state.
@param [in] Fs opus_int32: Sampling rate to decode to (Hz).
This must be one of 8000, 12000, 16000,
24000, or 48000.
@param [in] channels int: Number of channels (1 or 2) to decode
@retval #OPUS_OK Success or @ref opus_errorcodes
Decodes an Opus packet.
The input payload. This may be NULL if the previous packet was lost in transit (when PLC is enabled)
The offset to use when reading the input payload. Usually 0
The number of bytes in the payload
A buffer to put the output PCM. The output size is (# of samples) * (# of channels).
You can use the OpusPacketInfo helpers to get a hint of the frame size before you decode the packet if you need
exact sizing. Otherwise, the minimum safe buffer size is 5760 samples
The offset to use when writing to the output buffer
The number of samples (per channel) of available space in the output PCM buf.
If this is less than the maximum packet duration (120ms; 5760 for 48khz), this function will
not be capable of decoding some packets. In the case of PLC (data == NULL) or FEC (decode_fec == true),
then frame_size needs to be exactly the duration of the audio that is missing, otherwise the decoder will
not be in an optimal state to decode the next incoming packet. For the PLC and FEC cases, frame_size *must*
be a multiple of 10 ms.
Indicates that we want to recreate the PREVIOUS (lost) packet using FEC data from THIS packet. Using this packet
recovery scheme, you will actually decode this packet twice, first with decode_fec TRUE and then again with FALSE. If FEC data is not
available in this packet, the decoder will simply generate a best-effort recreation of the lost packet.
The number of decoded samples
Decodes an Opus packet.
The input payload. This may be empty if the previous packet was lost in transit (when PLC is enabled)
A buffer to put the output PCM. The output size is (# of samples) * (# of channels).
You can use the OpusPacketInfo helpers to get a hint of the frame size before you decode the packet if you need
exact sizing. Otherwise, the minimum safe buffer size is 5760 samples
The number of samples (per channel) of available space in the output PCM buf.
If this is less than the maximum packet duration (120ms; 5760 for 48khz), this function will
not be capable of decoding some packets. In the case of PLC (data == NULL) or FEC (decode_fec == true),
then frame_size needs to be exactly the duration of the audio that is missing, otherwise the decoder will
not be in an optimal state to decode the next incoming packet. For the PLC and FEC cases, frame_size *must*
be a multiple of 10 ms.
Indicates that we want to recreate the PREVIOUS (lost) packet using FEC data from THIS packet. Using this packet
recovery scheme, you will actually decode this packet twice, first with decode_fec TRUE and then again with FALSE. If FEC data is not
available in this packet, the decoder will simply generate a best-effort recreation of the lost packet.
The number of decoded samples
Decodes an Opus packet, putting the output data into a floating-point buffer.
The input payload. This may be NULL if that previous packet was lost in transit (when PLC is enabled)
The offset to use when reading the input payload. Usually 0
The number of bytes in the payload
A buffer to put the output PCM. The output size is (# of samples) * (# of channels).
You can use the OpusPacketInfo helpers to get a hint of the frame size before you decode the packet if you need
exact sizing. Otherwise, the minimum safe buffer size is 5760 samples
The offset to use when writing to the output buffer
The number of samples (per channel) of available space in the output PCM buf.
If this is less than the maximum packet duration (120ms; 5760 for 48khz), this function will
not be capable of decoding some packets. In the case of PLC (data == NULL) or FEC (decode_fec == true),
then frame_size needs to be exactly the duration of the audio that is missing, otherwise the decoder will
not be in an optimal state to decode the next incoming packet. For the PLC and FEC cases, frame_size *must*
be a multiple of 10 ms.
Indicates that we want to recreate the PREVIOUS (lost) packet using FEC data from THIS packet. Using this packet
recovery scheme, you will actually decode this packet twice, first with decode_fec TRUE and then again with FALSE. If FEC data is not
available in this packet, the decoder will simply generate a best-effort recreation of the lost packet. In that case,
the length of frame_size must be EXACTLY the length of the audio that was lost, or else the decoder will be in an inconsistent state.
The number of decoded samples (per channel)
Decodes an Opus packet.
The input payload. This may be empty if the previous packet was lost in transit (when PLC is enabled)
A buffer to put the output PCM. The output size is (# of samples) * (# of channels).
You can use the OpusPacketInfo helpers to get a hint of the frame size before you decode the packet if you need
exact sizing. Otherwise, the minimum safe buffer size is 5760 samples
The number of samples (per channel) of available space in the output PCM buf.
If this is less than the maximum packet duration (120ms; 5760 for 48khz), this function will
not be capable of decoding some packets. In the case of PLC (data == NULL) or FEC (decode_fec == true),
then frame_size needs to be exactly the duration of the audio that is missing, otherwise the decoder will
not be in an optimal state to decode the next incoming packet. For the PLC and FEC cases, frame_size *must*
be a multiple of 10 ms.
Indicates that we want to recreate the PREVIOUS (lost) packet using FEC data from THIS packet. Using this packet
recovery scheme, you will actually decode this packet twice, first with decode_fec TRUE and then again with FALSE. If FEC data is not
available in this packet, the decoder will simply generate a best-effort recreation of the lost packet.
The number of decoded samples
Gets the encoded bandwidth of the last packet decoded. This may be lower than the actual decoding sample rate,
and is only an indicator of the encoded audio's quality
Gets the sample rate that this decoder decodes to. Always constant for the lifetime of the decoder
Gets the number of channels that this decoder decodes to. Always constant for the lifetime of the decoder.
Gets the last estimated pitch value of the decoded audio
Gets or sets the gain (Q8) to use in decoding
Gets the duration of the last packet, in PCM samples per channel
Resets all buffers and prepares this decoder to process a fresh (unrelated) stream
The Opus encoder structure
OPUS_ENCODER_RESET_START
Allocates and initializes an encoder state.
Note that regardless of the sampling rate and number channels selected, the Opus encoder
can switch to a lower audio bandwidth or number of channels if the bitrate
selected is too low. This also means that it is safe to always use 48 kHz stereo input
and let the encoder optimize the encoding. The decoder will not be constrained later on
by the mode that you select here for the encoder.
Sampling rate of input signal (Hz). This must be one of 8000, 12000, 16000, 24000, or 48000.
Number of channels (1 or 2) in input signal
There are three coding modes:
OPUS_APPLICATION_VOIP gives best quality at a given bitrate for voice
signals. It enhances the input signal by high-pass filtering and
emphasizing formants and harmonics. Optionally it includes in-band
forward error correction to protect against packet loss. Use this
mode for typical VoIP applications.Because of the enhancement,
even at high bitrates the output may sound different from the input.
OPUS_APPLICATION_AUDIO gives best quality at a given bitrate for most
non-voice signals like music. Use this mode for music and mixed
(music/voice) content, broadcast, and applications requiring less
than 15 ms of coding delay.
OPUS_APPLICATION_RESTRICTED_LOWDELAY configures low-delay mode that
disables the speech-optimized mode in exchange for slightly reduced delay.
This mode can only be set on an newly initialized or freshly reset encoder
because it changes the codec delay.
The created encoder
The storage type of analysis_pcm, either short or float
Encodes an Opus frame.
Input signal (Interleaved if stereo). Length should be at least frame_size * channels
Offset to use when reading the in_pcm buffer
The number of samples per channel in the inpus signal.
The frame size must be a valid Opus framesize for the given sample rate.
For example, at 48Khz the permitted values are 120, 240, 480, 960, 1920, and 2880. Passing in a duration of less than 10ms
(480 samples at 48Khz) will prevent the encoder from using FEC, DTX, or hybrid modes.
Destination buffer for the output payload. This must contain at least max_data_bytes
The offset to use when writing to the output data buffer
The maximum amount of space allocated for the output payload. This may be used to impose
an upper limit on the instant bitrate, but should not be used as the only bitrate control (use the Bitrate parameter for that)
The length of the encoded packet, in bytes. This value will always be less than or equal to 1275, the maximum Opus packet size.
Encodes an Opus frame.
Input signal (Interleaved if stereo). Length should be at least frame_size * channels
The number of samples per channel in the inpus signal.
The frame size must be a valid Opus framesize for the given sample rate.
For example, at 48Khz the permitted values are 120, 240, 480, 960, 1920, and 2880. Passing in a duration of less than 10ms
(480 samples at 48Khz) will prevent the encoder from using FEC, DTX, or hybrid modes.
Destination buffer for the output payload. This must contain at least max_data_bytes
The maximum amount of space allocated for the output payload. This may be used to impose
an upper limit on the instant bitrate, but should not be used as the only bitrate control (use the Bitrate parameter for that)
The length of the encoded packet, in bytes. This value will always be less than or equal to 1275, the maximum Opus packet size.
Encodes an Opus frame using floating point input.
Input signal in float format (Interleaved if stereo). Length should be at least frame_size * channels.
Value should be normalized to the +/- 1.0 range. Samples with a range beyond +/-1.0 will be clipped.
Offset to use when reading from in_pcm buffer
The number of samples per channel in the inpus signal.
The frame size must be a valid Opus framesize for the given sample rate.
For example, at 48Khz the permitted values are 120, 240, 480, 960, 1920, and 2880. Passing in a duration of less than 10ms
(480 samples at 48Khz) will prevent the encoder from using FEC, DTX, or hybrid modes.
Destination buffer for the output payload. This must contain at least max_data_bytes
Offset to use when writing into output data buffer
The maximum amount of space allocated for the output payload. This may be used to impose
an upper limit on the instant bitrate, but should not be used as the only bitrate control (use the Bitrate parameter for that)
The length of the encoded packet, in bytes. This value will always be less than or equal to 1275, the maximum Opus packet size.
Encodes an Opus frame using floating point input.
Input signal in float format (Interleaved if stereo). Length should be at least frame_size * channels.
Value should be normalized to the +/- 1.0 range. Samples with a range beyond +/-1.0 will be clipped.
The number of samples per channel in the inpus signal.
The frame size must be a valid Opus framesize for the given sample rate.
For example, at 48Khz the permitted values are 120, 240, 480, 960, 1920, and 2880. Passing in a duration of less than 10ms
(480 samples at 48Khz) will prevent the encoder from using FEC, DTX, or hybrid modes.
Destination buffer for the output payload. This must contain at least max_data_bytes
The maximum amount of space allocated for the output payload. This may be used to impose
an upper limit on the instant bitrate, but should not be used as the only bitrate control (use the Bitrate parameter for that)
The length of the encoded packet, in bytes. This value will always be less than or equal to 1275, the maximum Opus packet size.
Gets or sets the application (or signal type) of the input signal. This hints
to the encoder what type of details we want to preserve in the encoding.
This cannot be changed after the encoder has started
Gets or sets the bitrate for encoder, in bits per second. Valid bitrates are between 6K (6144) and 510K (522240)
Gets or sets the maximum number of channels to be encoded. This can be used to force a downmix from stereo to mono if stereo
separation is not important
Gets or sets the maximum bandwidth to be used by the encoder. This can be used if
high-frequency audio is not important to your application (e.g. telephony)
Gets or sets the "preferred" encoded bandwidth. This does not affect the sample rate of the input audio,
only the encoding cutoffs
Gets or sets a flag to enable Discontinuous Transmission mode. This mode is only available in the SILK encoder
(Bitrate < 40Kbit/s and/or ForceMode == SILK). When enabled, the encoder detects silence and background noise
and reduces the number of output packets, with up to 600ms in between separate packet transmissions.
Gets or sets the encoder complexity, between 0 and 10
Gets or sets a flag to enable Forward Error Correction. This mode is only available in the SILK encoder
(Bitrate < 40Kbit/s and/or ForceMode == SILK). When enabled, lost packets can be partially recovered
by decoding data stored in the following packet.
Gets or sets the expected amount of packet loss in the transmission medium, from 0 to 100.
Only applies if UseInbandFEC is also enabled, and the encoder is in SILK mode.
Gets or sets a flag to enable Variable Bitrate encoding. This is recommended as it generally improves audio quality
with little impact on average bitrate
Gets or sets a flag to enable constrained VBR. This only applies when the encoder is in CELT mode (i.e. high bitrates)
Gets or sets a hint to the encoder for what type of audio is being processed, voice or music
Gets the number of samples of audio that are being stored in a buffer and are therefore contributing to latency.
Gets the encoder's input sample rate. This is fixed for the lifetime of the encoder.
Gets the number of channels that this encoder expects in its input. Always constant for the lifetime of the decoder.
Returns the final range of the entropy coder
Gets or sets the bit resolution of the input audio signal. Though the encoder always uses 16-bit internally, this can help
it make better decisions about bandwidth and cutoff values
Gets or sets a fixed length for each encoded frame. Typically, the encoder just chooses a frame duration based on the input length
and the current public mode. This can be used to enforce an exact length if it is required by your application (e.g. monotonous transmission)
Gets or sets a user-forced mode for the encoder. There are three modes, SILK, HYBRID, and CELT. Silk can only encode below 40Kbit/s and is best suited
for speech. Silk also has modes such as FEC which may be desirable. Celt sounds better at higher bandwidth and is comparable to AAC. It also performs somewhat faster.
Hybrid is used to create a smooth transition between the two modes. Note that this value may not always be honored due to other factors such
as frame size and bitrate.
Gets or sets a value indicating that this stream is a low-frequency channel. This is used when encoding 5.1 surround audio.
Gets or sets a flag to disable prediction, which does... something with the SILK codec
Gets or sets a value indicating whether neural net analysis functions should be enabled, increasing encode quality
at the expense of speed.
EXPERIMENTAL!!! Gets or sets the constant quality encoding parameter. This is a new feature intended to approximate
"Constant Quality VBR" that other codecs such as MP3Lame provide, to let you encode mixed speech and music
(such as a podcast) in the same Opus stream without changing encoder params.
The quality is range from 0 (lowest) to 10 (highest). A setting of "null" means to use the regular Opus bitrate modes.
EXPERIMENTAL. Returns the probability that the current signal is music, according to the built-in analysis.
Only meaningful if EnableAnalysis is true and quality is above 7 or so
A managed implementation of the Opus multistream decoder.
Creates a new multichannel decoder
A channel mapping describing which streams go to which channels: see
Gets the internal decoder state of one of the multichannel stream's decoders, indicated by stream ID.
The stream ID to fetch.
The decoder for that stream ID.
A managed implementation of the Opus multistream encoder.
Creates a new multichannel Opus encoder using the "old API".
The sample rate of the input signal
The number of channels to encode (1 - 255)
The number of streams to encode
The number of coupled streams
A raw mapping between input and output channels
The application to use for the encoder
Creates a multichannel Opus encoder using the "new API". This constructor allows you to use predefined Vorbis channel mappings, or specify your own.
The samples rate of the input
The total number of channels to encode (1 - 255)
The mapping family to use. 0 = mono/stereo, 1 = use Vorbis mappings, 255 = use raw channel mapping
The number of streams to encode
The number of coupled streams
A raw mapping of input/output channels
The application to use for the encoders
Gets the internal encoder state of one of the multichannel stream's enoders, indicated by stream ID.
The stream ID to fetch.
The encoder for that stream ID.
Contains the parsed information from a single Opus packet, such as the bandwidth,
number of samples, encoder mode, channel count, etc.
The Table of Contents byte for this packet. Contains info about modes, frame length, etc.
The list of subframes in this packet
The index of the start of the payload within the packet
Parse an opus packet into a packetinfo object containing one or more frames.
Opus decode will perform this operation internally so most applications do
not need to use this function.
The packet data to be parsed
The index of the beginning of the packet in the data array (usually 0)
The packet's length
A parsed packet info struct
Parse an opus packet into a packetinfo object containing one or more frames.
Opus_decode will perform this operation internally so most applications do
not need to use this function.
The packet data to be parsed
A parsed packet info struct
Gets the number of samples per frame from an Opus packet.
Sampling rate in Hz. This must be a multiple of 400, or inaccurate results will be returned.
Number of samples per frame
Gets the number of samples per frame from an Opus packet.
Opus packet. This must contain at least one byte of data
Sampling rate in Hz. This must be a multiple of 400, or inaccurate results will be returned.
Number of samples per frame
Gets the encoded bandwidth of an Opus packet. Note that you are not forced to decode at this bandwidth
An OpusBandwidth value
Gets the encoded bandwidth of an Opus packet. Note that you are not forced to decode at this bandwidth
An Opus packet (must be at least 1 byte).
An OpusBandwidth value
Gets the number of encoded channels of an Opus packet. Note that you are not forced to decode with this channel count.
The number of channels
Gets the number of encoded channels of an Opus packet. Note that you are not forced to decode with this channel count.
An opus packet (must be at least 1 byte)
The number of channels
Gets the number of frames in an Opus packet.
An Opus packet
The number of frames in the packet
Gets the number of samples of an Opus packet.
An Opus packet
The decoder's sampling rate in Hz. This must be a multiple of 400
The size of the PCM samples that this packet will be decoded to at the specified sample rate
Gets the number of samples of an Opus packet.
Your current decoder state
An Opus packet
The start offset in the array for reading the packet from
The packet's length
The size of the PCM samples that this packet will be decoded to by the specified decoder
Gets the number of samples of an Opus packet.
Your current decoder state
An Opus packet
The size of the PCM samples that this packet will be decoded to by the specified decoder
Gets the mode that was used to encode this packet.
Normally there is nothing you can really do with this, other than debugging.
An Opus packet
The OpusMode used by the encoder
Gets the mode that was used to encode this packet.
Normally there is nothing you can really do with this, other than debugging.
The OpusMode used by the encoder
(Re)initializes a previously allocated repacketizer state.
The state must be at least the size returned by opus_repacketizer_get_size().
This can be used for applications which use their own allocator instead of
malloc().
It must also be called to reset the queue of packets waiting to be
repacketized, which is necessary if the maximum packet duration of 120 ms
is reached or if you wish to submit packets with a different Opus
configuration (coding mode, audio bandwidth, frame size, or channel count).
Failure to do so will prevent a new packet from being added with
opus_repacketizer_cat().
@see opus_repacketizer_create
@see opus_repacketizer_get_size
@see opus_repacketizer_cat
@param rp OpusRepacketizer*: The repacketizer state to
(re)initialize.
Creates a new repacketizer
opus_repacketizer_cat. Add a packet to the current repacketizer state.
This packet must match the configuration of any packets already submitted
for repacketization since the last call to opus_repacketizer_init().
This means that it must have the same coding mode, audio bandwidth, frame
size, and channel count.
This can be checked in advance by examining the top 6 bits of the first
byte of the packet, and ensuring they match the top 6 bits of the first
byte of any previously submitted packet.
The total duration of audio in the repacketizer state also must not exceed
120 ms, the maximum duration of a single packet, after adding this packet.
The contents of the current repacketizer state can be extracted into new
packets using opus_repacketizer_out() or opus_repacketizer_out_range().
In order to add a packet with a different configuration or to add more
audio beyond 120 ms, you must clear the repacketizer state by calling
opus_repacketizer_init().
If a packet is too large to add to the current repacketizer state, no part
of it is added, even if it contains multiple frames, some of which might
fit.
If you wish to be able to add parts of such packets, you should first use
another repacketizer to split the packet into pieces and add them
individually.
@see opus_repacketizer_out_range
@see opus_repacketizer_out
@see opus_repacketizer_init
@param rp OpusRepacketizer*: The repacketizer state to which to
add the packet.
@param[in] data const unsigned char*: The packet data.
The application must ensure
this pointer remains valid
until the next call to
opus_repacketizer_init() or
opus_repacketizer_destroy().
@param len opus_int32: The number of bytes in the packet data.
@returns An error code indicating whether or not the operation succeeded.
@retval #OPUS_OK The packet's contents have been added to the repacketizer
state.
@retval #OPUS_INVALID_PACKET The packet did not have a valid TOC sequence,
the packet's TOC sequence was not compatible
with previously submitted packets (because
the coding mode, audio bandwidth, frame size,
or channel count did not match), or adding
this packet would increase the total amount of
audio stored in the repacketizer state to more
than 120 ms.
Return the total number of frames contained in packet data submitted to
the repacketizer state so far via opus_repacketizer_cat() since the last
call to opus_repacketizer_init() or opus_repacketizer_create().
This defines the valid range of packets that can be extracted with
opus_repacketizer_out_range() or opus_repacketizer_out().
@param rp OpusRepacketizer*: The repacketizer state containing the
frames.
@returns The total number of frames contained in the packet data submitted
to the repacketizer state.
Construct a new packet from data previously submitted to the repacketizer
state via opus_repacketizer_cat().
@param rp OpusRepacketizer*: The repacketizer state from which to
construct the new packet.
@param begin int: The index of the first frame in the current
repacketizer state to include in the output.
@param end int: One past the index of the last frame in the
current repacketizer state to include in the
output.
@param[out] data const unsigned char*: The buffer in which to
store the output packet.
@param maxlen opus_int32: The maximum number of bytes to store in
the output buffer. In order to guarantee
success, this should be at least
1276
for a single frame,
or for multiple frames,
1277*(end-begin)
.
However, 1*(end-begin)
plus
the size of all packet data submitted to
the repacketizer since the last call to
opus_repacketizer_init() or
opus_repacketizer_create() is also
sufficient, and possibly much smaller.
@returns The total size of the output packet on success, or an error code
on failure.
@retval #OPUS_BAD_ARG [begin,end)
was an invalid range of
frames (begin < 0, begin >= end, or end >
opus_repacketizer_get_nb_frames()).
@retval #OPUS_BUFFER_TOO_SMALL \a maxlen was insufficient to contain the
complete output packet.
Construct a new packet from data previously submitted to the repacketizer
state via opus_repacketizer_cat().
This is a convenience routine that returns all the data submitted so far
in a single packet.
It is equivalent to calling
@code
opus_repacketizer_out_range(rp, 0, opus_repacketizer_get_nb_frames(rp),
data, maxlen)
@endcode
@param rp OpusRepacketizer*: The repacketizer state from which to
construct the new packet.
@param[out] data const unsigned char*: The buffer in which to
store the output packet.
@param maxlen opus_int32: The maximum number of bytes to store in
the output buffer. In order to guarantee
success, this should be at least
1277*opus_repacketizer_get_nb_frames(rp)
.
However,
1*opus_repacketizer_get_nb_frames(rp)
plus the size of all packet data
submitted to the repacketizer since the
last call to opus_repacketizer_init() or
opus_repacketizer_create() is also
sufficient, and possibly much smaller.
@returns The total size of the output packet on success, or an error code
on failure.
@retval #OPUS_BUFFER_TOO_SMALL \a maxlen was insufficient to contain the
complete output packet.
Pads a given Opus packet to a larger size (possibly changing the TOC sequence).
@param[in,out] data const unsigned char*: The buffer containing the
packet to pad.
@param len opus_int32: The size of the packet.
This must be at least 1.
@param new_len opus_int32: The desired size of the packet after padding.
This must be at least as large as len.
@returns an error code
@retval #OPUS_OK \a on success.
@retval #OPUS_BAD_ARG \a len was less than 1 or new_len was less than len.
@retval #OPUS_INVALID_PACKET \a data did not contain a valid Opus packet.
Remove all padding from a given Opus packet and rewrite the TOC sequence to
minimize space usage.
@param[in,out] data const unsigned char*: The buffer containing the
packet to strip.
@param len opus_int32: The size of the packet.
This must be at least 1.
@returns The new size of the output packet on success, or an error code
on failure.
@retval #OPUS_BAD_ARG \a len was less than 1.
@retval #OPUS_INVALID_PACKET \a data did not contain a valid Opus packet.
Pads a given Opus multi-stream packet to a larger size (possibly changing the TOC sequence).
@param[in,out] data const unsigned char*: The buffer containing the
packet to pad.
@param len opus_int32: The size of the packet.
This must be at least 1.
@param new_len opus_int32: The desired size of the packet after padding.
This must be at least 1.
@param nb_streams opus_int32: The number of streams (not channels) in the packet.
This must be at least as large as len.
@returns an error code
@retval #OPUS_OK \a on success.
@retval #OPUS_BAD_ARG \a len was less than 1.
@retval #OPUS_INVALID_PACKET \a data did not contain a valid Opus packet.
Remove all padding from a given Opus multi-stream packet and rewrite the TOC sequence to
minimize space usage.
@param[in,out] data const unsigned char*: The buffer containing the
packet to strip.
@param len opus_int32: The size of the packet.
This must be at least 1.
@param nb_streams opus_int32: The number of streams (not channels) in the packet.
This must be at least 1.
@returns The new size of the output packet on success, or an error code
on failure.
@retval #OPUS_BAD_ARG \a len was less than 1 or new_len was less than len.
@retval #OPUS_INVALID_PACKET \a data did not contain a valid Opus packet.
Probability of having speech for time i to DETECT_SIZE-1 (and music before).
pspeech[0] is the probability that all frames in the window are speech.
Probability of having music for time i to DETECT_SIZE-1 (and speech before).
pmusic[0] is the probability that all frames in the window are music.
Central factory class for creating resamplers.
Using these methods allows the runtime to decide the most appropriate
implementation for your platform based on what is available. Native
interop for resamplers is not yet implemented, but may be in the future.
Create a new resampler with integer input and output rates (in hertz).
The number of channels to be processed
Input sampling rate, in hertz
Output sampling rate, in hertz
Resampling quality, from 0 to 10
An optional logger for the operation
A newly created resampler
Create a new resampler with fractional input/output rates. The sampling
rate ratio is an arbitrary rational number with both the numerator and
denominator being 32-bit integers.
The number of channels to be processed
Numerator of sampling rate ratio
Denominator of sampling rate ratio
Input sample rate rounded to the nearest integer (in hz)
Output sample rate rounded to the nearest integer (in hz)
Resampling quality, from 0 to 10
An optional logger for the operation
A newly created resampler
Chirp (bw expand) LP AR filter (Fixed point implementation)
I/O AR filter to be expanded (without leading 1)
I length of ar
I chirp factor (typically in range (0..1) )
Chirp (bw expand) LP AR filter (Fixed point implementation)
I/O AR filter to be expanded (without leading 1)
I length of ar
I chirp factor (typically in range (0..1) )
Comfort noise generation and estimation
Generates excitation for CNG LPC synthesis
O CNG excitation signal Q10
I Random samples buffer Q10
I Gain to apply
I Length
I/O Seed to random index generator
Resets CNG state
I/O Decoder state
Updates CNG estimate, and applies the CNG when packet was lost
I/O Decoder state
I/O Decoder control
I/O Signal
I Length of residual
Encodes signs of excitation
I/O Compressor data structure
I pulse signal
I length of input
I Signal type
I Quantization offset type
I Sum of absolute pulses per block [MAX_NB_SHELL_BLOCKS]
Decodes signs of excitation
I/O Compressor data structure
I/O pulse signal
I length of input
I Signal type
I Quantization offset type
I Sum of absolute pulses per block [MAX_NB_SHELL_BLOCKS]
Reset decoder state
I/O Stat
Returns error code
Init or Reset encoder
I/O State
O Encoder Status
O Returns error code
Read control structure from encode
I State
O Encoder Status
Returns error code
Encode frame with Silk
Note: if prefillFlag is set, the input must contain 10 ms of audio, irrespective of what
encControl.payloadSize_ms is set to
I/O State
I Control status
I Speech sample input vector
I Number of samples in input vector
I/O Compressor data structure
I/O Number of bytes in payload (input: Max bytes)
I Flag to indicate prefilling buffers no coding
error code
Encode side-information parameters to payload
I/O Encoder state
I/O Compressor data structure
I Frame number
I Flag indicating LBRR data is being encoded
I The type of conditional coding to use
(O)
(I)
I max value for sum of pulses
I number of output values
return ok
(O)
(I)
I max value for sum of pulses
I number of output values
return ok
Encode quantization indices of excitation
I/O compressor data structure
I Signal type
I quantOffsetType
I quantization indices
I Frame length
Represents error messages from a silk encoder/decoder
Second order ARMA filter, alternative implementation
I input signal
I MA coefficients [3]
I AR coefficients [2]
I/O State vector [2]
O output signal
I signal length (must be even)
I Operate on interleaved signal if > 1
Split signal into two decimated bands using first-order allpass filters
I Input signal [N]
I/O State vector [2]
O Low band [N/2]
O High band [N/2]
I Number of input samples
Chirp (bandwidth expand) LP AR filter
I/O AR filter to be expanded (without leading 1)
I Length of ar
I Chirp factor in Q16
Elliptic/Cauer filters designed with 0.1 dB passband ripple,
80 dB minimum stopband attenuation, and
[0.95 : 0.15 : 0.35] normalized cut off frequencies.
Helper function, interpolates the filter taps
order [TRANSITION_NB]
order [TRANSITION_NA]
LPC analysis filter
NB! State is kept internally and the
filter always starts with zero state
first d output samples are set to zero
O Output signal
I Input signal
I MA prediction coefficients, Q12 [order]
I Signal length
I Filter order
Compute inverse of LPC prediction gain, and
test if LPC coefficients are stable (all poles within unit circle)
Prediction coefficients, order [2][SILK_MAX_ORDER_LPC]
Prediction order
inverse prediction gain in energy domain, Q30
For input in Q12 domain
Prediction coefficients, Q12 [order]
I Prediction order
inverse prediction gain in energy domain, Q30
Finds linear prediction coeffecients and weights
[SilkConstants.LTP_ORDER]
[SilkConstants.LTP_ORDER]
Gain scalar quantization with hysteresis, uniform on log scale
O gain indices [MAX_NB_SUBFR]
I/O gains (quantized out) [MAX_NB_SUBFR]
I/O last index in previous frame. [Porting note] original implementation passed this as an int8*
I first gain is delta coded if 1
I number of subframes
Gains scalar dequantization, uniform on log scale
O quantized gains [MAX_NB_SUBFR]
I gain indices [MAX_NB_SUBFR]
I/O last index in previous frame [Porting note] original implementation passed this as an int8*
I first gain is delta coded if 1
I number of subframes
Compute unique identifier of gain indices vector
I gain indices [MAX_NB_SUBFR]
I number of subframes
unique identifier of gains
High-pass filter with cutoff frequency adaptation based on pitch lag statistics
I/O Encoder states
Normalized line spectrum frequency processor
Number of binary divisions, when not in low complexity mode
Compute quantization errors for an LPC_order element input vector for a VQ codebook
(O) Quantization errors [K]
(I) Input vectors to be quantized [LPC_order]
(I) Codebook vectors [K*LPC_order]
(I) Number of codebook vectors
(I) Number of LPCs
Laroia low complexity NLSF weights
(O) Pointer to input vector weights [D]
(I) Pointer to input vector [D]
(I) Input vector dimension (even)
Returns RD value in Q30
(O) Output [ order ]
(I) Quantization indices [ order ]
(I) Backward predictor coefs [ order ]
(I) Quantization step size
(I) Number of input values
Unpack predictor values and indices for entropy coding tables
(O) Indices to entropy tables [ LPC_ORDER ]
(O) LSF predictor [ LPC_ORDER ]
(I) Codebook object
(I) Index of vector in first LSF codebook
NLSF stabilizer, for a single input data vector
(I/O) Unstable/stabilized normalized LSF vector in Q15 [L]
(I) Min distance vector, NDeltaMin_Q15[L] must be >= 1 [L+1]
(I) Number of NLSF parameters in the input vector
NLSF vector decoder
(O) Quantized NLSF vector [ LPC_ORDER ]
(I) Codebook path vector [ LPC_ORDER + 1 ]
(I) Codebook object
Delayed-decision quantizer for NLSF residuals
(O) Quantization indices [ order ]
(O) Input [ order ]
(I) Weights [ order ]
(I) Backward predictor coefs [ order ]
(I) Indices to entropy coding tables [ order ]
(I) Rates []
(I) Quantization step size
(I) Inverse quantization step size
(I) R/D tradeoff
(I) Number of input values
RD value in Q25
Fixme: Optimize this method!
NLSF vector encoder
(I) Codebook path vector [ LPC_ORDER + 1 ]
(I/O) Quantized NLSF vector [ LPC_ORDER ]
(I) Codebook object
(I) NLSF weight vector [ LPC_ORDER ]
(I) Rate weight for the RD optimization
(I) Max survivors after first stage
(I) Signal type: 0/1/2
RD value in Q25
helper function for NLSF2A(..)
(O) intermediate polynomial, QA [dd+1]
(I) vector of interleaved 2*cos(LSFs), QA [d]
(I) polynomial order (= 1/2 * filter order)
compute whitening filter coefficients from normalized line spectral frequencies
(O) monic whitening filter coefficients in Q12, [ d ]
(I) normalized line spectral frequencies in Q15, [ d ]
(I) filter order (should be even)
Helper function for A2NLSF(..) Transforms polynomials from cos(n*f) to cos(f)^n
(I/O) Polynomial
(I) Polynomial order (= filter order / 2 )
Helper function for A2NLSF(..) Polynomial evaluation
(I) Polynomial, Q16
(I) Evaluation point, Q12
(I) Order
the polynomial evaluation, in Q16
Compute Normalized Line Spectral Frequencies (NLSFs) from whitening filter coefficients
If not all roots are found, the a_Q16 coefficients are bandwidth expanded until convergence.
(O) Normalized Line Spectral Frequencies in Q15 (0..2^15-1) [d]
(I/O) Monic whitening filter coefficients in Q16 [d]
(I) Filter order (must be even)
Limit, stabilize, convert and quantize NLSFs
I/O Encoder state
O Prediction coefficients [ 2 ][MAX_LPC_ORDER]
I/O Normalized LSFs (quant out) (0 - (2^15-1)) [MAX_LPC_ORDER]
I Previous Normalized LSFs (0 - (2^15-1)) [MAX_LPC_ORDER]
Routines for managing packet loss concealment
O
O
O
O
I
I
I
I
Simple way to make [8000, 12000, 16000, 24000, 48000] to [0, 1, 2, 3, 4]
Initialize/reset the resampler state for a given pair of input/output sampling rates
I/O Resampler state
I Input sampling rate (Hz)
I Output sampling rate (Hz)
I If 1: encoder; if 0: decoder
Resampler: convert from one sampling rate to another
Input and output sampling rate are at most 48000 Hz
I/O Resampler state
O Output signal
I Input signal
I Number of input samples
Downsample by a factor 2
I/O State vector [ 2 ]
O Output signal [ floor(len/2) ]
I Input signal [ len ]
I Number of input samples
Downsample by a factor 2/3, low quality
I/O State vector [ 6 ]
O Output signal [ floor(2*inLen/3) ]
I Input signal [ inLen ]
I Number of input samples
Second order AR filter with single delay elements
I/O State vector [ 2 ]
O Output signal
I Input signal
I AR coefficients, Q14
I Signal length
Resample with a 2nd order AR filter followed by FIR interpolation
I/O Resampler state
O Output signal
I Input signal
I Number of input samples
Upsample using a combination of allpass-based 2x upsampling and FIR interpolation
I/O Resampler state
O Output signal
I Input signal
I Number of input samples
Upsample by a factor 2, high quality
Uses 2nd order allpass filters for the 2x upsampling, followed by a
notch filter just above Nyquist.
I/O Resampler state [ 6 ]
O Output signal [ 2 * len ]
I Input signal [ len ]
I Number of input samples
shell coder; pulse-subframe length is hardcoded
O combined pulses vector [len]
I input vector [2 * len]
I number of OUTPUT samples
O combined pulses vector [len]
I input vector [2 * len]
I number of OUTPUT samples
O pulse amplitude of first child subframe
O pulse amplitude of second child subframe
I/O Compressor data structure
I pulse amplitude of current subframe
I table of shell cdfs
Shell encoder, operates on one shell code frame of 16 pulses
I/O compressor data structure
I data: nonnegative pulse amplitudes
Approximate sigmoid function
(I/O) Unsorted / Sorted vector
(O) Index vector for the sorted elements
(I) Vector length
(I) Number of correctly sorted positions
Insertion sort (fast for already almost sorted arrays):
Best case: O(n) for an already sorted array
Worst case: O(n^2) for an inversely sorted array
(I/O) Unsorted / Sorted vector
(I) Vector length
Decode mid/side predictors
I/O Compressor data structure
O Predictors
Decode mid-only flag
I/O Compressor data structure
O Flag that only mid channel has been coded
Entropy code the mid/side quantization indices
I/O Compressor data structure
I Quantization indices [ 2 ][ 3 ]
Entropy code the mid-only flag
I/O Compressor data structure
Find least-squares prediction gain for one signal based on another and quantize it
O Ratio of residual and mid energies
I Basis signal
I Target signal
I/O Smoothed mid, residual norms
I Number of samples
I Smoothing coefficient
O Returns predictor in Q13
Convert Left/Right stereo signal to adaptive Mid/Side representation
I/O State
I/O Left input signal, becomes mid signal
I/O Right input signal, becomes side signal
O Quantization indices [ 2 ][ 3 ]
O Flag: only mid signal coded
O Bitrates for mid and side signals
I Total bitrate
I Speech activity level in previous frame
I Last frame before a stereo.mono transition
I Sample rate (kHz)
I Number of samples
Convert adaptive Mid/Side representation to Left/Right stereo signal
I/O State
I/O Left input signal, becomes mid signal
I/O Right input signal, becomes side signal
I Predictors
I Samples rate (kHz)
I Number of samples
Quantize mid/side predictors
I/O Predictors (out: quantized)
O Quantization indices [ 2 ][ 3 ]
Struct for CNG
Structure for controlling decoder operation and reading decoder status
Structure for controlling encoder operation
Checks this encoder control struct and returns error code, if any
Structure containing NLSF codebook
Quantization step size
Inverse quantization step size
POINTER
POINTER
POINTER to Backward predictor coefs [ order ]
POINTER to Indices to entropy coding tables [ order ]
POINTER
POINTER
POINTER
Struct for Packet Loss Concealment
Overwrites this struct with values from another one. Equivalent to C struct assignment this = other
Decoder state
Init Decoder State
Resets CNG state
Resets PLC state
Encoder state
Control encoder
I Control structure
I Target max bitrate (bps)
I Flag to allow switching audio bandwidth
I Channel number
I
I
I
O
I
Control internal sampling rate
I Control structure
Decoder super struct
Decoder control
Encoder Super Struct
Initialize Silk Encoder state
I/O Pointer to Silk FIX encoder state
Variable cut-off low-pass filter state
Low pass filter state
Counter which is mapped to a cut-off frequency
Operating mode, <0: switch down, >0: switch up; 0: do nothing
Noise shaping quantization state
Buffer for quantized output signal
Prefilter state
POINTER
Noise shaping analysis state
VAD state
Analysis filterbank state: 0-8 kHz
Analysis filterbank state: 0-4 kHz
Analysis filterbank state: 0-2 kHz
Subframe energies
Smoothed energy level in each band
State of differentiator in the lowest band
Noise energy level in each band
Inverse noise energy level in each band
Noise level estimator bias/offset
Frame counter used in the initial phase
Struct for TOC (Table of Contents)
Voice activity for packet
Voice activity for each frame in packet
Flag indicating if packet contains in-band FEC
Compute number of bits to right shift the sum of squares of a vector
of int16s to make it fit in an int32
O Energy of x, after shifting to the right
O Number of bits right shift applied to energy
I Input vector
I Length of input vector
Zero-index variant
Compute number of bits to right shift the sum of squares of a vector
of int16s to make it fit in an int32
O Energy of x, after shifting to the right
O Number of bits right shift applied to energy
I Input vector
I Length of input vector
Cosine approximation table for LSF conversion
Q12 values (even)
Voice Activity Detection module for silk codec
Weighting factors for tilt measure
Initialization of the Silk VAD
O Pointer to Silk VAD state. Cannot be nullptr
0 if success
Get the speech activity level in Q8
I/O Encoder state
I PCM input
0 if success
Noise level estimation
I subband energies [VAD_N_BANDS]
I/O Pointer to Silk VAD state