Makes more sense as users usually set the -cutoff option
to low pass filter the signal. The encoder will still over
shoot slightly when encoding normal coefficients however
that's normal.
Signed-off-by: Rostislav Pehlivanov <atomnuker@gmail.com>
Signed-off-by: Kevin Wheatley <kevin.j.wheatley@gmail.com>
Reviewed-by: Hendrik Leppkes <h.leppkes@gmail.com>
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
Signed-off-by: Kevin Wheatley <kevin.j.wheatley@gmail.com>
Reviewed-by: Hendrik Leppkes <h.leppkes@gmail.com>
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
Also disable the mmx/iwht optimization when the bitexact flag is set.
With synthetically coded coefficients (i.e. these that lead to a
residual well outside the [-255,255] range), our optimizations will
overflow. It doesn't make sense to fix the overflows, since they can
only occur on synthetic input, not on real fwht-generated input. Thus,
add a bitexact flag that disables this optimization.
File libopenh264enc.c has been modified so that the encoder uses av_log()
to log messages (error, warning, info, etc.) instead of logging them
directly to stderr. At the time the encoder is created, the current
ffmpeg log level is mapped to an equivalent libopenh264 log level. This
log level, and a message logging function that invokes av_log() to
actually log messages, are then set on the encoder.
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
This commit changes a few things about the noise substitution
logic:
- Brings back the quantization factor (reduced to 3) during
scalefactor index calculations.
- Rejects any zeroed bands. They should be inaudiable and it's
a waste transmitting the scalefactor indices for these.
- Uses swb_offsets instead of incrementing a 'start' with every
window group size.
- Rejects all PNS during short windows.
Overall improves quality. There was a plan to use the lfg system
to create the random numbers instead of using whatever the decoder
uses but for now this works fine. Entropy is far from important here.
Signed-off-by: Rostislav Pehlivanov <atomnuker@gmail.com>
* commit '2268db2cd052674fde55c7d48b7a5098ce89b4ba':
lavu: Drop the {minus,plus}1 suffix from AVComponentDescriptor fields
Merged-by: Hendrik Leppkes <h.leppkes@gmail.com>
Based on a patch by wm4.
Reviewed-by: Paul B Mahol <onemda@gmail.com>
Reviewed-by: Hendrik Leppkes <h.leppkes@gmail.com>
Signed-off-by: James Almer <jamrial@gmail.com>
Do not make many assumption on the dimension of the slices and just
try to decode additional lines if there is enough data left.
Decodes all the samples kindly provided by ultramage.
This commit once again improves the PNS implementation by scaling the
thresholds with frequency. The thresholds get looser as the frequency
increases since higher frequencies are basically noise to human ears.
Also, this introduces quantization error correction for PNS. Should
the error be too much, no PNS will be used. The energy_ratio is used
to regulate the actual encoded PNS energy: if the generated PNS
energy is higher than the energy from the psy system, energy_ratio
is used to correct it so that hopefully once requantized and
transmitted the value in the decoder will be closer to what the
encoder has.
Signed-off-by: Rostislav Pehlivanov <atomnuker@gmail.com>
This was an oversight when the IS system was being first implemented.
The ener01 part was largely a result of trial and error and the fact
that the sum of coef0 and coef1 could result in a zero was
overlooked. Once ener01 turns to zero it's used to divide the left
channel energy which doesn't turn out so well as it fills IS[]
with -nan's and inf's which in turn confused the quantize_band_cost.
Signed-off-by: Rostislav Pehlivanov <atomnuker@gmail.com>
This commit rewrites the PNS implementation and significantly
improves sonic quality.
The previous implementation marked an incredibly big amount
of SFBs to predict when there was no need for this and this
resulted in quite a large amount of artifacts. Also the
quantization was incorrect (av_clip(4+log2f(...))) which
led to 3x the intensity for PNS values leading to even more
artifacts.
This commit rewrites the PNS search function and introduces
a major change: the PNS values are synthesized and are compared
to the current coefficients in addition to passing through
the revised checks to see whether PNS can be used.
This decreases distortions and makes the current PNS implementation
mainly focused on replacing any low-power non-zero bands as well
as adding any zeroed bands back.
The current encoder's performance is enough (especially with
IS) so PNS isn't really required except to fill in the occasional
few bands as well as extend any zeroed high frequency, so this
combination which is already enabled by default works
to get as much quality as it can within the bits allowed.
Signed-off-by: Rostislav Pehlivanov <atomnuker@gmail.com>
Since PNS generates coefficients it doesn't make sense to send
the predicted ones as well. Also the specifications explicitly
state to disable right channel IS predictors.
Signed-off-by: Rostislav Pehlivanov <atomnuker@gmail.com>
It's better to trust that the coefficients generated will be
closer than the coefficients derived, and the new PNS implementation
makes sure that this happens.
Signed-off-by: Rostislav Pehlivanov <atomnuker@gmail.com>
The specifications explicitly state to use roundf() which
also rounds half-integer values away from zero.
This does fix a few IS artifacts.
Signed-off-by: Rostislav Pehlivanov <atomnuker@gmail.com>
* commit '4e649debcf7f71d35c6b38cdb7ee715eba95d64a':
Postpone API-incompatible changes until the next bump
Merged-by: Hendrik Leppkes <h.leppkes@gmail.com>
* commit '069713aa4b137781e270768d803b1f7456daa724':
lavc: Drop deprecated thread opaque and codec pkt
Merged-by: Hendrik Leppkes <h.leppkes@gmail.com>
* commit '9f90b24877016e7140b9b14e4b1acee663bb6d8a':
lavc: Drop deprecated get_buffer related functions
Merged-by: Hendrik Leppkes <h.leppkes@gmail.com>
* commit '01bcc2d5c23fa757d163530abb396fd02f1be7c8':
lavc: Drop deprecated destruct_packet related functions
Merged-by: Hendrik Leppkes <h.leppkes@gmail.com>
* commit 'dc70c19476e76f1118df73b5d97cc76f0e5f6f6c':
lavc: Drop deprecated request_channels related functions
Merged-by: Hendrik Leppkes <h.leppkes@gmail.com>
This commit improves the TNS implementation to the point where it's
actually usable and very rarely results in nastyness (in all bitrates
except extremely low bitrates it's increasing the quality and prevents
some distortions from the coder being audiable).
Also adds a double filter support which is only used if the energy
difference between the top and bottom of the SFBs is above the
thresholds defined in the header file. Looking at the bitstream
that fdk_aac generates it sometimes used a double filter despite
the specs stating that a single filter should be enough for almost
all cases and purposes.
Unlike FAAC or fdk_aac we sometimes use a reverse filter in case
the energy difference isn't enought to use a double filter. This
actually works better.
Signed-off-by: Rostislav Pehlivanov <atomnuker@gmail.com>
This commit adds a flag to use the pure coefficients instead
of the processed ones (sce->coeffs). This is needed because
IS will apply the changes to the coefficients immediately
before the adjust_common_prediction function and it doesn't
make sense to measure stereo channel coefficient difference
when one of the channels coefficients are all zero.
Therefore add a flag to use pure coefficients in that case.
TNS is the only thing touching the coefficients before IS
so common window prediction will not take that into account
but the effect of the TNS filter per coefficient can be small
(a few percent) so to some approximation it's fine to just
ignore that.
Also fixed a small error which doesn't alter the results
that much. pow(sqrt(number), 3.0/4.0) == pow(number, 3.0/8.0) !=
pow(number, 3.0/4.0).
Signed-off-by: Rostislav Pehlivanov <atomnuker@gmail.com>
32bit is not sufficient for all cases
Fixes: signal_sigabrt_7ffff6ac8cc9_686_cov_1897408623_microsoft_new_way_to_shove_mpeg2_in_asf.dvr_ms
Found-by: Mateusz "j00ru" Jurczyk and Gynvael Coldwind
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
This commit reorders the coding tools such that they're doing what
the decoder does in reverse order. The very first thing the decoder
does is to decode M/S stereo if that's signalled, then prediction,
IS, and finally TNS and PNS in another function.
adjust_frame_information()'s application of IS and M/S was taken
out into two separate functions since prediction doesn't expect
to get the raw coefficients but rathe the coefficients at that
part of the encoding process.
The results show a much better PSNR when any combination of
Intensity Stereo, Mid/Side stereo and Prediction is used, which
is a sign of an increased encoder efficiency as well as the fact
that the decoder gets what it expects.
Otherwise, with only IS, PNS or prediction there are neither
regressions nor improvements except in the case of IS, which
now by itself (or with PNS) is less prone to artifacts. Enabling
M/S (using stereo_mode) as well will also reduce stereo artifacts
induced by IS, so in the very near future M/S may be enabled
by default.
Signed-off-by: Rostislav Pehlivanov <atomnuker@gmail.com>
If the selected coder isn't twoloop, this commit temporarily
disables IS and PNS.
The problem is in the encode_window_bands_info() being confused
and setting invalid band_types for non-marked (normal) bands.
Signed-off-by: Rostislav Pehlivanov <atomnuker@gmail.com>
Since the changes made a few week ago (which were done more than a
month ago) the quality and stability of intensity stereo has been
notably good. There were some requests and wishes to have in on by
default and therefore it has been enabled. Should any regressions
arise changes will be made to preferably keep it operating rather
than just disabling it by default again.
Signed-off-by: Rostislav Pehlivanov <atomnuker@gmail.com>
It has been in the current encoder in its current implementation
for quite some time now, so enable it by default. Will increase
quality at all bitrates, especially at low ones.
Signed-off-by: Rostislav Pehlivanov <atomnuker@gmail.com>
Purely a cosmetic change, most of the zeroing of encoder resources
should happen at the top of the main loop.
Signed-off-by: Rostislav Pehlivanov <atomnuker@gmail.com>
This commit reworks the TNS implementation to a hybrid between what
the specifications say, what the decoder does and what's the best
thing to do.
The filter application function was copied from the decoder and
modified such that it applies the inverse AR filter to the
coefficients. The LPC coefficients themselves are fed into the
same quantization expression that the specifications say should
be used however further processing is not done, instead they're
converted to the form that the decoder expects them to be in
and are sent off to the compute_lpc_coeffs function exactly the
way the decoder does. This function does all conversions and will
return the exact coefficients that the decoder will generate, which
are then applied to the coefficients.
Having the exact same coefficients on both the encoder and decoder
is a must since otherwise the entire sfb's over which the filter
is applied will be attenuated.
Despite this major rework, TNS might not work fine on some audio
types at very low bitrates (e.g. sub 90kbps) as it can attenuate
some coefficients too much. Users are advised to experiment with
TNS at higher bitrates if they wish to use this tool or simply
wait for the implementation to be improved.
Signed-off-by: Rostislav Pehlivanov <atomnuker@gmail.com>
Turns out autocorrelating more than 750 coefficients at once
will cause a segfault, despite there being enough space to
hold an entire frame of samples into the buffer.
Signed-off-by: Rostislav Pehlivanov <atomnuker@gmail.com>
This commit adds a function to get the reflection coefficients on
floating point samples. It's functionally identical to
ff_lpc_calc_ref_coefs() except it works on float samples and will
return the global prediction gain. The Welch window implementation
which is more optimized works only on int32_t samples so a slower
generic expression was used.
Signed-off-by: Rostislav Pehlivanov <atomnuker@gmail.com>
Not needed anymore, it was only used by the AAC TNS
encoder and was replaced with a more suitable function
in the following commit.
Signed-off-by: Rostislav Pehlivanov <atomnuker@gmail.com>
Needed for following commits. Contains the starting sfb for
every samplerate and window type.
Signed-off-by: Rostislav Pehlivanov <atomnuker@gmail.com>
Fixes out of array access
Fixes: 87196d8bbc633629fc9dd851fce73e70/asan_heap-oob_26f6853_862_cov_585961513_sonic3dblast_intro-partial.avi
Found-by: Mateusz "j00ru" Jurczyk and Gynvael Coldwind
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
Despite '417792' being reported in the binary decoder, the buffer at
encoding time needs to be bigger to avoid running out of space due to
interlace handling.
Signed-off-by: Vittorio Giovara <vittorio.giovara@gmail.com>
This was copied from the decoder, but is unneeded for the encoder.
tns_max_bands is unused and set to zero which zeroed out start, end
and size and thus no filter was actually applied.
Signed-off-by: Rostislav Pehlivanov <atomnuker@gmail.com>
Since the coefficients are stepped up to order + 1 it was possible
that it went over TNS_MAX_ORDER. Also just return in case the only
coefficient is less than the threshold.
Signed-off-by: Rostislav Pehlivanov <atomnuker@gmail.com>
The encoder-side filter isn't that important. The PSNR
shouldn't change so the FATE test should still be fine.
Signed-off-by: Rostislav Pehlivanov <atomnuker@gmail.com>
The order should never go above TNS_MAX_ORDER (and thus cause
the context to be reinitialized) but this is just in case.
Also fix a comparison, since the coefficients are zero-indexed.
Signed-off-by: Rostislav Pehlivanov <atomnuker@gmail.com>
It also made no sense to actually make the filter span the entire
window including the first band of the next window.
Signed-off-by: Rostislav Pehlivanov <atomnuker@gmail.com>
Pulses are already on the way so expect to see the list
gone in the close future.
TNS is already of sufficiently high quality to be enabled
by default (but isn't yet, so you too can help by testing!).
Signed-off-by: Rostislav Pehlivanov <atomnuker@gmail.com>
This commit abandons the way the specifications state to
quantize the coefficients, makes use of the new LPC float
functions and is much better.
The original way of converting non-normalized float samples
to int32_t which out LPC system expects was wrong and it was
wrong to assume the coefficients that are generated are also
valid. It was essentially a full garbage-in, garbage-out
system and it definitely shows when looking at spectrals
and listening. The high frequencies were very overattenuated.
The new LPC function performs the analysis directly.
The specifications state to quantize the coefficients into
four bit index values using an asin() function which of course
had to have ugly ternary operators because the function turns
negative if the coefficients are negative which when encoding
causes invalid bitstream to get generated.
This deviates from this by using the direct TNS tables, which
are fairly small since you only have 4 bits at most for index
values. The LPC values are directly quantized against the tables
and are then used to perform filtering after the requantization,
which simply fetches the array values.
The end result is that TNS works much better now and doesn't
attenuate anything but the actual signal, e.g. TNS removes
quantization errors and does it's job correctly now.
It might be enabled by default soon since it doesn't hurt and
helps reduce nastyness at low bitrates.
Signed-off-by: Rostislav Pehlivanov <atomnuker@gmail.com>
This commit removes the array which was made redundant with
the last commit. The current prediction system gets the
quantization error directly (and without the single-frame delay)
in the search_for_pred function.
Signed-off-by: Rostislav Pehlivanov <atomnuker@gmail.com>