* commit '4e649debcf7f71d35c6b38cdb7ee715eba95d64a':
Postpone API-incompatible changes until the next bump
Merged-by: Hendrik Leppkes <h.leppkes@gmail.com>
* commit '069713aa4b137781e270768d803b1f7456daa724':
lavc: Drop deprecated thread opaque and codec pkt
Merged-by: Hendrik Leppkes <h.leppkes@gmail.com>
* commit '9f90b24877016e7140b9b14e4b1acee663bb6d8a':
lavc: Drop deprecated get_buffer related functions
Merged-by: Hendrik Leppkes <h.leppkes@gmail.com>
* commit '01bcc2d5c23fa757d163530abb396fd02f1be7c8':
lavc: Drop deprecated destruct_packet related functions
Merged-by: Hendrik Leppkes <h.leppkes@gmail.com>
* commit 'dc70c19476e76f1118df73b5d97cc76f0e5f6f6c':
lavc: Drop deprecated request_channels related functions
Merged-by: Hendrik Leppkes <h.leppkes@gmail.com>
This commit improves the TNS implementation to the point where it's
actually usable and very rarely results in nastyness (in all bitrates
except extremely low bitrates it's increasing the quality and prevents
some distortions from the coder being audiable).
Also adds a double filter support which is only used if the energy
difference between the top and bottom of the SFBs is above the
thresholds defined in the header file. Looking at the bitstream
that fdk_aac generates it sometimes used a double filter despite
the specs stating that a single filter should be enough for almost
all cases and purposes.
Unlike FAAC or fdk_aac we sometimes use a reverse filter in case
the energy difference isn't enought to use a double filter. This
actually works better.
Signed-off-by: Rostislav Pehlivanov <atomnuker@gmail.com>
This commit adds a flag to use the pure coefficients instead
of the processed ones (sce->coeffs). This is needed because
IS will apply the changes to the coefficients immediately
before the adjust_common_prediction function and it doesn't
make sense to measure stereo channel coefficient difference
when one of the channels coefficients are all zero.
Therefore add a flag to use pure coefficients in that case.
TNS is the only thing touching the coefficients before IS
so common window prediction will not take that into account
but the effect of the TNS filter per coefficient can be small
(a few percent) so to some approximation it's fine to just
ignore that.
Also fixed a small error which doesn't alter the results
that much. pow(sqrt(number), 3.0/4.0) == pow(number, 3.0/8.0) !=
pow(number, 3.0/4.0).
Signed-off-by: Rostislav Pehlivanov <atomnuker@gmail.com>
32bit is not sufficient for all cases
Fixes: signal_sigabrt_7ffff6ac8cc9_686_cov_1897408623_microsoft_new_way_to_shove_mpeg2_in_asf.dvr_ms
Found-by: Mateusz "j00ru" Jurczyk and Gynvael Coldwind
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
This commit reorders the coding tools such that they're doing what
the decoder does in reverse order. The very first thing the decoder
does is to decode M/S stereo if that's signalled, then prediction,
IS, and finally TNS and PNS in another function.
adjust_frame_information()'s application of IS and M/S was taken
out into two separate functions since prediction doesn't expect
to get the raw coefficients but rathe the coefficients at that
part of the encoding process.
The results show a much better PSNR when any combination of
Intensity Stereo, Mid/Side stereo and Prediction is used, which
is a sign of an increased encoder efficiency as well as the fact
that the decoder gets what it expects.
Otherwise, with only IS, PNS or prediction there are neither
regressions nor improvements except in the case of IS, which
now by itself (or with PNS) is less prone to artifacts. Enabling
M/S (using stereo_mode) as well will also reduce stereo artifacts
induced by IS, so in the very near future M/S may be enabled
by default.
Signed-off-by: Rostislav Pehlivanov <atomnuker@gmail.com>
If the selected coder isn't twoloop, this commit temporarily
disables IS and PNS.
The problem is in the encode_window_bands_info() being confused
and setting invalid band_types for non-marked (normal) bands.
Signed-off-by: Rostislav Pehlivanov <atomnuker@gmail.com>
Since the changes made a few week ago (which were done more than a
month ago) the quality and stability of intensity stereo has been
notably good. There were some requests and wishes to have in on by
default and therefore it has been enabled. Should any regressions
arise changes will be made to preferably keep it operating rather
than just disabling it by default again.
Signed-off-by: Rostislav Pehlivanov <atomnuker@gmail.com>
It has been in the current encoder in its current implementation
for quite some time now, so enable it by default. Will increase
quality at all bitrates, especially at low ones.
Signed-off-by: Rostislav Pehlivanov <atomnuker@gmail.com>
Purely a cosmetic change, most of the zeroing of encoder resources
should happen at the top of the main loop.
Signed-off-by: Rostislav Pehlivanov <atomnuker@gmail.com>
This commit reworks the TNS implementation to a hybrid between what
the specifications say, what the decoder does and what's the best
thing to do.
The filter application function was copied from the decoder and
modified such that it applies the inverse AR filter to the
coefficients. The LPC coefficients themselves are fed into the
same quantization expression that the specifications say should
be used however further processing is not done, instead they're
converted to the form that the decoder expects them to be in
and are sent off to the compute_lpc_coeffs function exactly the
way the decoder does. This function does all conversions and will
return the exact coefficients that the decoder will generate, which
are then applied to the coefficients.
Having the exact same coefficients on both the encoder and decoder
is a must since otherwise the entire sfb's over which the filter
is applied will be attenuated.
Despite this major rework, TNS might not work fine on some audio
types at very low bitrates (e.g. sub 90kbps) as it can attenuate
some coefficients too much. Users are advised to experiment with
TNS at higher bitrates if they wish to use this tool or simply
wait for the implementation to be improved.
Signed-off-by: Rostislav Pehlivanov <atomnuker@gmail.com>
Turns out autocorrelating more than 750 coefficients at once
will cause a segfault, despite there being enough space to
hold an entire frame of samples into the buffer.
Signed-off-by: Rostislav Pehlivanov <atomnuker@gmail.com>
This commit adds a function to get the reflection coefficients on
floating point samples. It's functionally identical to
ff_lpc_calc_ref_coefs() except it works on float samples and will
return the global prediction gain. The Welch window implementation
which is more optimized works only on int32_t samples so a slower
generic expression was used.
Signed-off-by: Rostislav Pehlivanov <atomnuker@gmail.com>
Not needed anymore, it was only used by the AAC TNS
encoder and was replaced with a more suitable function
in the following commit.
Signed-off-by: Rostislav Pehlivanov <atomnuker@gmail.com>
Needed for following commits. Contains the starting sfb for
every samplerate and window type.
Signed-off-by: Rostislav Pehlivanov <atomnuker@gmail.com>
Fixes out of array access
Fixes: 87196d8bbc633629fc9dd851fce73e70/asan_heap-oob_26f6853_862_cov_585961513_sonic3dblast_intro-partial.avi
Found-by: Mateusz "j00ru" Jurczyk and Gynvael Coldwind
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
Despite '417792' being reported in the binary decoder, the buffer at
encoding time needs to be bigger to avoid running out of space due to
interlace handling.
Signed-off-by: Vittorio Giovara <vittorio.giovara@gmail.com>
This was copied from the decoder, but is unneeded for the encoder.
tns_max_bands is unused and set to zero which zeroed out start, end
and size and thus no filter was actually applied.
Signed-off-by: Rostislav Pehlivanov <atomnuker@gmail.com>
Since the coefficients are stepped up to order + 1 it was possible
that it went over TNS_MAX_ORDER. Also just return in case the only
coefficient is less than the threshold.
Signed-off-by: Rostislav Pehlivanov <atomnuker@gmail.com>
The encoder-side filter isn't that important. The PSNR
shouldn't change so the FATE test should still be fine.
Signed-off-by: Rostislav Pehlivanov <atomnuker@gmail.com>
The order should never go above TNS_MAX_ORDER (and thus cause
the context to be reinitialized) but this is just in case.
Also fix a comparison, since the coefficients are zero-indexed.
Signed-off-by: Rostislav Pehlivanov <atomnuker@gmail.com>
It also made no sense to actually make the filter span the entire
window including the first band of the next window.
Signed-off-by: Rostislav Pehlivanov <atomnuker@gmail.com>
Pulses are already on the way so expect to see the list
gone in the close future.
TNS is already of sufficiently high quality to be enabled
by default (but isn't yet, so you too can help by testing!).
Signed-off-by: Rostislav Pehlivanov <atomnuker@gmail.com>
This commit abandons the way the specifications state to
quantize the coefficients, makes use of the new LPC float
functions and is much better.
The original way of converting non-normalized float samples
to int32_t which out LPC system expects was wrong and it was
wrong to assume the coefficients that are generated are also
valid. It was essentially a full garbage-in, garbage-out
system and it definitely shows when looking at spectrals
and listening. The high frequencies were very overattenuated.
The new LPC function performs the analysis directly.
The specifications state to quantize the coefficients into
four bit index values using an asin() function which of course
had to have ugly ternary operators because the function turns
negative if the coefficients are negative which when encoding
causes invalid bitstream to get generated.
This deviates from this by using the direct TNS tables, which
are fairly small since you only have 4 bits at most for index
values. The LPC values are directly quantized against the tables
and are then used to perform filtering after the requantization,
which simply fetches the array values.
The end result is that TNS works much better now and doesn't
attenuate anything but the actual signal, e.g. TNS removes
quantization errors and does it's job correctly now.
It might be enabled by default soon since it doesn't hurt and
helps reduce nastyness at low bitrates.
Signed-off-by: Rostislav Pehlivanov <atomnuker@gmail.com>
This commit removes the array which was made redundant with
the last commit. The current prediction system gets the
quantization error directly (and without the single-frame delay)
in the search_for_pred function.
Signed-off-by: Rostislav Pehlivanov <atomnuker@gmail.com>
This commit completely alters the algorithm of prediction.
The original commit which introduced prediction was completely
incorrect to even remotely care about what the actual coefficients
contain or whether any options were enabled. Not my actual fault.
This commit treats prediction the way the decoder does and expects
to do: like lossy encryption. Everything related to prediction now
happens at the very end but just before quantization and encoding
of coefficients. On the decoder side, prediction happens before
anything has had a chance to even access the coefficients.
Also the original implementation had problems because it actually
touched the band_type of special bands which already had their
scalefactor indices marked and it's a wonder the asserion wasn't
triggered when transmitting those.
Overall, this now drastically increases audio quality and you should
think about enabling it if you don't plan on playing anything encoded
on really old low power ultra-embedded devices since they might not
support decoding of prediction or AAC-Main. Though the specifications
were written ages ago and as times change so do the FLOPS.
Signed-off-by: Rostislav Pehlivanov <atomnuker@gmail.com>
This was missed when the original commits were done. FF_PROFILE_UNKNOWN
is what's in avctx->profile when no audio profile is specified.
Signed-off-by: Rostislav Pehlivanov <atomnuker@gmail.com>
This commit simply duplicates the functionality of ff_lpc_calc_coefs()
for the case of a Levinson-Durbin LPC with the only difference being
that floating point samples are accepted and the resulting coefficients
are raw and unquantized.
The motivation behind doing this is the fact that the AAC encoder
requires LPC in TNS and LTP and converting non-normalized floating
point coefficients to int32_t using SWR and again back for the LPC
coefficients was very impractical.
The current LPC interfaces were designed for int32_t in mind possibly
because FLAC and ALAC use this type for most internal operations.
The mathematics in case of floats remains of course identical.
Signed-off-by: Rostislav Pehlivanov <atomnuker@gmail.com>
This commit simply moves the TNS tables to a more appropriate
aactab.h since then they can be accessed by both the decoder
and encoder.
The encoder _shouldn't_ normally need the tables since the
specs describe a specific quantization process, but the exact
reason for this can be seen in the TNS commit following.
Signed-off-by: Rostislav Pehlivanov <atomnuker@gmail.com>
At least for vdpau, the hwaccel init code tries to check the video
profile and ensure that there is a matching vdpau profile available.
If it can't find a match, it will fail to initialise.
In the case of wmv3/vc1, I observed initialisation to fail all the
time. It turns out that this is due to the hwaccel being initialised
very early in the codec init, before the profile has been extracted
and set.
Conceptually, it's a simple fix to reorder the init code, but it gets
messy really fast because ff_get_format(), which is what implicitly
trigger hwaccel init, is called multiple times through various shared
init calls from h263, etc. It's incredibly hard to prove to my own
satisfaction that it's safe to move the vc1 specific init code
ahead of this generic code, but all the vc1 fate tests pass, and I've
visually inspected a couple of samples and things seem correct.
Signed-off-by: Philip Langdale <philipl@overt.org>
The amv one probably looks suspicious, but since it's an intra-only
codec, I couldn't possibly imagine what it would use the edge for,
and the vsynth fate result doesn't change, so it's probably OK.
The current algorithm is just "try all the combinations, and pick the best".
It's not very fast either, probably due to a lot of copying, but will do for
an initial implementation.
Signed-off-by: Donny Yang <work@kota.moe>
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
Assert on `avctx->codec->encode2` to avoid a SEGFAULT on the subsequent
function call.
avcodec_encode_video2() uses a similar assertion.
Calling the wrong function on a stream is a serious inconsistency
which could at other places be potentially dangerous and exploitable,
it is thus safer to stop execution and not continue with such
inconsistency after returning an error.
Commit-message-extended-by commiter
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
Use a comment to list the reused tables, since it's more flexible than a
table name to keep information like this. The list will expand in later
commits.
* commit '167ea1fbf15ecefa30729f9b8d091ed431bf43bd':
xavs: Do not try to set the bitrate tolerance without a bitrate
Merged-by: Hendrik Leppkes <h.leppkes@gmail.com>
* commit 'f9ab4fe1f7c1e9d410ca5ee2c9ff8d2892aad068':
h264: Discard currently unsupported registered sei
Merged-by: Hendrik Leppkes <h.leppkes@gmail.com>
Including libavutil/internal.h breaks compilation of doc/print_options.c
with MSVC due to linking avpriv_strtod/avpriv_snprintf.
This reverts part of commit 095347f.
Reviewed-by: Michael Niedermayer <michael@niedermayer.cc>
Signed-off-by: Andreas Cadhalpun <Andreas.Cadhalpun@googlemail.com>
Fixes -Wunused-function warnings when compiling with --disable-yasm on x86.
Signed-off-by: Ganesh Ajjanagadde <gajjanagadde@gmail.com>
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
This commit removes the last thing a Windows environment can
complain about the AAC encoder code. Leftover from an old revision.
Signed-off-by: Rostislav Pehlivanov <atomnuker@gmail.com>
Thanks to @nevcairiel for pointing this one out.
Another thing which stopped msvc from compiling.
Signed-off-by: Rostislav Pehlivanov <atomnuker@gmail.com>
When the encoder is ran without specifying -profile:a
the default avctx->profile value is -99 (FF_PROFILE_UKNOWN),
which used to be treated as AAC-LC.
Signed-off-by: Rostislav Pehlivanov <atomnuker@gmail.com>
Hotfix to deal with msvc. Sane compilers lack POSIX ffs().
It only saves a single bit or so and isn't worth it that much.
Signed-off-by: Rostislav Pehlivanov <atomnuker@gmail.com>
This commit finalizes AAC-Main profile encoding support
by implementing all mandatory and optional tools available
in the specifications and current decoders.
The AAC-Main profile reqires that prediction support be
present (although decoders don't require it to be enabled)
for an encoder to be deemed capable of AAC-Main encoding,
as well as TNS, PNS and IS, all of which were implemented
with previous commits or earlier of this year.
Users are encouraged to test the new functionality using either
-profile:a aac_main or -aac_pred 1, the former of which will enable
the prediction option by default and the latter will change the
profile to AAC-Main. No other options shall be changed by enabling
either, it's currently up to the users to decide what's best.
The current implementation works best using M/S and/or IS,
so users are also welcome to enable both options and any
other options (TNS, PNS) for maximum quality.
Signed-off-by: Rostislav Pehlivanov <atomnuker@gmail.com>
This commit implements temporal noise shaping support in the
encoder, along with an -aac_tns option to toggle it on or off
(off by default for now). TNS will increase audio quality
and reduce quantization noise by applying a multitap FIR filter
across allowed coefficients and transmit side information to the
decoder so it could create an inverse filter.
Users are encouraged to test the new functionality by enabling
-aac_tns 1 during encoding.
No major bugs are observable at this time so after a while if no
new problems appear and if the current implementation is deemed
of high enough quality and stability it will be enabled by default,
possibly at the same time the encoder has its experimental flag
removed and becomes the standard aac encoder in ffmpeg.
Signed-off-by: Rostislav Pehlivanov <atomnuker@gmail.com>
This commit permits for the use of the Main profile
in encoding. The functionality of that profile will
be added in the commits following. By itself, this
commit does not alter anything.
Signed-off-by: Rostislav Pehlivanov <atomnuker@gmail.com>
This commit moves the intensity stereo implementation
out from aaccoder and into a separate file. This was
possible using the previous commits.
This commit also drastically improves the IS implementation
by making it phase invariant e.g. it will always choose the
best possible phase regardless of whether M/S coding is on
or most of the coefficients have identical phases.
This also increases the quality and reduces any distortions
introduced by enablind intensity stereo.
Users are encouraged to test it out using the -aac_is 1
parameter as it has always been.
Signed-off-by: Rostislav Pehlivanov <atomnuker@gmail.com>
This commit updates the function definitions in the aaccoder_mips.c
file. This was broken around a month or so ago with the addition
of the rounding argument.
The previous commit in this series also introduced a separate array
to put the quantization error in, this also needed to be updated,
albeit non-functional, in the MIPS optimized aaccoder file.
Credits for the rounding goes to Claudio Freire.
Signed-off-by: Rostislav Pehlivanov <atomnuker@gmail.com>
This commit moves the quantizer to a separate header file.
This allows the quantizer to be used from a separate files outside
of aaccoder without having to put another function pointer and will
result in a slight speedup as the compiler can do more optimizations.
This is required for commits following.
Signed-off-by: Rostislav Pehlivanov <atomnuker@gmail.com>
This commit only creates and initializes an LTP
context which is needed for upcoming commits (TNS).
Signed-off-by: Rostislav Pehlivanov <atomnuker@gmail.com>
This commit simply populates the table pointer which is needed
for upcoming commits (TNS, prediction, etc.). Copied from
the decoder.
Signed-off-by: Rostislav Pehlivanov <atomnuker@gmail.com>
This commit moves the resetting of special bands (above RESERVED_BT)
to the main frame encoding function rather than the way it was done
previously in their corresponding search_for_... functions.
The reason why special bands need to be reset is that while normal
bands get chosen for every frame by the coder (twoloop by default)
the coders do not touch any special sfbs and will therefore
make them persist throughout the file.
If we zero them out any bands left unmarked will be chosen by
the second part of the coder (the trellis function in aaccoder.c).
Signed-off-by: Rostislav Pehlivanov <atomnuker@gmail.com>
This commit only changes the coding style to a saner way
of accessing coefficients (makes more sense to get the
memory address of a coefficients and start from there
rather than adding arbitrary numbers to offset a pointer).
Some compilers might detect an out of bounds access easier.
Also the way M/S and IS coefficients are calculated has been
changed, but should still have the same result (with the exception
that IS now applies from the normal coefficients rather than the
pristine ones, this is needed for upcoming commits).
Signed-off-by: Rostislav Pehlivanov <atomnuker@gmail.com>
Fixes a -Wunused-variable while compiling with --disable-yasm on x86
Signed-off-by: Ganesh Ajjanagadde <gajjanagadde@gmail.com>
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
Invalid buffer ids are defined by VA_INVALID_ID. Use that through out
vaapi_*.c support files now that we have private data initialized and
managed by libavcodec. Previously, the only requirement for the public
vaapi_context struct was to be zero-initialized.
This fixes support for 3rdparty VA drivers that strictly conform to
the API whereby an invalid buffer id is VA_INVALID_ID and the first
valid buffer id can actually be zero.
Signed-off-by: Gwenole Beauchesne <gwenole.beauchesne@intel.com>
Move libavcodec managed objects from the public struct vaapi_context
to a new privately owned FFVAContext. This is done so that to clean up
and streamline the public structure, but also to prepare for new codec
support, thus requiring new internal data to be added in there.
The AVCodecContext.hwaccel_context, that holds the public vaapi_context,
shall no longer be accessed from within vaapi_*.c codec support files.
Signed-off-by: Gwenole Beauchesne <gwenole.beauchesne@intel.com>
Deprecate older VA pixel formats (MOCO, IDCT) as it is now very unlikely
to ever be useful in the future. Only keep plain AV_PIX_FMT_VAAPI format
that is aliased to the older VLD variant.
This is an API change.
Signed-off-by: Gwenole Beauchesne <gwenole.beauchesne@intel.com>
Currently, when forcing an I frame, via API, or via the ffmpeg cli,
using -force_key_frames, we still let x264 decide what sort of
keyframe to user. In some cases, it is useful to be able to force
an IDR frame, e.g. for cutting streams.
Signed-off-by: Derek Buitenhuis <derek.buitenhuis@gmail.com>
* commit '7bf9647264308d2df74b2b50669f2d02a7ecc90b':
vp7: bound checking in vp7_decode_frame_header
Only partially merged, see 46f72ea507
Merged-by: Hendrik Leppkes <h.leppkes@gmail.com>
* commit 'f34b152eb7b7e8d2aee57c710a072cf74173fbe1':
libfdk-aacdec: Clean up properly if the init fails
Merged-by: Hendrik Leppkes <h.leppkes@gmail.com>
* commit '87de6ddb7b7674e329d5c96677bd8685bc7f7855':
libfdk-aacdec: Bump the max number of channels to 8
Merged-by: Hendrik Leppkes <h.leppkes@gmail.com>
The h264 decoder reports 4.1 as its maximum level, but it will decode
5.1 4K video just fine. In practice, the published level limits in
vdpau do not communicate anything that's actually useful.
Previously most of the error paths leaked.
Also add FF_CODEC_CAP_INIT_THREADSAFE while adding caps_internal;
this decoder wrapper doesn't have any static data that is initialized.
Signed-off-by: Martin Storsjö <martin@martin.st>
For ADTS streams, the output format (number of channels, frame size)
can change at any point (with the latest version of fdk-aac, the decoder
seems to change format after a handful of frames, not outputting the
right format immediately, for cases that worked fine with the earlier
version of the lib).
Previously, the decoder decoded straight into the output frame once the
number of channels and frame size was known. This obviously does not
work if the number of channels or frame size changes.
The alternative would be to allocate the AVFrame with the maximum number
of channels and frame size, and change them afterward decoding into it,
but that may cause confusion to users e.g. of the get_buffer callback.
This solution should be more robust.
CC: libav-stable@libav.org
Signed-off-by: Martin Storsjö <martin@martin.st>
In the latest version of fdk-aac, the decoder can output up to 8
channels; take this into account when preallocating buffers that
need to fit the output from any packet.
CC: libav-stable@libav.org
Signed-off-by: Martin Storsjö <martin@martin.st>
The value of wrap_flag in the Text Wrap Box specifies if the text is to
be wrapped or not. Uses 'end of line wrap' amongst the wrap styles
supported by ASS if the text is to be wrapped, i.e; fill as much text
in a line as possible, then break to next line.
The 3GPP spec has no provision for smart wrapping.
Signed-off-by: Niklesh <niklesh.lalwani@iitb.ac.in>