CID 1260 (as evidenced by incorrect decoding of a sample from ticket
4876) seems to use incorrect weight tables. It appears those tables
were not zigzag-scanned.
Apply zigzag on weight tables for new CIDs 1258, 1259, and 1260, and
fix an incorrect chroma table for CID 1256.
Fixes last issue from ticket #4876.
Found-by: Christophe Gisquet <christophe.gisquet@gmail.com>
Signed-off-by: Christophe Gisquet <christophe.gisquet@gmail.com>
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
This bit is 1 in some samples, and seems to coincide with interlaced
mbs and CID1260. 2008 specs do not know about it, and maintain qscale
is 11 bits. This looks oversized, but may help larger bitdepths.
Currently, it leads to an obviously incorrect qscale value, meaning
its syntax is shifted by 1. However, reading 11 bits also leads to
obviously incorrect decoding: qscale seems to be 10 bits.
However, as most profiles still have 11bits qscale, the feature is
restricted to the CID1260 profile.
The encoder writes 12 bits of syntax, last and first bits always 0,
which is now somewhat inconsistent with the decoder, but ends up with
the same effect (progressive + reserved bit).
Partially fixes ticket #4876.
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
This patch tweaks search_for_pns to be both more
aggressive and more careful when applying PNS. On
the one side, it will again try to use PNS on zero
(or effectively zero) bands. For this, both zeroes
and band_type have to be checked (some ZERO bands
aren't marked in zeroes). On the other side, a more
accurate rate-distortion measure avoids using PNS
where it would cause audible distortion.
Also fixed a small bug in the computation of freq
that caused PNS usage on low-frequency bands during
8-short windows. This allows re-enabling PNS during
8-short.
AVBufferRef.data and AVPacket.data don't need to have the same value.
AVPacket could point anywhere into the buffer. Likewise, the sizes
don't need to be the same.
Signed-off-by: Luca Barbato <lu_zero@gentoo.org>
This works only for extradata sizes up to 128 bytes. Additionally, I
could never actually see it doing anything. The new code using
MMAL_BUFFER_HEADER_FLAG_CONFIG now takes care of this.
Signed-off-by: Luca Barbato <lu_zero@gentoo.org>
We can send mp4-style data directly. But for some reason, this requires
sending the extradata as buffer with MMAL_BUFFER_HEADER_FLAG_CONFIG
set. Reuse the infrastructure for sending AVPackets to do this.
Signed-off-by: Luca Barbato <lu_zero@gentoo.org>
Commit 7404f3bdb switched bitrate to 64 bits.
This triggers -Wabsolute-value on clang, e.g
http://fate.ffmpeg.org/log.cgi?time=20150917122742&log=compile&slot=x86_64-darwin-clang-3.7-O3.
Therefore, usage of abs is changed to llabs, which is available on all of the platforms.
Unfortunately, LLONG_MAX is not always available, so INT64_MAX is used instead.
Signed-off-by: Ganesh Ajjanagadde <gajjanagadde@gmail.com>
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
This patch modifies the encode frame function to
retry encoding the frame when the resulting bit count
is too far off target, but only adjusting lambda
in small, incremental step. It also makes the logic
more conservative - otherwise it will contend with
bit reservoir-related variations in bit allocation,
and result in artifacts when frame have to be truncated
(usually at high bit rates transitioning from low
complexity to high complexity).
Reduces the number of times the vbv retry code is used and should have no
effect on quality
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
If cmd_pos is broken, this would just keep accumulating packets in the
reassembly buffer, until it fails and flushes the buffer on overflow.
Since packets are usually rather small, this will take a lot of subtitle
packets. The perceived effect is that subtitles are not displayed
anymore after the faulty packet was passed to the decoder.
I'm not terribly sure about this, but on the other hand this code is
active only when fragmented packets need to be reassembled.
Fixes sample file in trac issue #4872.
Assuming the first and second packets are partial, this would append the
reassembly buffer (ctx->buf) to itself with the second
append_to_cached_buf() call, because buf is set to ctx->buf.
I do not know a valid sample file which triggers this, and do not know
if packets can be split into more than 2 sub-packets, but it triggered
with a (differently) broken sample file in trac issue #4872.
With the move of some functions into templates
in aaccoder_twoloop.h and aaccoder_trellis.h,
make checkheaders started failing. Add them to
SKIPHEADERS as should be.
This resolves implementation defined behavior, and also silences -Wabsolute-value in clang 3.5+.
Moreover, the generated asm is identical to before modulo nop padding.
Signed-off-by: Ganesh Ajjanagadde <gajjanagadde@gmail.com>
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
* commit 'e3d4784eb31b3ea4a97f2d4c698a75fab9bf3d86':
d3d11va: WindowsPhone requires a mutex around ID3D11VideoContext
Merged-by: Hendrik Leppkes <h.leppkes@gmail.com>
* commit '777885983533235ccda5145f96317fc8cd0a18ab':
dcadec: set channel layout in a separate function
Merged-by: Hendrik Leppkes <h.leppkes@gmail.com>
* commit '971177f751a6e2931232accceab438bce277bde8':
dcadec: scan for extensions in a separate function
Merged-by: Hendrik Leppkes <h.leppkes@gmail.com>
This patch refactors the AAC coders to reuse code
between the MIPS port and the regular, portable C code.
There were two main functions that had to use
hand-optimized versions of quantization code:
- search_for_quantizers_twoloop
- codebook_trellis_rate
Those two were split into their own template header
files so they can be inlined inside both the MIPS port
and the generic code. In each context, they'll link
to their specialized implementations, and thus be
optimized by the compiler.
This approach I believe is better than maintaining
several copies of each function. As past experience has
proven, having to keep those in sync was error prone.
In this way, they will remain in sync by default.
Also, an implementation of the dequantized output
argument for the optimized quantize_and_encode
functions is included in the patch. While the current
implementation of search_for_pred still isn't using
it, future iterations of main prediction probably will.
It should not imply any measurable performance hit while
not being used.
Failing earlier causes the context to be insufficiently initialized which
can break decoding future frames with threads
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
* commit '95a41311ac3a44773cc4dc407408aca35b1f8e26':
jpeg2000: Factor out band stepsize initialization
Merged-by: Hendrik Leppkes <h.leppkes@gmail.com>
* commit '41bcc3d15204f290400ba02e4e8f87fc07bcc00e':
jpeg2000: Split codeblock decoding from the main tile decoding
Merged-by: Hendrik Leppkes <h.leppkes@gmail.com>
* commit '5d14cf199990cd378904a2618b5c72c4b02290f6':
mpegvideo: Make sure mpegutils.h is included where needed
Merged-by: Hendrik Leppkes <h.leppkes@gmail.com>
* commit '525f58977c93e189fda49a5c4928feaf4d89fac6':
mpegvideo: Move macros to more appropriate headers
Merged-by: Hendrik Leppkes <h.leppkes@gmail.com>
This fixes cases where the shifted number is 64, but we shifted non-
zero numbers away in the shift. The change makes behaviour consistent
with libvpx.
Code in aaccoder_mips.c was not synced with changes in aaccoder.c for
some time.
That was cause for some fate-aac tests failing.
This patch fixes the problems.
Optimizations disabled in 933309a are enabled again.
Signed-off-by: Nedeljko Babic <nedeljko.babic@imgtec.com>
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
"int" is useful in testing because provides accurate results across
different plaftforms, so remove it from the scheduled FF_API_UNUSED_MEMBERS
deprecation.
Deprecate the now unused option, but temporarily retain the capability
to disable the now default behaviour.
Mention this change in the AVPacket documentation.
Signed-off-by: Vittorio Giovara <vittorio.giovara@gmail.com>
This also drops setting the frame->pts field. This is usually not set by
decoders, so this would be an inconsistency that's at worst a danger to
the API user.
It appears the buffer->dts field is normally not set by the MMAL
decoder, so don't use it. If it's ever going to be set by MMAL, we
don't know whether the value will be what we want.
Signed-off-by: Luca Barbato <lu_zero@gentoo.org>
The generic code in utils.c sets the AVFrame.pkt_dts field from the
packet it was supposedly decoded. This does not have to be true for a
fully asynchronous decoder like mmaldec. It could be overwritten with an
incorrect value. Even if the decoder doesn't determine the DTS (but sets
it to AV_NOPTS_VALUE), it's impossible to determine a correct value in
utils.c.
Decoders can now be marked with FF_CODEC_CAP_SETS_PKT_DTS, in which case
utils.c won't overwrite the field. The decoders are expected to set this
field (even if they only set it to AV_NOPTS_VALUE).
Signed-off-by: Luca Barbato <lu_zero@gentoo.org>
This MMAL feature fills in missing timestamps from the framerate set on
the input port. This is generally unwanted, since libavcodec decoders
merely pass through timestamps without ever "fixing" them. The framerate
is also unknown, and even the timebase doesn't have to be set.
Signed-off-by: Luca Barbato <lu_zero@gentoo.org>
Don't try to do a blocking wait for MMAL output if we haven't even sent
a single real packet, but only flush packets. Obviously we can't expect
to get anything back.
Additionally, don't send a flush packet to MMAL in the same case. It
appears the MMAL decoder will sometimes hang in mmal_vc_port_disable()
(called from ffmmal_close_decoder()), waiting for a reply from the GPU
which never arrives. Either MMAL disallows sending flush packets without
preceding real data, or it's a MMAL bug.
Signed-off-by: Luca Barbato <lu_zero@gentoo.org>
I can't come up with a nice way to handle this. It's hard to keep the
lock-stepped input/output in this case. You can't predict whether the
MMAL decoder will output a picture (because it's asynchronous), so
you have to assume in general that any packet could produce 0 or 1
frames. You can't continue to write input packets to the decoder,
because then you might get too many output frames, which you can't
get rid of because the lavc decoding API does not allow the decoder
to return an output frame without consuming an input frame (except
when flushing).
The ideal fix is a M:N decoding API (preferably asynchronous), which
would make this code potentially much cleaner. For now, this hack
will do.
Signed-off-by: Luca Barbato <lu_zero@gentoo.org>
Since TNS was fixed with the recent commits retweak the values
so it's more frequently used.
Still not enabled by default yet, though it's possible that it
will be made enabled by default in the near future.
Signed-off-by: Rostislav Pehlivanov <atomnuker@gmail.com>
This commit was made possible with the earlier commits since the
new quantization method basically means we're working always with
unsigned values. The specifications mention to use compression when
the first 2 bits are identical but they didn't mention if this should
happen before or after the conversion to signed values. Actually
they said nothing about conversion to signed values.
With this commit, coefficient compression usually always happens
which saves a lot of space, especially at extremely low bitrates
and doesn't change the quality at all.
Signed-off-by: Rostislav Pehlivanov <atomnuker@gmail.com>
This finally (and again) gets rid of basically everything the
specifications say about how TNS should be done. The main
problem used to be that a single filter was used for all
coefficients which despite being explicitly recommended by
the specifications usually sounds wrong, therefore it's
a corner case in the current TNS implementation.
This commit also changes the coefficient bit size, as apparently
it's better to use lower precision in case the windows are eight
short. This is apparently what fdk_aac uses, looking at the bit
stream and makes sense. Also the order when 8 SHORT windows happen
is important as 7 was too much and according to PSNR was worse
while 5 is just about correct.
Signed-off-by: Rostislav Pehlivanov <atomnuker@gmail.com>
* commit '5788623d29c3e806a7879210986110aced758dc2':
jpeg2000: Split codeblock decoding from the main tile decoding
Merged-by: Hendrik Leppkes <h.leppkes@gmail.com>
* commit 'db53a2306f62f05faa67e6f3c60ee55a9b8e4776':
jpeg2000: Do not warn about known and skippable markers
Merged-by: Hendrik Leppkes <h.leppkes@gmail.com>
Makes more sense as users usually set the -cutoff option
to low pass filter the signal. The encoder will still over
shoot slightly when encoding normal coefficients however
that's normal.
Signed-off-by: Rostislav Pehlivanov <atomnuker@gmail.com>
Signed-off-by: Kevin Wheatley <kevin.j.wheatley@gmail.com>
Reviewed-by: Hendrik Leppkes <h.leppkes@gmail.com>
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
Signed-off-by: Kevin Wheatley <kevin.j.wheatley@gmail.com>
Reviewed-by: Hendrik Leppkes <h.leppkes@gmail.com>
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
Also disable the mmx/iwht optimization when the bitexact flag is set.
With synthetically coded coefficients (i.e. these that lead to a
residual well outside the [-255,255] range), our optimizations will
overflow. It doesn't make sense to fix the overflows, since they can
only occur on synthetic input, not on real fwht-generated input. Thus,
add a bitexact flag that disables this optimization.
File libopenh264enc.c has been modified so that the encoder uses av_log()
to log messages (error, warning, info, etc.) instead of logging them
directly to stderr. At the time the encoder is created, the current
ffmpeg log level is mapped to an equivalent libopenh264 log level. This
log level, and a message logging function that invokes av_log() to
actually log messages, are then set on the encoder.
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
This commit changes a few things about the noise substitution
logic:
- Brings back the quantization factor (reduced to 3) during
scalefactor index calculations.
- Rejects any zeroed bands. They should be inaudiable and it's
a waste transmitting the scalefactor indices for these.
- Uses swb_offsets instead of incrementing a 'start' with every
window group size.
- Rejects all PNS during short windows.
Overall improves quality. There was a plan to use the lfg system
to create the random numbers instead of using whatever the decoder
uses but for now this works fine. Entropy is far from important here.
Signed-off-by: Rostislav Pehlivanov <atomnuker@gmail.com>
* commit '2268db2cd052674fde55c7d48b7a5098ce89b4ba':
lavu: Drop the {minus,plus}1 suffix from AVComponentDescriptor fields
Merged-by: Hendrik Leppkes <h.leppkes@gmail.com>
Based on a patch by wm4.
Reviewed-by: Paul B Mahol <onemda@gmail.com>
Reviewed-by: Hendrik Leppkes <h.leppkes@gmail.com>
Signed-off-by: James Almer <jamrial@gmail.com>
Do not make many assumption on the dimension of the slices and just
try to decode additional lines if there is enough data left.
Decodes all the samples kindly provided by ultramage.
This commit once again improves the PNS implementation by scaling the
thresholds with frequency. The thresholds get looser as the frequency
increases since higher frequencies are basically noise to human ears.
Also, this introduces quantization error correction for PNS. Should
the error be too much, no PNS will be used. The energy_ratio is used
to regulate the actual encoded PNS energy: if the generated PNS
energy is higher than the energy from the psy system, energy_ratio
is used to correct it so that hopefully once requantized and
transmitted the value in the decoder will be closer to what the
encoder has.
Signed-off-by: Rostislav Pehlivanov <atomnuker@gmail.com>
This was an oversight when the IS system was being first implemented.
The ener01 part was largely a result of trial and error and the fact
that the sum of coef0 and coef1 could result in a zero was
overlooked. Once ener01 turns to zero it's used to divide the left
channel energy which doesn't turn out so well as it fills IS[]
with -nan's and inf's which in turn confused the quantize_band_cost.
Signed-off-by: Rostislav Pehlivanov <atomnuker@gmail.com>
This commit rewrites the PNS implementation and significantly
improves sonic quality.
The previous implementation marked an incredibly big amount
of SFBs to predict when there was no need for this and this
resulted in quite a large amount of artifacts. Also the
quantization was incorrect (av_clip(4+log2f(...))) which
led to 3x the intensity for PNS values leading to even more
artifacts.
This commit rewrites the PNS search function and introduces
a major change: the PNS values are synthesized and are compared
to the current coefficients in addition to passing through
the revised checks to see whether PNS can be used.
This decreases distortions and makes the current PNS implementation
mainly focused on replacing any low-power non-zero bands as well
as adding any zeroed bands back.
The current encoder's performance is enough (especially with
IS) so PNS isn't really required except to fill in the occasional
few bands as well as extend any zeroed high frequency, so this
combination which is already enabled by default works
to get as much quality as it can within the bits allowed.
Signed-off-by: Rostislav Pehlivanov <atomnuker@gmail.com>
Since PNS generates coefficients it doesn't make sense to send
the predicted ones as well. Also the specifications explicitly
state to disable right channel IS predictors.
Signed-off-by: Rostislav Pehlivanov <atomnuker@gmail.com>
It's better to trust that the coefficients generated will be
closer than the coefficients derived, and the new PNS implementation
makes sure that this happens.
Signed-off-by: Rostislav Pehlivanov <atomnuker@gmail.com>
The specifications explicitly state to use roundf() which
also rounds half-integer values away from zero.
This does fix a few IS artifacts.
Signed-off-by: Rostislav Pehlivanov <atomnuker@gmail.com>
* commit '4e649debcf7f71d35c6b38cdb7ee715eba95d64a':
Postpone API-incompatible changes until the next bump
Merged-by: Hendrik Leppkes <h.leppkes@gmail.com>
* commit '069713aa4b137781e270768d803b1f7456daa724':
lavc: Drop deprecated thread opaque and codec pkt
Merged-by: Hendrik Leppkes <h.leppkes@gmail.com>
* commit '9f90b24877016e7140b9b14e4b1acee663bb6d8a':
lavc: Drop deprecated get_buffer related functions
Merged-by: Hendrik Leppkes <h.leppkes@gmail.com>
* commit '01bcc2d5c23fa757d163530abb396fd02f1be7c8':
lavc: Drop deprecated destruct_packet related functions
Merged-by: Hendrik Leppkes <h.leppkes@gmail.com>
* commit 'dc70c19476e76f1118df73b5d97cc76f0e5f6f6c':
lavc: Drop deprecated request_channels related functions
Merged-by: Hendrik Leppkes <h.leppkes@gmail.com>
This commit improves the TNS implementation to the point where it's
actually usable and very rarely results in nastyness (in all bitrates
except extremely low bitrates it's increasing the quality and prevents
some distortions from the coder being audiable).
Also adds a double filter support which is only used if the energy
difference between the top and bottom of the SFBs is above the
thresholds defined in the header file. Looking at the bitstream
that fdk_aac generates it sometimes used a double filter despite
the specs stating that a single filter should be enough for almost
all cases and purposes.
Unlike FAAC or fdk_aac we sometimes use a reverse filter in case
the energy difference isn't enought to use a double filter. This
actually works better.
Signed-off-by: Rostislav Pehlivanov <atomnuker@gmail.com>
This commit adds a flag to use the pure coefficients instead
of the processed ones (sce->coeffs). This is needed because
IS will apply the changes to the coefficients immediately
before the adjust_common_prediction function and it doesn't
make sense to measure stereo channel coefficient difference
when one of the channels coefficients are all zero.
Therefore add a flag to use pure coefficients in that case.
TNS is the only thing touching the coefficients before IS
so common window prediction will not take that into account
but the effect of the TNS filter per coefficient can be small
(a few percent) so to some approximation it's fine to just
ignore that.
Also fixed a small error which doesn't alter the results
that much. pow(sqrt(number), 3.0/4.0) == pow(number, 3.0/8.0) !=
pow(number, 3.0/4.0).
Signed-off-by: Rostislav Pehlivanov <atomnuker@gmail.com>
32bit is not sufficient for all cases
Fixes: signal_sigabrt_7ffff6ac8cc9_686_cov_1897408623_microsoft_new_way_to_shove_mpeg2_in_asf.dvr_ms
Found-by: Mateusz "j00ru" Jurczyk and Gynvael Coldwind
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
This commit reorders the coding tools such that they're doing what
the decoder does in reverse order. The very first thing the decoder
does is to decode M/S stereo if that's signalled, then prediction,
IS, and finally TNS and PNS in another function.
adjust_frame_information()'s application of IS and M/S was taken
out into two separate functions since prediction doesn't expect
to get the raw coefficients but rathe the coefficients at that
part of the encoding process.
The results show a much better PSNR when any combination of
Intensity Stereo, Mid/Side stereo and Prediction is used, which
is a sign of an increased encoder efficiency as well as the fact
that the decoder gets what it expects.
Otherwise, with only IS, PNS or prediction there are neither
regressions nor improvements except in the case of IS, which
now by itself (or with PNS) is less prone to artifacts. Enabling
M/S (using stereo_mode) as well will also reduce stereo artifacts
induced by IS, so in the very near future M/S may be enabled
by default.
Signed-off-by: Rostislav Pehlivanov <atomnuker@gmail.com>
If the selected coder isn't twoloop, this commit temporarily
disables IS and PNS.
The problem is in the encode_window_bands_info() being confused
and setting invalid band_types for non-marked (normal) bands.
Signed-off-by: Rostislav Pehlivanov <atomnuker@gmail.com>
Since the changes made a few week ago (which were done more than a
month ago) the quality and stability of intensity stereo has been
notably good. There were some requests and wishes to have in on by
default and therefore it has been enabled. Should any regressions
arise changes will be made to preferably keep it operating rather
than just disabling it by default again.
Signed-off-by: Rostislav Pehlivanov <atomnuker@gmail.com>
It has been in the current encoder in its current implementation
for quite some time now, so enable it by default. Will increase
quality at all bitrates, especially at low ones.
Signed-off-by: Rostislav Pehlivanov <atomnuker@gmail.com>
Purely a cosmetic change, most of the zeroing of encoder resources
should happen at the top of the main loop.
Signed-off-by: Rostislav Pehlivanov <atomnuker@gmail.com>
This commit reworks the TNS implementation to a hybrid between what
the specifications say, what the decoder does and what's the best
thing to do.
The filter application function was copied from the decoder and
modified such that it applies the inverse AR filter to the
coefficients. The LPC coefficients themselves are fed into the
same quantization expression that the specifications say should
be used however further processing is not done, instead they're
converted to the form that the decoder expects them to be in
and are sent off to the compute_lpc_coeffs function exactly the
way the decoder does. This function does all conversions and will
return the exact coefficients that the decoder will generate, which
are then applied to the coefficients.
Having the exact same coefficients on both the encoder and decoder
is a must since otherwise the entire sfb's over which the filter
is applied will be attenuated.
Despite this major rework, TNS might not work fine on some audio
types at very low bitrates (e.g. sub 90kbps) as it can attenuate
some coefficients too much. Users are advised to experiment with
TNS at higher bitrates if they wish to use this tool or simply
wait for the implementation to be improved.
Signed-off-by: Rostislav Pehlivanov <atomnuker@gmail.com>
Turns out autocorrelating more than 750 coefficients at once
will cause a segfault, despite there being enough space to
hold an entire frame of samples into the buffer.
Signed-off-by: Rostislav Pehlivanov <atomnuker@gmail.com>
This commit adds a function to get the reflection coefficients on
floating point samples. It's functionally identical to
ff_lpc_calc_ref_coefs() except it works on float samples and will
return the global prediction gain. The Welch window implementation
which is more optimized works only on int32_t samples so a slower
generic expression was used.
Signed-off-by: Rostislav Pehlivanov <atomnuker@gmail.com>
Not needed anymore, it was only used by the AAC TNS
encoder and was replaced with a more suitable function
in the following commit.
Signed-off-by: Rostislav Pehlivanov <atomnuker@gmail.com>
Needed for following commits. Contains the starting sfb for
every samplerate and window type.
Signed-off-by: Rostislav Pehlivanov <atomnuker@gmail.com>
Fixes out of array access
Fixes: 87196d8bbc633629fc9dd851fce73e70/asan_heap-oob_26f6853_862_cov_585961513_sonic3dblast_intro-partial.avi
Found-by: Mateusz "j00ru" Jurczyk and Gynvael Coldwind
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
Despite '417792' being reported in the binary decoder, the buffer at
encoding time needs to be bigger to avoid running out of space due to
interlace handling.
Signed-off-by: Vittorio Giovara <vittorio.giovara@gmail.com>
This was copied from the decoder, but is unneeded for the encoder.
tns_max_bands is unused and set to zero which zeroed out start, end
and size and thus no filter was actually applied.
Signed-off-by: Rostislav Pehlivanov <atomnuker@gmail.com>
Since the coefficients are stepped up to order + 1 it was possible
that it went over TNS_MAX_ORDER. Also just return in case the only
coefficient is less than the threshold.
Signed-off-by: Rostislav Pehlivanov <atomnuker@gmail.com>
The encoder-side filter isn't that important. The PSNR
shouldn't change so the FATE test should still be fine.
Signed-off-by: Rostislav Pehlivanov <atomnuker@gmail.com>
The order should never go above TNS_MAX_ORDER (and thus cause
the context to be reinitialized) but this is just in case.
Also fix a comparison, since the coefficients are zero-indexed.
Signed-off-by: Rostislav Pehlivanov <atomnuker@gmail.com>
It also made no sense to actually make the filter span the entire
window including the first band of the next window.
Signed-off-by: Rostislav Pehlivanov <atomnuker@gmail.com>
Pulses are already on the way so expect to see the list
gone in the close future.
TNS is already of sufficiently high quality to be enabled
by default (but isn't yet, so you too can help by testing!).
Signed-off-by: Rostislav Pehlivanov <atomnuker@gmail.com>
This commit abandons the way the specifications state to
quantize the coefficients, makes use of the new LPC float
functions and is much better.
The original way of converting non-normalized float samples
to int32_t which out LPC system expects was wrong and it was
wrong to assume the coefficients that are generated are also
valid. It was essentially a full garbage-in, garbage-out
system and it definitely shows when looking at spectrals
and listening. The high frequencies were very overattenuated.
The new LPC function performs the analysis directly.
The specifications state to quantize the coefficients into
four bit index values using an asin() function which of course
had to have ugly ternary operators because the function turns
negative if the coefficients are negative which when encoding
causes invalid bitstream to get generated.
This deviates from this by using the direct TNS tables, which
are fairly small since you only have 4 bits at most for index
values. The LPC values are directly quantized against the tables
and are then used to perform filtering after the requantization,
which simply fetches the array values.
The end result is that TNS works much better now and doesn't
attenuate anything but the actual signal, e.g. TNS removes
quantization errors and does it's job correctly now.
It might be enabled by default soon since it doesn't hurt and
helps reduce nastyness at low bitrates.
Signed-off-by: Rostislav Pehlivanov <atomnuker@gmail.com>
This commit removes the array which was made redundant with
the last commit. The current prediction system gets the
quantization error directly (and without the single-frame delay)
in the search_for_pred function.
Signed-off-by: Rostislav Pehlivanov <atomnuker@gmail.com>
This commit completely alters the algorithm of prediction.
The original commit which introduced prediction was completely
incorrect to even remotely care about what the actual coefficients
contain or whether any options were enabled. Not my actual fault.
This commit treats prediction the way the decoder does and expects
to do: like lossy encryption. Everything related to prediction now
happens at the very end but just before quantization and encoding
of coefficients. On the decoder side, prediction happens before
anything has had a chance to even access the coefficients.
Also the original implementation had problems because it actually
touched the band_type of special bands which already had their
scalefactor indices marked and it's a wonder the asserion wasn't
triggered when transmitting those.
Overall, this now drastically increases audio quality and you should
think about enabling it if you don't plan on playing anything encoded
on really old low power ultra-embedded devices since they might not
support decoding of prediction or AAC-Main. Though the specifications
were written ages ago and as times change so do the FLOPS.
Signed-off-by: Rostislav Pehlivanov <atomnuker@gmail.com>
This was missed when the original commits were done. FF_PROFILE_UNKNOWN
is what's in avctx->profile when no audio profile is specified.
Signed-off-by: Rostislav Pehlivanov <atomnuker@gmail.com>
This commit simply duplicates the functionality of ff_lpc_calc_coefs()
for the case of a Levinson-Durbin LPC with the only difference being
that floating point samples are accepted and the resulting coefficients
are raw and unquantized.
The motivation behind doing this is the fact that the AAC encoder
requires LPC in TNS and LTP and converting non-normalized floating
point coefficients to int32_t using SWR and again back for the LPC
coefficients was very impractical.
The current LPC interfaces were designed for int32_t in mind possibly
because FLAC and ALAC use this type for most internal operations.
The mathematics in case of floats remains of course identical.
Signed-off-by: Rostislav Pehlivanov <atomnuker@gmail.com>
This commit simply moves the TNS tables to a more appropriate
aactab.h since then they can be accessed by both the decoder
and encoder.
The encoder _shouldn't_ normally need the tables since the
specs describe a specific quantization process, but the exact
reason for this can be seen in the TNS commit following.
Signed-off-by: Rostislav Pehlivanov <atomnuker@gmail.com>
At least for vdpau, the hwaccel init code tries to check the video
profile and ensure that there is a matching vdpau profile available.
If it can't find a match, it will fail to initialise.
In the case of wmv3/vc1, I observed initialisation to fail all the
time. It turns out that this is due to the hwaccel being initialised
very early in the codec init, before the profile has been extracted
and set.
Conceptually, it's a simple fix to reorder the init code, but it gets
messy really fast because ff_get_format(), which is what implicitly
trigger hwaccel init, is called multiple times through various shared
init calls from h263, etc. It's incredibly hard to prove to my own
satisfaction that it's safe to move the vc1 specific init code
ahead of this generic code, but all the vc1 fate tests pass, and I've
visually inspected a couple of samples and things seem correct.
Signed-off-by: Philip Langdale <philipl@overt.org>
The amv one probably looks suspicious, but since it's an intra-only
codec, I couldn't possibly imagine what it would use the edge for,
and the vsynth fate result doesn't change, so it's probably OK.
The current algorithm is just "try all the combinations, and pick the best".
It's not very fast either, probably due to a lot of copying, but will do for
an initial implementation.
Signed-off-by: Donny Yang <work@kota.moe>
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
Assert on `avctx->codec->encode2` to avoid a SEGFAULT on the subsequent
function call.
avcodec_encode_video2() uses a similar assertion.
Calling the wrong function on a stream is a serious inconsistency
which could at other places be potentially dangerous and exploitable,
it is thus safer to stop execution and not continue with such
inconsistency after returning an error.
Commit-message-extended-by commiter
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>