* commit '688417399c69aadd4c287bdb0dec82ef8799011c':
hevcdsp: split the pred functions by width
Not merged, FFmpeg HEVC DSP has diverged substantially from Libav.
Merged-by: Hendrik Leppkes <h.leppkes@gmail.com>
* commit '818bfe7f0a3ff243deb63c4b146de2563f38ffd4':
hevcdsp: split the epel functions by width
Not merged, FFmpeg HEVC DSP has diverged substantially from Libav.
Merged-by: Hendrik Leppkes <h.leppkes@gmail.com>
* commit '1f821750f0b8d0c87cbf88a28ad699b92db5ec88':
hevcdsp: split the qpel functions by width instead of by the subpixel fraction
Not merged, FFmpeg HEVC DSP has diverged substantially from Libav.
Merged-by: Hendrik Leppkes <h.leppkes@gmail.com>
Also correct the check to reject log < 7, because UPDATE_CACHE only
guarantees 25 meaningful bits.
This fixes undefined behavior:
runtime error: shift exponent is negative
Testing with START/STOP timers in get_ue_golomb, one for the first
branch (A) and one for the second (B), shows that there is practically no
slowdown, e.g. for the cavs decoder:
With the check in the B branch:
629 decicycles in get_ue_golomb B, 4194260 runs, 44 skips
433 decicycles in get_ue_golomb A,268434102 runs, 1354 skips
Without the check:
624 decicycles in get_ue_golomb B, 4194273 runs, 31 skips
433 decicycles in get_ue_golomb A,268434203 runs, 1253 skips
Since the B branch is executed far less often than the A branch, this
change is negligible, even more so for the h264 decoder, where the ratio
B/A is a lot smaller.
Fixes: mozilla bug 1230239
Fixes: fbeb8b2c7c996e9b91c6b1af319d7ebc/asan_heap-oob_195450f_2743_e8856ece4579ea486670be2b236099a0.bit
Found-by: Tyson Smith
Found-by: Mateusz "j00ru" Jurczyk and Gynvael Coldwind
Reviewed-by: Michael Niedermayer <michael@niedermayer.cc>
Signed-off-by: Andreas Cadhalpun <Andreas.Cadhalpun@googlemail.com>
PSNR doesn't change as expected. The AAC spec doesn't really say
anything about how exactly to generate noise.
Signed-off-by: Rostislav Pehlivanov <atomnuker@gmail.com>
The show option did not take clipping into account, so the borders on
the clipped side wouldn't show up. Fix it.
Signed-off-by: Jean Delvare <jdelvare@suse.de>
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
This should fix this test failing on kfreebsd, a regression since
6e5dbe7, which decreased the CMP_TARGET by 1.
Reviewed-by: Rostislav Pehlivanov <atomnuker@gmail.com>
Signed-off-by: Andreas Cadhalpun <Andreas.Cadhalpun@googlemail.com>
Similar to 33fefdb44.
Fix trac ticket #4921.
Signed-off-by: Nicolas George <george@nsup.org>
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
Having a configure option with the same name as a MIPS ISA is confusing,
so better to remove it. This option was being used to add some
optimizations to a specific core (i6400). We will add the optimizations
just when the i6400 core has been detected, in a later patch.
Signed-off-by: Vicente Olivert Riera <Vincent.Riera@imgtec.com>
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
The limit is a conservative guess, the spec does not seem to specify a limit
Reviewed-by: Andreas Cadhalpun <andreas.cadhalpun@googlemail.com>
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
Relying on AVPixFmtDescriptor.nb_components is cleaner and faster than
checking data and linesize for every possible plane.
Signed-off-by: Jean Delvare <jdelvare@suse.de>
Reviewed-by: Paul B Mahol <onemda@gmail.com>
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
The transpose_4x4H is wrong which cost me much time to find this bug. The orders of r2 and r3 are wrong,
this bug waste me much time while I make aarch64 arm instruction which used the function.
This also removes a #ifdef and special case for the fixed point case
Reviewed-by: Andreas Cadhalpun <andreas.cadhalpun@googlemail.com>
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>