Runtime of all IDCTs together goes from 3327 to 2473 cycles (intra, i.e.
~35% faster) or from 2312 to 1448 cycles (inter, i.e. ~60% faster). Total
decode time of ped1080p.webm goes from 8.086sec to 7.974sec (1.4% faster).
Sub-IDCTs will follow later. ped1080.webm goes from 9.295s to 8.191s
(13.5% faster). The IDCT itself goes from 4372 (intra) or 4337 (inter)
to 403 (intra) or 329 (inter) cycles for the DC-only form, 23755 (intra)
or 23723 (inter) to 3497 (intra) or 3607 (inter) cycles for the no-DC
form, which averages from 23393 (intra) or 16612 (inter) to 3449 (intra)
or 2392 (inter) for all 32x32s together, i.e. about ~7x faster (all
tests done on ped1080p.webm).
The function macro always sets .align 2 before declaring the
function label (since 5c5e1ea3) and always sets the section to
.text (since 278caa6a).
The .align 5 before certain functions, added in fc252eba, were added
before .text and .align were added to the function macro and thus
became useless/unused when the function macro got them.
This restores the original intention, to align the loop entry
points.
Signed-off-by: Martin Storsjö <martin@martin.st>
This file no longer uses the pld instruction at all, all such uses
have been split into hpeldsp_arm.S.
Signed-off-by: Martin Storsjö <martin@martin.st>
Fixes use of uninitialized memory
Partly fixes; msan_uninit-mem_7fb7d24780d0_2744_R03T.CAK
Found-by: Mateusz "j00ru" Jurczyk and Gynvael Coldwind
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
Fixes use of uninitialized memory
Fixes: msan_uninit-mem_7f6e13c220d0_8489_short.flac
Found-by: Mateusz "j00ru" Jurczyk and Gynvael Coldwind
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
* commit 'a03a642d5ceb5f2f7c6ebbf56ff365dfbcdb65eb':
h264: do not use 422 functions for monochrome
See: 07abf13da4a7c3d23ce6bc6542d72e6252161736
Merged-by: Michael Niedermayer <michaelni@gmx.at>
* commit '9eef9eb3014b2ed9c3ff4aac510a9f04edb555cf':
h264: check that execute_decode_slices() is not called too many times
Conflicts:
libavcodec/h264.c
The check is replaced by an assert() as the mb index should not ever go out
of bounds.
Merged-by: Michael Niedermayer <michaelni@gmx.at>
* commit '9a026c72982faf20e1c8dfbe48f0b312cdea69c8':
h264: rebuild the default ref list if the reference count changes
Conflicts:
libavcodec/h264.c
Merged-by: Michael Niedermayer <michaelni@gmx.at>
* commit '50079a6aa93291e6dc9d9fb8d33da83f79e9311d':
lavc: do not leak the internal frame if opening the codec fails
Conflicts:
libavcodec/utils.c
See: 8b285f03f70e884312c6c4e00a1377cfd85a3a7a
Merged-by: Michael Niedermayer <michaelni@gmx.at>
* commit '8058284ce09030b47512746d726fb2ad3ae8a20f':
lavc: add 422/444 YUV with alpha to align_dimensions()
Conflicts:
libavcodec/utils.c
Only cosmetical changes happen as these formats already where in the list before
Merged-by: Michael Niedermayer <michaelni@gmx.at>
* commit '2f97094608cfd2665660f7a26a3291559b186752':
lagarith: do not call simd functions on unaligned lines
Conflicts:
libavcodec/lagarith.c
Merged-by: Michael Niedermayer <michaelni@gmx.at>
There is no point in delaying the check and it avoids bugs with a
half-initialized context.
Fixes invalid reads.
Found-by: Mateusz "j00ru" Jurczyk and Gynvael Coldwind
CC:libav-stable@libav.org
They end up overwriting past the line end.
Partially based on a patch by Michael Niedermayer <michaelni@gmx.at>
Bug-Id: vlc/9700
Signed-off-by: Luca Barbato <lu_zero@gentoo.org>
Signed-off-by: Anton Khirnov <anton@khirnov.net>
The decoder currently sets CODEC_FLAG_EMU_EDGE and relies on
get_buffer2() to always provide buffers with linesize == 2 * width.
This is wrong, since we place no such restriction on get_buffer2()
implementations.
Fix this by decoding into internal buffers and copying them to output
frames. Since this is a very obscure decoder, the performance hit should
not be an issue.
The decoder currently sets CODEC_FLAG_EMU_EDGE and relies on
get_buffer2() to always provide buffers with linesize == 2 * width.
This is wrong, since we place no such restriction on get_buffer2()
implementations.
Fix this by decoding into internal buffers and copying them to output
frames. Since this is a very obscure decoder, the performance hit should
not be an issue.
Fixes qp fields becoming out of range
Fixes: asan_static-oob_e393a3_6998_WPP_A_ericsson_MAIN10_2.bit
Found-by: Mateusz "j00ru" Jurczyk and Gynvael Coldwind
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
This matches how its done for SPS/PPS.
An alternative to this is to check it when its used.
Fixes null pointer dereference
Fixes: signal_sigsegv_e30a43_1437_CIP_A_Panasonic_3.bit
Found-by: Mateusz "j00ru" Jurczyk and Gynvael Coldwind
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
* commit '7840c40445c9f52aeccba96de3d27613398bfbf2':
(e)ac3dec: set AV_FRAME_DATA_MATRIXENCODING side data.
Merged-by: Michael Niedermayer <michaelni@gmx.at>
* commit '13345fc1f86fc3615789e196d5a339c1c27c9068':
(e)ac3: parse and store the Dolby Surround, Surround EX and Headphone mode flags.
Merged-by: Michael Niedermayer <michaelni@gmx.at>
* commit 'e92123093dfdca0ef6608998240e2f9345d63bff':
mlpdec: set AV_FRAME_DATA_MATRIXENCODING side data.
Merged-by: Michael Niedermayer <michaelni@gmx.at>
* commit '4b7f1a7ced0e98f2cc698d896f7ebab8d30eaa09':
mlp: Parse TrueHD decoder channel modifiers and set the AVMatrixEncoding for each substream.
Conflicts:
libavcodec/mlp_parser.h
libavcodec/mlpdec.c
Merged-by: Michael Niedermayer <michaelni@gmx.at>