* commit 'cab8c5f8e140c96ba3725ab709d823abfd1e31a5':
h264: do not reinitialize the global cabac tables at each slice header
See: 1e2e2c8095
Merged-by: Michael Niedermayer <michaelni@gmx.at>
* commit '71cabb521ac397db3903011d2de7afd3e0fc7ab6':
h264: do not discard NAL_SEI when skipping frames
Conflicts:
libavcodec/h264.c
See: 7d75fb381b
Merged-by: Michael Niedermayer <michaelni@gmx.at>
* commit 'c4e43560fe6677e9d60bfb3cffc41c7324e92a0b':
h264data: Move some tables to the only place they are used
Conflicts:
libavcodec/h264data.h
Merged-by: Michael Niedermayer <michaelni@gmx.at>
This reverts commit bf36dc50ea, reversing
changes made to b7fc2693c7.
Conflicts:
libavcodec/h264.c
Keeping support for the old VDPAU API has been requested by our VDPAU maintainer
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
This matches the matroska defintion of stereo_mode, with
no metadata written if no info exist in sei
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
Fixes inconsistency that leads to out of array accesses with threads
Found-by: Mateusz "j00ru" Jurczyk and Gynvael Coldwind
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
This fixes returning pictures with corrupted data pointers.
Found-by: Mateusz "j00ru" Jurczyk and Gynvael Coldwind
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
Just like get_buffer, get_format should not be called from a different
thread if thread_safe_callbacks is not set.
Signed-off-by: Reimar Döffinger <Reimar.Doeffinger@gmx.de>
* commit '85deb51a01f1ecc5ac5faa52ad8ea141c384e23a':
h264: Only initialize dsputil if error resilience is enabled
Conflicts:
libavcodec/h264.c
Merged-by: Michael Niedermayer <michaelni@gmx.at>
* commit 'e8cafd2773bc56455c8816593cbd9368f2d69a80':
h264: Clear the mb members via memset instead of using dsputil
Conflicts:
libavcodec/h264.c
Merged-by: Michael Niedermayer <michaelni@gmx.at>
It is only used for error resilience. This allows building the
h264 decoder without dsputil, if error resilience is disabled.
Signed-off-by: Martin Storsjö <martin@martin.st>
The non-intra-pcm branch in hl_decode_mb (simple, 8bpp) goes from 700
to 672 cycles, and the complete loop of decode_mb_cabac and hl_decode_mb
(in the decode_slice loop) goes from 1759 to 1733 cycles on the clip
tested (cathedral), i.e. almost 30 cycles per mb faster.
Signed-off-by: Martin Storsjö <martin@martin.st>
The EC code does not support fields currently thus it makes no
sense to wait for these cases (which also the check doesnt handle
correctly)
Fixes Ticket 2454
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
Fixes inconsistency ultimately leading to an out of array read
Found-by: Mateusz "j00ru" Jurczyk and Gynvael Coldwind
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
* qatar/master:
hwaccel: fix use with frame based multithreading
Conflicts:
libavcodec/h263dec.c
libavcodec/version.h
Merged-by: Michael Niedermayer <michaelni@gmx.at>
Allows use of AVHWAccel based decoders with frame based multithreading.
The decoders will be forced into an non-concurrent mode by delaying
ff_thread_finish_setup() calls after decoding of the current frame
is finished.
This wastes memory by unnecessarily using multiple threads and thus
copies of the decoder context but allows seamless switching between
hardware accelerated and frame threaded software decoding when the
hardware decoder does not support the stream.
* qatar/master:
h264: Make it possible to compile without error_resilience
Conflicts:
configure
libavcodec/h264.c
Merged-by: Michael Niedermayer <michaelni@gmx.at>
Error resilience is enabled by the h264 decoder, unless explicitly
disabled. --disable-everything --enable-decoder=h264 will produce
a h264 decoder with error resilience enabled, while
--disable-everything --enable-decoder=h264 --disable-error-resilience
will produce a h264 decoder with error resilience disabled.
Signed-off-by: Martin Storsjö <martin@martin.st>
* commit '23e85be58fc64b2e804e68b0034a08a6d257e523':
h264: add a parameter to the CHROMA444 macro.
h264: add a parameter to the CHROMA422 macro.
Conflicts:
libavcodec/h264.c
libavcodec/h264.h
libavcodec/h264_cavlc.c
libavcodec/h264_loopfilter.c
Merged-by: Michael Niedermayer <michaelni@gmx.at>
* commit '6d2b6f21eb45ffbda1103c772060303648714832':
h264: add a parameter to the CABAC macro.
h264: add a parameter to the FIELD_OR_MBAFF_PICTURE macro.
Conflicts:
libavcodec/h264.c
libavcodec/h264_cabac.c
libavcodec/h264_cavlc.c
Merged-by: Michael Niedermayer <michaelni@gmx.at>
* commit '7fa00653a550c0d24b3951c0f9fed6350ecf5ce4':
h264: add a parameter to the FIELD_PICTURE macro.
h264: add a parameter to the FRAME_MBAFF macro.
Conflicts:
libavcodec/h264.c
libavcodec/h264_loopfilter.c
libavcodec/h264_refs.c
Merged-by: Michael Niedermayer <michaelni@gmx.at>
* commit 'da6be8fcec16a94d8084bda8bb8a0a411a96bcf7':
h264: add a parameter to the MB_FIELD macro.
h264: add a parameter to the MB_MBAFF macro.
Conflicts:
libavcodec/h264.c
libavcodec/h264_cabac.c
libavcodec/h264_cavlc.c
Merged-by: Michael Niedermayer <michaelni@gmx.at>
AVCodecContext.bits_per_raw_sample is updated from the previous thread
in the generic update function before the codec specific update_thread
function is called. The check for reinitialization of dsp functions uses
bits_per_raw_sample. When called from update_thread_context it will be
already at the current value and the dsp functions aren't updated if
only the bit depth changes.
* commit 'c3ebfcd6e1327ca7bbcaee822e593c2da6cfd352':
mpegvideo: allocate hwaccel privdata after the frame buffer
h264: allocate hwaccel privdata after the frame buffer
Merged-by: Michael Niedermayer <michaelni@gmx.at>
This ensures the hwaccel privdata does not leak when a frame buffer could
not be allocated (and toggle the assert when the frame is re-used).
Having no frame buffer available is quite common when using the DXVA2
hwaccel in situations where the DXVA2 renderer is being re-allocated, for
example when moving between displays.
Signed-off-by: Anton Khirnov <anton@khirnov.net>
dct_bits is never set except in h264, where it is never used, thus
remove it. Then, remove all functions that were set based on non-zero
(32) values for dct_bits. Lastly, merge 9-14 bpp functions for get_pixels
and draw_edge, which only care about pixel storage unit size, not actual
bits used (i.e. they don't clip).
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
* commit 'a4472ac01e86f9fae5adb9034f2777b86a9c5480':
Add informative messages to av_log_ask_for_sample calls lacking them
anm: Get rid of some very silly goto statements
Conflicts:
libavformat/anm.c
Merged-by: Michael Niedermayer <michaelni@gmx.at>
* commit '555000c7d5c1e13043a948ebc48d2939b0ba6536':
h264: check that DPB is allocated before accessing it in flush_dpb()
vf_hqdn3d: fix uninitialized variable use
vf_gradfun: fix uninitialized variable use
Merged-by: Michael Niedermayer <michaelni@gmx.at>
* commit '7b89cd20d844cbe763ca34e63e99d110043cf241':
eamad: allocate a dummy reference frame when the real one is missing
Replace remaining includes of audioconvert.h with channel_layout.h
Replace some forgotten instances of PIX_FMT_* with AV_PIX_FMT_*.
Conflicts:
libavcodec/h264.c
Merged-by: Michael Niedermayer <michaelni@gmx.at>
* commit 'bcd0a7137e4aca0f6f598593b90ca8f338444c51':
configure: Add missing h264chroma dependencies to vp5, vp6
Add missing error_resilience includes to files that use ER
Conflicts:
configure
libavcodec/mpeg12.c
libavcodec/mpeg4videodec.c
Merged-by: Michael Niedermayer <michaelni@gmx.at>
This makes the decoder independent of mpegvideo.
This copy of the draw_horiz_band code is simplified compared to
the "generic" mpegvideo one which still has a number of special
cases for different codecs.
Signed-off-by: Martin Storsjö <martin@martin.st>
* commit '5da51284937649a8ebb84fa951c235438fcbf8ae':
cavs: Add a dependency on h264chroma
lavc: Split out ff_hwaccel_pixfmt_list_420[] over individual codecs
Conflicts:
libavcodec/h263dec.c
libavcodec/h264.c
libavcodec/mpeg12.c
Merged-by: Michael Niedermayer <michaelni@gmx.at>
* commit 'c10da30d8426a1f681d99a780b6e311f7fb4e5c5':
shorten: set invalid channels count to 0
vorbisdec: check memory allocations
h264: check for luma and chroma bit dept being equal
Conflicts:
libavcodec/shorten.c
Merged-by: Michael Niedermayer <michaelni@gmx.at>
Not all hwaccels implement all codecs, so using one single list for
multiple such codecs means some codecs will be represented in the list,
even though they don't actually handle that codec. Copying specific
lists in each codec fixes that.
Signed-off-by: Martin Storsjö <martin@martin.st>
The decoder assumes a single bit depth for all the planes
while the specification allows different bit depths for luma
and chroma.
Avoid the possible problems described in CVE-2013-2277
CC: libav-stable@libav.org
The code is located in mpegvideo, and it's likely that in a minimal
config, we don't want to include debug info anyway.
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
* commit '668e16a0dd1ff56d4beeff5c658d8a2a08dbfac8':
h264: on reference overflow, reset the reference count to 0, not 1.
Conflicts:
libavcodec/h264.c
Merged-by: Michael Niedermayer <michaelni@gmx.at>
* commit 'e671d3ad6cd7fe1d02e9b35b889a25d8c059fce9':
h264: do not copy ref count/ref2frm when updating per-frame context
flvdec: Check the return value of a malloc
Conflicts:
libavformat/flvdec.c
Merged-by: Michael Niedermayer <michaelni@gmx.at>
This is a regression introduced from the h264/mpegvideo split
Fixes out of array reads
Found-by: Mateusz "j00ru" Jurczyk and Gynvael Coldwind
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
Instead, only extend edges on-demand when the motion vector actually
crosses the visible decoded area using ff_emulated_edge_mc(). This
changes decoding time for cathedral from 8.722sec to 8.706sec, i.e.
0.2% faster overall. More generally (VP8 uses this also), low-motion
content gets significant speed improvements, whereas high-motion content
tends to decode in approximately the same time.
Signed-off-by: Martin Storsjö <martin@martin.st>
These functions are mostly H264-specific (the only other user I can
spot is bink), and this allows us to special-case some functionality
for H264. Also remove the 16-bit-coeff with >8bpp versions (unused)
and merge the duplicate 32-bit-coeff for >8bpp (identical).
Signed-off-by: Martin Storsjö <martin@martin.st>
The non-intra-pcm branch in hl_decode_mb (simple, 8bpp) goes from 700
to 672 cycles, and the complete loop of decode_mb_cabac and hl_decode_mb
(in the decode_slice loop) goes from 1759 to 1733 cycles on the clip
tested (cathedral), i.e. almost 30 cycles per mb faster.
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
Not all hwaccels implement all codecs, so using one single list for
multiple such codecs means some codecs will be represented in the list,
even though they don't actually handle that codec. Copying specific
lists in each codec fixes that.
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
Instead, only extend edges on-demand when the motion vector actually
crosses the visible decoded area using ff_emulated_edge_mc(). This
changes decoding time for cathedral from 8.722sec to 8.706sec, i.e.
0.2% faster overall. More generally (VP8 uses this also), low-motion
content gets significant speed improvements, whereas high-motion content
tends to decode in approximately the same time.
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
Instead, keep them in the bitstream buffer until we read them verbatim,
this saves a memcpy() and a subsequent clearing of the target buffer.
decode_cabac+decode_mb for a sample file (CAPM3_Sony_D.jsv) goes from
6121.4 to 6095.5 cycles, i.e. 26 cycles faster.
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
Some applications do not like that.
Fixes VDA
Reduces noise for VDPAU
Tested-by: Guillaume POIRIER <poirierg@gmail.com>
Tested-by: Carl Eugen Hoyos <cehoyos@ag.or.at>
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
Writing into uninitialized hw surfaces is not supported and triggers an assert inside avpriv_color_frame
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
Most of the changes are just trivial are just trivial replacements of
fields from MpegEncContext with equivalent fields in H264Context.
Everything in h264* other than h264.c are those trivial changes.
The nontrivial parts are:
1) extracting a simplified version of the frame management code from
mpegvideo.c. We don't need last/next_picture anymore, since h264 uses
its own more complex system already and those were set only to appease
the mpegvideo parts.
2) some tables that need to be allocated/freed in appropriate places.
3) hwaccels -- mostly trivial replacements.
for dxva, the draw_horiz_band() call is moved from
ff_dxva2_common_end_frame() to per-codec end_frame() callbacks,
because it's now different for h264 and MpegEncContext-based
decoders.
4) svq3 -- it does not use h264 complex reference system, so I just
added some very simplistic frame management instead and dropped the
use of ff_h264_frame_start(). Because of this I also had to move some
initialization code to svq3.
Additional fixes for chroma format and bit depth changes by
Janne Grunau <janne-libav@jannau.net>
Signed-off-by: Anton Khirnov <anton@khirnov.net>
Init code in that if statement goes down from 26716 cycles to 26047
cycles, i.e. the removal of the clear_blocks and smaller memcpy()
together save around 670 cycles.
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
These functions are mostly H264-specific (the only other user I can
spot is bink), and this allows us to special-case some functionality
for H264. Also remove the 16-bit-coeff with >8bpp versions (unused)
and merge the duplicate 32-bit-coeff for >8bpp (identical).
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
* commit '35685a3c2a1ec09f3c62dcfc4368fe9e92bcddf6':
dsputil: Move ff_shrink* function declarations to separate header
dsputil: Move ff_svq3 function declarations to a separate header
Merged-by: Michael Niedermayer <michaelni@gmx.at>
* commit 'f81c37e40fe3236d54da12aef9cdba48ba70ec31':
vf_delogo: fix an uninitialized read.
h264: remove obsolete comment.
mpegvideo: remove some unused variables from Picture.
utvideoenc/v410enc: do not set AVFrame.reference.
Merged-by: Michael Niedermayer <michaelni@gmx.at>
The existing checks are insufficient to detect a pixel format
changes in case of some damaged streams.
Fixes inconsistency and later out of array accesses
Found-by: Mateusz "j00ru" Jurczyk and Gynvael Coldwind
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
* commit '96753bd00d6d4046db6818c0aadc21bf2a11d77b':
dsputil: x86: Correct the number of registers used in put_no_rnd_pixels16_l2
dsputil: add missing HAVE_YASM guard
hwaccel: do not offer unsupported pixel formats
vdpau: add missing pixel format for H.264
Merged-by: Michael Niedermayer <michaelni@gmx.at>
The sh4 optimizations are removed, because the code is
100% identical to the C code, so it is unlikely to
provide any real practical benefit.
Signed-off-by: Diego Biurrun <diego@biurrun.de>
Signed-off-by: Ronald S. Bultje <rsbultje@gmail.com>
Signed-off-by: Luca Barbato <lu_zero@gentoo.org>
Without any correctly decoded slices, there can be no frame.
Fixes out of array reads
Found-by: Rafaël Carré
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
ref_list is constructed from other fields per slice when needed, so do
not copy it for both frame and slice threading.
default_ref_list is constructed per frame and still needs to be copied
to per-slice contexts for slice threading, but a copy is not needed for
frame threading.
Fixes out of array reads
Regression probably since allowing pixel format changes or a related commit
Found-by: Mateusz "j00ru" Jurczyk and Gynvael Coldwind
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
If the motion vector is at a subpixel position, we need 3 pixels below
the motion vector's wholepel position available, not 2, since the MC
filter is a sixtap filter for the hpel position, and then a bilin filter
for the qpel position.
This patch fixes highly irreproducible (0.1%) fate failures in frame 2
and 4 of h264-conformance-cama2_vtc_b (e.g. first P-frame, first field,
last line of MB x=40,y=2 and second field and last lines of MBs x=39-40,
y=3). These used pre-loopfilter instead of post-loopfilter data because
the await_progress() waited for one line too little in that field, and
the motion vector of these particular MBs happened to align exactly to a
position where that demonstrates the bug.
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
If the motion vector is at a subpixel position, we need 3 pixels below
the motion vector's wholepel position available, not 2, since the MC
filter is a sixtap filter for the hpel position, and then a bilin filter
for the qpel position.
This patch fixes highly irreproducible (0.1%) fate failures in frame 2
and 4 of h264-conformance-cama2_vtc_b (e.g. first P-frame, first field,
last line of MB x=40,y=2 and second field and last lines of MBs x=39-40,
y=3). These used pre-loopfilter instead of post-loopfilter data because
the await_progress() waited for one line too little in that field, and
the motion vector of these particular MBs happened to align exactly to a
position where that demonstrates the bug.
CC: libav-stable@libav.org
Signed-off-by: Luca Barbato <lu_zero@gentoo.org>
* commit 'd8c772de53d29afb1bada88afa859fce8489c668':
nutdec: Always return a value from nut_read_timestamp()
configure: Make warnings from -Wreturn-type fatal errors
x86: ABS2: port to cpuflags
vdpau: Remove av_unused attribute from function declaration
h264: fix ff_generate_sliding_window_mmcos() prototype.
Conflicts:
configure
libavformat/nutdec.c
Merged-by: Michael Niedermayer <michaelni@gmx.at>
Clobbering these tables will temporarily clobber the template used
as a basis for other threads to start decoding from. If the other
decoding thread updates from the template right at that moment,
subsequent threads will get invalid (or, usually, none at all) mmco
tables. This leads to invalid reference lists and subsequent decode
failures.
Therefore, instead, decode the mmco tables only for the first slice in
a field or frame. For other slices, decode the bits and ensure they
are identical to the mmco tables in the first slice, but don't ever
clobber the context state. This prevents other threads from using a
clobbered/invalid template as starting point for decoding, and thus
fixes decoding in these cases.
This fixes occasional (~1%) failures of h264-conformance-mr1_bt_a with
frame-multithreading enabled.
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
Clobbering these tables will temporarily clobber the template used
as a basis for other threads to start decoding from. If the other
decoding thread updates from the template right at that moment,
subsequent threads will get invalid (or, usually, none at all) mmco
tables. This leads to invalid reference lists and subsequent decode
failures.
Therefore, instead, decode the mmco tables only for the first slice in
a field or frame. For other slices, decode the bits and ensure they
are identical to the mmco tables in the first slice, but don't ever
clobber the context state. This prevents other threads from using a
clobbered/invalid template as starting point for decoding, and thus
fixes decoding in these cases.
This fixes occasional (~1%) failures of h264-conformance-mr1_bt_a with
frame-multithreading enabled.
Signed-off-by: Luca Barbato <lu_zero@gentoo.org>
Fixes null pointer dereference later, since if this function failed,
a positive return value was returned to the caller.
Found-by: Mateusz "j00ru" Jurczyk and Gynvael Coldwind
Signed-off-by: Martin Storsjö <martin@martin.st>