AVCodecContext.bits_per_raw_sample is updated from the previous thread
in the generic update function before the codec specific update_thread
function is called. The check for reinitialization of dsp functions uses
bits_per_raw_sample. When called from update_thread_context it will be
already at the current value and the dsp functions aren't updated if
only the bit depth changes.
* commit 'c3ebfcd6e1327ca7bbcaee822e593c2da6cfd352':
mpegvideo: allocate hwaccel privdata after the frame buffer
h264: allocate hwaccel privdata after the frame buffer
Merged-by: Michael Niedermayer <michaelni@gmx.at>
This ensures the hwaccel privdata does not leak when a frame buffer could
not be allocated (and toggle the assert when the frame is re-used).
Having no frame buffer available is quite common when using the DXVA2
hwaccel in situations where the DXVA2 renderer is being re-allocated, for
example when moving between displays.
Signed-off-by: Anton Khirnov <anton@khirnov.net>
dct_bits is never set except in h264, where it is never used, thus
remove it. Then, remove all functions that were set based on non-zero
(32) values for dct_bits. Lastly, merge 9-14 bpp functions for get_pixels
and draw_edge, which only care about pixel storage unit size, not actual
bits used (i.e. they don't clip).
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
* commit 'a4472ac01e86f9fae5adb9034f2777b86a9c5480':
Add informative messages to av_log_ask_for_sample calls lacking them
anm: Get rid of some very silly goto statements
Conflicts:
libavformat/anm.c
Merged-by: Michael Niedermayer <michaelni@gmx.at>
* commit '555000c7d5c1e13043a948ebc48d2939b0ba6536':
h264: check that DPB is allocated before accessing it in flush_dpb()
vf_hqdn3d: fix uninitialized variable use
vf_gradfun: fix uninitialized variable use
Merged-by: Michael Niedermayer <michaelni@gmx.at>
* commit '7b89cd20d844cbe763ca34e63e99d110043cf241':
eamad: allocate a dummy reference frame when the real one is missing
Replace remaining includes of audioconvert.h with channel_layout.h
Replace some forgotten instances of PIX_FMT_* with AV_PIX_FMT_*.
Conflicts:
libavcodec/h264.c
Merged-by: Michael Niedermayer <michaelni@gmx.at>
* commit 'bcd0a7137e4aca0f6f598593b90ca8f338444c51':
configure: Add missing h264chroma dependencies to vp5, vp6
Add missing error_resilience includes to files that use ER
Conflicts:
configure
libavcodec/mpeg12.c
libavcodec/mpeg4videodec.c
Merged-by: Michael Niedermayer <michaelni@gmx.at>
This makes the decoder independent of mpegvideo.
This copy of the draw_horiz_band code is simplified compared to
the "generic" mpegvideo one which still has a number of special
cases for different codecs.
Signed-off-by: Martin Storsjö <martin@martin.st>
* commit '5da51284937649a8ebb84fa951c235438fcbf8ae':
cavs: Add a dependency on h264chroma
lavc: Split out ff_hwaccel_pixfmt_list_420[] over individual codecs
Conflicts:
libavcodec/h263dec.c
libavcodec/h264.c
libavcodec/mpeg12.c
Merged-by: Michael Niedermayer <michaelni@gmx.at>
* commit 'c10da30d8426a1f681d99a780b6e311f7fb4e5c5':
shorten: set invalid channels count to 0
vorbisdec: check memory allocations
h264: check for luma and chroma bit dept being equal
Conflicts:
libavcodec/shorten.c
Merged-by: Michael Niedermayer <michaelni@gmx.at>
Not all hwaccels implement all codecs, so using one single list for
multiple such codecs means some codecs will be represented in the list,
even though they don't actually handle that codec. Copying specific
lists in each codec fixes that.
Signed-off-by: Martin Storsjö <martin@martin.st>
The decoder assumes a single bit depth for all the planes
while the specification allows different bit depths for luma
and chroma.
Avoid the possible problems described in CVE-2013-2277
CC: libav-stable@libav.org
The code is located in mpegvideo, and it's likely that in a minimal
config, we don't want to include debug info anyway.
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
* commit '668e16a0dd1ff56d4beeff5c658d8a2a08dbfac8':
h264: on reference overflow, reset the reference count to 0, not 1.
Conflicts:
libavcodec/h264.c
Merged-by: Michael Niedermayer <michaelni@gmx.at>
* commit 'e671d3ad6cd7fe1d02e9b35b889a25d8c059fce9':
h264: do not copy ref count/ref2frm when updating per-frame context
flvdec: Check the return value of a malloc
Conflicts:
libavformat/flvdec.c
Merged-by: Michael Niedermayer <michaelni@gmx.at>
This is a regression introduced from the h264/mpegvideo split
Fixes out of array reads
Found-by: Mateusz "j00ru" Jurczyk and Gynvael Coldwind
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
Instead, only extend edges on-demand when the motion vector actually
crosses the visible decoded area using ff_emulated_edge_mc(). This
changes decoding time for cathedral from 8.722sec to 8.706sec, i.e.
0.2% faster overall. More generally (VP8 uses this also), low-motion
content gets significant speed improvements, whereas high-motion content
tends to decode in approximately the same time.
Signed-off-by: Martin Storsjö <martin@martin.st>
These functions are mostly H264-specific (the only other user I can
spot is bink), and this allows us to special-case some functionality
for H264. Also remove the 16-bit-coeff with >8bpp versions (unused)
and merge the duplicate 32-bit-coeff for >8bpp (identical).
Signed-off-by: Martin Storsjö <martin@martin.st>
The non-intra-pcm branch in hl_decode_mb (simple, 8bpp) goes from 700
to 672 cycles, and the complete loop of decode_mb_cabac and hl_decode_mb
(in the decode_slice loop) goes from 1759 to 1733 cycles on the clip
tested (cathedral), i.e. almost 30 cycles per mb faster.
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
Not all hwaccels implement all codecs, so using one single list for
multiple such codecs means some codecs will be represented in the list,
even though they don't actually handle that codec. Copying specific
lists in each codec fixes that.
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
Instead, only extend edges on-demand when the motion vector actually
crosses the visible decoded area using ff_emulated_edge_mc(). This
changes decoding time for cathedral from 8.722sec to 8.706sec, i.e.
0.2% faster overall. More generally (VP8 uses this also), low-motion
content gets significant speed improvements, whereas high-motion content
tends to decode in approximately the same time.
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
Instead, keep them in the bitstream buffer until we read them verbatim,
this saves a memcpy() and a subsequent clearing of the target buffer.
decode_cabac+decode_mb for a sample file (CAPM3_Sony_D.jsv) goes from
6121.4 to 6095.5 cycles, i.e. 26 cycles faster.
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
Some applications do not like that.
Fixes VDA
Reduces noise for VDPAU
Tested-by: Guillaume POIRIER <poirierg@gmail.com>
Tested-by: Carl Eugen Hoyos <cehoyos@ag.or.at>
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
Writing into uninitialized hw surfaces is not supported and triggers an assert inside avpriv_color_frame
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
Most of the changes are just trivial are just trivial replacements of
fields from MpegEncContext with equivalent fields in H264Context.
Everything in h264* other than h264.c are those trivial changes.
The nontrivial parts are:
1) extracting a simplified version of the frame management code from
mpegvideo.c. We don't need last/next_picture anymore, since h264 uses
its own more complex system already and those were set only to appease
the mpegvideo parts.
2) some tables that need to be allocated/freed in appropriate places.
3) hwaccels -- mostly trivial replacements.
for dxva, the draw_horiz_band() call is moved from
ff_dxva2_common_end_frame() to per-codec end_frame() callbacks,
because it's now different for h264 and MpegEncContext-based
decoders.
4) svq3 -- it does not use h264 complex reference system, so I just
added some very simplistic frame management instead and dropped the
use of ff_h264_frame_start(). Because of this I also had to move some
initialization code to svq3.
Additional fixes for chroma format and bit depth changes by
Janne Grunau <janne-libav@jannau.net>
Signed-off-by: Anton Khirnov <anton@khirnov.net>
Init code in that if statement goes down from 26716 cycles to 26047
cycles, i.e. the removal of the clear_blocks and smaller memcpy()
together save around 670 cycles.
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
These functions are mostly H264-specific (the only other user I can
spot is bink), and this allows us to special-case some functionality
for H264. Also remove the 16-bit-coeff with >8bpp versions (unused)
and merge the duplicate 32-bit-coeff for >8bpp (identical).
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
* commit '35685a3c2a1ec09f3c62dcfc4368fe9e92bcddf6':
dsputil: Move ff_shrink* function declarations to separate header
dsputil: Move ff_svq3 function declarations to a separate header
Merged-by: Michael Niedermayer <michaelni@gmx.at>
* commit 'f81c37e40fe3236d54da12aef9cdba48ba70ec31':
vf_delogo: fix an uninitialized read.
h264: remove obsolete comment.
mpegvideo: remove some unused variables from Picture.
utvideoenc/v410enc: do not set AVFrame.reference.
Merged-by: Michael Niedermayer <michaelni@gmx.at>
The existing checks are insufficient to detect a pixel format
changes in case of some damaged streams.
Fixes inconsistency and later out of array accesses
Found-by: Mateusz "j00ru" Jurczyk and Gynvael Coldwind
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
* commit '96753bd00d6d4046db6818c0aadc21bf2a11d77b':
dsputil: x86: Correct the number of registers used in put_no_rnd_pixels16_l2
dsputil: add missing HAVE_YASM guard
hwaccel: do not offer unsupported pixel formats
vdpau: add missing pixel format for H.264
Merged-by: Michael Niedermayer <michaelni@gmx.at>
The sh4 optimizations are removed, because the code is
100% identical to the C code, so it is unlikely to
provide any real practical benefit.
Signed-off-by: Diego Biurrun <diego@biurrun.de>
Signed-off-by: Ronald S. Bultje <rsbultje@gmail.com>
Signed-off-by: Luca Barbato <lu_zero@gentoo.org>
Without any correctly decoded slices, there can be no frame.
Fixes out of array reads
Found-by: Rafaël Carré
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
ref_list is constructed from other fields per slice when needed, so do
not copy it for both frame and slice threading.
default_ref_list is constructed per frame and still needs to be copied
to per-slice contexts for slice threading, but a copy is not needed for
frame threading.