Avoid searching for the lowest bulk cost for each pixel that isn't a repeat/skip. Instead store the lowest cost as we go along each pixel, and use it as needed.
Signed-off-by: Malcolm Bechard <malcolm.bechard@gmail.com>
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
Also read data size for raw compressions too and
make sure its value is sane.
Remove code that fills missing blocks with zeroes.
It is marginally useful and make implementation
of actually useful features harder.
Signed-off-by: Paul B Mahol <onemda@gmail.com>
This is a regression introduced from the h264/mpegvideo split
Fixes out of array reads
Found-by: Mateusz "j00ru" Jurczyk and Gynvael Coldwind
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
This fixes crashes in chromium on win64 on machines with AVX
(crashes that apparently aren't triggered by fate).
Signed-off-by: Martin Storsjö <martin@martin.st>
* commit '8a11ce43d08352f7a290355ebb5b29c495ad9609':
build: Ensure that output directories for header objects are created
h264: Get rid of unnecessary casts
Conflicts:
common.mak
Merged-by: Michael Niedermayer <michaelni@gmx.at>
change the treatment of the strip y coordinates which previously did
not follow the description (nor did it behave like the binary decoder
on files with absolute strip offsets).
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
The new code is also faster and more robust.
As for the performance:
old decoder + conversion to rgb: fps = 2618
old decoder, without converting to rgb: fps = 4012
new decoder, producing rgb: fps = 4502
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
This gets rid of a number of warnings about casts discarding
qualifiers from the pointer target, present since 7ebfb466a.
Signed-off-by: Martin Storsjö <martin@martin.st>
Instead, only extend edges on-demand when the motion vector actually
crosses the visible decoded area using ff_emulated_edge_mc(). This
changes decoding time for cathedral from 8.722sec to 8.706sec, i.e.
0.2% faster overall. More generally (VP8 uses this also), low-motion
content gets significant speed improvements, whereas high-motion content
tends to decode in approximately the same time.
Signed-off-by: Martin Storsjö <martin@martin.st>
Instead, keep them in the bitstream buffer until we read them verbatim,
this saves a memcpy() and a subsequent clearing of the target buffer.
decode_cabac+decode_mb for a sample file (CAPM3_Sony_D.jsv) goes from
6121.4 to 6095.5 cycles, i.e. 26 cycles faster.
Signed-off-by: Martin Storsjö <martin@martin.st>
This allows more transparent mixing of get_bits and whole-byte access
without having to touch get_bits internals.
Signed-off-by: Martin Storsjö <martin@martin.st>
These functions are mostly H264-specific (the only other user I can
spot is bink), and this allows us to special-case some functionality
for H264. Also remove the 16-bit-coeff with >8bpp versions (unused)
and merge the duplicate 32-bit-coeff for >8bpp (identical).
Signed-off-by: Martin Storsjö <martin@martin.st>
The non-alpha and alpha-Y planes are cleared in the idct_put/add()
calls. For the alpha U/V planes, we only care about the DC for entropy
context prediction purposes, the rest of the data is unused.
Signed-off-by: Martin Storsjö <martin@martin.st>
The non-intra-pcm branch in hl_decode_mb (simple, 8bpp) goes from 700
to 672 cycles, and the complete loop of decode_mb_cabac and hl_decode_mb
(in the decode_slice loop) goes from 1759 to 1733 cycles on the clip
tested (cathedral), i.e. almost 30 cycles per mb faster.
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
The residual block data of 16x16 blocks was ignored for b-frames, which
leads to easy-to-identify artifacts. After this patch, the artifacts are
gone. Sample video: svq3_watermark.mov. (Fate results unaffected.)
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
Reference:
commit 3615e2be84
Author: Michael Niedermayer <michaelni@gmx.at>
Date: Tue Dec 2 22:02:57 2003 +0000
h263_h_loop_filter_mmx
Originally committed as revision 2553 to svn://svn.ffmpeg.org/ffmpeg/trunk
commit 359f98ded9
Author: Michael Niedermayer <michaelni@gmx.at>
Date: Tue Dec 2 20:28:10 2003 +0000
h263_v_loop_filter_mmx
Originally committed as revision 2552 to svn://svn.ffmpeg.org/ffmpeg/trunk
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
* qatar/master:
x86: dsputil: Fix h263 loop filter link error in some configurations
Conflicts:
libavcodec/x86/dsputil.asm
Merged-by: Michael Niedermayer <michaelni@gmx.at>
Not all hwaccels implement all codecs, so using one single list for
multiple such codecs means some codecs will be represented in the list,
even though they don't actually handle that codec. Copying specific
lists in each codec fixes that.
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
This was caused by unconditionally referencing a conditionally compiled
table. Now the code is also compiled conditionally.
Signed-off-by: Diego Biurrun <diego@biurrun.de>
This avoids SIMD-optimized functions having to sign-extend their
line size argument manually to be able to do pointer arithmetic.
Signed-off-by: Diego Biurrun <diego@biurrun.de>
The symbol "ff_h263_loop_filter_strength" is defined in h263.c, but
the h263 loopfilter functions (in the .asm file) are not optimized
out (even though their function pointers are never assigned).
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
Instead, only extend edges on-demand when the motion vector actually
crosses the visible decoded area using ff_emulated_edge_mc(). This
changes decoding time for cathedral from 8.722sec to 8.706sec, i.e.
0.2% faster overall. More generally (VP8 uses this also), low-motion
content gets significant speed improvements, whereas high-motion content
tends to decode in approximately the same time.
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
* commit '8837f4396a1a458a0efb07fe7daba7b847755a7a':
libopencore-amrwb: Make AMR-WB ifdeffery more precise
Merged-by: Michael Niedermayer <michaelni@gmx.at>
* commit '56632fef65c0cb6946ed3648ded3d7b82e5c5c17':
libopencore-amrnb: cosmetics: Group all encoder-related code together
Merged-by: Michael Niedermayer <michaelni@gmx.at>
Instead, keep them in the bitstream buffer until we read them verbatim,
this saves a memcpy() and a subsequent clearing of the target buffer.
decode_cabac+decode_mb for a sample file (CAPM3_Sony_D.jsv) goes from
6121.4 to 6095.5 cycles, i.e. 26 cycles faster.
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
Some applications do not like that.
Fixes VDA
Reduces noise for VDPAU
Tested-by: Guillaume POIRIER <poirierg@gmail.com>
Tested-by: Carl Eugen Hoyos <cehoyos@ag.or.at>
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
The library might provide an encoder in the future, so it's better to
check for the presence of the decoder rather than just the library.
CC: libav-stable@libav.org
Signed-off-by: Luca Barbato <lu_zero@gentoo.org>
The user can provide a password even when the stream
is not encrypted, so check the value of s->format
instead of s->pass in ttafilter_init().
Signed-off-by: James Almer <jamrial@gmail.com>
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
Writing into uninitialized hw surfaces is not supported and triggers an assert inside avpriv_color_frame
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
* commit 'd2a25c4032ce6ceabb0f51b5c1e6ca865395a793':
get_buffer(): do not initialize the data.
Conflicts:
libavcodec/utils.c
Merged-by: Michael Niedermayer <michaelni@gmx.at>
Most of the changes are just trivial are just trivial replacements of
fields from MpegEncContext with equivalent fields in H264Context.
Everything in h264* other than h264.c are those trivial changes.
The nontrivial parts are:
1) extracting a simplified version of the frame management code from
mpegvideo.c. We don't need last/next_picture anymore, since h264 uses
its own more complex system already and those were set only to appease
the mpegvideo parts.
2) some tables that need to be allocated/freed in appropriate places.
3) hwaccels -- mostly trivial replacements.
for dxva, the draw_horiz_band() call is moved from
ff_dxva2_common_end_frame() to per-codec end_frame() callbacks,
because it's now different for h264 and MpegEncContext-based
decoders.
4) svq3 -- it does not use h264 complex reference system, so I just
added some very simplistic frame management instead and dropped the
use of ff_h264_frame_start(). Because of this I also had to move some
initialization code to svq3.
Additional fixes for chroma format and bit depth changes by
Janne Grunau <janne-libav@jannau.net>
Signed-off-by: Anton Khirnov <anton@khirnov.net>
* qatar/master:
doc/platform: Fix 10l typo
dsputil: Move STRIDE_ALIGN macro to the only place it is used
Conflicts:
libavcodec/dsputil.h
Merged-by: Michael Niedermayer <michaelni@gmx.at>
Matroska specification lists support for BlockAdditional element
which is not supported by ffmpeg's matroska parser. This patch
adds grammar definitions for parsing that element (and few other
related elements) and then puts the data in AVPacket.side_data
with new AVPacketSideDataType AV_PKT_DATA_MATROSKA_BLOCKADDITIONAL.
Signed-off-by: Vignesh Venkatasubramanian <vigneshv@google.com>
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
Init code in that if statement goes down from 26716 cycles to 26047
cycles, i.e. the removal of the clear_blocks and smaller memcpy()
together save around 670 cycles.
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
* commit '1647da89dd8ac09a55c111589f7a30d7e6b87d90':
lavr: make sure that the mix function is reset even if no mixing will be done
lavr: print out the mix matrix in ff_audio_mix_set_matrix()
ws-snd1: decode directly to the user-provided AVFrame
wmavoice: decode directly to the user-provided AVFrame
Merged-by: Michael Niedermayer <michaelni@gmx.at>
* commit '205a95f7b5178362874bc1e65eae9866723491c1':
wmaenc: alloc/free coded_frame instead of keeping it in the WMACodecContext
wma: decode directly to the user-provided AVFrame
wmapro: decode directly to the user-provided AVFrame
wavpack: decode directly to the user-provided AVFrame
Conflicts:
libavcodec/wavpack.c
Merged-by: Michael Niedermayer <michaelni@gmx.at>
* commit 'ee6ca11b657515ad736ec0d2b8635e098d0a2680':
vorbis: decode directly to the user-provided AVFrame
vmdaudio: decode directly to the user-provided AVFrame
twinvq: decode directly to the user-provided AVFrame
tta: decode directly to the user-provided AVFrame
truespeech: decode directly to the user-provided AVFrame
Conflicts:
libavcodec/tta.c
libavcodec/twinvq.c
Merged-by: Michael Niedermayer <michaelni@gmx.at>
* commit '4a2b26fc1b1ad123eba473a20e270f2b0ba92bca':
tak: decode directly to the user-provided AVFrame
smackaud: decode directly to the user-provided AVFrame
sipr: decode directly to the user-provided AVFrame
shorten: decode directly to the user-provided AVFrame
Conflicts:
libavcodec/shorten.c
libavcodec/takdec.c
Merged-by: Michael Niedermayer <michaelni@gmx.at>
* commit '5d5c248c3df30fa91a8dde639618c985b9a11c53':
s302m: decode directly to the user-provided AVFrame
ra288: decode directly to the user-provided AVFrame
ra144: decode directly to the user-provided AVFrame
ralf: decode directly to the user-provided AVFrame
qdm2: decode directly to the user-provided AVFrame
Merged-by: Michael Niedermayer <michaelni@gmx.at>
* commit '1b9b6d6e5ea556b6d307f9d473f54f6406fdc3c8':
qcelp: decode directly to the user-provided AVFrame
pcm-bluray: decode directly to the user-provided AVFrame
nellymoser: decode directly to the user-provided AVFrame
mpc7/8: decode directly to the user-provided AVFrame
mpegaudio: decode directly to the user-provided AVFrame
mlp/truehd: decode directly to the user-provided AVFrame
Conflicts:
libavcodec/mpc7.c
Merged-by: Michael Niedermayer <michaelni@gmx.at>
* commit '86bfcfcf2364bc837b7bb582c66a8a15a332414f':
mace: decode directly to the user-provided AVFrame
libspeex: decode directly to the user-provided AVFrame
libopus: decode directly to the user-provided AVFrame
libopencore-amr: decode directly to the user-provided AVFrame
libgsm: decode directly to the user-provided AVFrame
Conflicts:
libavcodec/libopusdec.c
libavcodec/mace.c
Merged-by: Michael Niedermayer <michaelni@gmx.at>
* commit 'a8ea936a0a00570f61a16a588821b52f6a3115c2':
libilbc: decode directly to the user-provided AVFrame
dpcm: decode directly to the user-provided AVFrame
imc/iac: decode directly to the user-provided AVFrame
gsm: decode directly to the user-provided AVFrame
Conflicts:
libavcodec/dpcm.c
Merged-by: Michael Niedermayer <michaelni@gmx.at>
* commit 'cb7b47a61dba0b9329ecede5dd3211dc0662dc05':
g726: decode directly to the user-provided AVFrame
g723.1: decode directly to the user-provided AVFrame
g722: decode directly to the user-provided AVFrame
flac: decode directly to the user-provided AVFrame
cinaudio: decode directly to the user-provided AVFrame
Conflicts:
libavcodec/g723_1.c
Merged-by: Michael Niedermayer <michaelni@gmx.at>
* commit '182821cff43f5f977004d105b86c47ceb20d00d6':
dca: decode directly to the user-provided AVFrame
cook: decode directly to the user-provided AVFrame
comfortnoise: decode directly to the user-provided AVFrame
bmvaudio: decode directly to the user-provided AVFrame
pcm: decode directly to the user-provided AVFrame
Conflicts:
libavcodec/pcm.c
Merged-by: Michael Niedermayer <michaelni@gmx.at>
* commit '5cc0bd2cb47cbb1040f2bb0ded8d72a442c79b20':
binkaudio: decode directly to the user-provided AVFrame
atrac3: decode directly to the user-provided AVFrame
atrac1: decode directly to the user-provided AVFrame
ape: decode directly to the user-provided AVFrame
amrwb: decode directly to the user-provided AVFrame
Conflicts:
libavcodec/amrwbdec.c
Merged-by: Michael Niedermayer <michaelni@gmx.at>
* commit 'e3db34291f4401a16f6ac92721617a9f33cd4c31':
amrnb: decode directly to the user-provided AVFrame
als: decode directly to the user-provided AVFrame
alac: decode directly to the user-provided AVFrame
adxenc: alloc/free coded_frame instead of keeping it in the ADXContext
adx: decode directly to the user-provided AVFrame
Conflicts:
libavcodec/alsdec.c
libavcodec/amrnbdec.c
Merged-by: Michael Niedermayer <michaelni@gmx.at>
* commit 'e57daa876bf0cf50782550e366e589441cd8c2bd':
adpcm: decode directly to the user-provided AVFrame
ac3: decode directly to the user-provided AVFrame
aac: decode directly to the user-provided AVFrame
8svx: decode directly to the user-provided AVFrame
Conflicts:
libavcodec/8svx.c
libavcodec/ac3dec.c
libavcodec/adpcm.c
Merged-by: Michael Niedermayer <michaelni@gmx.at>
* commit '304b806cb524fb040f8e09a241040f1af2cb820b':
build: Make library minor version visible in the Makefile
x86: mpeg4qpel: Make movsxifnidn do the right thing
Merged-by: Michael Niedermayer <michaelni@gmx.at>
These functions are mostly H264-specific (the only other user I can
spot is bink), and this allows us to special-case some functionality
for H264. Also remove the 16-bit-coeff with >8bpp versions (unused)
and merge the duplicate 32-bit-coeff for >8bpp (identical).
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
Fixes integer overflow and out of array accesses
Found-by: Mateusz "j00ru" Jurczyk and Gynvael Coldwind
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
Also use the resulting 16bpp functions for anything >8 and <=16, not just
9 and 10. This fixes 12 and 14bpp H264 support.
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
This allows more transparent mixing of get_bits and whole-byte access
without having to touch get_bits internals.
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
* commit '36fab50e90d15352e403e4cc210890810f2fb4e2':
asfdec: silence a warning
mss4, ra288: Remove unused DSPContext local codec context members
Merged-by: Michael Niedermayer <michaelni@gmx.at>
get_pix_fmt_score() returns a score representing the amount
of loss when converting a pixel format
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>