* commit '92e598a57a7ce4b8ac9ea56274af39f5fd888311':
prores: Drop DSP infrastructure for prores encoder bits
Conflicts:
libavcodec/Makefile
libavcodec/proresdsp.c
libavcodec/proresenc_kostya.c
Note, these changes only affect one of the 2 prores encoders we have
If someone wants to add optimizations to the affected encoder, or needs/wants
this infrastructure, then iam happy to revert this
Merged-by: Michael Niedermayer <michaelni@gmx.at>
AAC LOAS can have new audio config objects in the stream itself.
Make sure the decoder reconfigures itself when the first one arrives
midstream.
Bug-Id: 644
Signed-off-by: Luca Barbato <lu_zero@gentoo.org>
The vector dequantization has a test in a loop preventing effective SIMD
implementation. By moving it out of the loop, this loop can be DSPized.
Therefore, modify the current DSP implementation. In particular, the
DSP implementation no longer has to handle null loop sizes.
The decode_hf implementations have following timings:
For x86 Arrandale:
C SSE SSE2 SSE4
win32: 260 162 119 104
win64: 242 N/A 89 72
The arm NEON optimizations follow in a later patch as external asm. The
now unused check for the y modifier in arm inline asm is removed from
configure.
Based on a patch from Christophe Gisquet.
Unrolling of the m == 0 case avoids a possible use of the uninitilized
value sum when s->predictor_history is not set. I failed to find a
sample for it. It also reduced the cycle count from 220 to 150 on
sandy bridge, x86_64 linux, gcc 4.8.2 compared to his patch.
Timings for Arrandale:
C SSE
win32: 2108 334
win64: 1152 322
Factorizing the inner loop with a call/jmp is a >15 cycles cost, even with
the jmp destination being aligned.
Unrolling for ARCH_X86_64 is a 20 cycles gain.
Signed-off-by: Janne Grunau <janne-libav@jannau.net>
The scaling factor is constant so it is faster to scale the
FIR coefficients in the tables during compilation.
Signed-off-by: Janne Grunau <janne-libav@jannau.net>
There's an SSE2 version as well, and x86_64 guarantees that
instruction set is present.
Signed-off-by: James Almer <jamrial@gmail.com>
Reviewed-by: "Ronald S. Bultje" <rsbultje@gmail.com>
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
Also fail if AV_EF_EXPLODE is set.
We do not fail by default, but rather return some image as it may be usefull to the
end user to see what is on the image, for example text could be read quite fine and
objects recognized.
Possibly fixes Ticket3424
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
Framerate is now a sane rational instead of an integer, and
inputDepth is changed to what it actually is.
Signed-off-by: Derek Buitenhuis <derek.buitenhuis@gmail.com>
Framerate is now a sane rational instead of an integer, and
inputDepth is changed to what it actually is.
Signed-off-by: Derek Buitenhuis <derek.buitenhuis@gmail.com>
* qatar/master:
Use av_frame_copy() to simplify code where appropriate.
Conflicts:
libavfilter/vf_copy.c
Merged-by: Michael Niedermayer <michaelni@gmx.at>
* commit '8feac29cc46270cc89d6016340e7bac780877131':
lavc: use AVFrame API properly in ff_reget_buffer()
Merged-by: Michael Niedermayer <michaelni@gmx.at>
We need the emulation to support the cases where the first
argument is the same as the fourth. To achieve this a fifth
argument working as a temporary may be needed.
Emulation that doesn't obey the original instruction semantics
can't be in x86inc.
Signed-off-by: James Almer <jamrial@gmail.com>
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
* commit '8eeacf31c5ea37baf6b222dc38d20cf4fd33c455':
hevc: Do not left shift a negative value in hevc_loop_filter_chroma
Merged-by: Michael Niedermayer <michaelni@gmx.at>
* commit 'ff486c0f7f6b2ace3f0238660bc06cc35b389676':
hevc: Do not right shift a negative value in get_pcm
Merged-by: Michael Niedermayer <michaelni@gmx.at>
* commit '50c988aa6d6c6f0ceb8f922bcea34800b56b85d9':
hevc: Drop unnecessary shifts in deblocking_filter_CTB
Merged-by: Michael Niedermayer <michaelni@gmx.at>
First pixel was computed based on invalid address read, and then
corrected by the following memcpy. After the commit, it's not computed
anymore, and memcpy fills the appropriate area.
Fixes Ticket #3387
Fixes out of array read
Fixes: 1cb91c36c4e55463f14aacb9bdf55b38-asan_heap-oob_106cbce_5617_cov_11212800_h264_mmx_chroma_intra_lf.mp4
Found-by: Mateusz "j00ru" Jurczyk and Gynvael Coldwind
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
* qatar/master:
libvorbis: Give consistent names to all functions, structs, and defines
Conflicts:
libavcodec/libvorbisenc.c
Merged-by: Michael Niedermayer <michaelni@gmx.at>
* commit '9c029f67ca82147ddfa83a1546ee1e109e11fbd4':
aarch64: use EXTERN_ASM consistently for exported symbols
Merged-by: Michael Niedermayer <michaelni@gmx.at>
And use the value from the specification.
Sample-Id: 00000451-google
Found-by: Mateusz j00ru Jurczyk and Gynvael Coldwind
CC: libav-stable@libav.org
Signed-off-by: Luca Barbato <lu_zero@gentoo.org>
* commit '017a06a9ee86b047079166c2694c9c655ff03356':
x86: dsputil: Use correct file name as multiple inclusion guard
Merged-by: Michael Niedermayer <michaelni@gmx.at>
* commit 'ba42c852477e87f6e47a5587e8f7829c46c52032':
bit_depth_template: Use file name as multiple inclusion guard
Merged-by: Michael Niedermayer <michaelni@gmx.at>
Makes fate-h264 pass under valgrind --undef-value-errors=yes with
-cpuflags none. {avg,put}_h264_chroma_mc8_8 approximately 5% faster,
{avg,put}_h264_chroma_mc4_8 2% faster both on x86 and arm.
Fixes out of array read
Fixes: 08e48e9daae7d8f8ab6dbe3919e797e5-asan_heap-oob_157461c_5295_cov_1266798650_firefing.mpg
Found-by: Mateusz "j00ru" Jurczyk and Gynvael Coldwind
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
* commit '175e5063320f585118a5461f15dbacf2ce17e97d':
hevc: Mention the missing SPS in the error message
Merged-by: Michael Niedermayer <michaelni@gmx.at>
EXR files have, like tiffs, multiple channels and layers. I have a
patch for exr.c that allows you to select the layer you want to process
thru a -layer flag. It is not bulletproof but works for all layers that
have 3 channels in them (normals, motion vectors, etc).
The calling convention for ffmpeg is:
ffmpeg -layer Diffuse -i myexr.%d.exr test.mov
Here's an exr image with multiple layers:
http://www.datafilehost.com/d/e45d9a1c
Reviewed-by: Paul B Mahol <onemda@gmail.com>
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
Fixes out of array read
Fixes: d4476f68ca1c1c57afbc45806f581963-asan_heap-oob_2266b27_8607_cov_4044577381_snow_chroma_bug.avi
Found-by: Mateusz "j00ru" Jurczyk and Gynvael Coldwind
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
* commit '4d7ab5cfebef91820af2933ef2f622ea598e6b53':
doxygen: Add a number of missing function parameter descriptions
Conflicts:
libavformat/avformat.h
Merged-by: Michael Niedermayer <michaelni@gmx.at>
* commit '614b9e4db8f3d7c23fc0410fc04745a727a82f4e':
h264: use avpriv_request_sample for chroma_format_idc
Conflicts:
libavcodec/h264_ps.c
Merged-by: Michael Niedermayer <michaelni@gmx.at>
Fixes out of array read
Fixes: 5f9698e86d92f19bb08d54ff0d57027f-signal_sigsegv_b30756_3795_cov_2693691257_ansi256.ans
Found-by: Mateusz "j00ru" Jurczyk and Gynvael Coldwind
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
Results are from a Win64 build running on an AMD FX 6300
1121 decicycles in ttafilter_process_dec_c, 16777112 runs, 104 skips
522 decicycles in ff_ttafilter_process_dec_ssse3, 16777149 runs, 67 skips
477 decicycles in ff_ttafilter_process_dec_sse4, 16777156 runs, 60 skips
Signed-off-by: James Almer <jamrial@gmail.com>
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
The advantage of this is that the is32x32 division branch in
decode_coeffs_b is removed from the inner loop to outside the block
coef decoding loop in decode_coeffs. Also, it allows us to merge the
txsz branches from the block coef decoding loop, the context merge
and the context split.
Fixes out of array read
Fixes: caa65cc01655505705129b677189f036-signal_sigsegv_fdcc43_2681_cov_3043376737_PPH422I5_Panasonic_A.264
Found-by: Mateusz "j00ru" Jurczyk and Gynvael Coldwind
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
Fixes out of array access
Fixes: 14a74a0a2dc67ede543f0e35d834fbbe-asan_heap-oob_49572c_556_cov_215466444_44_001_engine_room.mov
Found-by: Mateusz "j00ru" Jurczyk and Gynvael Coldwind
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
* qatar/master:
g2meet: validate bpp and bitmasks in the display info
Conflicts:
libavcodec/g2meet.c
See: ae95b2f810
Merged-by: Michael Niedermayer <michaelni@gmx.at>
Fixes out of array access
Fixes: abd3c041acbcb816be113455d138166b-asan_heap-oob_b11634_3707_cov_1707137151_als_05_2ch48k16b.mp4
Found-by: Mateusz "j00ru" Jurczyk and Gynvael Coldwind
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
Fixes compilation with flac decoder disabled and encoder enabled
Signed-off-by: James Almer <jamrial@gmail.com>
Reviewed-by: Paul B Mahol <onemda@gmail.com>
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
* commit '9cd4bc41760f8ad879e248920eacbe1e7757152c':
ac3dec: set AV_FRAME_DATA_DOWNMIX_INFO side data.
Conflicts:
libavcodec/version.h
Merged-by: Michael Niedermayer <michaelni@gmx.at>
Fixes inconsistency and out of array accesses
Fixes: 10cdd7e63e7f66e3e66273939e0863dd-asan_heap-oob_1a4ff32_7078_cov_4056274555_mov_h264_aac__mp4box_frag.mp4
Found-by: Mateusz "j00ru" Jurczyk and Gynvael Coldwind
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
* qatar/master:
h264_parser: use enum values in h264_find_frame_end()
Conflicts:
libavcodec/h264_parser.c
Merged-by: Michael Niedermayer <michaelni@gmx.at>
I found the optimisation of 2 samples per iteration obscured the
underlying algorithm. I had to write it out on paper and translate into
a mathematical sum to see that the two samples are unconnected. I hope
that if anyone else is struggling to understand the code that this will
be useful.
Reviewed-by: Christophe Gisquet <christophe.gisquet@gmail.com>
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>