Old Intel GPUs expect the reference frame index to the actual surface,
instead of the index into RefFrameList as specified by the spec.
This workaround should be set when using one of the "ClearVideo" decoder
devices.
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
The latest H.264 DXVA specification states that the index in this
structure should refer to a valid entry in the RefFrameList of the picture
parameter structure, and not to the actual surface index.
Fixes H.264 DXVA2 decoding on recent Intel GPUs (tested on Sandy and Ivy)
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
* commit 'b66382101cff33e2ce66500327a90d0a105eedeb':
dxva2: Increase maximum number of slices for mpeg2
See: bceeccc648
Merged-by: Michael Niedermayer <michaelni@gmx.at>
* commit 'd48430c367947a64647c6959cf472f2c01778b17':
build: Let the SVQ3 decoder depend on the H.264 decoder
Conflicts:
libavcodec/Makefile
Merged-by: Michael Niedermayer <michaelni@gmx.at>
Some content requires an higher number of slices in order to
render properly.
Rise the number to 1024 and warn if ever it exceeds.
Signed-off-by: Luca Barbato <lu_zero@gentoo.org>
Also undo the changes to ra144enc.c from previous commits.
Should fix ticket #3429
Signed-off-by: James Almer <jamrial@gmail.com>
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
The SVQ3 decoder reuses large parts of the H.264 decoder so it
makes no sense to enable the former but not the latter.
Also drop unnecessary h263.o object from SVQ3 decoder object list.
* commit '3741aa37c2a0d0717faff74a5c4cc357d16f6d1d':
x86: cabac: Use correct #includes to make header compile standalone
Merged-by: Michael Niedermayer <michaelni@gmx.at>
c3390fd56c made use of the DSP function
but did not complement it with a call to emms, which is done here before
computations involving floats are performed.
Fixes ticket #3429, which affected MMX/MMXExt machines.
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
* commit 'eeaf4f3b87815cbae4c12856cfaafb3a2dae8e0c':
av_vdpau_get_profile: mask out H.264 intra profile flag
Merged-by: Michael Niedermayer <michaelni@gmx.at>
Should fix compilation failures with --disable-yasm on some compilers
Signed-off-by: James Almer <jamrial@gmail.com>
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
* commit '5397386effba2e53e4ff82852a86f6be4d59e9c1':
mathops: move macro to the only place it is used
Merged-by: Michael Niedermayer <michaelni@gmx.at>
With cli usage the decoder might have not set the colorspace during
encoder init, manual colorspace override might be needed in such
cases.
Signed-off-by: Anton Khirnov <anton@khirnov.net>
This applies commit 5de64bb3 (the source of the above commit message)
to libutvideoenc as well.
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
Fixes out of array read
Fixes: 387713a12dc5cfa27fcb4178084ce1ea-asan_stack-oob_131176a_1182_cov_3861068719_CAINIT_C_SHARP_3.bit
Found-by: Mateusz "j00ru" Jurczyk and Gynvael Coldwind
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
Fixes out of array accesses
This should not affect any release
Fixes: 8ab69af9e5a7a7e20fe04cdd25c0d6e7-asan_heap-oob_e72b82_5505_cov_2278389485_g2m4.wmv
Found-by: Mateusz "j00ru" Jurczyk and Gynvael Coldwind
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
This reverts the changes 6467209836
and 68c3ed936a did to the SSE2 version,
which generated a hit of about 5 cycles.
Signed-off-by: James Almer <jamrial@gmail.com>
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
Sandy Bridge Win64:
180 cycles on ff_synth_filter_inner_sse2
150 cycles on ff_synth_filter_inner_avx
Signed-off-by: James Almer <jamrial@gmail.com>
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
Avoid a division by 0 in ff_mpeg4_set_one_direct_mv.
Sample-Id: 00000168-google
Reported-by: Mateusz "j00ru" Jurczyk and Gynvael Coldwind
Signed-off-by: Vittorio Giovara <vittorio.giovara@gmail.com>
* commit 'd1f9563d502037239185c11578cc614bdf0c5870':
pthread_frame: flush all threads on flush, not just the first one
Merged-by: Michael Niedermayer <michaelni@gmx.at>
* commit '2f02bbcca050936686482453078e83dc25493da0':
build: Let the ffvhuff decoder/encoder depend on the huffyuv decoder/encoder
Conflicts:
configure
libavcodec/Makefile
Merged-by: Michael Niedermayer <michaelni@gmx.at>
* commit '34150be515cd9c43b0b679806b8d01774960af78':
build: Let the iac decoder depend on the imc decoder
Conflicts:
configure
Merged-by: Michael Niedermayer <michaelni@gmx.at>
* commit '8e0cf39faf02536dca08f4fe628a66d1ae022fde':
build: Let all MJPEG-related decoders depend on the MJPEG decoder
Conflicts:
configure
libavcodec/Makefile
Merged-by: Michael Niedermayer <michaelni@gmx.at>
* commit '0a36988e48dd581d29e77f768f987738bdf365f0':
build: Let AMV decoder depend on the SP5X decoder
Conflicts:
configure
libavcodec/Makefile
Merged-by: Michael Niedermayer <michaelni@gmx.at>
* commit '17a63ff0cd187b9e50e4a47862750295976853b1':
h264: update flag name in ff_h264_decode_ref_pic_list_reordering()
Merged-by: Michael Niedermayer <michaelni@gmx.at>
* commit 'e70ab7c1f5005041bba0e4efc1165410f83495b2':
h264: add MVCD to the list of High profiles in SPS
Merged-by: Michael Niedermayer <michaelni@gmx.at>
The new function has the ability to allocate the structure, allowing it to grow
without needing major bumps
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
This structure is used in the interface between libs and thus cannot have
fields added in the middle without major bump
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
avcodec_flush_buffers() must release all internally held references
according to its documentation, for which all the threads need to be
flushed.
CC:libav-stable@libav.org
Bug-Id: vlc/9665
These codecs compile all of the MJPEG code anyway, so there is little
point in not enabling the MJPEG decoder directly. This also simplifies
the dependency declarations for the MJPEG codec family.
This codec compiles all of the SP5X code anyway, so there is little
point in not enabling the decoder directly. This also simplifies the
dependency declaration for the AMV decoder.
Timings for Arrandale:
C SSE
win32: 2108 334
win64: 1152 322
Factorizing the inner loop with a call/jmp is a >15 cycles cost, even with
the jmp destination being aligned.
Unrolling for ARCH_X86_64 is a 20 cycles gain.
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
* commit '92e598a57a7ce4b8ac9ea56274af39f5fd888311':
prores: Drop DSP infrastructure for prores encoder bits
Conflicts:
libavcodec/Makefile
libavcodec/proresdsp.c
libavcodec/proresenc_kostya.c
Note, these changes only affect one of the 2 prores encoders we have
If someone wants to add optimizations to the affected encoder, or needs/wants
this infrastructure, then iam happy to revert this
Merged-by: Michael Niedermayer <michaelni@gmx.at>
AAC LOAS can have new audio config objects in the stream itself.
Make sure the decoder reconfigures itself when the first one arrives
midstream.
Bug-Id: 644
Signed-off-by: Luca Barbato <lu_zero@gentoo.org>
The vector dequantization has a test in a loop preventing effective SIMD
implementation. By moving it out of the loop, this loop can be DSPized.
Therefore, modify the current DSP implementation. In particular, the
DSP implementation no longer has to handle null loop sizes.
The decode_hf implementations have following timings:
For x86 Arrandale:
C SSE SSE2 SSE4
win32: 260 162 119 104
win64: 242 N/A 89 72
The arm NEON optimizations follow in a later patch as external asm. The
now unused check for the y modifier in arm inline asm is removed from
configure.
Based on a patch from Christophe Gisquet.
Unrolling of the m == 0 case avoids a possible use of the uninitilized
value sum when s->predictor_history is not set. I failed to find a
sample for it. It also reduced the cycle count from 220 to 150 on
sandy bridge, x86_64 linux, gcc 4.8.2 compared to his patch.
Timings for Arrandale:
C SSE
win32: 2108 334
win64: 1152 322
Factorizing the inner loop with a call/jmp is a >15 cycles cost, even with
the jmp destination being aligned.
Unrolling for ARCH_X86_64 is a 20 cycles gain.
Signed-off-by: Janne Grunau <janne-libav@jannau.net>
The scaling factor is constant so it is faster to scale the
FIR coefficients in the tables during compilation.
Signed-off-by: Janne Grunau <janne-libav@jannau.net>
There's an SSE2 version as well, and x86_64 guarantees that
instruction set is present.
Signed-off-by: James Almer <jamrial@gmail.com>
Reviewed-by: "Ronald S. Bultje" <rsbultje@gmail.com>
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
Also fail if AV_EF_EXPLODE is set.
We do not fail by default, but rather return some image as it may be usefull to the
end user to see what is on the image, for example text could be read quite fine and
objects recognized.
Possibly fixes Ticket3424
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
Framerate is now a sane rational instead of an integer, and
inputDepth is changed to what it actually is.
Signed-off-by: Derek Buitenhuis <derek.buitenhuis@gmail.com>
Framerate is now a sane rational instead of an integer, and
inputDepth is changed to what it actually is.
Signed-off-by: Derek Buitenhuis <derek.buitenhuis@gmail.com>
* qatar/master:
Use av_frame_copy() to simplify code where appropriate.
Conflicts:
libavfilter/vf_copy.c
Merged-by: Michael Niedermayer <michaelni@gmx.at>
* commit '8feac29cc46270cc89d6016340e7bac780877131':
lavc: use AVFrame API properly in ff_reget_buffer()
Merged-by: Michael Niedermayer <michaelni@gmx.at>
We need the emulation to support the cases where the first
argument is the same as the fourth. To achieve this a fifth
argument working as a temporary may be needed.
Emulation that doesn't obey the original instruction semantics
can't be in x86inc.
Signed-off-by: James Almer <jamrial@gmail.com>
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
* commit '8eeacf31c5ea37baf6b222dc38d20cf4fd33c455':
hevc: Do not left shift a negative value in hevc_loop_filter_chroma
Merged-by: Michael Niedermayer <michaelni@gmx.at>
* commit 'ff486c0f7f6b2ace3f0238660bc06cc35b389676':
hevc: Do not right shift a negative value in get_pcm
Merged-by: Michael Niedermayer <michaelni@gmx.at>
* commit '50c988aa6d6c6f0ceb8f922bcea34800b56b85d9':
hevc: Drop unnecessary shifts in deblocking_filter_CTB
Merged-by: Michael Niedermayer <michaelni@gmx.at>
First pixel was computed based on invalid address read, and then
corrected by the following memcpy. After the commit, it's not computed
anymore, and memcpy fills the appropriate area.
Fixes Ticket #3387
Fixes out of array read
Fixes: 1cb91c36c4e55463f14aacb9bdf55b38-asan_heap-oob_106cbce_5617_cov_11212800_h264_mmx_chroma_intra_lf.mp4
Found-by: Mateusz "j00ru" Jurczyk and Gynvael Coldwind
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
* qatar/master:
libvorbis: Give consistent names to all functions, structs, and defines
Conflicts:
libavcodec/libvorbisenc.c
Merged-by: Michael Niedermayer <michaelni@gmx.at>
* commit '9c029f67ca82147ddfa83a1546ee1e109e11fbd4':
aarch64: use EXTERN_ASM consistently for exported symbols
Merged-by: Michael Niedermayer <michaelni@gmx.at>
And use the value from the specification.
Sample-Id: 00000451-google
Found-by: Mateusz j00ru Jurczyk and Gynvael Coldwind
CC: libav-stable@libav.org
Signed-off-by: Luca Barbato <lu_zero@gentoo.org>
* commit '017a06a9ee86b047079166c2694c9c655ff03356':
x86: dsputil: Use correct file name as multiple inclusion guard
Merged-by: Michael Niedermayer <michaelni@gmx.at>
* commit 'ba42c852477e87f6e47a5587e8f7829c46c52032':
bit_depth_template: Use file name as multiple inclusion guard
Merged-by: Michael Niedermayer <michaelni@gmx.at>
Makes fate-h264 pass under valgrind --undef-value-errors=yes with
-cpuflags none. {avg,put}_h264_chroma_mc8_8 approximately 5% faster,
{avg,put}_h264_chroma_mc4_8 2% faster both on x86 and arm.
Fixes out of array read
Fixes: 08e48e9daae7d8f8ab6dbe3919e797e5-asan_heap-oob_157461c_5295_cov_1266798650_firefing.mpg
Found-by: Mateusz "j00ru" Jurczyk and Gynvael Coldwind
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
* commit '175e5063320f585118a5461f15dbacf2ce17e97d':
hevc: Mention the missing SPS in the error message
Merged-by: Michael Niedermayer <michaelni@gmx.at>
EXR files have, like tiffs, multiple channels and layers. I have a
patch for exr.c that allows you to select the layer you want to process
thru a -layer flag. It is not bulletproof but works for all layers that
have 3 channels in them (normals, motion vectors, etc).
The calling convention for ffmpeg is:
ffmpeg -layer Diffuse -i myexr.%d.exr test.mov
Here's an exr image with multiple layers:
http://www.datafilehost.com/d/e45d9a1c
Reviewed-by: Paul B Mahol <onemda@gmail.com>
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
Fixes out of array read
Fixes: d4476f68ca1c1c57afbc45806f581963-asan_heap-oob_2266b27_8607_cov_4044577381_snow_chroma_bug.avi
Found-by: Mateusz "j00ru" Jurczyk and Gynvael Coldwind
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
* commit '4d7ab5cfebef91820af2933ef2f622ea598e6b53':
doxygen: Add a number of missing function parameter descriptions
Conflicts:
libavformat/avformat.h
Merged-by: Michael Niedermayer <michaelni@gmx.at>
* commit '614b9e4db8f3d7c23fc0410fc04745a727a82f4e':
h264: use avpriv_request_sample for chroma_format_idc
Conflicts:
libavcodec/h264_ps.c
Merged-by: Michael Niedermayer <michaelni@gmx.at>
Fixes out of array read
Fixes: 5f9698e86d92f19bb08d54ff0d57027f-signal_sigsegv_b30756_3795_cov_2693691257_ansi256.ans
Found-by: Mateusz "j00ru" Jurczyk and Gynvael Coldwind
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
Results are from a Win64 build running on an AMD FX 6300
1121 decicycles in ttafilter_process_dec_c, 16777112 runs, 104 skips
522 decicycles in ff_ttafilter_process_dec_ssse3, 16777149 runs, 67 skips
477 decicycles in ff_ttafilter_process_dec_sse4, 16777156 runs, 60 skips
Signed-off-by: James Almer <jamrial@gmail.com>
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
The advantage of this is that the is32x32 division branch in
decode_coeffs_b is removed from the inner loop to outside the block
coef decoding loop in decode_coeffs. Also, it allows us to merge the
txsz branches from the block coef decoding loop, the context merge
and the context split.
Fixes out of array read
Fixes: caa65cc01655505705129b677189f036-signal_sigsegv_fdcc43_2681_cov_3043376737_PPH422I5_Panasonic_A.264
Found-by: Mateusz "j00ru" Jurczyk and Gynvael Coldwind
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
Fixes out of array access
Fixes: 14a74a0a2dc67ede543f0e35d834fbbe-asan_heap-oob_49572c_556_cov_215466444_44_001_engine_room.mov
Found-by: Mateusz "j00ru" Jurczyk and Gynvael Coldwind
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
* qatar/master:
g2meet: validate bpp and bitmasks in the display info
Conflicts:
libavcodec/g2meet.c
See: ae95b2f810
Merged-by: Michael Niedermayer <michaelni@gmx.at>
Fixes out of array access
Fixes: abd3c041acbcb816be113455d138166b-asan_heap-oob_b11634_3707_cov_1707137151_als_05_2ch48k16b.mp4
Found-by: Mateusz "j00ru" Jurczyk and Gynvael Coldwind
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
Fixes compilation with flac decoder disabled and encoder enabled
Signed-off-by: James Almer <jamrial@gmail.com>
Reviewed-by: Paul B Mahol <onemda@gmail.com>
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
* commit '9cd4bc41760f8ad879e248920eacbe1e7757152c':
ac3dec: set AV_FRAME_DATA_DOWNMIX_INFO side data.
Conflicts:
libavcodec/version.h
Merged-by: Michael Niedermayer <michaelni@gmx.at>
Fixes inconsistency and out of array accesses
Fixes: 10cdd7e63e7f66e3e66273939e0863dd-asan_heap-oob_1a4ff32_7078_cov_4056274555_mov_h264_aac__mp4box_frag.mp4
Found-by: Mateusz "j00ru" Jurczyk and Gynvael Coldwind
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
* qatar/master:
h264_parser: use enum values in h264_find_frame_end()
Conflicts:
libavcodec/h264_parser.c
Merged-by: Michael Niedermayer <michaelni@gmx.at>
I found the optimisation of 2 samples per iteration obscured the
underlying algorithm. I had to write it out on paper and translate into
a mathematical sum to see that the two samples are unconnected. I hope
that if anyone else is struggling to understand the code that this will
be useful.
Reviewed-by: Christophe Gisquet <christophe.gisquet@gmail.com>
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
* commit '3fbad00714698f59c6326edfcc63db87f525e7c0':
utvideoenc: Enable support for multiple slices and use them
Conflicts:
libavcodec/utvideoenc.c
tests/fate/utvideo.mak
See: efec857c9f
Merged-by: Michael Niedermayer <michaelni@gmx.at>
The official Ut Video decoder only threads with slices, thus until
now any files encoded by the libavcodec encoder have only been
decodable with a single thread. The default slice count is now
set to subsampled_height / 120.
Also sets slices to 1 for the Ut Video encoder tests to keep them
green.
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
Altivec can only load naturally aligned vectors. To handle possibly
unaligned data a second vector is loaded from an offset of the original
location and the data is recovered through a vector permutation.
Overreads are minimal if the offset for second load points to the last
element of data. This is 7 for loading eight 8-bit pixels and overreads
are reduced from 16 bytes to 8 bytes if the pixels are 64-bit aligned.
For unaligned pixels the overread is reduced from 23 bytes to 15 bytes
in the worst case.
The official Ut Video decoder only threads with slices, thus until
now any files encoded by the libavcodec encoder have only been
decodable with a single thread. The default slice count is now
set to subsampled_height / 120.
Also sets slices to 1 for the Ut Video encoder tests to keep them
green.
Signed-off-by: Derek Buitenhuis <derek.buitenhuis@gmail.com>
* commit '304e916a92bc17385a485bec2f957e192257ddb6':
h264_sei: name buffering period type consistently
Merged-by: Michael Niedermayer <michaelni@gmx.at>
* commit '3a0576702825423abecb32627c530dbc4c0f73bc':
h264: store current_sps_id inside the current sps
Conflicts:
libavcodec/h264.c
libavcodec/h264_ps.c
The current_sps_id is not removed as it used in security related code.
Merged-by: Michael Niedermayer <michaelni@gmx.at>
* commit '73e8fab31dc19c4371499e612856accbc00b2820':
h264: print values in case of error
Conflicts:
libavcodec/h264.c
libavcodec/h264_ps.c
Merged-by: Michael Niedermayer <michaelni@gmx.at>
* qatar/master:
mpeg12dec: do not add stereo3D side data to a non-existing frame
Conflicts:
libavcodec/mpeg12dec.c
See: fe285b04bb
Merged-by: Michael Niedermayer <michaelni@gmx.at>
Tested on an AMD FX 6300
679081 decicycles in ff_flac_lpc_32_xop, 32768 runs
774425 decicycles in ff_flac_lpc_32_sse4, 32768 runs
Signed-off-by: James Almer <jamrial@gmail.com>
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
User data is usually coded before slice data. That means the frame
the user data belongs to is not available while parsing the user data.
The stereo3D side data has to use the same indirection over the private
context as pan scan information and A53 captions.
Bug-Id:632
AVFrame.sample_rate is set in ff_get_buffer, but aacdec calls
ff_get_buffer before the samplerate is known. So it needs to be
set again before returning the frame.
The buffer holding the coefficients must be padded with 0 so as to use DSP
functions that may overread. Currently, the SSE2/3 versions is an example,
as they process batches of 16 bytes.
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
* qatar/master:
rpza: limit the number of blocks to the total remaining blocks in the frame
See: 3819db745d
Merged-by: Michael Niedermayer <michaelni@gmx.at>
* commit 'a46dc49744bdc4f2e31725b63ac8e41f701e4fa1':
rpza: move some variables to the blocks where they are used
Merged-by: Michael Niedermayer <michaelni@gmx.at>
It was done only in check_mvset(), while mv_scale() is called also by
dist_scale().
Sample-Id: 00001579-google
Reported-by: Mateusz "j00ru" Jurczyk and Gynvael Coldwind
CC: libav-stable@libav.org
The directional intra predictors either don't care about order (dc, h,
dc_left, tm), or they prefer inverted order (vr, dr, hd). This allows
more efficient SIMD implementations.
The spec doesn't describe how it should be decoded so this is probably
the safest thing to do. Fixes valgrind errors on fuzzed11.ivf and fixes
valgrind errors on fuzzed10.ivf differently.
When request_channel_layout is 0,
all substreams should be decoded.
Thanks to Michael Niedermayer for spotting.
Also fix a mismatch between the parser and
decoder when request_channel_layout is a
subset of Stereo.
* commit '5c1c6e82261b856214499b9fef3a08bf3ff6e0ae':
dca: include dcadsp.h in {arm,x86}/dca.h for checkheaders
Merged-by: Michael Niedermayer <michaelni@gmx.at>
* commit '0cffd6fff59f192120dc93aa6c3cb8180f5506e3':
x86: use the inline int8x8_fmul_int32 only if inline SSE2 is availbale
Merged-by: Michael Niedermayer <michaelni@gmx.at>
Don't decode further substreams if request_channel_layout
is a subset of the current substream's channel_layout.
Before, we would only discard further substreams if
request_channel_layout matched the substream's
channel_layout extactly, thus decoding additional
channels which the caller would probably end up downmixing.
The x86 runs short on registers because numerous elements are not static.
In addition, splitting them allows more optimized code, at least for x86.
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
It is currently declared as a macro who is set to inlinable functions,
among which a Neon and a default C implementations.
Add a DSP parameter to each inline function, unused except by the
default C implementation which calls a function from the DSP context.
On an Arrandale CPU, gain for an inlined SSE2 function vs. a call:
- Win32: 29 to 26 cycles
- Win64: 25 to 23 cycles
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
* commit '5bcbb516f2ff45290ef7995b081762e668693672':
arm: Add X() around all references to extern symbols
Merged-by: Michael Niedermayer <michaelni@gmx.at>
Fixes use of uninitialized memory
Fixes: 93728afd9aa074ba14a09bfd93a632fd-asan_static-oob_124a17d_1445_cov_1021181966_DBLK_D_VIXS_1.bit
Found-by: Mateusz "j00ru" Jurczyk and Gynvael Coldwind
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
The x86 runs short on registers because numerous elements are not static.
In addition, splitting them allows more optimized code, at least for x86.
Arm asm changes by Janne Grunau.
Signed-off-by: Janne Grunau <janne-libav@jannau.net>
For the callable function (as opposed to the inline one):
C SSE SSE2 SSE4
Win32: 47 42 29 26
Win64: 30 33 25 23
The SSE version is neither compiled nor set for ARCH_X86_64, as the
inlinable function takes over.
Signed-off-by: Janne Grunau <janne-libav@jannau.net>
It is currently declared as a macro who is set to inlinable functions,
among which a Neon and a default C implementations.
Add a DSP parameter to each inline function, unused except by the
default C implementation which calls a function from the DSP context.
On an Arrandale CPU, gain for an inlined SSE2 function vs. a call:
- Win32: 29 to 26 cycles
- Win64: 25 to 23 cycles
Signed-off-by: Janne Grunau <janne-libav@jannau.net>
Fixes use of uninitialized memory
Fixes out of array read
Fixes assertion failure
Fixes part of cb307d24befbd109c6f054008d6777b5/asan_static-oob_124a175_1445_cov_2355279992_DBLK_D_VIXS_1.bit
Found-by: Mateusz "j00ru" Jurczyk and Gynvael Coldwind
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
Fixes inconsistencies
Fixes use of uninitilaized memory
Fixes part of cb307d24befbd109c6f054008d6777b5/asan_static-oob_124a175_1445_cov_2355279992_DBLK_D_VIXS_1.bit
Found-by: Mateusz "j00ru" Jurczyk and Gynvael Coldwind
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
* qatar/master:
vp8: Use 2 registers for dst_stride and src_stride in neon bilin filter
Conflicts:
libavcodec/arm/vp8dsp_neon.S
Merged-by: Michael Niedermayer <michaelni@gmx.at>
benchmarked on sandybridge x86_64:
1358232 decicycles in flac_lpc_32_c
1244575 decicycles in flac_lpc_32_sse4, James Almer's patch
650045 decicycles in flac_lpc_32_sse4, this patch
I haven't tested the edgecases such as odd block lengths
odd block length tested-by: James Almer <jamrial@gmail.com>
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
Enable compilation on machines with an old libfdk-aac.
Signed-off-by: Timothy Gu <timothygu99@gmail.com>
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
Also adjust header #include order and some comments.
Signed-off-by: Diego Biurrun <diego@biurrun.de>
Signed-off-by: Vittorio Giovara <vittorio.giovara@gmail.com>
* commit '1f097d168d9cad473dd44010a337c1413a9cd198':
h264: reset data partitioning at the beginning of each decode call
Merged-by: Michael Niedermayer <michaelni@gmx.at>
* commit '5de64bb34d68d6c224dca90003172d7a27958825':
utvideoenc: Add support for the new BT.709 FourCCs for YCbCr
Merged-by: Michael Niedermayer <michaelni@gmx.at>
* commit 'b25e84b7399bd91605596b67d761d3464dbe8a6e':
hevc: check that the VCL NAL types are the same for all slice segments of a frame
Merged-by: Michael Niedermayer <michaelni@gmx.at>
Prevents using GetBitContexts with data from previous calls.
Fixes access to freed memory.
Found-by: Mateusz "j00ru" Jurczyk and Gynvael Coldwind
CC:libav-stable@libav.org
With cli usage the decoder might have not set the colorspace during
encoder init, manual colorspace override might be needed in such
cases.
Signed-off-by: Anton Khirnov <anton@khirnov.net>
This avoids them being cleared before the full initialization finished
Fixes out of array read
Fixes: asan_heap-oob_f0c5e6_7071_cov_1605985132_mov_h264_aac__Demo_FlagOfOurFathers.mov
Found-by: Mateusz "j00ru" Jurczyk and Gynvael Coldwind
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
Such changes are forbidden in H.264 and lead to race conditions
Fixes out of array read
Fixes: signal_sigsegv_f9796a_1613_cov_3114610371_FM1_BT_B.h264
Found-by: Mateusz "j00ru" Jurczyk and Gynvael Coldwind
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
Fixes out of array read
Fixes: asan_static-oob_1efed25_1887_cov_2013541199_HeyYa_RA10_AAC_192K_30s.rm
Found-by: Mateusz "j00ru" Jurczyk and Gynvael Coldwind
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
Fixes out of array read
Fixes: signal_sigsegv_1326a09_1752_cov_245452111_GRTH301.HNS
Found-by: Mateusz "j00ru" Jurczyk and Gynvael Coldwind
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
Even though the most common framerate for RoQ is 30fps,
the format supports other framerates too.
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
* commit '09e2203b8ba6943d5c0fe6d73b65b145c3fdf98e':
hevc: Consider first quantization group any reference to 0, 0
Merged-by: Michael Niedermayer <michaelni@gmx.at>
Fixes use of uninitialized memory
Fixes out of array read
Fixes: asan_static-oob_123cee5_2630_cov_1869071233_PICSIZE_A_Bossen_1.bin
Found-by: Mateusz "j00ru" Jurczyk and Gynvael Coldwind
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
It is not necessarily an error when a chunk does not cover a whole block.
Messages did not reflect the actual situation either.
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
A dependent slice cannot have address 0.
Prevent an out of array bound load in ff_hevc_cabac_init().
Sample-Id: 00001406-google
Reported-by: Mateusz "j00ru" Jurczyk and Gynvael Coldwind
CC: libav-stable@libav.org
According to my understanding of T-REC-H.265-2013044 chapter 8.6.1.
Sample-Id: 00001438-google
Reported-by: Mateusz "j00ru" Jurczyk and Gynvael Coldwind
CC: libav-stable@libav.org