Adds a new member to MpegEncContext to hold the number of used slice
contexts. Fixes segfaults with '-threads 17 -thread_type slice' and
fate-vsynth{1,2}-mpeg{2,4}thread{,_ilace} with --disable-pthreads.
The extra thread added in {frame_}*thread_init was not taken into
account. Explicitly sets thread_count to 1 if only one CPU core was
detected. Also fixes two typos in comments.
Some external codecs have their own code to determine the best number
of threads. This number is not necessary the number of cpu cores.
Thread_count will be only 0 if the codec has CODEC_CAP_AUTO_THREADS.
The safe bitstream reader does not allow using skip_bits_long() to seek to a
point before the start of the buffer, which was needed by the mp3 decoder.
This change instead calculates the start point of the first valid granule and
skips to that position.
Signed-off-by: Justin Ruggles <justin.ruggles@gmail.com>
Since the conditions for the actual usage are more specific a less
preferred method can be used. This would cause compilation errors
because necessary headers are not included.
Originally, prior to 8742a4ff8, the caller code was compiled
within this condition:
ARCH_X86 && HAVE_7REGS && HAVE_EBX_AVAILABLE && !defined(BROKEN_RELOCATIONS)
Since HAVE_7REGS is defined as
(ARCH_X86_64 || (HAVE_EBX_AVAILABLE && HAVE_EBP_AVAILABLE))
the subcondition HAVE_7REGS && HAVE_EBX_AVAILABLE is equal
to HAVE_7REGS (for 32 bit at least). The correct simplification
of the original condition thus is HAVE_7REGS, not
HAVE_EBX_AVAILABLE.
This fixes compilation in some cases where HAVE_EBP_AVAILABLE = 0
and HAVE_EBX_AVAILABLE = 1.
Signed-off-by: Martin Storsjö <martin@martin.st>
The format is a per-frame property, having it in AVFrame simplify the
operation of extraction of that information, since avoids the need to
access the codec/stream context.
width and height are per-frame properties, setting these values in
AVFrame simplify the operation of extraction of that information,
since avoids the need to check the codec/stream context.
The sample aspect ratio is a per-frame property, so it makes sense to
define it in AVFrame rather than in the codec/stream context.
Simplify application-level sample aspect ratio information extraction,
and allow further simplifications.
Based on a patch by Michael Niedermayer <michaelni@gmx.at>
Fixes NGS00145, CVE-2011-4352
Found-by: Phillip Langlois
Signed-off-by: Reinhard Tartler <siretart@tauware.de>
Use sched_getaffinity to determine the number of logical CPUs.
Limits the number of threads to 16 since slice threading of H.264
seems to be buggy with more than 16 threads.
This file does not use anything from get_bits.h but needs
intreadwrite.h.
Signed-off-by: Paul B Mahol <onemda@gmail.com>
Signed-off-by: Janne Grunau <janne-libav@jannau.net>
The 'fiel' atoms can be found in H.264 tracks clobbering the extradata.
MJPEG supports non field based extradata, and this data should be
preserved when copying.
Also define a codec capability for codecs that can handle
parameters changed externally between decoded packets.
Signed-off-by: Martin Storsjö <martin@martin.st>
On 32-bit OS X with gcc 4.0/4.2 and shared libraries enabled, the ebx register
is not available, but required to assemble the functions.
This reverts commit 8742a4f to a simplified version of the original constraints.
This fixes a deadlock VLC triggered with multithreaded decoding. The
wait forces one of the current waiters to wake and not the thread
which calls pthread_cond_signal() itself.
Signed-off-by: Ronald S. Bultje <rsbultje@gmail.com>
Interlaced content for most codec requires it.
This patch is a stop-gap pending a serious rework to support
codecs with non 16 pixel macroblocks.
Signed-off-by: Luca Barbato <lu_zero@gentoo.org>
Trailing bits are likely to be non-zero if the NAL unit is truncated.
Clearing the bits make overreads of the bitstream less likely in this
case. Fixes playback of
http://streams.videolan.org/streams/mp4/Mr_MrsSmith-h264_aac.mp4 which
has a forbidden byte sequence of 0x00 0x00 0x00 in it SPS.
Start code emulation prevention is only required in Annex B bytestream
packed NAL units. For other coding formats the size is already known.
Looking for a start code prefix can result in false positives like in
http://streams.videolan.org/streams/mp4/Mr_MrsSmith-h264_aac.mp4
which has a false positive in the SPS.
Width and height might get passed as 0 and would cause floating point
exceptions in decode_frame.
Fixes bugzilla #149
Signed-off-by: Janne Grunau <janne-libav@jannau.net>
This was intended as an optimisation for skipped blocks in MPEG2
P-frames and never used elsewhere. Removing this "optimisation"
speeds up MPEG2 decoding by 1-2% (ARM Cortex-A9).
Signed-off-by: Mans Rullgard <mans@mansr.com>
Previously the decoder only worked if the user had set avctx->pix_fmt
manually. For some reason the libavformat tmv demuxer sets this, so
the problem was not visible in avplay etc.
Signed-off-by: Ronald S. Bultje <rsbultje@gmail.com>
The buffer splicing relies on the bitstream reader over-reading
the end of the buffer as declared in init_get_bits(), although
more data is actually present. Manually moving the bitstream
boundary after init_get_bits() allows this to work as expected.
Signed-off-by: Mans Rullgard <mans@mansr.com>
When turned on, H264/CAVLC gets ~15% (CVPCMNL1_SVA_C.264) slower for
ultra-high-bitrate files, or ~2.5% (CVFI1_SVA_C.264) for lower-bitrate
files. Other codecs are affected to a lesser extent because they are
less optimized; e.g., VC-1 slows down by less than 1% (all on x86).
The patch generated 3 extra instructions (cmp, cmovae and mov) per
call to get_bits().
The performance penalty on ARM is within the error margin for most
files, up to 4% in extreme cases such as CVPCMNL1_SVA_C.264.
Based on work (for GCI) by Aneesh Dogra <lionaneesh@gmail.com>, and
inspired by patch in Chromium by Chris Evans <cevans@chromium.org>.
The A32 bitstream reader variant is only used on ARMv5 and for
Prores due to the larger bit cache this decoder requires.
In benchmarks on ARMv5 (Marvell Sheeva) with gcc 4.6, the only
statistically significant difference between ALT and A32 is
a 4% advantage for ALT in FLAC decoding. There is thus no (longer)
any reason to keep the A32 reader from this point of view.
This patch adds an option to the ALT reader increasing the bit
cache to 32 bits as required by the Prores decoder. Benchmarking
shows no significant change in speed on Intel i7. Again, the
A32 reader fails to justify its existence.
Signed-off-by: Mans Rullgard <mans@mansr.com>
In the case that (frame_flags & 0x03) == 3, hybrid_maxclip
may have had a signed integer overflow.
Signed-off-by: Derek Buitenhuis <derek.buitenhuis@gmail.com>
Signed-off-by: Anton Khirnov <anton@khirnov.net>
It doesn't make much sense to clip pre-shift,
nor is it correct for proper decoding.
Signed-off-by: Derek Buitenhuis <derek.buitenhuis@gmail.com>
Signed-off-by: Anton Khirnov <anton@khirnov.net>
The keyframe after a POC reset may not be the first to be returned to
the user. Therefore, don't reset the expected next POC once we return
a keyframe to the user, but once we know that the next frame in the
return-queue is a keyframe.
The mode is set in libgsm_decode_init, but the decoder
object is simply destroyed and recreated in the flush
function - therefore the mode has to be set again.
This fixes playback using the libgsm_ms decoder in avplay.
Signed-off-by: Martin Storsjö <martin@martin.st>
v410 is a packed 10-bit 4:4:4 YCbCr format used in
QuickTime.
Signed-off-by: Derek Buitenhuis <derek.buitenhuis@gmail.com>
Signed-off-by: Diego Biurrun <diego@biurrun.de>
This patch is a generalization of what Michael Niedermayer
fixed in a single case.
The wmv8-drm fate test had been updated accordingly.
Signed-off-by: Luca Barbato <lu_zero@gentoo.org>
The change in 599b4c6ef didn't turn out to work properly on
i386 on OS X, where it broke building with PIC enabled.
Signed-off-by: Martin Storsjö <martin@martin.st>
Make the function prototype match the argument of
AVCodecCntext.execute() and remove the cast hiding
this mismatch.
Signed-off-by: Mans Rullgard <mans@mansr.com>
This replaces the explicit offset(reg) memory references with
"m" operands for the same locations. As a result, one fewer
register operand is needed for these inline asm statements.
Signed-off-by: Mans Rullgard <mans@mansr.com>
When the buf and last pointers are equal, the FFSWAP() results
in an invalid call to memcpy() with same source and destination
on some targets. Although assigning a struct to itself is valid
C99, gcc does not check for this before calling memcpy().
See http://gcc.gnu.org/bugzilla/show_bug.cgi?id=32667
Signed-off-by: Mans Rullgard <mans@mansr.com>
The existing functions defined in intfloat_readwrite.[ch] are
both slow and incorrect (infinities are not handled).
This introduces a new header with fast, inline conversion
functions using direct union punning assuming an IEEE-754
system, an assumption already made throughout the code.
The one use of Intel/Motorola extended 80-bit format is
replaced by simpler code sufficient under the present
constraints (positive normal values).
The old functions are marked deprecated and retained for
compatibility.
Signed-off-by: Mans Rullgard <mans@mansr.com>
Return the whole packet as consumed in this case and not the size the
packet should have had. Move the insufficient data check into the for
condition to fix a ISO C90 error on bigendian.
This groups the encode/decode parts under single ifdefs and
eliminates the encode_init() function as it merely calls
common_init(). Also fix whitespace in moved code.
Signed-off-by: Mans Rullgard <mans@mansr.com>
Earlier, bits per sample was defined as 8, since
bits_per_coded_sample was used to indicate whether to ignore
the lower bits of the codeword, having values 6, 7 or 8.
g722 encodes 2 samples into one byte codeword, therefore the
bits per sample is 4. By changing this, the generated timestamps
for streams encoded with g722 become correct.
This makes timestamp generation for g722 data correct (both when
encoding and when demuxing from raw g722 files).
Signed-off-by: Martin Storsjö <martin@martin.st>
This avoids using bits_per_coded_sample for this information.
bits_per_coded_sample should be 4 for this codec normally,
since two samples are encoded into one 8 bit codeword.
In principle, this might be info that needs to be passed from
a demuxer, and in that case, a private AVOption isn't the best
choice, but no such samples are available at the moment, so
that use case is purely theoretical at the moment.
Signed-off-by: Martin Storsjö <martin@martin.st>
When decoding lossy WavPack samples, they are supposed
to be clipped, in order to be decoded correctly.
Signed-off-by: Derek Buitenhuis <derek.buitenhuis@gmail.com>
Signed-off-by: Ronald S. Bultje <rsbultje@gmail.com>
Pass the correct size in bits to mpeg4audio_get_config and add a flag
to disable parsing of the sync extension when the size is not known.
Latm with AudioMuxVersion 0 does not specify the size of the audio
specific config. Data after the audio specific config can be
misinterpreted as sync extension resulting in random and wrong configs.
Add AV_NUM_DATA_POINTERS to simplify the bump transition.
This will allow for supporting more planar audio channels without having to
allocate separate pointer arrays.
There's no reason to use two arrays for this.
Based off commit 2fea60c600
to FFmpeg by Michael Niedermayer.
Signed-off-by: Derek Buitenhuis <derek.buitenhuis@gmail.com>
Signed-off-by: Ronald S. Bultje <rsbultje@gmail.com>
The next call to decode() will update from an invalid index, which will
either lead to a memcpy() where dest==src (2 threads), or lead to a
crash (>2 threads).
Signed-off-by: Ronald S. Bultje <rsbultje@gmail.com>
Interlaced videos can contain progressive frames too and now wrong scantable
is selected for them.
Signed-off-by: Ronald S. Bultje <rsbultje@gmail.com>
This fixes a compile error on mingw32 when using p->thread
directly (as if it were a pointer) to track thread existence,
because the type is opaque and may be a non-pointer.
Signed-off-by: Ronald S. Bultje <rsbultje@gmail.com>
The way these values are used, they should have an unsigned type.
A similar change was made for mpegvideo in cb66847.
Signed-off-by: Mans Rullgard <mans@mansr.com>
This simplifies the decoder so it doesn't have to process an in-packet header
or handle arbitrary-sized packets. It also fixes decoding of files with large
headers.
Instead of using fixed coefficients, the correct way is to calculate the
coefficients using the highpass cutoff frequency from the ADX stream header
and the sample rate.
This multiplication can overflow the signed range but not the
unsigned. After right-shifting it will thus fit in the signed
range again.
Signed-off-by: Mans Rullgard <mans@mansr.com>
It makes more sense for a bit mask to use an unsigned type.
The change should be source and binary compatible on all
supported systems, hence micro version bump.
Fixes a few invalid shifts.
Signed-off-by: Mans Rullgard <mans@mansr.com>
This is a hand-tuned version of the code with impossible parts of
the FASTDIV function ommitted.
2-5% faster overall on Cortex-A8.
Signed-off-by: Mans Rullgard <mans@mansr.com>
The initial values are not checked against the number of block sizes.
Initializing them to frame_len_bits will result in a block size index of 0
in these cases instead of something that might be out-of-range.
Fixes Bug 81.
This prevents build errors when compiler and assembler default
targets differ. Ideally each file would declare the highest
level it requires. This is however not easily possible as it
complicates assembling pre-armv6t2 code in Thumb-2 mode.
HAVE_NEON is used as indicator for ARMv7-A since no other
symbol exists for this and NEON is only available in this
variant.
Signed-off-by: Mans Rullgard <mans@mansr.com>
Adding the thread count in frame level multithreading to has_b_frames
as an additional delay causes more problems than it solves.
For example inconsistent behaviour during timestamp calculation in
libavformat.
Thread count and frame level multithreading are both set by the user.
If the additional delay caused by frame level multithreading needs
to be considered in the calling code it has all information to take
it into account.
Should it become necessary to calculate a maximum delay inside
libavcodec it should be exported as its own field and not reusing
an existing field.
Based on a patch by Michael Niedermayer.
Signed-off-by: Janne Grunau <janne-libav@jannau.net>
A new field, AVCodecContext.internal is used to hold a new struct
AVCodecInternal, which has private fields that are not codec-specific and are
used by general libavcodec functions.
Moved internal_buffer, internal_buffer_count, and is_copy.
libavcodec/options.c:583: warning: assignment discards qualifiers from pointer
target type
libavcodec/options.c:589: warning: initialization discards qualifiers from
pointer target type
http://samples.mplayerhq.hu/V-codecs/CVID/bad_cinepak_frame_size.mov
This fix works around another work around which handles a different type
of odd Cinepak data.
Thanks to Matthew Hoops (clone2727 - gmail.com) for the sample and fix.
Signed-off-by: Martin Storsjö <martin@martin.st>
It does not make much sense to factor the error handling to its own
av_always_inline function. Fixes "format not a string literal and no
format arguments" warning in the av_log.