Replace our incomplete w32threads implementation with x264's pthreads
w32threads wrapper.
Relicensed to LGPL with kind permission by Pegasys Inc.
Signed-off-by: Janne Grunau <janne-libav@jannau.net>
This allows concurrent decoding of the last field/frame, rather than
only the last slice, of data packets with multiple NAL units packed
together.
This will fix the slowdown reported in e.g. bug 52.
Signed-off-by: Anton Khirnov <anton@khirnov.net>
Correct computation of implicit weight tables when referencing pictures
that are marked for long reference.
Signed-off-by: Diego Biurrun <diego@biurrun.de>
High bitdepth H.264 needs 32-bit transform coefficients, whereas
dnxhd does not. This creates a conflict with the templated
functions operating on DCTELEM data. This patch adds a field
allowing the caller to choose the element size in dsputil_init()
and adds the required functions.
Signed-off-by: Mans Rullgard <mans@mansr.com>
FF_COMMON_FRAME holds the contents of the AVFrame structure and is also copied
to struct Picture. Replace by an embedded AVFrame structure in struct Picture.
By observation it did not seem to handle prev_frame_num > frame_num.
This does not affect any files I have.
Signed-off-by: Ronald S. Bultje <rsbultje@gmail.com>
Signed-off-by: Anton Khirnov <anton@khirnov.net>
One of the causes of this bug is that the h264 parser defaults low_delay
to 1, but the h264 codec defaults low_delay to 0. Really Ugly.
After many hours of looking at this, I'm still not sure how has_b_frames
is *intended* to behave, but to me the implementation appears way more
complicated than it ought to be.
My patch relies on the encoder to set an optional field in the SPS. This
works for libx264 streams, but I'm not sure that all h264 encoders will
set it.
Signed-off-by: Anton Khirnov <anton@khirnov.net>
When backing up the top-left border, check that the top-left
(rather than left) MB indeed does belong to our slice. If it
doesn't, backing up has no positive effect but may accidentally
interfere with other threads writing in the same space.
Fixes occasional one-off effects when enabling slice-MT.
This patch lets e.g. dsputil_init chose dsp functions with respect to
the bit depth to decode. The naming scheme of bit depth dependent
functions is <base name>_<bit depth>[_<prefix>] (i.e. the old
clear_blocks_c is now named clear_blocks_8_c).
Note: Some of the functions for high bit depth is not dependent on the
bit depth, but only on the pixel size. This leaves some room for
optimizing binary size.
Preparatory patch for high bit depth h264 decoding support.
Signed-off-by: Ronald S. Bultje <rsbultje@gmail.com>
In high bit depth, the QP values may now be up to (51 + 6*(bit_depth-8)).
Preparatory patch for high bit depth h264 decoding support.
Signed-off-by: Ronald S. Bultje <rsbultje@gmail.com>
In high bit depth the pixels will not be stored in uint8_t like in the
normal case, but in uint16_t. The pixel size is thus 1 in normal bit
depth and 2 in high bit depth.
Preparatory patch for high bit depth h264 decoding support.
Signed-off-by: Ronald S. Bultje <rsbultje@gmail.com>
It is pretty hopeless that other considerable projects will adopt
libavutil alone in other projects. Projects that need small footprint
are better off with more specialized libraries such as gnulib or rather
just copy the necessary parts that they need. With this in mind, nobody
is helped by having libavutil and libavcore split. In order to ease
maintenance inside and around FFmpeg and to reduce confusion where to
put common code, avcore's functionality is merged (back) to avutil.
Signed-off-by: Reinhard Tartler <siretart@tauware.de>
Don't free RBSP tables (containing decoded NAL units) on resolution
change, because we actually need this data to decode the frame after
reiniting (with new resolution). Fixed issue 2393.
Signed-off-by: Janne Grunau <janne-ffmpeg@jannau.net>
It's incomplete, no one is working on it, and when someone asks about
working on it we advise them not to.
Signed-off-by: Mans Rullgard <mans@mansr.com>
No speed improvement, but necessary for some future stuff.
Also opens up the possibility of asm chroma dc idct/dequant.
Originally committed as revision 26349 to svn://svn.ffmpeg.org/ffmpeg/trunk
Doesn't help speed as there isn't an asm implementation yet, but consistency
is a good thing.
Originally committed as revision 26348 to svn://svn.ffmpeg.org/ffmpeg/trunk
Useful so that we don't have to run the hierarchical DC iDCT if there aren't
any coefficients. Opens up some future opportunities for optimization as well.
Originally committed as revision 26337 to svn://svn.ffmpeg.org/ffmpeg/trunk
About 2.5x the speed.
NOTE: the way that the asm code handles large qmuls is a bit suboptimal.
If x264-style dequant was used (separate shift and qmul values), it might
be possible to get some extra speed.
Originally committed as revision 26336 to svn://svn.ffmpeg.org/ffmpeg/trunk
It was an ugly hack to begin with and didn't give any performance.
NOTE: this patch opens up some future simplifications to be made (such as
removing some of the scantables from H264Context) but doesn't take advantage
of them yet.
Originally committed as revision 26329 to svn://svn.ffmpeg.org/ffmpeg/trunk
svq3 still doesn't support multithreading, but it's simpler for clients if
they can enable threading for all codecs by default.
Originally committed as revision 26015 to svn://svn.ffmpeg.org/ffmpeg/trunk
Contrary to progressive, just being able to crop up to 14/15 pixels
is not enough to encode all supported resolutions, and the new
behaviour is also consistent with e.g. MPEG-2 etc.
Originally committed as revision 25669 to svn://svn.ffmpeg.org/ffmpeg/trunk
r25218 made assumptions about the existence of past reference frames that
weren't necessarily true.
Originally committed as revision 25243 to svn://svn.ffmpeg.org/ffmpeg/trunk
h264dsp_mmx.c to h264_idct.asm (as yasm code). Because the loops are now
coded in asm instead of C, this is (depending on the function) up to 50%
faster for cases where gcc didn't do a great job at looping.
Since h264_idct_add8() is now faster than the manual loop setup in h264.c,
in-asm idct calling can now be enabled for chroma as well (see r16207). For
MMX, this is 5% faster. For SSE2 (which isn't done for chroma if h264.c does
the looping), this makes it up to 50% faster. Speed gain overall is ~0.5-1.0%.
Originally committed as revision 25119 to svn://svn.ffmpeg.org/ffmpeg/trunk
Passing an explicit filename to this command is only necessary if the
documentation in the @file block refers to a file different from the
one the block resides in.
Originally committed as revision 22921 to svn://svn.ffmpeg.org/ffmpeg/trunk