* qatar/master:
lavr: fix handling of custom mix matrices
fate: force pix_fmt in lagarith-rgb32 test
fate: add tests for lagarith lossless video codec.
ARMv6: vp8: fix stack allocation with Apple's assembler
ARM: vp56: allow inline asm to build with clang
fft: 3dnow: fix register name typo in DECL_IMDCT macro
x86: dct32: port to cpuflags
x86: build: replace mmx2 by mmxext
Revert "wmapro: prevent division by zero when sample rate is unspecified"
wmapro: prevent division by zero when sample rate is unspecified
lagarith: fix color plane inversion for YUY2 output.
lagarith: pad RGB buffer by 1 byte.
dsputil: make add_hfyu_left_prediction_sse4() support unaligned src.
Conflicts:
doc/APIchanges
libavcodec/lagarith.c
libavfilter/x86/gradfun.c
libavutil/cpu.h
libavutil/version.h
libswscale/utils.c
libswscale/version.h
libswscale/x86/yuv2rgb.c
Merged-by: Michael Niedermayer <michaelni@gmx.at>
* qatar/master:
vc1dec: Remove separate scaling function for interlaced field MVs
vc1dec: Invoke edge_emulation regardless of MV precision
x86: Use consistent 3dnowext function and macro name suffixes
g723_1: scale output as supposed for the case with postfilter disabled
g723_1: increase excitation storage by 4
g723_1: fix upper bound parameter from inverse maximum autocorrelation
g723_1: make scale_vector() behave like the reference
g723_1: fix off-by-one error in normalize_bits()
g723_1: save/restore excitation with offset to store LPC history
wmapro: prevent division by zero when sample rate is unspecified
x86: proresdsp: improve SIGNEXTEND macro comments
x86: h264dsp: K&R formatting cosmetics
LICENSE: Document all GPL files
Conflicts:
libavcodec/g723_1.c
libavcodec/wmaprodec.c
libavcodec/x86/h264dsp_mmx.c
Merged-by: Michael Niedermayer <michaelni@gmx.at>
Refactoring mmx2/mmxext YASM code with cpuflags will force renames.
So switching to a consistent naming scheme beforehand is sensible.
The name "mmxext" is more official and widespread and also the name
of the CPU flag, as reported e.g. by the Linux kernel.
* qatar/master:
dca: Switch dca_sample_rates to avpriv_ prefix; it is used across libs
ARM: use =const syntax instead of explicit literal pools
ARM: use standard syntax for all LDRD/STRD instructions
fft: port FFT/IMDCT 3dnow functions to yasm, and disable on x86-64.
dct-test: allow to compile without HAVE_INLINE_ASM.
x86/dsputilenc: bury inline asm under HAVE_INLINE_ASM.
dca: Move tables used outside of dcadec.c to a separate file.
dca: Rename dca.c ---> dcadec.c
x86: h264dsp: Remove unused variable ff_pb_3_1
apetag: change a forgotten return to return 0
Conflicts:
libavcodec/Makefile
libavcodec/dca.c
libavcodec/x86/fft_3dn.c
libavcodec/x86/fft_3dn2.c
libavcodec/x86/fft_mmx.asm
Merged-by: Michael Niedermayer <michaelni@gmx.at>
* qatar/master:
mpc8: return more meaningful error codes.
mpc: return more meaningful error codes.
wv,mpc8: don't return apetag data in packets.
rtmp: do not warn about receiving metadata packets
x86: h264dsp: Adjust YASM #ifdefs
x86: yadif: Mark mmxext optimizations as such
h264: convert loop filter strength dsp function to yasm.
Improve descriptiveness of a number of codec and container long names
Conflicts:
libavcodec/flvdec.c
libavcodec/libopenjpegdec.c
libavformat/apetag.c
libavformat/mp3dec.c
Merged-by: Michael Niedermayer <michaelni@gmx.at>
This completes the conversion of h264dsp to yasm; note that h264 also
uses some dsputil functions, most notably qpel. Performance-wise, the
yasm-version is ~10 cycles faster (182->172) on x86-64, and ~8 cycles
faster (201->193) on x86-32.
* qatar/master: (35 commits)
h264_idct_10bit: port x86 assembly to cpuflags.
x86inc: clip num_args to 7 on x86-32.
x86inc: sync to latest version from x264.
fft: rename "z" to "zc" to prevent name collision.
wv: return meaningful error codes.
wv: return AVERROR_EOF on EOF, not EIO.
mp3dec: forward errors for av_get_packet().
mp3dec: remove a pointless local variable.
mp3dec: remove commented out cruft.
lavfi: bump minor to mark stabilizing the ABI.
FATE: add tests for yadif.
FATE: add a test for delogo video filter.
FATE: add a test for amix audio filter.
audiogen: allow specifying random seed as a commandline parameter.
vc1dec: Override invalid macroblock quantizer
vc1: avoid reading beyond the last line in vc1_draw_sprites()
vc1dec: check that coded slice positions and interlacing match.
vc1dec: Do not ignore ff_vc1_parse_frame_header_adv return value
configure: Move parts that should not be user-selectable to CONFIG_EXTRA
lavf: remove commented out cruft in avformat_find_stream_info()
...
Conflicts:
Makefile
configure
libavcodec/vc1dec.c
libavcodec/x86/h264_deblock.asm
libavcodec/x86/h264_deblock_10bit.asm
libavcodec/x86/h264dsp_mmx.c
libavfilter/version.h
libavformat/mp3dec.c
libavformat/utils.c
libavformat/wv.c
libavutil/x86/x86inc.asm
Merged-by: Michael Niedermayer <michaelni@gmx.at>
* qatar/master:
x86: Only use optimizations with cmov if the CPU supports the instruction
x86: Add CPU flag for the i686 cmov instruction
x86: remove unused inline asm macros from dsputil_mmx.h
x86: move some inline asm macros to the only places they are used
lavfi: Add the af_channelmap audio channel mapping filter.
lavfi: add join audio filter.
lavfi: allow audio filters to request a given number of samples.
lavfi: support automatically inserting the fifo filter when needed.
lavfi/audio: eliminate ff_default_filter_samples().
Conflicts:
Changelog
libavcodec/x86/h264dsp_mmx.c
libavfilter/Makefile
libavfilter/allfilters.c
libavfilter/avfilter.h
libavfilter/avfiltergraph.c
libavfilter/version.h
libavutil/x86/cpu.c
Merged-by: Michael Niedermayer <michaelni@gmx.at>
* qatar/master: (35 commits)
flvdec: Do not call parse_keyframes_index with a NULL stream
libspeexdec: include system headers before local headers
libspeexdec: return meaningful error codes
libspeexdec: cosmetics: reindent
libspeexdec: decode one frame at a time.
swscale: fix signed shift overflows in ff_yuv2rgb_c_init_tables()
Move timefilter code from lavf to lavd.
mov: add support for hdvd and pgapmetadata atoms
mov: rename function _stik, some indentation cosmetics
mov: rename function _int8 to remove ambiguity, some indentation cosmetics
mov: parse the gnre atom
mp3on4: check for allocation failures in decode_init_mp3on4()
mp3on4: create a separate flush function for MP3onMP4.
mp3on4: ensure that the frame channel count does not exceed the codec channel count.
mp3on4: set channel layout
mp3on4: fix the output channel order
mp3on4: allocate temp buffer with av_malloc() instead of on the stack.
mp3on4: copy MPADSPContext from first context to all contexts.
fmtconvert: port float_to_int16_interleave() 2-channel x86 inline asm to yasm
fmtconvert: port int32_to_float_fmul_scalar() x86 inline asm to yasm
...
Conflicts:
libavcodec/arm/h264dsp_init_arm.c
libavcodec/h264.c
libavcodec/h264.h
libavcodec/h264_cabac.c
libavcodec/h264_cavlc.c
libavcodec/h264_ps.c
libavcodec/h264dsp_template.c
libavcodec/h264idct_template.c
libavcodec/h264pred.c
libavcodec/h264pred_template.c
libavcodec/x86/h264dsp_mmx.c
libavdevice/Makefile
libavdevice/jack_audio.c
libavformat/Makefile
libavformat/flvdec.c
libavformat/flvenc.c
libavutil/pixfmt.h
libswscale/utils.c
Merged-by: Michael Niedermayer <michaelni@gmx.at>
Fixes regression in 836f47d34b in ICC-10.x,
since ICC<=11.0 doesn't align stack upon function calls.
Signed-off-by: Ronald S. Bultje <rsbultje@gmail.com>
Ports the majority of IDCT functions for 10-bit H.264.
Parts are inspired from 8-bit IDCT code in Libav; other parts ported from x264 with relicensing permission from author.
Signed-off-by: Ronald S. Bultje <rbultje@google.com>
* qatar/master: (32 commits)
10-bit H.264 x86 chroma v loopfilter asm
Port SMPTE S302M audio decoder from FFmbc 0.3. [Copyright headers corrected]
Fix crash of interlaced MPEG2 decoding
h264pred: fix one more aliasing violation.
doc/APIchanges: fill in missing hashes and dates.
flacenc: use proper initializers for AVOption default values.
lavc: deprecate named constants for deprecated antialias_algo.
aac: workaround for compilation on cygwin
swscale: extend YUV422p support to 10bits depth
tiff: add support for inverted FillOrder for uncompressed data
Remove unused softfloat implementation.
h264pred: fix aliasing violations.
rotozoom: Eliminate French variable name.
rotozoom: Check return value of fread().
rotozoom: Return an error value instead of calling exit().
rotozoom: Make init_demo() return int and check for errors on invocation.
rotozoom: Drop silly UINT8 typedef.
rotozoom: Drop some unnecessary parentheses.
rotozoom: K&R coding style cosmetics
rtsp: Only do keepalive using GET_PARAMETER if the server supports it
...
Conflicts:
Changelog
cmdutils.c
doc/APIchanges
doc/general.texi
ffmpeg.c
ffplay.c
libavcodec/h264pred_template.c
libavcodec/resample.c
libavutil/pixfmt.h
libavutil/softfloat.c
libavutil/softfloat.h
tests/rotozoom.c
Merged-by: Michael Niedermayer <michaelni@gmx.at>
This patch lets e.g. dsputil_init chose dsp functions with respect to
the bit depth to decode. The naming scheme of bit depth dependent
functions is <base name>_<bit depth>[_<prefix>] (i.e. the old
clear_blocks_c is now named clear_blocks_8_c).
Note: Some of the functions for high bit depth is not dependent on the
bit depth, but only on the pixel size. This leaves some room for
optimizing binary size.
Preparatory patch for high bit depth h264 decoding support.
Signed-off-by: Ronald S. Bultje <rsbultje@gmail.com>
This patch lets e.g. dsputil_init chose dsp functions with respect to
the bit depth to decode. The naming scheme of bit depth dependent
functions is <base name>_<bit depth>[_<prefix>] (i.e. the old
clear_blocks_c is now named clear_blocks_8_c).
Note: Some of the functions for high bit depth is not dependent on the
bit depth, but only on the pixel size. This leaves some room for
optimizing binary size.
Preparatory patch for high bit depth h264 decoding support.
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
About 2.5x the speed.
NOTE: the way that the asm code handles large qmuls is a bit suboptimal.
If x264-style dequant was used (separate shift and qmul values), it might
be possible to get some extra speed.
Originally committed as revision 26336 to svn://svn.ffmpeg.org/ffmpeg/trunk
inline asm works for gcc-3.x also (hopefully). Should fix gcc-3.x FATE
breakage after r25254.
Originally committed as revision 25262 to svn://svn.ffmpeg.org/ffmpeg/trunk
from memory locations/offsets depending on b_idx plus constants, rather than
having gcc do this. This saves several lea calls and together saves about
10 cycles in h264_loop_filter_strength_mmx2().
Originally committed as revision 25256 to svn://svn.ffmpeg.org/ffmpeg/trunk
a pxor, or remove the instruction alltogether. Altogether, this saves 1
instruction.
Originally committed as revision 25255 to svn://svn.ffmpeg.org/ffmpeg/trunk
This has no measurable speed effect because the surrounding code doesn't
take advantage of this yet.
Originally committed as revision 25254 to svn://svn.ffmpeg.org/ffmpeg/trunk
of the d_idx variable and therefore allows for future optimizations. No speed
difference by this commit itself.
Originally committed as revision 25253 to svn://svn.ffmpeg.org/ffmpeg/trunk
inlining various constants within the loop code. 20 cycles faster on
cathedral sample.
Originally committed as revision 25252 to svn://svn.ffmpeg.org/ffmpeg/trunk