12 Commits

Author SHA1 Message Date
Michael Niedermayer
cd8cb54990 Merge remote-tracking branch 'qatar/master'
* qatar/master:
  ARM: ac3dsp: optimised update_bap_counts()
  mpegaudiodec: Fix av_dlog() invocation.
  h264/10bit: add HAVE_ALIGNED_STACK checks.
  Update 8-bit H.264 IDCT function names to reflect bit-depth.
  Add IDCT functions for 10-bit H.264.
  mpegaudioenc: Fix broken av_dlog statement.
  Employ correct printf format specifiers, mostly in debug output.
  ARM: fix MUL64 inline asm for pre-armv6

Conflicts:
	libavcodec/mpegaudioenc.c
	libavformat/ape.c
	libavformat/mxfdec.c
	libavformat/r3d.c

Merged-by: Michael Niedermayer <michaelni@gmx.at>
2011-06-02 05:12:10 +02:00
Daniel Kang
348493db60 Update 8-bit H.264 IDCT function names to reflect bit-depth.
Signed-off-by: Ronald S. Bultje <rbultje@google.com>
2011-05-31 15:02:32 -07:00
Michael Niedermayer
b4bcd1e2f1 Merge remote-tracking branch 'qatar/master'
* qatar/master:
  Fix compilation of iirfilter-test.
  libx264: handle closed GOP codec flag
  lavf: remove duplicate assignment in avformat_alloc_context.
  lavf: use designated initializers for AVClasses.
  flvdec: clenup debug code
  asfdec: fix possible overread on broken files.
  asfdec: do not fall back to binary/generic search
  asfdec: reindent after previous commit c7bd5ed
  asfdec: fallback to binary search internally
  mpegaudio: add _fixed suffix to some names
  Modify x86util.asm to ease transitioning to 10-bit H.264 assembly.
  dct: build dct32 as separate object files
  qdm2: include correct header for rdft

Conflicts:
	ffpresets/libx264-fast.ffpreset
	ffpresets/libx264-fast_firstpass.ffpreset
	ffpresets/libx264-faster.ffpreset
	ffpresets/libx264-faster_firstpass.ffpreset
	ffpresets/libx264-medium.ffpreset
	ffpresets/libx264-medium_firstpass.ffpreset
	ffpresets/libx264-placebo.ffpreset
	ffpresets/libx264-placebo_firstpass.ffpreset
	ffpresets/libx264-slow.ffpreset
	ffpresets/libx264-slow_firstpass.ffpreset
	ffpresets/libx264-slower.ffpreset
	ffpresets/libx264-slower_firstpass.ffpreset
	ffpresets/libx264-superfast.ffpreset
	ffpresets/libx264-superfast_firstpass.ffpreset
	ffpresets/libx264-ultrafast.ffpreset
	ffpresets/libx264-ultrafast_firstpass.ffpreset
	ffpresets/libx264-veryfast.ffpreset
	ffpresets/libx264-veryfast_firstpass.ffpreset
	ffpresets/libx264-veryslow.ffpreset
	ffpresets/libx264-veryslow_firstpass.ffpreset
	libavformat/flvdec.c

Merged-by: Michael Niedermayer <michaelni@gmx.at>
2011-05-18 05:42:42 +02:00
Daniel Kang
d0005d347d Modify x86util.asm to ease transitioning to 10-bit H.264 assembly.
Arguments for variable size instructions are added to many macros, along
with other various changes. The x86util.asm code was ported from x264.

Signed-off-by: Diego Biurrun <diego@biurrun.de>
2011-05-17 20:44:48 +02:00
Michael Niedermayer
5a153604c9 Merge remote branch 'qatar/master'
* qatar/master:
  Fix FSF address copy paste error in some license headers.
  Add an aac sample which uses LTP to fate-aac.
DUPLICATE  [PATCH] Update pixdesc_be fate refs after adding 9/10bit YUV420P formats.
  arm: properly mark external symbol call

Conflicts:
	libavcodec/x86/ac3dsp.asm
	libavcodec/x86/deinterlace.asm
	libavcodec/x86/dsputil_yasm.asm
	libavcodec/x86/dsputilenc_yasm.asm
	libavcodec/x86/fft_mmx.asm
	libavcodec/x86/fmtconvert.asm
	libavcodec/x86/h264_chromamc.asm
	libavcodec/x86/h264_deblock.asm
	libavcodec/x86/h264_idct.asm
	libavcodec/x86/h264_intrapred.asm
	libavcodec/x86/h264_weight.asm
	libavcodec/x86/vc1dsp_yasm.asm
	libavcodec/x86/vp3dsp.asm
	libavcodec/x86/vp56dsp.asm
	libavcodec/x86/vp8dsp.asm
	libavcodec/x86/x86util.asm
	libswscale/ppc/swscale_template.c

Merged-by: Michael Niedermayer <michaelni@gmx.at>
2011-05-15 04:44:07 +02:00
Diego Biurrun
888fa31eca Fix FSF address copy paste error in some license headers. 2011-05-14 21:32:31 +02:00
Mans Rullgard
2912e87a6c Replace FFmpeg with Libav in licence headers
Signed-off-by: Mans Rullgard <mans@mansr.com>
2011-03-19 13:33:20 +00:00
Jason Garrett-Glaser
19fb234e4a H.264: split luma dc idct out and implement MMX/SSE2 versions
About 2.5x the speed.

NOTE: the way that the asm code handles large qmuls is a bit suboptimal.
If x264-style dequant was used (separate shift and qmul values), it might
be possible to get some extra speed.

Originally committed as revision 26336 to svn://svn.ffmpeg.org/ffmpeg/trunk
2011-01-14 21:34:25 +00:00
Reimar Döffinger
02b424d9c8 Add d suffix to movd target register to make it work with nasm.
Originally committed as revision 25206 to svn://svn.ffmpeg.org/ffmpeg/trunk
2010-09-26 09:15:18 +00:00
Ronald S. Bultje
ae11291865 Unroll loop in h264_idct_add16intra_sse2(). Basically identical to r25171, this
inlines scan8[] and removes loop setup. 15% faster, 0.4% overall.

See "[PATCH] unroll loop in h264_idct_add8_sse2()" thread on ML.

Originally committed as revision 25172 to svn://svn.ffmpeg.org/ffmpeg/trunk
2010-09-24 14:07:23 +00:00
Ronald S. Bultje
4bca677494 Unroll loop in h264_idct_add8_sse2(). This means we can inline scan8[] in the
code directly also and remove loop setup. 20% faster in function, 0.8% overall.

See "[PATCH] unroll loop in h264_idct_add8_sse2()" thread on ML.

Originally committed as revision 25171 to svn://svn.ffmpeg.org/ffmpeg/trunk
2010-09-24 14:05:45 +00:00
Ronald S. Bultje
1d16a1cf99 Rename h264_idct_sse2.asm to h264_idct.asm; move inline IDCT asm from
h264dsp_mmx.c to h264_idct.asm (as yasm code). Because the loops are now
coded in asm instead of C, this is (depending on the function) up to 50%
faster for cases where gcc didn't do a great job at looping.

Since h264_idct_add8() is now faster than the manual loop setup in h264.c,
in-asm idct calling can now be enabled for chroma as well (see r16207). For
MMX, this is 5% faster. For SSE2 (which isn't done for chroma if h264.c does
the looping), this makes it up to 50% faster. Speed gain overall is ~0.5-1.0%.

Originally committed as revision 25119 to svn://svn.ffmpeg.org/ffmpeg/trunk
2010-09-14 13:36:26 +00:00