122 Commits

Author SHA1 Message Date
Michael Niedermayer
889639969b dsputil_mmx: try to fix compilation without yasm.
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
2011-07-04 02:02:24 +02:00
Michael Niedermayer
976a8b2179 Merge remote-tracking branch 'qatar/master'
* qatar/master: (40 commits)
  H.264: template left MB handling
  H.264: faster fill_decode_caches
  H.264: faster write_back_*
  H.264: faster fill_filter_caches
  H.264: make filter_mb_fast support the case of unavailable top mb
  Do not include log.h in avutil.h
  Do not include pixfmt.h in avutil.h
  Do not include rational.h in avutil.h
  Do not include mathematics.h in avutil.h
  Do not include intfloat_readwrite.h in avutil.h
  Remove return statements following infinite loops without break
  RTSP: Doxygen comment cleanup
  doxygen: Escape '\' in Doxygen documentation.
  md5: cosmetics
  md5: use AV_WL32 to write result
  md5: add fate test
  md5: include correct headers
  md5: fix test program
  doxygen: Drop array size declarations from Doxygen parameter names.
  doxygen: Fix parameter names to match the function prototypes.
  ...

Conflicts:
	libavcodec/x86/dsputil_mmx.c
	libavformat/flvenc.c
	libavformat/oggenc.c
	libavformat/wtv.c

Merged-by: Michael Niedermayer <michaelni@gmx.at>
2011-07-04 00:45:21 +02:00
Daniel Kang
9bfa5363da H.264: Add x86 assembly for 10-bit H.264 qpel functions.
Mainly ported from 8-bit H.264 qpel.

Some code ported from x264. LGPL ok by author.

Signed-off-by: Ronald S. Bultje <rsbultje@gmail.com>
2011-07-03 07:43:38 -07:00
Michael Niedermayer
3074f03a07 Merge remote-tracking branch 'qatar/master'
* qatar/master:
  get_bits: remove x86 inline asm in A32 bitstream reader
  doc: Remove outdated information about our issue tracker
  avidec: Factor out the sync fucntionality.
  fate-aac: Expand coverage.
  ac3dsp: add x86-optimized versions of ac3dsp.extract_exponents().
  ac3dsp: simplify extract_exponents() now that it does not need to do clipping.
  ac3enc: clip coefficients after MDCT.
  ac3enc: add int32_t array clipping function to DSPUtil, including x86 versions.
  swscale: for >8bit scaling, read in native bit-depth.
  matroskadec: matroska_read_seek after after EBML_STOP leads to failure.
  doxygen: fix usage of @file directive in libavutil/{dict,file}.h
  doxygen: Help doxygen parser to understand the DECLARE_ALIGNED and offsetof macros

Conflicts:
	doc/issue_tracker.txt
	libavformat/avidec.c
	libavutil/dict.h
	libswscale/swscale.c
	libswscale/utils.c
	tests/ref/lavfi/pixfmts_scale

Merged-by: Michael Niedermayer <michaelni@gmx.at>
2011-07-02 03:24:32 +02:00
Justin Ruggles
6054cd25b4 ac3enc: add int32_t array clipping function to DSPUtil, including x86 versions. 2011-07-01 13:02:11 -04:00
Michael Niedermayer
bb9d5171a7 Merge remote-tracking branch 'qatar/master'
* qatar/master: (21 commits)
  swscale: Add Doxygen for hyscale_fast/hScale.
  fate: enable lavfi-pixmt tests on big endian systems
  PPC: swscale: disable altivec functions for unsupported formats
  fate: merge identical pixdesc_be/le tests
  swscale: Add Doxygen for yuv2planar*/yuv2packed* functions.
  build: call texi2pod.pl with full path instead of symlink
  build: include sub-makefiles using full path instead of symlinks
  swscale: update big endian reference values after dff5a835.
  wavpack: skip blocks with no samples
  cosmetics: remove outdated comment that is no longer true
  build: replace some addprefix/addsuffix with substitution refs
  avutil: Remove unused arbitrary precision integer code.
  configure: Drop check for availability of ten assembler operands.
  aacenc: Save channel configuration for later use.
  aacenc: Fix codebook trellising for zeroed bands.
  swscale: change prototypes of scaled YUV output functions.
  swscale: re-add support for non-native endianness.
  swscale: disentangle yuv2rgbX_c_full() into small functions.
  swscale: split yuv2packed[12X]_c() remainders into small functions.
  swscale: split yuv2packedX_altivec in smaller functions.
  ...

Conflicts:
	Makefile
	configure
	libavcodec/x86/dsputil_mmx.c
	libavfilter/Makefile
	libavformat/Makefile
	libavutil/integer.c
	libavutil/integer.h
	libswscale/swscale.c
	libswscale/swscale_internal.h
	libswscale/x86/swscale_template.c
	tests/ref/lavfi/pixdesc_le
	tests/ref/lavfi/pixfmts_scale

Merged-by: Michael Niedermayer <michaelni@gmx.at>
2011-06-29 05:23:12 +02:00
Diego Biurrun
d2ee495fb2 configure: Drop check for availability of ten assembler operands.
This was done to support gcc 2.95, which is an old legacy compiler
that fails to compile the current codebase anyway.
2011-06-28 13:14:37 +02:00
Michael Niedermayer
83f9bc8aee Merge remote-tracking branch 'qatar/master'
* qatar/master:
  lavf: prevent crash in av_open_input_file() if ap == NULL.
  more Changelog additions
  lavf: add a forgotten NULL check in convert_format_parameters().
  Fix build if yasm is not available.
  H.264: Add x86 assembly for 10-bit MC Chroma H.264 functions.

Conflicts:
	Changelog

Merged-by: Michael Niedermayer <michaelni@gmx.at>
2011-06-19 04:02:06 +02:00
Ronald S. Bultje
ed63f527f2 Fix build if yasm is not available. 2011-06-18 08:34:14 -04:00
Daniel Kang
f188a1e0ca H.264: Add x86 assembly for 10-bit MC Chroma H.264 functions.
Mainly ported from 8-bit H.264 MC Chroma.

Signed-off-by: Ronald S. Bultje <rsbultje@gmail.com>
2011-06-18 07:52:19 -04:00
Michael Niedermayer
c137fdd778 Merge remote-tracking branch 'qatar/master'
* qatar/master:
  swscale: remove misplaced comment.
  ffmpeg: fix streaming to ffserver.
  swscale: split out RGB48 output functions from yuv2packed[12X]_c().
  build: move vpath directives to main Makefile
  swscale: fix JPEG-range YUV scaling artifacts.
  build: move ALLFFLIBS to a more logical place
  ARM: factor some repetitive code into macros
  Fix SVQ3 after adding 4:4:4 H.264 support
  H.264: fix CODEC_FLAG_GRAY
  4:4:4 H.264 decoding support
  ac3enc: fix allocation of floating point samples.

Conflicts:
	ffmpeg.c
	libavcodec/dsputil_template.c
	libavcodec/h264.c
	libavcodec/mpegvideo.c
	libavcodec/snow.c
	libswscale/swscale.c
	libswscale/swscale_internal.h

Merged-by: Michael Niedermayer <michaelni@gmx.at>
2011-06-15 02:15:25 +02:00
Jason Garrett-Glaser
c90b94424c 4:4:4 H.264 decoding support
Note: this is 4:4:4 from the 2007 spec revision, not the previous (now deprecated) 4:4:4 mode in H.264.
2011-06-13 21:16:30 -07:00
Jason Garrett-Glaser
504811baea Roll back 4:4:4 H.264 for now
Needs some ARM/PPC asm modifications.
2011-06-13 13:38:46 -07:00
Jason Garrett-Glaser
c9c493872c 4:4:4 H.264 decoding support
Note: this is 4:4:4 from the 2007 spec revision, not the previous (now deprecated) 4:4:4 mode in H.264.
2011-06-13 12:21:39 -07:00
Michael Niedermayer
612122b187 Merge remote branch 'qatar/master'
* qatar/master: (32 commits)
  10-bit H.264 x86 chroma v loopfilter asm
  Port SMPTE S302M audio decoder from FFmbc 0.3. [Copyright headers corrected]
  Fix crash of interlaced MPEG2 decoding
  h264pred: fix one more aliasing violation.
  doc/APIchanges: fill in missing hashes and dates.
  flacenc: use proper initializers for AVOption default values.
  lavc: deprecate named constants for deprecated antialias_algo.
  aac: workaround for compilation on cygwin
  swscale: extend YUV422p support to 10bits depth
  tiff: add support for inverted FillOrder for uncompressed data
  Remove unused softfloat implementation.
  h264pred: fix aliasing violations.
  rotozoom: Eliminate French variable name.
  rotozoom: Check return value of fread().
  rotozoom: Return an error value instead of calling exit().
  rotozoom: Make init_demo() return int and check for errors on invocation.
  rotozoom: Drop silly UINT8 typedef.
  rotozoom: Drop some unnecessary parentheses.
  rotozoom: K&R coding style cosmetics
  rtsp: Only do keepalive using GET_PARAMETER if the server supports it
  ...

Conflicts:
	Changelog
	cmdutils.c
	doc/APIchanges
	doc/general.texi
	ffmpeg.c
	ffplay.c
	libavcodec/h264pred_template.c
	libavcodec/resample.c
	libavutil/pixfmt.h
	libavutil/softfloat.c
	libavutil/softfloat.h
	tests/rotozoom.c

Merged-by: Michael Niedermayer <michaelni@gmx.at>
2011-05-12 04:51:24 +02:00
Michael Niedermayer
59eb12faff Merge remote branch 'qatar/master'
* qatar/master: (30 commits)
  AVOptions: make default_val a union, as proposed in AVOption2.
  arm/h264pred: add missing argument type.
  h264dsp_mmx: place bracket outside #if/#endif block.
  lavf/utils: fix ff_interleave_compare_dts corner case.
  fate: add 10-bit H264 tests.
  h264: do not print "too many references" warning for intra-only.
  Enable decoding of high bit depth h264.
  Adds 8-, 9- and 10-bit versions of some of the functions used by the h264 decoder.
  Add support for higher QP values in h264.
  Add the notion of pixel size in h264 related functions.
  Make the h264 loop filter bit depth aware.
  Template dsputil_template.c with respect to pixel size, etc.
  Template h264idct_template.c with respect to pixel size, etc.
  Preparatory patch for high bit depth h264 decoding support.
  Move some functions in dsputil.c into a new file dsputil_template.c.
  Move the functions in h264idct into a new file h264idct_template.c.
  Move the functions in h264pred.c into a new file h264pred_template.c.
  Preparatory patch for high bit depth h264 decoding support.
  Add pixel formats for 9- and 10-bit yuv420p.
  Choose h264 chroma dc dequant function dynamically.
  ...

Conflicts:
	doc/APIchanges
	ffmpeg.c
	ffplay.c
	libavcodec/alpha/dsputil_alpha.c
	libavcodec/arm/dsputil_init_arm.c
	libavcodec/arm/dsputil_init_armv6.c
	libavcodec/arm/dsputil_init_neon.c
	libavcodec/arm/dsputil_iwmmxt.c
	libavcodec/arm/h264pred_init_arm.c
	libavcodec/bfin/dsputil_bfin.c
	libavcodec/dsputil.c
	libavcodec/h264.c
	libavcodec/h264.h
	libavcodec/h264_cabac.c
	libavcodec/h264_cavlc.c
	libavcodec/h264_loopfilter.c
	libavcodec/h264_ps.c
	libavcodec/h264_refs.c
	libavcodec/h264dsp.c
	libavcodec/h264idct.c
	libavcodec/h264pred.c
	libavcodec/mlib/dsputil_mlib.c
	libavcodec/options.c
	libavcodec/ppc/dsputil_altivec.c
	libavcodec/ppc/dsputil_ppc.c
	libavcodec/ppc/h264_altivec.c
	libavcodec/ps2/dsputil_mmi.c
	libavcodec/sh4/dsputil_align.c
	libavcodec/sh4/dsputil_sh4.c
	libavcodec/sparc/dsputil_vis.c
	libavcodec/utils.c
	libavcodec/version.h
	libavcodec/x86/dsputil_mmx.c
	libavformat/options.c
	libavformat/utils.c
	libavutil/pixfmt.h
	libswscale/swscale.c
	libswscale/swscale_internal.h
	libswscale/swscale_template.c
	tests/ref/seek/lavf_avi

Merged-by: Michael Niedermayer <michaelni@gmx.at>
2011-05-11 05:47:02 +02:00
Jason Garrett-Glaser
9f3d6ca4f1 Port x86 10-bit H.264 deblock asm from x264 2011-05-10 20:02:15 -07:00
Oskar Arvidsson
19a0729b4c Adds 8-, 9- and 10-bit versions of some of the functions used by the h264 decoder.
This patch lets e.g. dsputil_init chose dsp functions with respect to
the bit depth to decode. The naming scheme of bit depth dependent
functions is <base name>_<bit depth>[_<prefix>] (i.e. the old
clear_blocks_c is now named clear_blocks_8_c).

Note: Some of the functions for high bit depth is not dependent on the
bit depth, but only on the pixel size. This leaves some room for
optimizing binary size.

Preparatory patch for high bit depth h264 decoding support.

Signed-off-by: Ronald S. Bultje <rsbultje@gmail.com>
2011-05-10 07:24:36 -04:00
Baptiste Coudurier
6d4c49a2af Move png mmx functions into x86/png_mmx.c, remove them from DSPContext.
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
2011-04-27 20:08:09 +02:00
Oskar Arvidsson
8dbe585641 Adds 8-, 9- and 10-bit versions of some of the functions used by the h264 decoder.
This patch lets e.g. dsputil_init chose dsp functions with respect to
the bit depth to decode. The naming scheme of bit depth dependent
functions is <base name>_<bit depth>[_<prefix>] (i.e. the old
clear_blocks_c is now named clear_blocks_8_c).

Note: Some of the functions for high bit depth is not dependent on the
bit depth, but only on the pixel size. This leaves some room for
optimizing binary size.

Preparatory patch for high bit depth h264 decoding support.

Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
2011-04-10 22:33:42 +02:00
Michael Niedermayer
3c8493074b Merge remote-tracking branch 'newdev/master'
* newdev/master:
  dsputil: allow to skip drawing of top/bottom edges.
  Split fate-psx-str-v3 into a video-only and audio-only test.

Conflicts:
	libavcodec/dsputil.c
	libavcodec/mpegvideo.c
	libavcodec/snow.c
	libavcodec/x86/dsputil_mmx.c

Merged-by: Michael Niedermayer <michaelni@gmx.at>
2011-03-27 01:40:18 +01:00
Alexander Strange
1500be13f2 dsputil: allow to skip drawing of top/bottom edges. 2011-03-26 17:45:38 -04:00
Michael Niedermayer
2fd41c9067 Merge remote-tracking branch 'newdev/master'
* newdev/master:
  avio: make udp_set_remote_url/get_local_port internal.
  asfdec: also subtract preroll when reading simple index object
  matroskaenc: remove a variable that's unused after bc17bd9.
  avio: cosmetics - nicer vertical alignment.
  Remove unnecessary icc version checks
  Disable 'attribute "foo" ignored' warnings from icc
  rtsp: Don't use a locale dependent format string
  Add xd55 codec tag for XDCAM HD422 720p25 CBR files.
  configure: get libavcodec version from new version.h header
  lavc: move the version macros to a new installed header.
  matroskaenc: simplify get_aac_sample_rates by using ff_mpeg4audio_get_config
  Do not use format string "%0.3f" for RTSP Range field.
  Add apply_window_int16() to DSPContext with x86-optimized versions and use it in the ac3_fixed encoder.
  Document usage of import libraries created by dlltool
  configure: Set the correct lib target for arm/wince dlltool
  fate: simplify regression-funcs.sh
  fate: add support for multithread testing

Conflicts:
	libavformat/rtspdec.c
	libavutil/attributes.h
	libavutil/internal.h
	libavutil/mem.h

Merged-by: Michael Niedermayer <michaelni@gmx.at>
2011-03-24 02:16:11 +01:00
Justin Ruggles
e6e9823488 Add apply_window_int16() to DSPContext with x86-optimized versions and use it
in the ac3_fixed encoder.
2011-03-22 21:08:30 -04:00
Michael Niedermayer
d375c10400 Fake-Merge remote-tracking branch 'ffmpeg-mt/master' 2011-03-22 22:36:57 +01:00
Mans Rullgard
2912e87a6c Replace FFmpeg with Libav in licence headers
Signed-off-by: Mans Rullgard <mans@mansr.com>
2011-03-19 13:33:20 +00:00
Ronald S. Bultje
6a717eb4aa dsputil_mmx.c: remove ff_vector128.
Remove ff_vector128, it is identical to ff_pb_80.
(cherry picked from commit bf6fa732459399fac215bdfa44dd39a6fb1d1e01)
2011-02-20 19:05:46 +01:00
Ronald S. Bultje
bf6fa73245 dsputil_mmx.c: remove ff_vector128.
Remove ff_vector128, it is identical to ff_pb_80.
2011-02-19 10:51:15 -05:00
Ronald S. Bultje
9a1ced321b dsputil: move VC1-specific stuff into VC1DSPContext.
(cherry picked from commit 12802ec0601c3bd7b9c7a2503518e28fd5e7d744)
2011-02-18 19:52:40 +01:00
Ronald S. Bultje
12802ec060 dsputil: move VC1-specific stuff into VC1DSPContext. 2011-02-17 17:35:35 -05:00
Justin Ruggles
fe2ff6d247 Separate format conversion DSP functions from DSPContext.
This will be beneficial for use with the audio conversion API without
requiring it to depend on all of dsputil.

Signed-off-by: Mans Rullgard <mans@mansr.com>
(cherry picked from commit c73d99e672329c8f2df290736ffc474c360ac4ae)
2011-02-04 03:08:09 +01:00
Justin Ruggles
c73d99e672 Separate format conversion DSP functions from DSPContext.
This will be beneficial for use with the audio conversion API without
requiring it to depend on all of dsputil.

Signed-off-by: Mans Rullgard <mans@mansr.com>
2011-02-02 02:44:53 +00:00
Ronald S. Bultje
baffa091af Implement a SIMD version of emulated_edge_mc() for x86.
From ~550 cycles (C version) to 170 (SSE/x86-64), 206 (MMX/x86-32)
and 196 (SSE2/x86-32) cycles.
(cherry picked from commit 81f2a3f4ffcc6935b8b8ada4954700b3f333ae4f)
2011-02-02 03:40:49 +01:00
Justin Ruggles
389b5bfa34 cosmetics: indentation
Signed-off-by: Mans Rullgard <mans@mansr.com>
(cherry picked from commit d19b744a36987e1dd0c3239a2e1baa1e71d07a77)
2011-02-02 03:40:49 +01:00
Justin Ruggles
a8ae4e0e7b Remove unneeded add bias from 3 functions.
DSPContext.vector_fmul_window()
DCADSPContext.lfe_fir()
SynthFilterContext.synth_filter_float()

Signed-off-by: Mans Rullgard <mans@mansr.com>
(cherry picked from commit 80ba1ddb58b5923b9f36a6acd542affc4ca722eb)
2011-02-02 03:40:48 +01:00
Ronald S. Bultje
81f2a3f4ff Implement a SIMD version of emulated_edge_mc() for x86.
From ~550 cycles (C version) to 170 (SSE/x86-64), 206 (MMX/x86-32)
and 196 (SSE2/x86-32) cycles.
2011-01-31 20:55:56 -05:00
Justin Ruggles
d19b744a36 cosmetics: indentation
Signed-off-by: Mans Rullgard <mans@mansr.com>
2011-01-31 20:30:15 +00:00
Justin Ruggles
80ba1ddb58 Remove unneeded add bias from 3 functions.
DSPContext.vector_fmul_window()
DCADSPContext.lfe_fir()
SynthFilterContext.synth_filter_float()

Signed-off-by: Mans Rullgard <mans@mansr.com>
2011-01-31 20:28:42 +00:00
Justin Ruggles
015f9f1ad3 Change DSPContext.vector_fmul() from dst=dst*src to dest=src0*src1.
Signed-off-by: Mans Rullgard <mans@mansr.com>
(cherry picked from commit 6eabb0d3ad42b91c1b4c298718c29961f7c1653a)
2011-01-23 19:32:08 +01:00
Justin Ruggles
6eabb0d3ad Change DSPContext.vector_fmul() from dst=dst*src to dest=src0*src1.
Signed-off-by: Mans Rullgard <mans@mansr.com>
2011-01-22 17:53:27 +00:00
Mans Rullgard
ef4a65149d Replace ASMALIGN() with .p2align
This macro has unconditionally used .p2align for a long time and
serves no useful purpose.
2011-01-18 20:48:24 +00:00
Mans Rullgard
ac3c9d0169 x86: remove VLA in ac3_downmix_sse 2011-01-18 20:48:24 +00:00
Ronald S. Bultje
ec3233a855 Fix ff_pw_3 alignment.
Originally committed as revision 26344 to svn://svn.ffmpeg.org/ffmpeg/trunk
2011-01-14 23:26:34 +00:00
Jason Garrett-Glaser
19fb234e4a H.264: split luma dc idct out and implement MMX/SSE2 versions
About 2.5x the speed.

NOTE: the way that the asm code handles large qmuls is a bit suboptimal.
If x264-style dequant was used (separate shift and qmul values), it might
be possible to get some extra speed.

Originally committed as revision 26336 to svn://svn.ffmpeg.org/ffmpeg/trunk
2011-01-14 21:34:25 +00:00
Ronald S. Bultje
8d147f1f60 For rounding in chroma MC SSSE3, use 16-byte pw_3/4 instead of reading 8 bytes
and then using movlhps to dup it into the higher half of the register.

Originally committed as revision 26086 to svn://svn.ffmpeg.org/ffmpeg/trunk
2010-12-24 17:23:22 +00:00
Baptiste Coudurier
90f1f3bf00 In yadif filter, declare asm constants directly to avoid dependency on libavcodec
Originally committed as revision 25895 to svn://svn.ffmpeg.org/ffmpeg/trunk
2010-12-06 00:14:15 +00:00
Baptiste Coudurier
9e95999e2a 10l, add ff_pw_1 to dsputil_mmx for yadif sse2
Originally committed as revision 25881 to svn://svn.ffmpeg.org/ffmpeg/trunk
2010-12-04 13:06:06 +00:00
İsmail Dönmez
80e33d2451 dsputil: Use explicit movzbl instead of movzx
This fixes compilation with the latest clang trunk version.

Patch by İsmail Dönmez, ismail at namtrac dot org

Originally committed as revision 25628 to svn://svn.ffmpeg.org/ffmpeg/trunk
2010-11-01 19:35:51 +00:00
Ramiro Polla
153ca56b38 xmm_clobbers: list xmm registers first in clobber list
suncc does not like the leading commas inside the macro, but it has no problem
with trailing commas.

Originally committed as revision 25615 to svn://svn.ffmpeg.org/ffmpeg/trunk
2010-10-31 18:14:48 +00:00
Ramiro Polla
5d543a3d13 dsputil_mmx: add xmm registers to clobber list
Originally committed as revision 25611 to svn://svn.ffmpeg.org/ffmpeg/trunk
2010-10-31 13:57:58 +00:00