Commit Graph

704 Commits

Author SHA1 Message Date
Daniel Kang
406fbd24dc H.264: Add optimizations to predict x86 assembly.
Signed-off-by: Ronald S. Bultje <rsbultje@gmail.com>
2011-07-22 14:54:33 -07:00
Michael Niedermayer
4095fa9038 Merge remote-tracking branch 'qatar/master'
* qatar/master:
  dnxhddec: optimise dnxhd_decode_dct_block()
  rtp: remove disabled code
  eac3enc: use different numbers of blocks per frame to allow higher bitrates
  dnxhd: add regression test for 10-bit
  dnxhd: 10-bit support
  dsputil: update per-arch init funcs for non-h264 high bit depth
  dsputil: template get_pixels() for different bit depths
  dsputil: create 16/32-bit dctcoef versions of some functions
  jfdctint: add 10-bit version
  mov: add clcp type track as Subtitle stream.
  mpeg4: add Mpeg4 Profiles names.
  mpeg4: decode Level Profile for MPEG4 Part 2.
  ffprobe: display bitstream level.
  imgconvert: remove unused glue and xglue macros

Conflicts:
	libavcodec/dsputil_template.c

Merged-by: Michael Niedermayer <michaelni@gmx.at>
2011-07-22 12:08:52 +02:00
Joseph Artsimovich
5ab21439fd dnxhd: 10-bit support
Signed-off-by: Mans Rullgard <mans@mansr.com>
2011-07-21 18:44:40 +01:00
Mans Rullgard
a617c6aaa3 dsputil: update per-arch init funcs for non-h264 high bit depth
Signed-off-by: Mans Rullgard <mans@mansr.com>
2011-07-21 18:10:58 +01:00
Mans Rullgard
874f1a901d dsputil: template get_pixels() for different bit depths
Signed-off-by: Mans Rullgard <mans@mansr.com>
2011-07-21 18:10:58 +01:00
Mans Rullgard
0a72533e98 jfdctint: add 10-bit version
Signed-off-by: Mans Rullgard <mans@mansr.com>
2011-07-21 18:10:58 +01:00
Michael Niedermayer
f93f6963ba Merge remote-tracking branch 'qatar/master'
* qatar/master:
  rv30: return AVERROR(EINVAL) instead of EINVAL
  build: add -L flags before existing LDFLAGS
  simple_idct: whitespace cosmetics
  simple_idct: make repeated code a macro
  dsputil: remove huge #if 0 block
  simple_idct: change 10-bit add/put stride from pixels to bytes
  dsputil: allow 9/10-bit functions for non-h264 codecs
  dnxhd: rename some data tables
  dnxhdenc: remove inline from function only called through pointer
  dnxhdenc: whitespace cosmetics
  swscale: mark YUV422P10(LE,BE) as supported for output
  configure: add -xc99 to LDFLAGS for Sun CC
  Remove unused and non-compiling vestigial g729 decoder
  Remove unused code under G729_BITEXACT #ifdef.
  mpegvideo: fix invalid picture unreferencing.
  dsputil: Remove extra blank line at end.
  dsputil: Replace a LONG_MAX check with HAVE_FAST_64BIT.
  simple_idct: add 10-bit version

Conflicts:
	Makefile
	libavcodec/g729data.h
	libavcodec/g729dec.c
	libavcodec/rv30.c
	tests/ref/lavfi/pixdesc
	tests/ref/lavfi/pixfmts_copy
	tests/ref/lavfi/pixfmts_null
	tests/ref/lavfi/pixfmts_scale
	tests/ref/lavfi/pixfmts_vflip

Merged-by: Michael Niedermayer <michaelni@gmx.at>
2011-07-21 16:28:53 +02:00
Mans Rullgard
e7a972e113 simple_idct: add 10-bit version
Signed-off-by: Mans Rullgard <mans@mansr.com>
2011-07-20 17:49:48 +01:00
Michael Niedermayer
3c3daf4d19 Merge remote-tracking branch 'qatar/master'
* qatar/master:
  vf_libopencv: replace opencv/cxtypes.h #include by opencv/cxcore.h
  dsputil: remove disabled code
  tta: remove disabled code
  gxfenc: place variable declarations before statements
  x86: Use LOCAL_ALIGNED in mpegvideo_mmx_template
  random_seed: use proper #includes

Merged-by: Michael Niedermayer <michaelni@gmx.at>
2011-07-18 16:43:46 +02:00
Diego Biurrun
65083b4911 dsputil: remove disabled code 2011-07-18 11:48:35 +02:00
Martin Storsjö
8f62ef0f95 x86: Use LOCAL_ALIGNED in mpegvideo_mmx_template
Signed-off-by: Martin Storsjö <martin@martin.st>
2011-07-18 00:10:45 +03:00
Michael Niedermayer
78accb876c Merge remote-tracking branch 'qatar/master'
* qatar/master:
  ffmpeg: fix some indentation
  ffmpeg: fix operation with --disable-avfilter
  simple_idct: remove disabled code
  motion_est: remove disabled code
  vc1: remove disabled code
  fate: separate lavf-mxf_d10 test from lavf-mxf
  cabac: Move code only used in the cabac test program to cabac.c.
  ffplay: warn that -pix_fmt is no longer working, suggest alternative
  ffplay: warn that -s is no longer working, suggest alternative
  lavf: rename enc variable in utils.c:has_codec_parameters()
  lavf: use designated initialisers for all (de)muxers.
  wav: remove a use of deprecated AV_METADATA_ macro
  rmdec: remove useless ap parameter from rm_read_header_old()
  dct-test: remove write-only variable
  des: fix #if conditional around P_shuffle
  Use LOCAL_ALIGNED in ff_check_alignment()

Conflicts:
	ffmpeg.c
	libavformat/avidec.c
	libavformat/matroskaenc.c
	libavformat/mp3enc.c
	libavformat/oggenc.c
	libavformat/utils.c
	tests/ref/lavf/mxf

Merged-by: Michael Niedermayer <michaelni@gmx.at>
2011-07-17 20:12:02 +02:00
Diego Biurrun
e0ae2174db simple_idct: remove disabled code 2011-07-17 17:32:37 +02:00
Michael Niedermayer
5dc6bd86f0 Merge remote-tracking branch 'qatar/master'
* qatar/master:
  APIchanges: fill in missing hashes and dates.
  Add an APIChanges entry and bump minor versions for recent changes.
  ffmpeg: print the low bitrate warning after the codec is openend.
  doxygen: Move function documentation into the macro generating the function.
  doxygen: Make sure parameter names match between .c and .h files.
  h264: move fill_decode_neighbors()/fill_decode_caches() to h264_mvpred.h
  H.264: Add more x86 assembly for 10-bit H.264 predict functions
  lavf: fix invalid reads in avformat_find_stream_info()
  cmdutils: replace opt_default with opt_default2() and remove set_context_opts
  ffmpeg: use new avcodec_open2 and avformat_find_stream_info API.
  ffplay: use new avcodec_open2 and avformat_find_stream_info API.
  cmdutils: store all codec options in one dict instead of video/audio/sub
  ffmpeg: check experimental flag after codec is opened.
  ffmpeg: do not set GLOBAL_HEADER flag in the options context

Conflicts:
	cmdutils.c
	doc/APIchanges
	ffmpeg.c
	ffplay.c
	libavcodec/version.h
	libavformat/version.h
	libswscale/swscale_unscaled.c

Merged-by: Michael Niedermayer <michaelni@gmx.at>
2011-07-14 20:44:58 +02:00
Daniel Kang
ac4a85f476 H.264: Add more x86 assembly for 10-bit H.264 predict functions
Mainly ported from 8-bit H.264 predict.

Some code ported from x264. LGPL ok by author.

Signed-off-by: Ronald S. Bultje <rsbultje@gmail.com>
2011-07-13 18:44:51 -07:00
Michael Niedermayer
e10979ff56 Merge remote-tracking branch 'qatar/master'
* qatar/master:
  changelog: misc typo and wording fixes
  H.264: add filter_mb_fast support for >8-bit decoding
  doc: Remove outdated comments about gcc 2.95 and gcc 3.3 support.
  lls: use av_lfg instead of rand() in test program
  build: remove unnecessary dependency on libs from 'all' target
  H.264: avoid redundant alpha/beta calculations in loopfilter
  H.264: optimize intra/inter loopfilter decision
  mpegts: fix Continuity Counter error detection
  build: remove unnecessary FFLDFLAGS variable
  vp8/mt: flush worker thread, not application thread context, on seek.
  mt: proper locking around release_buffer calls.
  DxVA2: unbreak build after [657ccb5ac7]
  hwaccel: unbreak build
  Eliminate FF_COMMON_FRAME macro.

Conflicts:
	Changelog
	Makefile
	doc/developer.texi
	libavcodec/avcodec.h
	libavcodec/h264.c
	libavcodec/mpeg4videodec.c

Merged-by: Michael Niedermayer <michaelni@gmx.at>
2011-07-12 01:42:32 +02:00
Jason Garrett-Glaser
b5bbc84fe2 H.264: add filter_mb_fast support for >8-bit decoding
Much faster high bit depth deblocking.
2011-07-11 14:58:50 -07:00
Michael Niedermayer
3602ad7ee6 Merge commit '142e76f1055de5dde44696e71a5f63f2cb11dedf'
* commit '142e76f1055de5dde44696e71a5f63f2cb11dedf':
  swscale: fix crash with dithering due incorrect offset calculation.
  matroskadec: fix stupid typo (!= -> ==)
  build: remove duplicates from order-only directory prerequisite list
  build: rework rules for things in the tools dir
  configure: fix --cpu=host with gcc 4.6
  ARM: use const macro to define constant data in asm
  bitdepth: simplify FUNC/FUNCC macros
  dsputil: remove ff_emulated_edge_mc macro used in one place
  9/10-bit: simplify clipping macros
  matroskadec: reindent
  matroskadec: defer parsing of cues element until we seek.
  lavc: add support for codec-specific defaults.
  lavc: make avcodec_alloc_context3 officially public.
  lavc: remove a half-working attempt at different defaults for audio/video codecs.
  ac3dec: add a drc_scale private option
  lavf: add avformat_find_stream_info()
  lavc: introduce avcodec_open2() as a replacement for avcodec_open().

Conflicts:
	Makefile
	libavcodec/utils.c
	libavformat/avformat.h
	libswscale/swscale_internal.h

Merged-by: Michael Niedermayer <michaelni@gmx.at>
2011-07-11 04:23:25 +02:00
Mans Rullgard
710b8df949 dsputil: remove ff_emulated_edge_mc macro used in one place
This macro can cause problems in conjunction with the bitdepth
template expansion.  It was presumably added to keep source
compatibility when high bitdepth support was added.  However,
emulated_edge_mc is a dsputil pointer and should not be called
directly, so there is little reason to keep such a macro.

Signed-off-by: Mans Rullgard <mans@mansr.com>
2011-07-10 17:55:58 +01:00
Michael Niedermayer
2f56a97f24 Merge remote-tracking branch 'qatar/master'
* qatar/master: (22 commits)
  H.264: fix filter_mb_fast with 4:4:4 + 8x8dct
  alsa: limit buffer_size to 32768 frames.
  alsa: fallback to buffer_size/4 for period_size.
  doc: replace @pxref by @ref where appropriate
  mpeg1video: don't abort if thread_count is too high.
  segafilm: add support for videos with cri adx adpcm
  gxf: Fix 25 fps DV material in GXF being misdetected as 50 fps
  libxvid: Add const qualifier to silence compiler warning.
  H.264: improve qp_thresh check
  H.264: use fill_rectangle in CABAC decoding
  H.264: Remove redundant hl_motion_16/8 code
  H.264: merge fill_rectangle into P-SKIP MV prediction, to match B-SKIP
  H.264: faster P-SKIP decoding
  H.264: av_always_inline some more functions
  H.264: Add x86 assembly for 10-bit H.264 predict functions
  swscale: rename uv_off/uv_off2 to uv_off_px/byte.
  swscale: implement error dithering in planarCopyWrapper.
  swscale: error dithering for 16/9/10-bit to 8-bit.
  swscale: fix overflow in 16-bit vertical scaling.
  swscale: fix crash in 8-bpc bilinear output without alpha.
  ...

Conflicts:
	doc/developer.texi
	libavdevice/alsa-audio.h
	libavformat/gxf.c
	libswscale/swscale.c
	libswscale/swscale_internal.h
	libswscale/swscale_unscaled.c
	libswscale/x86/swscale_template.c
	tests/ref/lavfi/pixdesc
	tests/ref/lavfi/pixfmts_copy
	tests/ref/lavfi/pixfmts_crop
	tests/ref/lavfi/pixfmts_hflip
	tests/ref/lavfi/pixfmts_null
	tests/ref/lavfi/pixfmts_scale
	tests/ref/lavfi/pixfmts_vflip

Merged-by: Michael Niedermayer <michaelni@gmx.at>
2011-07-10 04:28:50 +02:00
Daniel Kang
c0483d0c7a H.264: Add x86 assembly for 10-bit H.264 predict functions
Mainly ported from 8-bit H.264 predict.

Some code ported from x264. LGPL ok by author.

Signed-off-by: Ronald S. Bultje <rsbultje@gmail.com>
2011-07-08 15:59:29 -07:00
Michael Niedermayer
5d4fd1d1ad Merge remote-tracking branch 'qatar/master'
* qatar/master: (36 commits)
  ARM: allow unaligned buffer in fixed-point NEON FFT4
  fate: test more FFT etc sizes
  dca: set AVCodecContext frame_size for DTS audio
  YASM: Shut up unused variable compiler warning with --disable-yasm.
  x86_32: Fix build on x86_32 with --disable-yasm.
  iirfilter: add fate test
  doxygen: Add qmul docs.
  ogg: propagate return values and return more meaningful error values
  H.264: fix overreads of qscale_table
  Remove unused static tables and static inline functions.
  eval: clear Parser instances before using
  dct-test: remove 'ref' function pointer from tables
  build: Remove deleted 'check' target from .PHONY list.
  oggdec: Abort Ogg header parsing when encountering a data packet.
  Add LGPL license boilerplate to files lacking it.
  mxfenc: small typo fix
  doxygen: Fix documentation for some VP8 functions.
  sha: use AV_RB32() instead of assuming buffer can be cast to uint32_t*
  des: allow unaligned input and output buffers
  aes: allow unaligned input and output buffers
  ...

Conflicts:
	libavcodec/dct-test.c
	libavcodec/libvpxenc.c
	libavcodec/x86/dsputil_mmx.c
	libavcodec/x86/h264_qpel_mmx.c
	libavfilter/x86/gradfun.c
	libavformat/oggdec.c

Merged-by: Michael Niedermayer <michaelni@gmx.at>
2011-07-05 02:26:17 +02:00
Daniel Kang
3c7c16fde3 YASM: Shut up unused variable compiler warning with --disable-yasm.
Signed-off-by: Diego Biurrun <diego@biurrun.de>
2011-07-04 18:49:09 +02:00
Daniel Kang
567a32b5b2 x86_32: Fix build on x86_32 with --disable-yasm.
Signed-off-by: Ronald S. Bultje <rsbultje@gmail.com>
2011-07-04 08:47:09 -07:00
Daniel Kang
58f7aad051 Fix build with --disable-yasm.
Signed-off-by: Ronald S. Bultje <rsbultje@gmail.com>
2011-07-03 22:56:09 -07:00
Michael Niedermayer
145293b335 h264_qpel_mmx: add another forgotten have_yasm
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
2011-07-04 03:05:24 +02:00
Michael Niedermayer
889639969b dsputil_mmx: try to fix compilation without yasm.
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
2011-07-04 02:02:24 +02:00
Michael Niedermayer
976a8b2179 Merge remote-tracking branch 'qatar/master'
* qatar/master: (40 commits)
  H.264: template left MB handling
  H.264: faster fill_decode_caches
  H.264: faster write_back_*
  H.264: faster fill_filter_caches
  H.264: make filter_mb_fast support the case of unavailable top mb
  Do not include log.h in avutil.h
  Do not include pixfmt.h in avutil.h
  Do not include rational.h in avutil.h
  Do not include mathematics.h in avutil.h
  Do not include intfloat_readwrite.h in avutil.h
  Remove return statements following infinite loops without break
  RTSP: Doxygen comment cleanup
  doxygen: Escape '\' in Doxygen documentation.
  md5: cosmetics
  md5: use AV_WL32 to write result
  md5: add fate test
  md5: include correct headers
  md5: fix test program
  doxygen: Drop array size declarations from Doxygen parameter names.
  doxygen: Fix parameter names to match the function prototypes.
  ...

Conflicts:
	libavcodec/x86/dsputil_mmx.c
	libavformat/flvenc.c
	libavformat/oggenc.c
	libavformat/wtv.c

Merged-by: Michael Niedermayer <michaelni@gmx.at>
2011-07-04 00:45:21 +02:00
Daniel Kang
9bfa5363da H.264: Add x86 assembly for 10-bit H.264 qpel functions.
Mainly ported from 8-bit H.264 qpel.

Some code ported from x264. LGPL ok by author.

Signed-off-by: Ronald S. Bultje <rsbultje@gmail.com>
2011-07-03 07:43:38 -07:00
Michael Niedermayer
3074f03a07 Merge remote-tracking branch 'qatar/master'
* qatar/master:
  get_bits: remove x86 inline asm in A32 bitstream reader
  doc: Remove outdated information about our issue tracker
  avidec: Factor out the sync fucntionality.
  fate-aac: Expand coverage.
  ac3dsp: add x86-optimized versions of ac3dsp.extract_exponents().
  ac3dsp: simplify extract_exponents() now that it does not need to do clipping.
  ac3enc: clip coefficients after MDCT.
  ac3enc: add int32_t array clipping function to DSPUtil, including x86 versions.
  swscale: for >8bit scaling, read in native bit-depth.
  matroskadec: matroska_read_seek after after EBML_STOP leads to failure.
  doxygen: fix usage of @file directive in libavutil/{dict,file}.h
  doxygen: Help doxygen parser to understand the DECLARE_ALIGNED and offsetof macros

Conflicts:
	doc/issue_tracker.txt
	libavformat/avidec.c
	libavutil/dict.h
	libswscale/swscale.c
	libswscale/utils.c
	tests/ref/lavfi/pixfmts_scale

Merged-by: Michael Niedermayer <michaelni@gmx.at>
2011-07-02 03:24:32 +02:00
Justin Ruggles
f99a5ef92e ac3dsp: add x86-optimized versions of ac3dsp.extract_exponents(). 2011-07-01 13:02:11 -04:00
Justin Ruggles
6054cd25b4 ac3enc: add int32_t array clipping function to DSPUtil, including x86 versions. 2011-07-01 13:02:11 -04:00
Carl Eugen Hoyos
4d08dfefa9 Remove gcc 2.95.3 remnants. 2011-06-29 10:07:39 +02:00
Michael Niedermayer
bb9d5171a7 Merge remote-tracking branch 'qatar/master'
* qatar/master: (21 commits)
  swscale: Add Doxygen for hyscale_fast/hScale.
  fate: enable lavfi-pixmt tests on big endian systems
  PPC: swscale: disable altivec functions for unsupported formats
  fate: merge identical pixdesc_be/le tests
  swscale: Add Doxygen for yuv2planar*/yuv2packed* functions.
  build: call texi2pod.pl with full path instead of symlink
  build: include sub-makefiles using full path instead of symlinks
  swscale: update big endian reference values after dff5a835.
  wavpack: skip blocks with no samples
  cosmetics: remove outdated comment that is no longer true
  build: replace some addprefix/addsuffix with substitution refs
  avutil: Remove unused arbitrary precision integer code.
  configure: Drop check for availability of ten assembler operands.
  aacenc: Save channel configuration for later use.
  aacenc: Fix codebook trellising for zeroed bands.
  swscale: change prototypes of scaled YUV output functions.
  swscale: re-add support for non-native endianness.
  swscale: disentangle yuv2rgbX_c_full() into small functions.
  swscale: split yuv2packed[12X]_c() remainders into small functions.
  swscale: split yuv2packedX_altivec in smaller functions.
  ...

Conflicts:
	Makefile
	configure
	libavcodec/x86/dsputil_mmx.c
	libavfilter/Makefile
	libavformat/Makefile
	libavutil/integer.c
	libavutil/integer.h
	libswscale/swscale.c
	libswscale/swscale_internal.h
	libswscale/x86/swscale_template.c
	tests/ref/lavfi/pixdesc_le
	tests/ref/lavfi/pixfmts_scale

Merged-by: Michael Niedermayer <michaelni@gmx.at>
2011-06-29 05:23:12 +02:00
Diego Biurrun
d2ee495fb2 configure: Drop check for availability of ten assembler operands.
This was done to support gcc 2.95, which is an old legacy compiler
that fails to compile the current codebase anyway.
2011-06-28 13:14:37 +02:00
Reimar Döffinger
5c13b5bb39 Add operand size to add instructions.
In these cases it can't be guessed from the operands (at least
not necessarily), and it seems some clang versions refuse to
compiler it.
Fixes ticket #303.

Signed-off-by: Reimar Döffinger <Reimar.Doeffinger@gmx.de>
2011-06-26 13:29:17 +02:00
Michael Niedermayer
686959e87e Merge remote-tracking branch 'qatar/master'
* qatar/master:
  doxygen: Consistently use '@' instead of '\' for Doxygen markup.
  Use av_printf_format to check the usage of printf style functions
  Add av_printf_format, for marking printf style format strings and their parameters
  ARM: enable thumb for Cortex-M* CPUs
  nsvdec: Propagate error values instead of returning 0 in nsv_read_header().
  build: remove SRC_PATH_BARE variable
  build: move basic rules and variables to main Makefile
  build: move special targets to end of main Makefile
  lavdev: improve feedback in case of invalid frame rate/size
  vfwcap: prefer "framerate_q" over "fps" in vfw_read_header()
  v4l2: prefer "framerate_q" over "fps" in v4l2_set_parameters()
  fbdev: prefer "framerate_q" over "fps" in device context
  bktr: prefer "framerate" over "fps" for grab_read_header()
  ALSA: implement channel layout for playback.
  alsa: support unsigned variants of already supported signed formats.
  alsa: add support for more formats.
  ARM: allow building in Thumb2 mode

Conflicts:
	common.mak
	doc/APIchanges
	libavcodec/vdpau.h
	libavdevice/alsa-audio-common.c
	libavdevice/fbdev.c
	libavdevice/libdc1394.c
	libavutil/avutil.h

Merged-by: Michael Niedermayer <michaelni@gmx.at>
2011-06-24 03:07:04 +02:00
Diego Biurrun
adbfc605f6 doxygen: Consistently use '@' instead of '\' for Doxygen markup.
Signed-off-by: Diego Biurrun <diego@biurrun.de>
2011-06-24 00:37:49 +02:00
Carl Eugen Hoyos
81ef892ca8 Use HAVE_TEN_OPERANDS for new decode_significance* functions. 2011-06-22 21:45:03 +02:00
Michael Niedermayer
043d2affbb Merge remote-tracking branch 'qatar/master'
* qatar/master:
  rawdec: Fix decoding of QT WRAW files.
  configure: report optimization for size separately
  mov: Support Digital Voodoo SD 8 Bit and DTS codec identifiers.
  mov: Support R10g codec identifier.
  riff/img2: Add JPEG 2000 codec IDs.
  riff: Add DAVC fourcc.
  riff: Add M263, XVIX, MMJP, CDV5 fourccs.
  rawvideo: Support auv2 fourcc.
  swscale: Remove unused variable from ff_bfin_get_unscaled_swscale().
  h264: Fix assert that failed to compile with -DDEBUG.
  h264: Add x86 assembly for 10-bit weight/biweight H.264 functions.
  fate: remove output redirections from old regtest scripts

Conflicts:
	configure
	libavcodec/rawdec.c
	libavformat/isom.c

Merged-by: Michael Niedermayer <michaelni@gmx.at>
2011-06-22 05:16:40 +02:00
Reimar Döffinger
5f654897e3 A cmp instruction with two constants is invalid, thus "g" constraint
is not correct but must be "rm" instead.

Signed-off-by: Reimar Döffinger <Reimar.Doeffinger@gmx.de>
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
2011-06-21 21:41:04 +02:00
Daniel Kang
84e70ef004 h264: Add x86 assembly for 10-bit weight/biweight H.264 functions.
Mainly ported from 8-bit H.264 weight/biweight.

Signed-off-by: Diego Biurrun <diego@biurrun.de>
2011-06-21 15:24:13 +02:00
Michael Niedermayer
6cbe81999b Merge remote-tracking branch 'qatar/master'
* qatar/master: (28 commits)
  Replace usages of av_get_bits_per_sample_fmt() with av_get_bytes_per_sample().
  x86: cabac: fix register constraints for 32-bit mode
  cabac: move x86 asm to libavcodec/x86/cabac.h
  x86: h264: cast pointers to intptr_t rather than int
  x86: h264: remove hardcoded edi in decode_significance_8x8_x86()
  x86: h264: remove hardcoded esi in decode_significance[_8x8]_x86()
  x86: h264: remove hardcoded edx in decode_significance[_8x8]_x86()
  x86: h264: remove hardcoded eax in decode_significance[_8x8]_x86()
  x86: cabac: change 'a' constraint to 'r' in get_cabac_inline()
  x86: cabac: remove hardcoded esi in get_cabac_inline()
  x86: cabac: remove hardcoded edx in get_cabac_inline()
  x86: cabac: remove unused macro parameter
  x86: cabac: remove hardcoded ebx in inline asm
  x86: cabac: remove hardcoded struct offsets from inline asm
  cabac: remove inline asm under #if 0
  cabac: remove BRANCHLESS_CABAC_DECODER switch
  cabac: remove #if 0 cascade under never-set #ifdef ARCH_X86_DISABLED
  document libswscale bump
  error_resilience: skip last-MV predictor step if MVs are not available.
  error_resilience: actually add counter when adding a MV predictor.
  ...

Conflicts:
	Changelog
	libavcodec/error_resilience.c
	libavfilter/defaults.c
	libavfilter/vf_drawtext.c
	libswscale/swscale.h
	tests/ref/vsynth1/error
	tests/ref/vsynth2/error

Merged-by: Michael Niedermayer <michaelni@gmx.at>
2011-06-21 03:38:25 +02:00
Mans Rullgard
c5ee740745 x86: cabac: fix register constraints for 32-bit mode
Some operands need to be accessed in byte mode, which restricts the
available registers in 32-bit mode.  Using the 'q' constraint selects
a suitable register.

Signed-off-by: Mans Rullgard <mans@mansr.com>
2011-06-20 23:36:40 +01:00
Mans Rullgard
2143d69bdd cabac: move x86 asm to libavcodec/x86/cabac.h
Signed-off-by: Mans Rullgard <mans@mansr.com>
2011-06-20 22:36:31 +01:00
Mans Rullgard
d075e7d540 x86: h264: cast pointers to intptr_t rather than int
Only the low-order bits are used here so the type is not important,
but this avoids a compiler warning.

Signed-off-by: Mans Rullgard <mans@mansr.com>
2011-06-20 22:36:31 +01:00
Mans Rullgard
3a4edb76d6 x86: h264: remove hardcoded edi in decode_significance_8x8_x86()
Signed-off-by: Mans Rullgard <mans@mansr.com>
2011-06-20 22:36:31 +01:00
Mans Rullgard
b92c1a6d26 x86: h264: remove hardcoded esi in decode_significance[_8x8]_x86()
Signed-off-by: Mans Rullgard <mans@mansr.com>
2011-06-20 22:36:31 +01:00
Mans Rullgard
3fc4e36c78 x86: h264: remove hardcoded edx in decode_significance[_8x8]_x86()
Signed-off-by: Mans Rullgard <mans@mansr.com>
2011-06-20 22:36:31 +01:00
Mans Rullgard
e4b5a204aa x86: h264: remove hardcoded eax in decode_significance[_8x8]_x86()
Signed-off-by: Mans Rullgard <mans@mansr.com>
2011-06-20 22:36:30 +01:00
Mans Rullgard
018c33838e x86: cabac: remove hardcoded ebx in inline asm
Signed-off-by: Mans Rullgard <mans@mansr.com>
2011-06-20 22:36:30 +01:00
Mans Rullgard
6b712acc0e x86: cabac: remove hardcoded struct offsets from inline asm
Signed-off-by: Mans Rullgard <mans@mansr.com>
2011-06-20 22:36:30 +01:00
Michael Niedermayer
83f9bc8aee Merge remote-tracking branch 'qatar/master'
* qatar/master:
  lavf: prevent crash in av_open_input_file() if ap == NULL.
  more Changelog additions
  lavf: add a forgotten NULL check in convert_format_parameters().
  Fix build if yasm is not available.
  H.264: Add x86 assembly for 10-bit MC Chroma H.264 functions.

Conflicts:
	Changelog

Merged-by: Michael Niedermayer <michaelni@gmx.at>
2011-06-19 04:02:06 +02:00
Ronald S. Bultje
ed63f527f2 Fix build if yasm is not available. 2011-06-18 08:34:14 -04:00
Daniel Kang
f188a1e0ca H.264: Add x86 assembly for 10-bit MC Chroma H.264 functions.
Mainly ported from 8-bit H.264 MC Chroma.

Signed-off-by: Ronald S. Bultje <rsbultje@gmail.com>
2011-06-18 07:52:19 -04:00
Carl Eugen Hoyos
5fb67d8039 Fix compilation with old yasm. 2011-06-16 23:18:50 +02:00
Michael Niedermayer
c137fdd778 Merge remote-tracking branch 'qatar/master'
* qatar/master:
  swscale: remove misplaced comment.
  ffmpeg: fix streaming to ffserver.
  swscale: split out RGB48 output functions from yuv2packed[12X]_c().
  build: move vpath directives to main Makefile
  swscale: fix JPEG-range YUV scaling artifacts.
  build: move ALLFFLIBS to a more logical place
  ARM: factor some repetitive code into macros
  Fix SVQ3 after adding 4:4:4 H.264 support
  H.264: fix CODEC_FLAG_GRAY
  4:4:4 H.264 decoding support
  ac3enc: fix allocation of floating point samples.

Conflicts:
	ffmpeg.c
	libavcodec/dsputil_template.c
	libavcodec/h264.c
	libavcodec/mpegvideo.c
	libavcodec/snow.c
	libswscale/swscale.c
	libswscale/swscale_internal.h

Merged-by: Michael Niedermayer <michaelni@gmx.at>
2011-06-15 02:15:25 +02:00
Jason Garrett-Glaser
c90b94424c 4:4:4 H.264 decoding support
Note: this is 4:4:4 from the 2007 spec revision, not the previous (now deprecated) 4:4:4 mode in H.264.
2011-06-13 21:16:30 -07:00
Jason Garrett-Glaser
504811baea Roll back 4:4:4 H.264 for now
Needs some ARM/PPC asm modifications.
2011-06-13 13:38:46 -07:00
Jason Garrett-Glaser
c9c493872c 4:4:4 H.264 decoding support
Note: this is 4:4:4 from the 2007 spec revision, not the previous (now deprecated) 4:4:4 mode in H.264.
2011-06-13 12:21:39 -07:00
Michael Niedermayer
45fb647495 Merge remote-tracking branch 'qatar/master'
* qatar/master:
  bitstream: Properly promote av_reverse values before shifting.
  libavutil/swscale: YUV444P10/YUV444P9 support.
  H.264: Fix high bit depth explicit biweight
  h264: Fix 10-bit H.264 x86 chroma v loopfilter asm.
  Replace DEBUG_SEEK/DEBUG_SI + av_log combinations by av_dlog.
  Update copyright year for ac3enc_opts_template.c.
  adts: Adjust frame size mask to follow the specification.
  movenc: Add RTP muxer/hinter options
  movenc: Pass the RTP AVFormatContext to the SDP generation
  rtspenc: Add RTP muxer options
  rtspenc: Add an AVClass for setting muxer specific options
  rtpenc_chain: Pass the rtpflags options through to the chained muxer
  rtpenc: Declare the rtp flags private AVOptions in rtpenc.h
  sdp: Reindent after the previous commit
  rtpenc: MP4A-LATM payload support
  avoptions: Add an av_opt_flag_is_set function for inspecting flag fields
  sdp: Allow passing an AVFormatContext to the SDP generation
  mov: Fix wrong timestamp generation for fragmented movies that have time offset caused by the first edit list entry.
  mpeg12: more advanced ffmpeg mpeg2 aspect guessing code.
  swscale: split YUYV output out of yuv2packed[12X]_c().

Conflicts:
	doc/APIchanges
	libavcodec/Makefile
	libavcodec/h264dsp_template.c
	libavcodec/mpeg12.c
	libavformat/aacdec.c
	libavformat/avidec.c
	libavformat/internal.h
	libavformat/movenc.c
	libavformat/rtpenc.c
	libavformat/rtpenc_latm.c
	libavformat/sdp.c
	libavformat/version.h
	libavutil/avutil.h
	libavutil/pixfmt.h
	libswscale/swscale.c

Merged-by: Michael Niedermayer <michaelni@gmx.at>
2011-06-11 03:51:36 +02:00
Oskar Arvidsson
6c031a3338 h264: Fix 10-bit H.264 x86 chroma v loopfilter asm.
The tc variable was not splatted correctly.

Signed-off-by: Ronald S. Bultje <rsbultje@gmail.com>
2011-06-10 14:44:57 -04:00
Michael Niedermayer
d552f616a2 Merge remote-tracking branch 'qatar/master'
* qatar/master: (28 commits)
  Remove some non-compiling debug messages.
  ffplay: Fix non-compiling debug printf and replace it by av_dlog.
  H264: x86 predict init cosmetics.
  ac3enc: Fix linking of AC-3 encoder without the E-AC-3 encoder.
  Move E-AC-3 encoder functions to a separate eac3enc.c file.
  ac3enc: remove convenience macro, #define DEBUG
  ac3enc: remove unused #define
  vc1: re-initialize tables after width/height change.
  APIchanges: fill-in git commit hash for av_get_bytes_per_sample() addition
  samplefmt: add av_get_bytes_per_sample()
  iirfilter: fix biquad filter coefficients.
  swscale: remove duplicate conversion routine in swScale().
  swscale: add yuv2planar/packed function typedefs.
  swscale: integrate yuv2nv12X_C into yuv2yuvX() function pointers.
  swscale: reindent x86 init code.
  swscale: extract SWS_FULL_CHR_H_INT conditional into init code.
  swscale: cosmetics.
  swscale: remove alp/chr/lumSrcOffset.
  swscale: un-special-case yuv2yuvX16_c().
  shorten: Remove stray DEBUG #define and corresponding av_dlog statement.
  ...

Conflicts:
	doc/APIchanges
	libavcodec/ac3enc.c
	libavutil/avutil.h
	libavutil/samplefmt.c
	libswscale/swscale.c
	libswscale/swscale_internal.h
	libswscale/x86/swscale_template.c

Merged-by: Michael Niedermayer <michaelni@gmx.at>
2011-06-08 05:25:28 +02:00
Daniel Kang
4de83b7b6d H264: x86 predict init cosmetics.
Change indentation and whitespace; also move HAVE_YASM blocks.

Signed-off-by: Diego Biurrun <diego@biurrun.de>
2011-06-08 00:22:52 +02:00
Michael Niedermayer
f9569249c2 Merge remote-tracking branch 'qatar/master'
* qatar/master:
  Remove some unused scripts from tools/.
  Add x86 assembly for some 10-bit H.264 intra predict functions.
  v4l2: do not force NTSC as standard
  Skip tableprint.h during 'make checkheaders'.
  Remove unnecessary LIBAVFORMAT_BUILD #ifdef.
  Drop explicit filenames from @file Doxygen tags.
  Skip generated table headers during 'make checkheaders'.
  lavf,lavc: free avoptions in a generic way.
  AVOptions: add av_opt_free convenience function.
  tableprint: Restore mistakenly deleted common.h #include for FF_ARRAY_ELEMS.
  tiff: print log in case of unknown / unsupported tag.
  tiff: fix linesize for mono-white/black formats.
  Fix build of eval-test program
  configure: Document --enable-vaapi
  ac3enc: extract all exponents for the frame at once

Merged-by: Michael Niedermayer <michaelni@gmx.at>
2011-06-06 03:33:58 +02:00
Daniel Kang
a8d44f9dd5 Add x86 assembly for some 10-bit H.264 intra predict functions.
Parts are inspired from the 8-bit H.264 predict code in Libav.
Other parts ported from x264 with relicensing permission from author.

Signed-off-by: Diego Biurrun <diego@biurrun.de>
2011-06-06 01:31:02 +02:00
Michael Niedermayer
99eb31e263 Merge remote-tracking branch 'qatar/master'
* qatar/master: (25 commits)
  Replace custom DEBUG preprocessor trickery by the standard one.
  vorbis: Remove non-compiling debug statement.
  vorbis: Remove pointless DEBUG #ifdef around debug output macros.
  cook: Remove non-compiling debug output.
  Remove pointless #ifdefs around function declarations in a header.
  Replace #ifdef + av_log() combinations by av_dlog().
  Replace custom debug output functions by av_dlog().
  cook: Remove unused debug functions.
  Remove stray extra arguments from av_dlog() invocations.
  targa: fix big-endian build
  v4l2: remove one forgotten use of AVFormatParameters.pix_fmt.
  vfwcap: add a framerate private option.
  v4l2: add a framerate private option.
  libdc1394: add a framerate private option.
  fbdev: add a framerate private option.
  bktr: add a framerate private option.
  oma: check avio_read() return value
  nutdec: remove unused variable
  Remove unused variables
  swscale: allocate larger buffer to handle altivec overreads.
  ...

Conflicts:
	ffmpeg.c
	libavcodec/dca.c
	libavcodec/dirac.c
	libavcodec/error_resilience.c
	libavcodec/h264.c
	libavcodec/mpeg12.c
	libavcodec/mpeg4videodec.c
	libavcodec/mpegvideo.c
	libavcodec/mpegvideo_enc.c
	libavcodec/pthread.c
	libavcodec/rv10.c
	libavcodec/s302m.c
	libavcodec/shorten.c
	libavcodec/truemotion2.c
	libavcodec/utils.c
	libavdevice/dv1394.c
	libavdevice/fbdev.c
	libavdevice/libdc1394.c
	libavdevice/v4l2.c
	libavformat/4xm.c
	libavformat/apetag.c
	libavformat/asfdec.c
	libavformat/avidec.c
	libavformat/mmf.c
	libavformat/mpeg.c
	libavformat/mpegenc.c
	libavformat/mpegts.c
	libavformat/oggdec.c
	libavformat/oggparseogm.c
	libavformat/rl2.c
	libavformat/rmdec.c
	libavformat/rpl.c
	libavformat/rtpdec_latm.c
	libavformat/sauce.c
	libavformat/sol.c
	libswscale/utils.c
	tests/ref/vsynth1/error
	tests/ref/vsynth2/error

Merged-by: Michael Niedermayer <michaelni@gmx.at>
2011-06-03 05:19:30 +02:00
Loren Merritt
53be7b23e9 Cosmetic changes to h264_idct_10bit.asm.
Removes redundant dword tags and whitespace changes.

Signed-off-by: Ronald S. Bultje <rsbultje@gmail.com>
2011-06-02 07:07:15 -07:00
Loren Merritt
994c3550ff 2x faster h264_idct_add8_10.
Signed-off-by: Ronald S. Bultje <rsbultje@gmail.com>
2011-06-02 07:07:02 -07:00
Ronald S. Bultje
e6635a9a19 h264: remove CONFIG_GPL from x86 intra prediction code.
The authors permitted relicensing to LGPL a long time ago (Holger,
Loren and Jason).
2011-06-02 07:02:46 -07:00
Michael Niedermayer
cd8cb54990 Merge remote-tracking branch 'qatar/master'
* qatar/master:
  ARM: ac3dsp: optimised update_bap_counts()
  mpegaudiodec: Fix av_dlog() invocation.
  h264/10bit: add HAVE_ALIGNED_STACK checks.
  Update 8-bit H.264 IDCT function names to reflect bit-depth.
  Add IDCT functions for 10-bit H.264.
  mpegaudioenc: Fix broken av_dlog statement.
  Employ correct printf format specifiers, mostly in debug output.
  ARM: fix MUL64 inline asm for pre-armv6

Conflicts:
	libavcodec/mpegaudioenc.c
	libavformat/ape.c
	libavformat/mxfdec.c
	libavformat/r3d.c

Merged-by: Michael Niedermayer <michaelni@gmx.at>
2011-06-02 05:12:10 +02:00
Daniel Kang
f3aa65af3a h264/10bit: add HAVE_ALIGNED_STACK checks.
Fixes regression in 836f47d34b in ICC-10.x,
since ICC<=11.0 doesn't align stack upon function calls.

Signed-off-by: Ronald S. Bultje <rsbultje@gmail.com>
2011-05-31 21:43:20 -07:00
Daniel Kang
348493db60 Update 8-bit H.264 IDCT function names to reflect bit-depth.
Signed-off-by: Ronald S. Bultje <rbultje@google.com>
2011-05-31 15:02:32 -07:00
Daniel Kang
836f47d34b Add IDCT functions for 10-bit H.264.
Ports the majority of IDCT functions for 10-bit H.264.

Parts are inspired from 8-bit IDCT code in Libav; other parts ported from x264 with relicensing permission from author.

Signed-off-by: Ronald S. Bultje <rbultje@google.com>
2011-05-31 15:02:32 -07:00
Michael Niedermayer
b8a43bc1b5 Merge remote-tracking branch 'qatar/master' into master
* qatar/master: (27 commits)
  ac3enc: fix LOCAL_ALIGNED usage in count_mantissa_bits()
  ac3dsp: do not use the ff_* prefix when referencing ff_ac3_bap_bits.
  ac3dsp: fix loop condition in ac3_update_bap_counts_c()
  ARM: unbreak build
  ac3enc: modify mantissa bit counting to keep bap counts for all values of bap instead of just 0 to 4.
  ac3enc: split mantissa bit counting into a separate function.
  ac3enc: store per-block/channel bap pointers by reference block in a 2D array rather than in the AC3Block struct.
  get_bits: add av_unused tag to cache variable
  sws: replace all long with int.
  ARM: aacdec: fix constraints on inline asm
  ARM: remove unnecessary volatile from inline asm
  ARM: add "cc" clobbers to inline asm where needed
  ARM: improve FASTDIV asm
  ac3enc: use LOCAL_ALIGNED macro
  APIchanges: fill in git hash for av_get_pix_fmt_name (0420bd7).
  lavu: add av_get_pix_fmt_name() convenience function
  cmdutils: remove OPT_FUNC2
  swscale: fix crash in bilinear scaling.
  vpxenc: add VP8E_SET_STATIC_THRESHOLD mapping
  webm: support stereo videos in matroska/webm muxer
  ...

Conflicts:
	Changelog
	cmdutils.c
	cmdutils.h
	doc/APIchanges
	doc/muxers.texi
	ffmpeg.c
	ffplay.c
	libavcodec/ac3enc.c
	libavcodec/ac3enc_float.c
	libavcodec/avcodec.h
	libavcodec/get_bits.h
	libavcodec/libvpxenc.c
	libavcodec/version.h
	libavdevice/libdc1394.c
	libavformat/matroskaenc.c
	libavutil/avutil.h
	libswscale/rgb2rgb.c
	libswscale/swscale.c
	libswscale/swscale_template.c
	libswscale/x86/swscale_template.c

Merged-by: Michael Niedermayer <michaelni@gmx.at>
2011-05-29 03:34:35 +02:00
Justin Ruggles
70bb747a57 ac3dsp: do not use the ff_* prefix when referencing ff_ac3_bap_bits.
this should fix the windows builds

Signed-off-by: Martin Storsjö <martin@martin.st>
2011-05-28 22:43:40 +03:00
Justin Ruggles
6ca23db9cc ac3enc: modify mantissa bit counting to keep bap counts for all values of bap
instead of just 0 to 4.

This does all the actual bit counting as a final step.
2011-05-28 12:39:28 -04:00
Michael Niedermayer
8381ab1437 Merge remote-tracking branch 'qatar/master'
* qatar/master: (29 commits)
  ARM: disable ff_vector_fmul_vfp on VFPv3 systems
  ARM: check for VFPv3
  swscale: Remove unused variables in x86 code.
  doc: Drop DJGPP section, Libav now compiles out-of-the-box on FreeDOS.
  x86: Add appropriate ifdefs around certain AVX functions.
  cmdutils: use sws_freeContext() instead of av_freep().
  swscale: delay allocation of formatConvBuffer().
  swscale: fix build with --disable-swscale-alpha.
  movenc: Deprecate the global RTP hinting flag, use a private AVOption instead
  movenc: Add an AVClass for setting muxer specific options
  swscale: fix non-bitexact yuv2yuv[X2]() MMX/MMX2 functions.
  configure: report yasm/nasm presence properly
  tcp: make connect() timeout properly
  rawdec: factor video demuxer definitions into a macro.
  rtspdec: add initial_pause private option.
  lavf: deprecate AVFormatParameters.width/height.
  tty: add video_size private option.
  rawdec: add video_size private option.
  x11grab: add video_size private option.
  x11grab: factorize returning error codes.
  ...

Merged-by: Michael Niedermayer <michaelni@gmx.at>
2011-05-27 23:48:22 +02:00
Diego Biurrun
5e528cffcf x86: Add appropriate ifdefs around certain AVX functions.
nasm versions prior to 2.09 have trouble assembling some of our AVX code.
Protect these sections by preprocessor macros to allow compilation to pass.
2011-05-27 21:18:12 +02:00
Reimar Döffinger
7e637b70ec Fix compilation with YASM/NASM versions not supporting AVX. 2011-05-26 19:44:39 +02:00
Reimar Döffinger
384d10360b Fix register types for LOAD_AB arguments, fixes compilation with NASM. 2011-05-24 22:24:08 +02:00
Michael Niedermayer
26ed595bd0 Merge remote-tracking branch 'qatar/master'
* qatar/master:
  configure: Add -U__STRICT_ANSI__ to CPPFLAGS on Cygwin and DOS.
  aacdec: fix typo in scalefactor clipping check
  fate: fix fate-h264-conformance-frext-pph10i4-panasonic-a crcs.
  fate: update 9/10bit refs.
  h264: Properly set coded_{width, height} when parsing H.264.
  x86 asm: Add SECTION_TEXT to dct32_sse.asm.
  Fix 9/10 bit in swscale.

Merged-by: Michael Niedermayer <michaelni@gmx.at>
2011-05-24 04:35:08 +02:00
Dave Yeo
a10fb79070 x86 asm: Add SECTION_TEXT to dct32_sse.asm.
This fixes the following error on OS/2:
error: segment name `.text align=16' not recognized

Signed-off-by: Diego Biurrun <diego@biurrun.de>
2011-05-23 12:47:53 +02:00
Michael Niedermayer
01a73d6cef Merge remote-tracking branch 'qatar/master'
* qatar/master:
  ffmpeg: Don't trigger url_interrupt_cb on the first signal
  avoptions: Check the return value from av_get_number
  dct32_sse: eliminate some spills
  Fix dct32() compilation with --disable-yasm

Conflicts:
	ffmpeg.c

Merged-by: Michael Niedermayer <michaelni@gmx.at>
2011-05-23 04:29:51 +02:00
Michael Niedermayer
94ea17075b dct32: Replacing libav by ffmpeg in the license header with the authors permission.
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
2011-05-22 21:31:27 +02:00
Loren Merritt
422b2362fc dct32_sse: eliminate some spills
125->104 cycles on penryn (x86_64 only)
2011-05-22 19:27:18 +02:00
Vitor Sessak
e6c1791b47 Fix compilation with --disable-yasm. 2011-05-22 13:41:13 +02:00
Vitor Sessak
165c7c420d Fix dct32() compilation with --disable-yasm
Signed-off-by: Ronald S. Bultje <rsbultje@gmail.com>
2011-05-22 07:10:19 -04:00
Michael Niedermayer
bf8bb94322 Merge remote-tracking branch 'qatar/master'
* qatar/master:
  ffmpeg: get rid of the -vglobal option.
  dct32: Add AVX implementation of 32-point DCT
  dct32: Change pass 6 permutation to allow for AVX implementation
  dct32: port SSE 32-point DCT to YASM
  multiple inclusion guard cleanup
  avio: document buffer must created with av_malloc() and friends
  avio: check AVIOContext malloc failure
  swscale: point out an alternative to sws_getContext
  svq3: Do initialization after parsing the extradata
  add changelog entries for 0.7_beta2
  mp3lame: add #include required for AV_RB32 macro.

Conflicts:
	Changelog
	libavcodec/svq3.c
	libavcodec/x86/dct32_sse.c
	libavfilter/vsrc_buffer.h

Merged-by: Michael Niedermayer <michaelni@gmx.at>
2011-05-22 04:53:19 +02:00
Vitor Sessak
6204feb160 dct32: Add AVX implementation of 32-point DCT 2011-05-21 17:42:26 +02:00
Vitor Sessak
4e653b98c8 dct32: Change pass 6 permutation to allow for AVX implementation 2011-05-21 17:42:26 +02:00
Vitor Sessak
3758eb0eb9 dct32: port SSE 32-point DCT to YASM 2011-05-21 17:42:26 +02:00
Diego Biurrun
153382e1b6 multiple inclusion guard cleanup
Add missing multiple inclusion guards; clean up #endif comments;
add missing library prefixes; keep guard names consistent.
2011-05-21 13:48:10 +02:00
Michael Niedermayer
6d32bcd770 Merge remote-tracking branch 'qatar/master'
* qatar/master:
  configure: make executable again
  LATM/AAC: Free previously initialized context on reinit.
  configure: Do not unconditionally add -Wall to host CFLAGS.
  configure: Set OS/2 objformat to a.out.
  Add support for a.out object format to assembler macros.
  fate: disable threading for encoding
  fate: add comment field
  fate: allow overriding default build and install dirs
  mpegtsenc: Add an AVClass pointer to the private data
  mpegaudio: clean up #includes
  mpegaudio: move all header parsing to mpegaudiodecheader.[ch]

Merged-by: Michael Niedermayer <michaelni@gmx.at>
2011-05-21 05:32:03 +02:00
Dave Yeo
d69f9a4234 Add support for a.out object format to assembler macros.
This format is still used by e.g. OS/2.

Signed-off-by: Diego Biurrun <diego@biurrun.de>
2011-05-20 17:52:21 +02:00
Michael Niedermayer
80d156d7fd Merge remote-tracking branch 'qatar/master'
* qatar/master:
  qdm2: Use floating point synthesis filter.
  h264: correct border check.
  h264: fix loopfilter with threading at slice boundaries.
  Fix ff_mpa_synth_filter_fixed() prototype
  Rename costablegen.c ---> cos_tablegen.c.
  Collapse tableprint.c into tableprint.h.
  Simplify trig table rules
  Remove potentially unstable filenames from comments in generated files.
  Ignore generated tables and generated table generator programs.
  Simplify CLEANFILES make variable by using wildcards.
  Remove silly insults from avformat_version() Doxygen documentation.
  mpegaudiodsp: fix x86 and ppc makefiles
  configure: Adjust AVX assembler check.
  mpegaudio: remove unused version of SAME_HEADER_MASK
  mpegaudio: remove useless #undef at end of file
  asfdec: add missing #include for av_bswap32()
  mpegaudio: merge two #if CONFIG_FLOAT blocks
  mpegaudio: move some struct definitions from mpegaudio.h
  Move some mpegaudio functions to new mpegaudiodsp subsystem

Conflicts:
	libavcodec/h264.c
	libavcodec/x86/Makefile

Merged-by: Michael Niedermayer <michaelni@gmx.at>
2011-05-20 05:48:22 +02:00
Mans Rullgard
0b5e44ed29 mpegaudiodsp: fix x86 and ppc makefiles
Signed-off-by: Mans Rullgard <mans@mansr.com>
2011-05-19 16:32:24 +01:00
Mans Rullgard
c4f5c2d6f4 Move some mpegaudio functions to new mpegaudiodsp subsystem
This separation allows these functions to be used in a cleaner
fashion from other codecs (e.g. qdm2) and simplifies creating
optimised versions of them.

Signed-off-by: Mans Rullgard <mans@mansr.com>
2011-05-19 12:25:34 +01:00
Michael Niedermayer
3c7650a83d Merge remote-tracking branch 'qatar/master'
This early morning merge should fix --disable-yasm

* qatar/master:
  Clean up #includes in cmdutils.h.
  g729: Merge g729.h into g729dec.c.
  10l: wrap float_interleave functions in HAVE_YASM.

Conflicts:
	libavcodec/g729.h

Merged-by: Michael Niedermayer <michaelni@gmx.at>
2011-05-19 13:00:31 +02:00
Michael Niedermayer
75a37b57a5 Merge remote-tracking branch 'qatar/master'
* qatar/master:
  APIchanges: fill in date and commit for request_sample_fmt
  Add floating-point sample format support to the ac3, eac3, dca, aac, and vorbis decoders.
  Add support for request_sample_format in ffmpeg and ffplay.
  Add APIchanges entry for request_sample_fmt.
  Add request_sample_fmt field to AVCodecContext.
  Add float_interleave() to FmtConvertContext with x86-optimized versions.
  Remove unused make variable SEEK_REFFILE
  fate: remove redundant aref and vref references
  fate: remove do_ffmpeg_nocheck function
  fate: do not collect -benchmark output
  mpegaudiodec: remove decode_end() function
  fate: run aref and vref as regular tests
  mpegaudio: sanitise compute_antialias_* names
  mpeg12: add slice-threading checks to slice-threading initializers.
  h264: copy pixel_shift between slice threading contexts.
  mdec: enable frame-level multithreading.
  mdec.c: fix overread.

Conflicts:
	libavcodec/aacdec.c
	libavcodec/ac3dec.c
	libavcodec/avcodec.h
	libavcodec/dca.c
	libavcodec/h264.c
	libavcodec/mdec.c
	libavcodec/mpeg12.c
	libavcodec/options.c
	libavcodec/version.h
	libavcodec/vorbisdec.c

Merged-by: Michael Niedermayer <michaelni@gmx.at>
2011-05-19 06:00:31 +02:00
Justin Ruggles
e98a95e779 10l: wrap float_interleave functions in HAVE_YASM.
fixes compilation with --disable-yasm
2011-05-18 20:18:08 -04:00
Justin Ruggles
32f8fb8ecf Add float_interleave() to FmtConvertContext with x86-optimized versions.
Partially based on patches by clsid2 in ffdshow-tryout.
ff_float_interleave6() x86 improvements by Loren Merrit.
2011-05-18 17:27:05 -04:00
Michael Niedermayer
b4bcd1e2f1 Merge remote-tracking branch 'qatar/master'
* qatar/master:
  Fix compilation of iirfilter-test.
  libx264: handle closed GOP codec flag
  lavf: remove duplicate assignment in avformat_alloc_context.
  lavf: use designated initializers for AVClasses.
  flvdec: clenup debug code
  asfdec: fix possible overread on broken files.
  asfdec: do not fall back to binary/generic search
  asfdec: reindent after previous commit c7bd5ed
  asfdec: fallback to binary search internally
  mpegaudio: add _fixed suffix to some names
  Modify x86util.asm to ease transitioning to 10-bit H.264 assembly.
  dct: build dct32 as separate object files
  qdm2: include correct header for rdft

Conflicts:
	ffpresets/libx264-fast.ffpreset
	ffpresets/libx264-fast_firstpass.ffpreset
	ffpresets/libx264-faster.ffpreset
	ffpresets/libx264-faster_firstpass.ffpreset
	ffpresets/libx264-medium.ffpreset
	ffpresets/libx264-medium_firstpass.ffpreset
	ffpresets/libx264-placebo.ffpreset
	ffpresets/libx264-placebo_firstpass.ffpreset
	ffpresets/libx264-slow.ffpreset
	ffpresets/libx264-slow_firstpass.ffpreset
	ffpresets/libx264-slower.ffpreset
	ffpresets/libx264-slower_firstpass.ffpreset
	ffpresets/libx264-superfast.ffpreset
	ffpresets/libx264-superfast_firstpass.ffpreset
	ffpresets/libx264-ultrafast.ffpreset
	ffpresets/libx264-ultrafast_firstpass.ffpreset
	ffpresets/libx264-veryfast.ffpreset
	ffpresets/libx264-veryfast_firstpass.ffpreset
	ffpresets/libx264-veryslow.ffpreset
	ffpresets/libx264-veryslow_firstpass.ffpreset
	libavformat/flvdec.c

Merged-by: Michael Niedermayer <michaelni@gmx.at>
2011-05-18 05:42:42 +02:00
Daniel Kang
d0005d347d Modify x86util.asm to ease transitioning to 10-bit H.264 assembly.
Arguments for variable size instructions are added to many macros, along
with other various changes. The x86util.asm code was ported from x264.

Signed-off-by: Diego Biurrun <diego@biurrun.de>
2011-05-17 20:44:48 +02:00
Michael Niedermayer
f8ae3a2108 Merge remote branch 'qatar/master'
12 files changed, 36 insertions(+), 81 deletions(-)
yes thats 36 new lines in 14 commits

* qatar/master:
  ffmpeg: fix -aspect cli option
  Restructure video filter implementation in ffmpeg.c.
  ffplay: remove audio_write_get_buf_size() forward declaration
  lavfi: print key-frame and picture type information in ff_dlog_ref()
  mathops: remove ancient confusing comment
  cws2fws: Improve error message wording.
  tools: Check the return value of write().
  mpegaudio: move OUT_FMT macro to mpegaudiodec.c
  mpegaudio: remove OUT_MIN/MAX macros
  Add missing #includes to mp3_header_(de)compress bsf
  dct: fix indentation
  dct: bypass table allocation for DCT_II of size 32
  h264dsp_mmx: Add #ifdefs around some mmxext functions on x86_64.
  Remove unused header mpegaudio3.h.

Conflicts:
	ffmpeg.c
	libavcodec/mpegaudio.h
	libavcodec/mpegaudio3.h
	libavfilter/avfilter.c

Merged-by: Michael Niedermayer <michaelni@gmx.at>
2011-05-17 04:51:33 +02:00
Gil Pedersen
257de5fb25 h264dsp_mmx: Add #ifdefs around some mmxext functions on x86_64.
This fixes linking errors due to undefined symbols on x86_64 OS X.

Signed-off-by: Diego Biurrun <diego@biurrun.de>
2011-05-16 15:35:53 +02:00
Michael Niedermayer
5a153604c9 Merge remote branch 'qatar/master'
* qatar/master:
  Fix FSF address copy paste error in some license headers.
  Add an aac sample which uses LTP to fate-aac.
DUPLICATE  [PATCH] Update pixdesc_be fate refs after adding 9/10bit YUV420P formats.
  arm: properly mark external symbol call

Conflicts:
	libavcodec/x86/ac3dsp.asm
	libavcodec/x86/deinterlace.asm
	libavcodec/x86/dsputil_yasm.asm
	libavcodec/x86/dsputilenc_yasm.asm
	libavcodec/x86/fft_mmx.asm
	libavcodec/x86/fmtconvert.asm
	libavcodec/x86/h264_chromamc.asm
	libavcodec/x86/h264_deblock.asm
	libavcodec/x86/h264_idct.asm
	libavcodec/x86/h264_intrapred.asm
	libavcodec/x86/h264_weight.asm
	libavcodec/x86/vc1dsp_yasm.asm
	libavcodec/x86/vp3dsp.asm
	libavcodec/x86/vp56dsp.asm
	libavcodec/x86/vp8dsp.asm
	libavcodec/x86/x86util.asm
	libswscale/ppc/swscale_template.c

Merged-by: Michael Niedermayer <michaelni@gmx.at>
2011-05-15 04:44:07 +02:00
Diego Biurrun
888fa31eca Fix FSF address copy paste error in some license headers. 2011-05-14 21:32:31 +02:00
Michael Niedermayer
612122b187 Merge remote branch 'qatar/master'
* qatar/master: (32 commits)
  10-bit H.264 x86 chroma v loopfilter asm
  Port SMPTE S302M audio decoder from FFmbc 0.3. [Copyright headers corrected]
  Fix crash of interlaced MPEG2 decoding
  h264pred: fix one more aliasing violation.
  doc/APIchanges: fill in missing hashes and dates.
  flacenc: use proper initializers for AVOption default values.
  lavc: deprecate named constants for deprecated antialias_algo.
  aac: workaround for compilation on cygwin
  swscale: extend YUV422p support to 10bits depth
  tiff: add support for inverted FillOrder for uncompressed data
  Remove unused softfloat implementation.
  h264pred: fix aliasing violations.
  rotozoom: Eliminate French variable name.
  rotozoom: Check return value of fread().
  rotozoom: Return an error value instead of calling exit().
  rotozoom: Make init_demo() return int and check for errors on invocation.
  rotozoom: Drop silly UINT8 typedef.
  rotozoom: Drop some unnecessary parentheses.
  rotozoom: K&R coding style cosmetics
  rtsp: Only do keepalive using GET_PARAMETER if the server supports it
  ...

Conflicts:
	Changelog
	cmdutils.c
	doc/APIchanges
	doc/general.texi
	ffmpeg.c
	ffplay.c
	libavcodec/h264pred_template.c
	libavcodec/resample.c
	libavutil/pixfmt.h
	libavutil/softfloat.c
	libavutil/softfloat.h
	tests/rotozoom.c

Merged-by: Michael Niedermayer <michaelni@gmx.at>
2011-05-12 04:51:24 +02:00
Jason Garrett-Glaser
5705b02079 10-bit H.264 x86 chroma v loopfilter asm
Also delete some unused deblock asm macros.
2011-05-11 11:09:10 -07:00
Michael Niedermayer
59eb12faff Merge remote branch 'qatar/master'
* qatar/master: (30 commits)
  AVOptions: make default_val a union, as proposed in AVOption2.
  arm/h264pred: add missing argument type.
  h264dsp_mmx: place bracket outside #if/#endif block.
  lavf/utils: fix ff_interleave_compare_dts corner case.
  fate: add 10-bit H264 tests.
  h264: do not print "too many references" warning for intra-only.
  Enable decoding of high bit depth h264.
  Adds 8-, 9- and 10-bit versions of some of the functions used by the h264 decoder.
  Add support for higher QP values in h264.
  Add the notion of pixel size in h264 related functions.
  Make the h264 loop filter bit depth aware.
  Template dsputil_template.c with respect to pixel size, etc.
  Template h264idct_template.c with respect to pixel size, etc.
  Preparatory patch for high bit depth h264 decoding support.
  Move some functions in dsputil.c into a new file dsputil_template.c.
  Move the functions in h264idct into a new file h264idct_template.c.
  Move the functions in h264pred.c into a new file h264pred_template.c.
  Preparatory patch for high bit depth h264 decoding support.
  Add pixel formats for 9- and 10-bit yuv420p.
  Choose h264 chroma dc dequant function dynamically.
  ...

Conflicts:
	doc/APIchanges
	ffmpeg.c
	ffplay.c
	libavcodec/alpha/dsputil_alpha.c
	libavcodec/arm/dsputil_init_arm.c
	libavcodec/arm/dsputil_init_armv6.c
	libavcodec/arm/dsputil_init_neon.c
	libavcodec/arm/dsputil_iwmmxt.c
	libavcodec/arm/h264pred_init_arm.c
	libavcodec/bfin/dsputil_bfin.c
	libavcodec/dsputil.c
	libavcodec/h264.c
	libavcodec/h264.h
	libavcodec/h264_cabac.c
	libavcodec/h264_cavlc.c
	libavcodec/h264_loopfilter.c
	libavcodec/h264_ps.c
	libavcodec/h264_refs.c
	libavcodec/h264dsp.c
	libavcodec/h264idct.c
	libavcodec/h264pred.c
	libavcodec/mlib/dsputil_mlib.c
	libavcodec/options.c
	libavcodec/ppc/dsputil_altivec.c
	libavcodec/ppc/dsputil_ppc.c
	libavcodec/ppc/h264_altivec.c
	libavcodec/ps2/dsputil_mmi.c
	libavcodec/sh4/dsputil_align.c
	libavcodec/sh4/dsputil_sh4.c
	libavcodec/sparc/dsputil_vis.c
	libavcodec/utils.c
	libavcodec/version.h
	libavcodec/x86/dsputil_mmx.c
	libavformat/options.c
	libavformat/utils.c
	libavutil/pixfmt.h
	libswscale/swscale.c
	libswscale/swscale_internal.h
	libswscale/swscale_template.c
	tests/ref/seek/lavf_avi

Merged-by: Michael Niedermayer <michaelni@gmx.at>
2011-05-11 05:47:02 +02:00
Jason Garrett-Glaser
9f3d6ca4f1 Port x86 10-bit H.264 deblock asm from x264 2011-05-10 20:02:15 -07:00
Jason Garrett-Glaser
8ad77b65b5 Update x86 H.264 deblock asm
Includes AVX versions from x264.
2011-05-10 20:01:58 -07:00
Ronald S. Bultje
86b29553f8 h264dsp_mmx: place bracket outside #if/#endif block.
Should fix compile on systems missing yasm/nasm.
2011-05-10 08:39:38 -04:00
Oskar Arvidsson
19a0729b4c Adds 8-, 9- and 10-bit versions of some of the functions used by the h264 decoder.
This patch lets e.g. dsputil_init chose dsp functions with respect to
the bit depth to decode. The naming scheme of bit depth dependent
functions is <base name>_<bit depth>[_<prefix>] (i.e. the old
clear_blocks_c is now named clear_blocks_8_c).

Note: Some of the functions for high bit depth is not dependent on the
bit depth, but only on the pixel size. This leaves some room for
optimizing binary size.

Preparatory patch for high bit depth h264 decoding support.

Signed-off-by: Ronald S. Bultje <rsbultje@gmail.com>
2011-05-10 07:24:36 -04:00
Michael Niedermayer
be315a3232 Merge remote branch 'qatar/master'
* qatar/master:
Duplicate  AMV: disable DR1 and don't override EMU_EDGE
Duplicate  lavf: inspect more frames for fps when container time base is coarse
Wrong and we have correct fix: Fix races in default av_log handler
  vorbis: Replace sized int_fast integer types with plain int/unsigned.
  Remove disabled non-optimized code variants.
NO  bswap.h: Remove disabled code.
  Remove some disabled printf debug cruft.
  Replace more disabled printf() calls by av_dlog().
NO  tests: Remove disabled code.
NO  Replace some commented-out debug printf() / av_log() messages with av_dlog().
  vorbisdec: Replace some sizeof(type) by sizeof(*variable).
NO  vf_fieldorder: Replace FFmpeg by Libav in license boilerplate.

Conflicts:
	libavcodec/h264.c
	libavcodec/vorbisdec.c
	libavutil/log.c

Merged-by: Michael Niedermayer <michaelni@gmx.at>
2011-04-30 01:58:26 +02:00
Diego Biurrun
a734fa575f Remove disabled non-optimized code variants. 2011-04-29 20:01:13 +02:00
Michael Niedermayer
52a81cd0e4 Fix add_paeth_prediction_mmx for rgb48
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
2011-04-27 20:08:37 +02:00
Michael Niedermayer
afd2371d5c merge read and and in add_paeth_prediction
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
2011-04-27 20:08:37 +02:00
Baptiste Coudurier
6d4c49a2af Move png mmx functions into x86/png_mmx.c, remove them from DSPContext.
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
2011-04-27 20:08:09 +02:00
Michael Niedermayer
d7e5aebae7 Merge remote branch 'qatar/master'
* qatar/master: (23 commits)
  ac3enc: correct the flipped sign in the ac3_fixed encoder
  Eliminate pointless '#if 1' statements without matching '#else'.
  Add AVX FFT implementation.
  Increase alignment of av_malloc() as needed by AVX ASM.
  Update x86inc.asm from x264 to allow AVX emulation using SSE and MMX.
  mjpeg: Detect overreads in mjpeg_decode_scan() and error out.
  documentation: extend documentation for ffmpeg -aspect option
  APIChanges: update commit hashes for recent additions.
  lavc: deprecate FF_*_TYPE macros in favor of AV_PICTURE_TYPE_* enums
  aac: add headers needed for log2f()
  lavc: remove FF_API_MB_Q cruft
  lavc: remove FF_API_RATE_EMU cruft
  lavc: remove FF_API_HURRY_UP cruft
  pad: make the filter parametric
  vsrc_movie: add key_frame and pict_type.
  vsrc_movie: fix leak in request_frame()
  lavfi: add key_frame and pict_type to AVFilterBufferRefVideo.
  vsrc_buffer: add sample_aspect_ratio fields to arguments.
  lavfi: add fieldorder filter
  scale: make the filter parametric
  ...

Conflicts:
	Changelog
	doc/filters.texi
	ffmpeg.c
	libavcodec/ac3dec.h
	libavcodec/dsputil.c
	libavfilter/avfilter.h
	libavfilter/vf_scale.c
	libavfilter/vf_yadif.c
	libavfilter/vsrc_buffer.c

Merged-by: Michael Niedermayer <michaelni@gmx.at>
2011-04-27 03:51:04 +02:00
Vitor Sessak
9d35fa520e Add AVX FFT implementation.
Signed-off-by: Reinhard Tartler <siretart@tauware.de>
2011-04-26 18:25:24 +02:00
Vitor Sessak
33cbfa6fa3 Update x86inc.asm from x264 to allow AVX emulation using SSE and MMX.
Signed-off-by: Reinhard Tartler <siretart@tauware.de>
2011-04-26 18:18:22 +02:00
Carl Eugen Hoyos
5c0068758f Fix compilation with --disable-yasm. 2011-04-12 17:40:18 +02:00
Oskar Arvidsson
8dbe585641 Adds 8-, 9- and 10-bit versions of some of the functions used by the h264 decoder.
This patch lets e.g. dsputil_init chose dsp functions with respect to
the bit depth to decode. The naming scheme of bit depth dependent
functions is <base name>_<bit depth>[_<prefix>] (i.e. the old
clear_blocks_c is now named clear_blocks_8_c).

Note: Some of the functions for high bit depth is not dependent on the
bit depth, but only on the pixel size. This leaves some room for
optimizing binary size.

Preparatory patch for high bit depth h264 decoding support.

Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
2011-04-10 22:33:42 +02:00
Michael Niedermayer
3c8493074b Merge remote-tracking branch 'newdev/master'
* newdev/master:
  dsputil: allow to skip drawing of top/bottom edges.
  Split fate-psx-str-v3 into a video-only and audio-only test.

Conflicts:
	libavcodec/dsputil.c
	libavcodec/mpegvideo.c
	libavcodec/snow.c
	libavcodec/x86/dsputil_mmx.c

Merged-by: Michael Niedermayer <michaelni@gmx.at>
2011-03-27 01:40:18 +01:00
Alexander Strange
1500be13f2 dsputil: allow to skip drawing of top/bottom edges. 2011-03-26 17:45:38 -04:00
Michael Niedermayer
2fd41c9067 Merge remote-tracking branch 'newdev/master'
* newdev/master:
  avio: make udp_set_remote_url/get_local_port internal.
  asfdec: also subtract preroll when reading simple index object
  matroskaenc: remove a variable that's unused after bc17bd9.
  avio: cosmetics - nicer vertical alignment.
  Remove unnecessary icc version checks
  Disable 'attribute "foo" ignored' warnings from icc
  rtsp: Don't use a locale dependent format string
  Add xd55 codec tag for XDCAM HD422 720p25 CBR files.
  configure: get libavcodec version from new version.h header
  lavc: move the version macros to a new installed header.
  matroskaenc: simplify get_aac_sample_rates by using ff_mpeg4audio_get_config
  Do not use format string "%0.3f" for RTSP Range field.
  Add apply_window_int16() to DSPContext with x86-optimized versions and use it in the ac3_fixed encoder.
  Document usage of import libraries created by dlltool
  configure: Set the correct lib target for arm/wince dlltool
  fate: simplify regression-funcs.sh
  fate: add support for multithread testing

Conflicts:
	libavformat/rtspdec.c
	libavutil/attributes.h
	libavutil/internal.h
	libavutil/mem.h

Merged-by: Michael Niedermayer <michaelni@gmx.at>
2011-03-24 02:16:11 +01:00
Justin Ruggles
e6e9823488 Add apply_window_int16() to DSPContext with x86-optimized versions and use it
in the ac3_fixed encoder.
2011-03-22 21:08:30 -04:00
Michael Niedermayer
d375c10400 Fake-Merge remote-tracking branch 'ffmpeg-mt/master' 2011-03-22 22:36:57 +01:00
Michael Niedermayer
d4a50a2100 Merge remote-tracking branch 'newdev/master'
Merged-by: Michael Niedermayer <michaelni@gmx.at>
2011-03-21 03:33:28 +01:00
Mans Rullgard
0aded9484d Move dct and rdft definitions to separate files
This leaves fft.h with only the core FFT and MDCT definitions
thus making it more managable.

Signed-off-by: Mans Rullgard <mans@mansr.com>
2011-03-20 17:15:33 +00:00
Mans Rullgard
2912e87a6c Replace FFmpeg with Libav in licence headers
Signed-off-by: Mans Rullgard <mans@mansr.com>
2011-03-19 13:33:20 +00:00
Justin Ruggles
0f999cfddb ac3enc: add float_to_fixed24() with x86-optimized versions to AC3DSPContext
and use in scale_coefficients() for the floating-point AC-3 encoder.
2011-03-17 16:46:48 -04:00
Justin Ruggles
79414257e2 mathops: fix MULL() when the compiler does not inline the function.
If the function is not inlined, an immmediate cannot be used for the
shift parameter, so the %cl register must be used instead in that case.

This fixes compilation for x86-32 using gcc with --disable-optimizations.
2011-03-15 20:49:37 -04:00
Justin Ruggles
aaff3b312e mathops: change "g" constraint to "rm" in x86-32 version of MUL64().
The 1-arg imul instruction cannot take an immediate argument, only a register
or memory argument.
2011-03-15 13:43:47 -04:00
Justin Ruggles
b181b8fb96 mathops: convert MULL/MULH/MUL64 to inline functions rather than macros.
This fixes unexpected name collisions that were occurring with variables
declared within the macros.
It also fixes the fate-acodec-ac3_fixed regression test on x86-32.
2011-03-15 13:43:47 -04:00
Justin Ruggles
f1efbca5e9 ac3enc: add SIMD-optimized shifting functions for use with the fixed-point AC3 encoder. 2011-03-14 08:45:31 -04:00
Mans Rullgard
a5444fee06 Add CONFIG_AC3DSP symbol to simplify makefiles
Signed-off-by: Mans Rullgard <mans@mansr.com>
2011-03-12 11:35:26 +00:00
Ronald S. Bultje
bf6fa73245 dsputil_mmx.c: remove ff_vector128.
Remove ff_vector128, it is identical to ff_pb_80.
2011-02-19 10:51:15 -05:00
Ronald S. Bultje
12802ec060 dsputil: move VC1-specific stuff into VC1DSPContext. 2011-02-17 17:35:35 -05:00
Justin Ruggles
1f004fc512 ac3dsp: Change punpckhqdq to movhlps in ac3_max_msb_abs_int16().
Signed-off-by: Ronald S. Bultje <rsbultje@gmail.com>
2011-02-16 14:08:34 -05:00
Justin Ruggles
fbb6b49dab ac3enc: Add x86-optimized function to speed up log2_tab().
AC3DSPContext.ac3_max_msb_abs_int16() finds the maximum MSB of the absolute
value of each element in an array of int16_t.

Signed-off-by: Ronald S. Bultje <rsbultje@gmail.com>
2011-02-13 16:49:39 -05:00
Loren Merritt
e6b1ed693a FFT: factor a shuffle out of the inner loop and merge it into fft_permute.
6% faster SSE FFT on Conroe, 2.5% on Penryn.

Signed-off-by: Janne Grunau <janne-ffmpeg@jannau.net>
2011-02-13 15:36:39 +01:00
Justin Ruggles
dda3f0ef48 Add x86-optimized versions of exponent_min().
Signed-off-by: Ronald S. Bultje <rsbultje@gmail.com>
2011-02-10 15:32:47 -05:00
Ronald S. Bultje
17cf7c68ed Fix ff_emu_edge_core_sse() on Win64.
Fix emu_edge_v_extend_15 to be <128 bytes on Win64, by being more strict
on the size of registers and which registers are being used for operations
where multiple are available. This fixes segfaults in emulated_edge()
function calls on Win64.
2011-02-08 18:25:12 -05:00
Justin Ruggles
c73d99e672 Separate format conversion DSP functions from DSPContext.
This will be beneficial for use with the audio conversion API without
requiring it to depend on all of dsputil.

Signed-off-by: Mans Rullgard <mans@mansr.com>
2011-02-02 02:44:53 +00:00
Alex Converse
770c410fbb Fix ff_imdct_calc_sse() on gcc-4.6
Gcc 4.6 only preserves the first value when using an array with an "m"
constraint.

Signed-off-by: Mans Rullgard <mans@mansr.com>
2011-02-02 02:40:05 +00:00
Ronald S. Bultje
81f2a3f4ff Implement a SIMD version of emulated_edge_mc() for x86.
From ~550 cycles (C version) to 170 (SSE/x86-64), 206 (MMX/x86-32)
and 196 (SSE2/x86-32) cycles.
2011-01-31 20:55:56 -05:00
Justin Ruggles
d19b744a36 cosmetics: indentation
Signed-off-by: Mans Rullgard <mans@mansr.com>
2011-01-31 20:30:15 +00:00
Justin Ruggles
80ba1ddb58 Remove unneeded add bias from 3 functions.
DSPContext.vector_fmul_window()
DCADSPContext.lfe_fir()
SynthFilterContext.synth_filter_float()

Signed-off-by: Mans Rullgard <mans@mansr.com>
2011-01-31 20:28:42 +00:00
Mans Rullgard
80944df720 x86: fix overflow in h264 8x8 planar prediction
Signed-off-by: Mans Rullgard <mans@mansr.com>
2011-01-24 23:24:28 +00:00
Justin Ruggles
6eabb0d3ad Change DSPContext.vector_fmul() from dst=dst*src to dest=src0*src1.
Signed-off-by: Mans Rullgard <mans@mansr.com>
2011-01-22 17:53:27 +00:00
Justin Ruggles
1c189fc533 cosmetics related to LPC changes.
Signed-off-by: Mans Rullgard <mans@mansr.com>
2011-01-21 19:59:08 +00:00
Justin Ruggles
77a78e9bdc Separate window function from autocorrelation.
Signed-off-by: Mans Rullgard <mans@mansr.com>
2011-01-21 19:59:08 +00:00
Justin Ruggles
56f8952b25 Move lpc_compute_autocorr() from DSPContext to a new struct LPCContext.
Signed-off-by: Mans Rullgard <mans@mansr.com>
2011-01-21 19:58:59 +00:00
Ronald S. Bultje
b9c7f66e6d Fix horizontal/horizontal_up 8x8l intra prediction x86/simd functions.
The original functions did not work correctly for edge pixels, e.g.
when CODEC_FLAG_EMU_EDGE is set, leading to corrupt output in e.g. VLC.
Based on a patch by Daniel Kang <daniel d kang gmail com>.

Signed-off-by: Ronald S. Bultje <rsbultje gmail com>
2011-01-19 20:34:42 -05:00
Mans Rullgard
ef4a65149d Replace ASMALIGN() with .p2align
This macro has unconditionally used .p2align for a long time and
serves no useful purpose.
2011-01-18 20:48:24 +00:00
Mans Rullgard
ac3c9d0169 x86: remove VLA in ac3_downmix_sse 2011-01-18 20:48:24 +00:00
Janne Grunau
2c3589bfda consolidate .gitignore patters into a single file
Signed-off-by: Janne Grunau <janne-ffmpeg@jannau.net>
2011-01-18 21:32:05 +01:00
Janne Grunau
348b8218f7 convert svn:ignore properties to .gitignore files
Signed-off-by: Janne Grunau <janne-ffmpeg@jannau.net>
2011-01-17 15:50:14 +01:00
Ronald S. Bultje
1b3e43e4fd Fix overflow in pred16x16_plane x86 simd code. Fixes issue 2547.
Originally committed as revision 26381 to svn://svn.ffmpeg.org/ffmpeg/trunk
2011-01-15 22:00:44 +00:00
Ronald S. Bultje
ec3233a855 Fix ff_pw_3 alignment.
Originally committed as revision 26344 to svn://svn.ffmpeg.org/ffmpeg/trunk
2011-01-14 23:26:34 +00:00
Jason Garrett-Glaser
19fb234e4a H.264: split luma dc idct out and implement MMX/SSE2 versions
About 2.5x the speed.

NOTE: the way that the asm code handles large qmuls is a bit suboptimal.
If x264-style dequant was used (separate shift and qmul values), it might
be possible to get some extra speed.

Originally committed as revision 26336 to svn://svn.ffmpeg.org/ffmpeg/trunk
2011-01-14 21:34:25 +00:00
Daniel Kang
004357a11f Fix compilation on x86-32 with --disable-optimizations,
fixes issue 2127.

Patch by Daniel Kang, daniel.d.kang at gmail

Originally committed as revision 26204 to svn://svn.ffmpeg.org/ffmpeg/trunk
2011-01-03 11:30:04 +00:00
Daniel Kang
0790caba60 Fix invalid reads in valgrind fate, patch by Daniel Kang <daniel dot d dot
kang at gmail com>, as part of Google's GCI 2010.

Originally committed as revision 26177 to svn://svn.ffmpeg.org/ffmpeg/trunk
2010-12-31 01:29:06 +00:00
Daniel Kang
536e9b2f58 Port pred8x8l_down_left_mmxext (H.264 intra prediction) from x264 (authors:
Jason, Loren, Holger) to FFmpeg. Patch by Daniel Kang <daniel dot d dot kang
at gmail com>, as part of Google's GCI 2010.

Originally committed as revision 26162 to svn://svn.ffmpeg.org/ffmpeg/trunk
2010-12-29 23:48:44 +00:00
Daniel Kang
720ea2d5b2 Port pred4x4_down_right_mmxext (H.264 intra prediction) from x264 (authors:
Jason, Loren, Holger) to FFmpeg. Patch by Daniel Kang <daniel dot d dot kang
at gmail com>, as part of Google's GCI 2010.

Originally committed as revision 26159 to svn://svn.ffmpeg.org/ffmpeg/trunk
2010-12-29 21:55:51 +00:00
Daniel Kang
d0aebe23e2 Port pred4x4_vertical_right_mmxext (H.264 intra prediction) from x264 (authors:
Jason, Loren, Holger) to FFmpeg. Patch by Daniel Kang <daniel dot d dot kang
at gmail com>, as part of Google's GCI 2010.

Originally committed as revision 26158 to svn://svn.ffmpeg.org/ffmpeg/trunk
2010-12-29 21:52:41 +00:00
Daniel Kang
76497232ef Port pred4x4_horizontal_down_mmxext (H.264 intra prediction) from x264
(authors:Jason, Loren, Holger) to FFmpeg. Patch by Daniel Kang <daniel dot
d dot kang at gmail com>, as part of Google's GCI 2010.

Originally committed as revision 26157 to svn://svn.ffmpeg.org/ffmpeg/trunk
2010-12-29 21:49:57 +00:00
Daniel Kang
e9c576a467 Port pred4x4_horizontal_up_mmxext (H.264 intra prediction) from x264 (authors:
Jason, Loren, Holger) to FFmpeg. Patch by Daniel Kang <daniel dot d dot kang
at gmail com>, as part of Google's GCI 2010.

Originally committed as revision 26156 to svn://svn.ffmpeg.org/ffmpeg/trunk
2010-12-29 21:42:33 +00:00
Daniel Kang
92f441ae86 Port pred4x4_vertical_left_mmxext (H.264 intra prediction) from x264 (authors:
Jason, Loren, Holger) to FFmpeg. Patch by Daniel Kang <daniel dot d dot kang
at gmail com>, as part of Google's GCI 2010.

Originally committed as revision 26155 to svn://svn.ffmpeg.org/ffmpeg/trunk
2010-12-29 21:35:34 +00:00
Ronald S. Bultje
e8d98764cc Merge a few superfluous CONFIG_GPL checks.
Originally committed as revision 26154 to svn://svn.ffmpeg.org/ffmpeg/trunk
2010-12-29 21:30:47 +00:00
Ronald S. Bultje
42a59278cf Whitespace cosmetics.
Originally committed as revision 26152 to svn://svn.ffmpeg.org/ffmpeg/trunk
2010-12-29 20:43:15 +00:00
Daniel Kang
57b1f334d1 Port pred8x8l_horizontal_down_sse2/ssse3 (H.264 intra prediction) from x264
(authors: Jason, Loren, Holger) to FFmpeg. Patch by Daniel Kang <daniel dot
d dot kang at gmail com>, as part of Google's GCI 2010.

Originally committed as revision 26151 to svn://svn.ffmpeg.org/ffmpeg/trunk
2010-12-29 20:42:15 +00:00
Daniel Kang
04cbdf3d24 Port pred8x8l_horizontal_down_mmxext (H.264 intra prediction) from x264
(authors: Jason, Loren, Holger) to FFmpeg. Patch by Daniel Kang <daniel dot
d dot kang at gmail com>, as part of Google's GCI 2010.

Originally committed as revision 26150 to svn://svn.ffmpeg.org/ffmpeg/trunk
2010-12-29 20:38:06 +00:00
Daniel Kang
98c6053cd0 Port pred8x8l_horizontal_up_mmxext/ssse3 (H.264 intra prediction) from x264
(authors: Jason, Loren, Holger) to FFmpeg. Patch by Daniel Kang <daniel dot
d dot kang at gmail com>, as part of Google's GCI 2010.

Originally committed as revision 26149 to svn://svn.ffmpeg.org/ffmpeg/trunk
2010-12-29 20:35:31 +00:00
Daniel Kang
ecc7efbbb6 Port pred8x8l_vertical_left_sse2/ssse3 (H.264 intra prediction) from x264
(authors: Jason, Loren, Holger) to FFmpeg. Patch by Daniel Kang <daniel dot
d dot kang at gmail com>, as part of Google's GCI 2010.

Originally committed as revision 26148 to svn://svn.ffmpeg.org/ffmpeg/trunk
2010-12-29 20:06:22 +00:00
Daniel Kang
bdd93f1b25 Port pred8x8l_vertical_right_sse2/ssse3 (H.264 intra prediction) from x264
(authors: Jason, Loren, Holger) to FFmpeg. Patch by Daniel Kang <daniel dot
d dot kang at gmail com>, as part of Google's GCI 2010.

Originally committed as revision 26147 to svn://svn.ffmpeg.org/ffmpeg/trunk
2010-12-29 19:54:05 +00:00
Daniel Kang
f25112fc09 Port pred8x8l_vertical_right_mmxext (H.264 intra prediction) from x264
(authors: Jason, Loren, Holger) to FFmpeg. Patch by Daniel Kang <daniel dot
d dot kang at gmail com>, as part of Google's GCI 2010.

Originally committed as revision 26146 to svn://svn.ffmpeg.org/ffmpeg/trunk
2010-12-29 19:46:09 +00:00
Daniel Kang
602a4cb25a Port pred8x8l_down_right_sse2/ssse3 (H.264 intra prediction) from x264
(authors: Jason, Loren, Holger) to FFmpeg. Patch by Daniel Kang <daniel dot
d dot kang at gmail com>, as part of Google's GCI 2010.

Originally committed as revision 26145 to svn://svn.ffmpeg.org/ffmpeg/trunk
2010-12-29 19:19:49 +00:00
Daniel Kang
e916acbcd1 Port pred8x8l_down_right_mmxext (H.264 intra prediction) from x264 (authors:
Jason, Loren, Holger) to FFmpeg. Patch by Daniel Kang <daniel dot d dot kang
at gmail com>, as part of Google's GCI 2010.

Originally committed as revision 26143 to svn://svn.ffmpeg.org/ffmpeg/trunk
2010-12-29 19:12:02 +00:00
Daniel Kang
c249e66576 Port pred8x8l_down_left_sse2/ssse3 (H.264 intra prediction) from x264 (authors:
Jason, Loren, Holger) to FFmpeg. Patch by Daniel Kang <daniel dot d dot kang at
gmail com>, as part of Google's GCI 2010.

Originally committed as revision 26142 to svn://svn.ffmpeg.org/ffmpeg/trunk
2010-12-29 19:02:50 +00:00
Daniel Kang
ee1ba9c326 Port pred8x8l_vertical_mmxext/ssse3 (H.264 intra prediction) from x264 to
FFmpeg. Original authors: Holger Lubitz <holger lubitz org>, Jason Garrett-
Glaser <darkshikari gmail com> (approves LGPL relicensing for this code) and
Loren Merritt <lorenm at u dot washington dot edu> (approves LGPL relicensing
for this code). Patch by Daniel Kang <daniel dot d dot kang at gmail com>, as
part of Google's GCI 2010.

Originally committed as revision 26140 to svn://svn.ffmpeg.org/ffmpeg/trunk
2010-12-29 18:46:40 +00:00
Daniel Kang
04207ef353 Port pred8x8l_horizontal_mmxext/ssse3 (H.264 intra prediction) from x264 to
FFmpeg. Original authors: Holger Lubitz <holger lubitz org>, Jason Garrett-
Glaser <darkshikari gmail com> (approves LGPL relicensing for this code) and
Loren Merritt <lorenm at u dot washington dot edu> (approves LGPL relicensing
for this code). Patch by Daniel Kang <daniel dot d dot kang at gmail com>, as
part of Google's GCI 2010.

Originally committed as revision 26139 to svn://svn.ffmpeg.org/ffmpeg/trunk
2010-12-29 18:40:53 +00:00
Daniel Kang
abab14eac0 Port pred8x8l_dc_mmx/ssse3 (H.264 intra prediction) from x264 to FFmpeg.
Original authors: Holger Lubitz <holger lubitz org>, Jason Garrett-Glaser
<darkshikari gmail com> (approves LGPL relicensing for this code) and Loren
Merritt <lorenm at u dot washington dot edu> (approves LGPL relicensing for
this code). Patch by Daniel Kang <daniel dot d dot kang at gmail com>, as
part of Google's GCI 2010.

Originally committed as revision 26138 to svn://svn.ffmpeg.org/ffmpeg/trunk
2010-12-29 18:33:10 +00:00
Daniel Kang
2e93fd4b5e Port pred8x8l_top_dc_mmxext/ssse3 (H.264 intra prediction) from x264 to FFmpeg.
Original authors: Holger Lubitz <holger lubitz org>, Jason Garrett-Glaser
<darkshikari gmail com> (approves LGPL relicensing for this code) and Loren
Merritt <lorenm at u dot washington dot edu> (approves LGPL relicensing for
this code). Patch by Daniel Kang <daniel dot d dot kang at gmail com>, as
part of Google's GCI 2010.

Originally committed as revision 26137 to svn://svn.ffmpeg.org/ffmpeg/trunk
2010-12-29 18:11:27 +00:00
Ronald S. Bultje
54a959e483 Move PRED4x4_LOWPASS up so it can be used in 8x8l predict functions while
keeping the functions ordered in the source file (i.e. cosmetics).

Originally committed as revision 26136 to svn://svn.ffmpeg.org/ffmpeg/trunk
2010-12-29 18:04:57 +00:00
Ronald S. Bultje
a2dfe8d18d Port pred8x8_dc_mmxext (H.264 intra prediction) from x264 to FFmpeg. Original
authors: Holger Lubitz <holger lubitz org>, Jason Garrett-Glaser <darkshikari
gmail com> (approves LGPL relicensing for this code) and Loren Merritt <lorenm
at u dot washington dot edu> (approves LGPL relicensing for this code). Patch
by Daniel Kang <daniel dot d dot kang at gmail com>, as part of Google's GCI
2010.

Originally committed as revision 26135 to svn://svn.ffmpeg.org/ffmpeg/trunk
2010-12-29 18:00:26 +00:00
Ronald S. Bultje
83ff3f72e5 Add missing authors to copyright headers.
Originally committed as revision 26133 to svn://svn.ffmpeg.org/ffmpeg/trunk
2010-12-29 17:45:26 +00:00
Daniel Kang
725a3f9dfb Port pred8x8_top_dc_mmxext (H.264 intra prediction) from x264 to FFmpeg.
Original authors: Holger Lubitz <holger lubitz org>, Jason Garrett-Glaser
<darkshikari gmail com> (approves LGPL relicensing for this code) and Loren
Merritt <lorenm at u dot washington dot edu> (approves LGPL relicensing for
this code). Patch by Daniel Kang <daniel dot d dot kang at gmail com>, as
part of Google's GCI 2010.

Originally committed as revision 26132 to svn://svn.ffmpeg.org/ffmpeg/trunk
2010-12-29 17:42:34 +00:00
Ronald S. Bultje
98928c83e0 Mark recently added pred4x4_down_left_mmxext as CONFIG_GPL. Although Holger
initially said he'd be OK with relicensing, he also said he wanted to have
another look at the patch, and then he went on vacation, so let's play it
safe for now. We can consider removing this again later.

Originally committed as revision 26131 to svn://svn.ffmpeg.org/ffmpeg/trunk
2010-12-29 17:34:00 +00:00
Daniel Kang
911b32f482 Port pred4x4_down_left_mmxext (H.264 intra prediction) from x264 to FFmpeg.
LGPL relicensing approved by original authors: Holger Lubitz <holger lubitz
org>, Jason Garrett-Glaser <darkshikari gmail com> and Loren Merritt <lorenm
at u dot washington dot edu>. Patch by Daniel Kang <daniel dot d dot kang at
gmail com>, as part of Google's GCI 2010.

Originally committed as revision 26087 to svn://svn.ffmpeg.org/ffmpeg/trunk
2010-12-24 22:43:07 +00:00
Ronald S. Bultje
8d147f1f60 For rounding in chroma MC SSSE3, use 16-byte pw_3/4 instead of reading 8 bytes
and then using movlhps to dup it into the higher half of the register.

Originally committed as revision 26086 to svn://svn.ffmpeg.org/ffmpeg/trunk
2010-12-24 17:23:22 +00:00
Baptiste Coudurier
90f1f3bf00 In yadif filter, declare asm constants directly to avoid dependency on libavcodec
Originally committed as revision 25895 to svn://svn.ffmpeg.org/ffmpeg/trunk
2010-12-06 00:14:15 +00:00
Baptiste Coudurier
9e95999e2a 10l, add ff_pw_1 to dsputil_mmx for yadif sse2
Originally committed as revision 25881 to svn://svn.ffmpeg.org/ffmpeg/trunk
2010-12-04 13:06:06 +00:00
avcoder
1761272ba9 Use SECTION .text for yasm code.
Patch by avcoder, ffmpeg gmail

Originally committed as revision 25859 to svn://svn.ffmpeg.org/ffmpeg/trunk
2010-12-01 13:12:39 +00:00
Ramiro Polla
4f9d25ddc8 dnxhd_mmx: prefer xmm registers below xmm6 when they are available
Originally committed as revision 25634 to svn://svn.ffmpeg.org/ffmpeg/trunk
2010-11-02 03:09:16 +00:00
İsmail Dönmez
80e33d2451 dsputil: Use explicit movzbl instead of movzx
This fixes compilation with the latest clang trunk version.

Patch by İsmail Dönmez, ismail at namtrac dot org

Originally committed as revision 25628 to svn://svn.ffmpeg.org/ffmpeg/trunk
2010-11-01 19:35:51 +00:00
Ramiro Polla
a4ece893e1 lpc_mmx: add xmm registers to clobber list
Originally committed as revision 25620 to svn://svn.ffmpeg.org/ffmpeg/trunk
2010-10-31 23:37:15 +00:00
Ramiro Polla
e5d5407e26 lpc_mmx: merge some asm blocks
These blocks depended on the compiler keeping xmm registers untouched between
them.

Originally committed as revision 25619 to svn://svn.ffmpeg.org/ffmpeg/trunk
2010-10-31 23:36:26 +00:00
Ramiro Polla
eed299b897 sad16_sse2: merge 2 asm blocks
Originally committed as revision 25617 to svn://svn.ffmpeg.org/ffmpeg/trunk
2010-10-31 21:20:20 +00:00
Ramiro Polla
153ca56b38 xmm_clobbers: list xmm registers first in clobber list
suncc does not like the leading commas inside the macro, but it has no problem
with trailing commas.

Originally committed as revision 25615 to svn://svn.ffmpeg.org/ffmpeg/trunk
2010-10-31 18:14:48 +00:00
Ramiro Polla
ba40452095 idct_sse2_xvid: only mark xmm>=8 as clobbered on x86_64
Originally committed as revision 25614 to svn://svn.ffmpeg.org/ffmpeg/trunk
2010-10-31 16:28:28 +00:00
Ramiro Polla
05c018078c motion_est_mmx: prefer xmm registers below xmm6 when they are available
Originally committed as revision 25612 to svn://svn.ffmpeg.org/ffmpeg/trunk
2010-10-31 15:07:21 +00:00
Ramiro Polla
5d543a3d13 dsputil_mmx: add xmm registers to clobber list
Originally committed as revision 25611 to svn://svn.ffmpeg.org/ffmpeg/trunk
2010-10-31 13:57:58 +00:00
Ramiro Polla
e2d13c5882 cosmetics: split long line
Originally committed as revision 25610 to svn://svn.ffmpeg.org/ffmpeg/trunk
2010-10-31 13:46:17 +00:00
Ramiro Polla
0d729e0de2 fdct_mmx: add xmm registers to clobber list
Originally committed as revision 25609 to svn://svn.ffmpeg.org/ffmpeg/trunk
2010-10-31 13:45:04 +00:00
Ramiro Polla
616735eb97 idct_sse2_xvid: add xmm registers to clobber list
Originally committed as revision 25608 to svn://svn.ffmpeg.org/ffmpeg/trunk
2010-10-31 13:17:43 +00:00
Ramiro Polla
9943f3b91c mpegvideo_mmx: add xmm registers to clobber list
Originally committed as revision 25607 to svn://svn.ffmpeg.org/ffmpeg/trunk
2010-10-31 13:15:16 +00:00
Ramiro Polla
559738eff3 dsputil_mmx: prefer xmm registers below xmm6 when they are available
Originally committed as revision 25606 to svn://svn.ffmpeg.org/ffmpeg/trunk
2010-10-31 13:13:53 +00:00
Ramiro Polla
51d592dbcb h264dsp: add xmm registers to clobber list
Originally committed as revision 25604 to svn://svn.ffmpeg.org/ffmpeg/trunk
2010-10-30 17:14:22 +00:00
Ramiro Polla
ac19f4a3e8 indent
Originally committed as revision 25598 to svn://svn.ffmpeg.org/ffmpeg/trunk
2010-10-28 18:31:30 +00:00
Ramiro Polla
cae05859e1 h264dsp: merge some more asm blocks
Originally committed as revision 25597 to svn://svn.ffmpeg.org/ffmpeg/trunk
2010-10-28 18:22:21 +00:00
Ramiro Polla
c6a908be58 dct32: mark xmm registers in clobber list in ff_dct32_float_sse()
Originally committed as revision 25569 to svn://svn.ffmpeg.org/ffmpeg/trunk
2010-10-25 20:29:29 +00:00
Ramiro Polla
b32c9ca9a3 h264dsp: merge some asm blocks
Some code was initializing some xmm registers in one asm block and using them
in the following block, assuming they wouldn't be changed in between blocks.

Originally committed as revision 25568 to svn://svn.ffmpeg.org/ffmpeg/trunk
2010-10-25 18:02:02 +00:00
Reimar Döffinger
6c2142809c Add d modifier to asm argument to fix nasm compilation.
Originally committed as revision 25397 to svn://svn.ffmpeg.org/ffmpeg/trunk
2010-10-07 19:18:18 +00:00
Ramiro Polla
326bf69acc fft: mark xmm registers as clobbered in ff_imdct_calc_sse
Originally committed as revision 25363 to svn://svn.ffmpeg.org/ffmpeg/trunk
2010-10-06 01:27:02 +00:00
Ronald S. Bultje
dd68d4db43 MMX, MMX2, SSE2 and SSSE3 optimizations for pred16x16/8x8_plane H264 intra
prediction (plus some with different rounding for svq3/rv40). Speedup (for
SSSE3) about ~6-fold, 3.6% faster overall with cathedral sample.

Originally committed as revision 25361 to svn://svn.ffmpeg.org/ffmpeg/trunk
2010-10-05 22:06:18 +00:00
İsmail Dönmez
9276bdddca snowdsp: Explicitly state the operand sizes
Fixes compilation with clang's builtin assembler

Patch by İsmail Dönmez, ismail at namtrac dot org

Originally committed as revision 25331 to svn://svn.ffmpeg.org/ffmpeg/trunk
2010-10-04 13:08:13 +00:00
Ronald S. Bultje
a52ffc3f54 Move static inline function to a macro, so that constant propagation in
inline asm works for gcc-3.x also (hopefully). Should fix gcc-3.x FATE
breakage after r25254.

Originally committed as revision 25262 to svn://svn.ffmpeg.org/ffmpeg/trunk
2010-09-29 17:42:26 +00:00
Eli Friedman
329d689f75 Use sse2 variant of put_pixels16() for no_rnd also. Provides a minor speed
increase to e.g. vc1, snow and mpeg decoding.

Patch by Eli Friedman <eli dot friedman gmail com>.

Originally committed as revision 25259 to svn://svn.ffmpeg.org/ffmpeg/trunk
2010-09-29 15:34:43 +00:00
Ronald S. Bultje
cd17285e6c Merge b_idx and edge variables, and optimize the ASM to directly load variables
from memory locations/offsets depending on b_idx plus constants, rather than
having gcc do this. This saves several lea calls and together saves about
10 cycles in h264_loop_filter_strength_mmx2().

Originally committed as revision 25256 to svn://svn.ffmpeg.org/ffmpeg/trunk
2010-09-29 14:04:39 +00:00
Ronald S. Bultje
0cc8a5d088 Remove mv_mask variable. Replace the related pand -1/0 instructions by either
a pxor, or remove the instruction alltogether. Altogether, this saves 1
instruction.

Originally committed as revision 25255 to svn://svn.ffmpeg.org/ffmpeg/trunk
2010-09-29 14:03:30 +00:00
Ronald S. Bultje
c0673f2cf4 Remove d_idx as a variable, and instead load it as a constant in the asm.
This has no measurable speed effect because the surrounding code doesn't
take advantage of this yet.

Originally committed as revision 25254 to svn://svn.ffmpeg.org/ffmpeg/trunk
2010-09-29 14:02:32 +00:00
Ronald S. Bultje
2c3135f6d3 Unroll inner bidir loop in h264_loop_filter_strength_mmx2(), which gets rid
of the d_idx variable and therefore allows for future optimizations. No speed
difference by this commit itself.

Originally committed as revision 25253 to svn://svn.ffmpeg.org/ffmpeg/trunk
2010-09-29 13:35:24 +00:00
Ronald S. Bultje
4b81511cab Unloop the outer loop in h264_loop_filter_strength_mmx2(), which allows
inlining various constants within the loop code. 20 cycles faster on
cathedral sample.

Originally committed as revision 25252 to svn://svn.ffmpeg.org/ffmpeg/trunk
2010-09-29 13:34:20 +00:00
Reimar Döffinger
02b424d9c8 Add d suffix to movd target register to make it work with nasm.
Originally committed as revision 25206 to svn://svn.ffmpeg.org/ffmpeg/trunk
2010-09-26 09:15:18 +00:00
Reimar Döffinger
dc77e985b7 Split and then simplify address generation macro.
Allows nasm to work for this code.

Originally committed as revision 25205 to svn://svn.ffmpeg.org/ffmpeg/trunk
2010-09-26 09:08:11 +00:00
Ronald S. Bultje
7e117771cd Remove unused variable.
Originally committed as revision 25173 to svn://svn.ffmpeg.org/ffmpeg/trunk
2010-09-24 15:31:46 +00:00
Ronald S. Bultje
ae11291865 Unroll loop in h264_idct_add16intra_sse2(). Basically identical to r25171, this
inlines scan8[] and removes loop setup. 15% faster, 0.4% overall.

See "[PATCH] unroll loop in h264_idct_add8_sse2()" thread on ML.

Originally committed as revision 25172 to svn://svn.ffmpeg.org/ffmpeg/trunk
2010-09-24 14:07:23 +00:00
Ronald S. Bultje
4bca677494 Unroll loop in h264_idct_add8_sse2(). This means we can inline scan8[] in the
code directly also and remove loop setup. 20% faster in function, 0.8% overall.

See "[PATCH] unroll loop in h264_idct_add8_sse2()" thread on ML.

Originally committed as revision 25171 to svn://svn.ffmpeg.org/ffmpeg/trunk
2010-09-24 14:05:45 +00:00
Måns Rullgård
c0bc8b9afb x86: disable SSE functions using stack when stack is not aligned
This fixes crashes with ICC 10.1.

Originally committed as revision 25153 to svn://svn.ffmpeg.org/ffmpeg/trunk
2010-09-21 17:57:21 +00:00
Måns Rullgård
f41237c9db x86: remove hack disabling sse2 h264 loop filter with 32-bit icc
Originally committed as revision 25146 to svn://svn.ffmpeg.org/ffmpeg/trunk
2010-09-18 20:44:32 +00:00
Ronald S. Bultje
ada65af9d1 Don't access upper 32 bits of a 32-bit int on 64-bit systems.
Originally committed as revision 25140 to svn://svn.ffmpeg.org/ffmpeg/trunk
2010-09-17 12:24:22 +00:00
Ronald S. Bultje
6c3d021891 Properly add HAVE_YASM around yasmified symbols. Should fix compile error
on configurations using --disable-yasm.

Originally committed as revision 25138 to svn://svn.ffmpeg.org/ffmpeg/trunk
2010-09-17 03:01:57 +00:00
Ronald S. Bultje
e2e341048e Move hadamard_diff{,16}_{mmx,mmx2,sse2,ssse3}() from inline asm to yasm,
which will hopefully solve the Win64/FATE failures caused by these functions.

Originally committed as revision 25137 to svn://svn.ffmpeg.org/ffmpeg/trunk
2010-09-17 01:56:06 +00:00
Ronald S. Bultje
d0acc2d2e9 Move sse16_sse2() from inline asm to yasm. It is one of the functions causing
Win64/FATE issues.

Originally committed as revision 25136 to svn://svn.ffmpeg.org/ffmpeg/trunk
2010-09-17 01:44:17 +00:00
Ronald S. Bultje
1d16a1cf99 Rename h264_idct_sse2.asm to h264_idct.asm; move inline IDCT asm from
h264dsp_mmx.c to h264_idct.asm (as yasm code). Because the loops are now
coded in asm instead of C, this is (depending on the function) up to 50%
faster for cases where gcc didn't do a great job at looping.

Since h264_idct_add8() is now faster than the manual loop setup in h264.c,
in-asm idct calling can now be enabled for chroma as well (see r16207). For
MMX, this is 5% faster. For SSE2 (which isn't done for chroma if h264.c does
the looping), this makes it up to 50% faster. Speed gain overall is ~0.5-1.0%.

Originally committed as revision 25119 to svn://svn.ffmpeg.org/ffmpeg/trunk
2010-09-14 13:36:26 +00:00
Jason Garrett-Glaser
8acb554aff LGPL SSE2 H.264 iDCT
This leaves no more GPL-only H.264 decoding asm code.

Approved by Loren.

Originally committed as revision 25092 to svn://svn.ffmpeg.org/ffmpeg/trunk
2010-09-10 02:25:12 +00:00
Stefano Sabatini
c6c98d0897 Move mm_support() from libavcodec to libavutil, make it a public
function and rename it to av_get_cpu_flags().

Originally committed as revision 25076 to svn://svn.ffmpeg.org/ffmpeg/trunk
2010-09-08 15:07:14 +00:00
Reimar Döffinger
b1c32fb5e5 Use "d" suffix for general-purpose registers used with movd.
This increases compatibilty with nasm and is also more consistent,
e.g. with h264_intrapred.asm and h264_chromamc.asm that already
do it that way.

Originally committed as revision 25042 to svn://svn.ffmpeg.org/ffmpeg/trunk
2010-09-05 10:10:16 +00:00
Stefano Sabatini
7160bb716b Rename FF_MM_ symbols related to CPU features flags as AV_CPU_FLAG_
symbols, and move them from libavcodec/avcodec.h to libavutil/cpu.h.

Originally committed as revision 25040 to svn://svn.ffmpeg.org/ffmpeg/trunk
2010-09-04 09:59:08 +00:00
Ronald S. Bultje
2c166c3af1 Port latest x264 deblock asm (before they moved to using NV12 as internal
format), LGPL'ed with permission from Jason and Loren. This includes mmx2
code, so remove inline asm from h264dsp_mmx.c accordingly.

Originally committed as revision 25031 to svn://svn.ffmpeg.org/ffmpeg/trunk
2010-09-03 16:52:46 +00:00
Eli Friedman
a10a9f5cd0 Fix typo in r25019.
Patch by Eli Friedman <eli.friedman at gmail dot com>.

Originally committed as revision 25022 to svn://svn.ffmpeg.org/ffmpeg/trunk
2010-09-01 23:19:36 +00:00
Ronald S. Bultje
615da9b1d9 Unscrew breakage after my last commit because of symbol prefixes.
Originally committed as revision 25020 to svn://svn.ffmpeg.org/ffmpeg/trunk
2010-09-01 21:10:19 +00:00
Ronald S. Bultje
a33a2562c1 Rename h264_weight_sse2.asm to h264_weight.asm; add 16x8/8x16/8x4 non-square
biweight code to sse2/ssse3; add sse2 weight code; and use that same code to
create mmx2 functions also, so that the inline asm in h264dsp_mmx.c can be
removed. OK'ed by Jason on IRC.

Originally committed as revision 25019 to svn://svn.ffmpeg.org/ffmpeg/trunk
2010-09-01 20:56:16 +00:00
Ronald S. Bultje
14bc1f2485 Split h264dsp_mmx.c (which was #included in dsputil_mmx.c) in h264_qpel_mmx.c,
still #included in dsputil_mmx.c and is part of DSPContext, and h264dsp_mmx.c,
which represents H264DSPContext and is now compiled on its own.

Originally committed as revision 25018 to svn://svn.ffmpeg.org/ffmpeg/trunk
2010-09-01 20:48:59 +00:00
Ronald S. Bultje
5929b3a651 Fix vertical align.
Originally committed as revision 25009 to svn://svn.ffmpeg.org/ffmpeg/trunk
2010-08-31 12:32:24 +00:00
Ronald S. Bultje
79ce0f002e Fix compilation failure if yasm is disabled (missing vp3 symbols).
Originally committed as revision 24992 to svn://svn.ffmpeg.org/ffmpeg/trunk
2010-08-30 20:30:40 +00:00
Ronald S. Bultje
de1c253bab Split intra prediction initialization (i.e. assigning of function pointers)
into its own file, it doesn't belong in h264dsp_mmx.c (much less so in
dsputil_mmx.c).

Originally committed as revision 24990 to svn://svn.ffmpeg.org/ffmpeg/trunk
2010-08-30 16:34:13 +00:00
Ronald S. Bultje
d0eb5a1174 Move H264 chroma MC from inline asm to yasm. This fixes VP3/5/6 and VC-1
fate failures on Win64.

Originally committed as revision 24989 to svn://svn.ffmpeg.org/ffmpeg/trunk
2010-08-30 16:31:04 +00:00
Ronald S. Bultje
e9f5f020c6 Move VP3 IDCT functions from inline ASM to YASM. This fixes part of the VP3/5/6
issues on Win64.

Originally committed as revision 24988 to svn://svn.ffmpeg.org/ffmpeg/trunk
2010-08-30 16:25:46 +00:00
Ronald S. Bultje
7e7c4b6008 Put ff_ prefix on non-static {put_signed,put,add}_pixels_clamped_mmx()
functions.

Originally committed as revision 24987 to svn://svn.ffmpeg.org/ffmpeg/trunk
2010-08-30 16:22:27 +00:00
Loren Merritt
19d929f9a3 cosmetics in imdct_sse
Originally committed as revision 24958 to svn://svn.ffmpeg.org/ffmpeg/trunk
2010-08-28 21:03:13 +00:00
Ronald S. Bultje
4eca52ed19 Fix typos when converting inline asm to yasm, fixes MMX-only fate-ea-vp61.
Originally committed as revision 24948 to svn://svn.ffmpeg.org/ffmpeg/trunk
2010-08-26 14:33:39 +00:00
Ronald S. Bultje
6697bc33e2 Revert r24931, it broke Win32 and some BSD compiles (yay fate).
Originally committed as revision 24934 to svn://svn.ffmpeg.org/ffmpeg/trunk
2010-08-25 20:36:35 +00:00
Ronald S. Bultje
72f642400b Mark xmm6 and xmm7 as clobbered in ff_vp3_idct_sse2(), which is contributing
to the VP6 fate failures on Win64.

Originally committed as revision 24931 to svn://svn.ffmpeg.org/ffmpeg/trunk
2010-08-25 19:57:05 +00:00
Måns Rullgård
69dad87c48 VP6: fix vp6_filter_diag4_mmx/sse on 64-bit
The stride can be negative and must be sign extended before being
used in pointer arithmetic.

Originally committed as revision 24926 to svn://svn.ffmpeg.org/ffmpeg/trunk
2010-08-25 15:41:11 +00:00
Ronald S. Bultje
89fa3504ed Move vp6_filter_diag4() x86 SIMD code from inline ASM to YASM. This should
help in fixing the Win64 fate failures.

Originally committed as revision 24922 to svn://svn.ffmpeg.org/ffmpeg/trunk
2010-08-25 13:44:16 +00:00
Ronald S. Bultje
3a0885146c Move vp6_filter_diag4() from DSPContext to VP56DSPContext.
Originally committed as revision 24921 to svn://svn.ffmpeg.org/ffmpeg/trunk
2010-08-25 13:42:28 +00:00
Måns Rullgård
c0ec9918b0 Remove global mm_flags variable
Originally committed as revision 24909 to svn://svn.ffmpeg.org/ffmpeg/trunk
2010-08-24 17:47:05 +00:00
Ronald S. Bultje
3611c45ab7 Mark xmm registers as clobbered in simple loopfilter. Should fix the last
two VP8-related fate failures on Win64.

Originally committed as revision 24908 to svn://svn.ffmpeg.org/ffmpeg/trunk
2010-08-24 16:52:27 +00:00
Alex Converse
cb4f12466b imdct/x86: Use "s->mdct_size" instead of "1 << s->mdct_bits".
It generates smaller cleaner code.

Originally committed as revision 24887 to svn://svn.ffmpeg.org/ffmpeg/trunk
2010-08-23 15:51:09 +00:00
Ronald S. Bultje
684d608bde Fix segfaults in VP8 SIMD code on Win64 (and FATE/win64 failures).
Originally committed as revision 24871 to svn://svn.ffmpeg.org/ffmpeg/trunk
2010-08-23 02:41:22 +00:00
Alex Converse
78b5c97d3e Convert ff_imdct_half_sse() to yasm.
This is to avoid split asm sections that attempt to preserve some
registers between sections.

Originally committed as revision 24869 to svn://svn.ffmpeg.org/ffmpeg/trunk
2010-08-22 14:39:58 +00:00
Jason Garrett-Glaser
05c04cdf54 VP5/6/8: ~7% faster arithmetic decoding
Grab from the bitstream in 16-bit chunks instead of 8-bit chunks.
TODO: grab in 32-bit chunks on 64-bit systems.

Originally committed as revision 24783 to svn://svn.ffmpeg.org/ffmpeg/trunk
2010-08-12 01:11:32 +00:00
Jason Garrett-Glaser
4a384de5b8 Split h264dsp and h264pred in configure.
Many H.264 derivatives, like RV40 and VP8, use the H.264 prediction functions
but not the weight/loopfilter functions.
This should reduce the size of builds with one of these derivatives but without
H.264 decoding itself.

Originally committed as revision 24741 to svn://svn.ffmpeg.org/ffmpeg/trunk
2010-08-07 23:10:25 +00:00
Jason Garrett-Glaser
98fe09df7b Add file missing in r24702
Originally committed as revision 24703 to svn://svn.ffmpeg.org/ffmpeg/trunk
2010-08-05 00:49:48 +00:00
Eli Friedman
c12d6955e2 H.264: SSE2/SSSE3 weighted prediction asm
Patch by Eli Friedman <eli.friedman at gmail dot com>

Originally committed as revision 24702 to svn://svn.ffmpeg.org/ffmpeg/trunk
2010-08-05 00:13:38 +00:00
Måns Rullgård
f079a64aea Move cavs dsp functions to their own struct
Originally committed as revision 24685 to svn://svn.ffmpeg.org/ffmpeg/trunk
2010-08-03 20:59:00 +00:00
Jason Garrett-Glaser
8b9b5e085f VP5/6/8: add one inline missed in r24677
Originally committed as revision 24682 to svn://svn.ffmpeg.org/ffmpeg/trunk
2010-08-03 11:21:22 +00:00
Jason Garrett-Glaser
827d43bb9d VP8: move zeroing of luma DC block into the WHT
Lets us do the zeroing in asm instead of C.
Also makes it consistent with the way the regular iDCT code does it.

Originally committed as revision 24668 to svn://svn.ffmpeg.org/ffmpeg/trunk
2010-08-02 20:18:09 +00:00
Ronald S. Bultje
6341838f3c Use word-writing instead of dword-writing (with two cached but otherwise
unchanged bytes) in the horizontal simple loopfilter. This makes the filter
quite a bit faster in itself (~30 cycles less on Core1), probably mostly
because we don't need a complex 4x4 transpose, but only a simple byte
interleave. Also allows using pextrw on SSE4, which speeds up even more
(e.g. 25% faster on Core i7).

Originally committed as revision 24638 to svn://svn.ffmpeg.org/ffmpeg/trunk
2010-07-31 23:13:15 +00:00
Vitor Sessak
fa738b3ad1 Remove x86/mmx.h. It is not used anymore and has been deprecated for years.
Originally committed as revision 24618 to svn://svn.ffmpeg.org/ffmpeg/trunk
2010-07-31 16:20:45 +00:00
Vitor Sessak
de4bc44abb Convert deinterlacing MMX code to YASM
Originally committed as revision 24615 to svn://svn.ffmpeg.org/ffmpeg/trunk
2010-07-31 14:50:51 +00:00
Vitor Sessak
740dfe7012 Fix compilation in x86_64. I broke it with r24580.
Originally committed as revision 24582 to svn://svn.ffmpeg.org/ffmpeg/trunk
2010-07-29 22:45:21 +00:00
Vitor Sessak
2c3dda6838 Translate libmpeg2 MMX IDCT to plain asm
Originally committed as revision 24580 to svn://svn.ffmpeg.org/ffmpeg/trunk
2010-07-29 22:19:54 +00:00
Ronald S. Bultje
ab4d031889 Use pmaddubsw for the mbedge_filter (>=ssse3), 6-10 cycles faster.
Originally committed as revision 24514 to svn://svn.ffmpeg.org/ffmpeg/trunk
2010-07-26 21:18:19 +00:00
Jason Garrett-Glaser
e25dee602f VP8: Much faster SSE2 MC
5-10% faster or more on Phenom, Athlon 64, and some others.
Helps some on pre-SSSE3 Intel chips as well, but not as much.

Originally committed as revision 24513 to svn://svn.ffmpeg.org/ffmpeg/trunk
2010-07-26 19:34:00 +00:00
Ronald S. Bultje
48adb7e7a4 Enable no-loop memory/register saving for ssse3/sse4 also.
Originally committed as revision 24511 to svn://svn.ffmpeg.org/ffmpeg/trunk
2010-07-26 14:07:57 +00:00
Ronald S. Bultje
2a180c69ea Save a register (or regsize of stackspace for x86-32) for the no-loop
mbedge loopfilter functions, by re-using space that holds a variable
that we no longer need.

Originally committed as revision 24510 to svn://svn.ffmpeg.org/ffmpeg/trunk
2010-07-26 14:00:15 +00:00
Ronald S. Bultje
bcd4aa6498 Use nested ifs instead of &&, which appears to not work with %ifidn (i.e. this
construct was always enabled, even for <ssse3 versions).

Originally committed as revision 24509 to svn://svn.ffmpeg.org/ffmpeg/trunk
2010-07-26 13:56:51 +00:00
Ronald S. Bultje
2208053bd3 Split pextrw macro-spaghetti into several opt-specific macros, this will make
future new optimizations (imagine a sse5) much easier. Also fix a bug where
we used the direction (%2) rather than optimization (%1) to enable this, which
means it wasn't ever actually used...

Originally committed as revision 24507 to svn://svn.ffmpeg.org/ffmpeg/trunk
2010-07-26 13:50:59 +00:00
Ronald S. Bultje
6de5b7c6b8 Fix obvious bug in assignment. Somehow, the test vectors don't test this...
Originally committed as revision 24489 to svn://svn.ffmpeg.org/ffmpeg/trunk
2010-07-25 02:42:40 +00:00
Ronald S. Bultje
e3f7bf774c Fix SPLATB_REG mess. Used to be a if/elseif/elseif/elseif spaghetti, so this
splits it into small optimization-specific macros which are selected for each
DSP function. The advantage of this approach is that the sse4 functions now
use the ssse3 codepath also without needing an explicit sse4 codepath.

Originally committed as revision 24487 to svn://svn.ffmpeg.org/ffmpeg/trunk
2010-07-24 19:33:05 +00:00
Eli Friedman
3611e7a309 Inline asm for VP56 arith coder
This is a lot more reliable to get cmov rather than trying to trick gcc into
generating it, useful since it's 2% faster overall.

Patch by Eli Friedman <eli.friedman at gmail>

Originally committed as revision 24471 to svn://svn.ffmpeg.org/ffmpeg/trunk
2010-07-23 21:46:30 +00:00
Jason Garrett-Glaser
3ae079a3c8 VP8: optimize DC-only chroma case in the same way as luma.
Add MMX idct_dc_add4uv function for this case.
~40% faster chroma idct.

Originally committed as revision 24455 to svn://svn.ffmpeg.org/ffmpeg/trunk
2010-07-23 06:02:52 +00:00
Jason Garrett-Glaser
51c9156438 VP8 asm: cosmetics (spacing)
Originally committed as revision 24453 to svn://svn.ffmpeg.org/ffmpeg/trunk
2010-07-23 03:02:56 +00:00
Jason Garrett-Glaser
8a467b2d44 VP8: 30% faster idct_mb
Take shortcuts based on statistically common situations.
Add 4-at-a-time idct_dc function (mmx and sse2) since rows of 4 DC-only DCT
blocks are common.
TODO: tie this more directly into the MB mode, since the DC-level transform is
only used for non-splitmv blocks?

Originally committed as revision 24452 to svn://svn.ffmpeg.org/ffmpeg/trunk
2010-07-23 02:58:27 +00:00
Jason Garrett-Glaser
c25c776708 VP8: clear DCT blocks in iDCT instead of using clear_blocks.
~0.3% faster overall.

Originally committed as revision 24448 to svn://svn.ffmpeg.org/ffmpeg/trunk
2010-07-23 00:07:16 +00:00
Ronald S. Bultje
dc5eec8085 Use pextrw for SSE4 mbedge filter result writing, speedup 5-10cycles on
CPUs supporting it.

Originally committed as revision 24437 to svn://svn.ffmpeg.org/ffmpeg/trunk
2010-07-22 19:59:34 +00:00
Ronald S. Bultje
003243c3c2 Fix and enable horizontal >=SSE2 mbedge loopfilter.
Originally committed as revision 24409 to svn://svn.ffmpeg.org/ffmpeg/trunk
2010-07-22 01:35:26 +00:00
Loren Merritt
c7b1d9768c relicense h264 deblock sse2 to lgpl
Originally committed as revision 24408 to svn://svn.ffmpeg.org/ffmpeg/trunk
2010-07-22 00:39:49 +00:00
Loren Merritt
532e769701 sync yasm macros from x264
Originally committed as revision 24406 to svn://svn.ffmpeg.org/ffmpeg/trunk
2010-07-21 22:45:16 +00:00
Jason Garrett-Glaser
8731dbd890 Eliminate one instruction in VP8 dc_add_sse4
Originally committed as revision 24405 to svn://svn.ffmpeg.org/ffmpeg/trunk
2010-07-21 22:41:37 +00:00
Jason Garrett-Glaser
7dd224a42d Various VP8 x86 deblocking speedups
SSSE3 versions, improve SSE2 versions a bit.
SSE2/SSSE3 mbedge h functions are currently broken, so explicitly disable them.

Originally committed as revision 24403 to svn://svn.ffmpeg.org/ffmpeg/trunk
2010-07-21 22:11:03 +00:00
Jason Garrett-Glaser
b8b231b5dc Make mmx VP8 WHT faster
Avoid pextrw, since it's slow on many older CPUs.
Now it doesn't require mmxext either.

Originally committed as revision 24397 to svn://svn.ffmpeg.org/ffmpeg/trunk
2010-07-21 20:51:01 +00:00
David Conrad
af521abc28 Add header declarations for mmx/sse constants missing them
Originally committed as revision 24381 to svn://svn.ffmpeg.org/ffmpeg/trunk
2010-07-21 10:02:07 +00:00
David Conrad
c7eec58170 Move ff_pw_* from vc1dsp_mmx.c to dsputil_mmx.c
Should fix compilation with icc and should help prevent any future duplicates

Originally committed as revision 24380 to svn://svn.ffmpeg.org/ffmpeg/trunk
2010-07-21 10:02:03 +00:00