Commit Graph

257 Commits

Author SHA1 Message Date
Ronald S. Bultje
c3af52fa8b dsputil: use vertical component for drawing bottom edge.
Current code only writes 8 pixels of vertical edge for YUV422, which
causes MC artifacts when subsequent frames use data from that edge.
2012-01-25 18:06:36 +08:00
Michael Niedermayer
e462257242 Merge remote-tracking branch 'qatar/master'
* qatar/master: (23 commits)
  applehttp: Properly clean up if unable to probe a segment
  applehttp: Avoid reading uninitialized memory
  fate: Replace misleading "aac" in the name of an ADTS test with "adts".
  fate: Drop pointless "-an" from pictor test command.
  fate: split off image codec FATE tests into their own file
  fate: split off WMA codec FATE tests into their own file
  fate: split off lossless video and audio FATE tests into their own files
  fate: split off qtrle codec FATE tests into their own file
  fate: split off Ut Video codec FATE tests into their own file
  fate: split off screen codec FATE tests into their own file
  fate: split off Real Inc. codec FATE tests into their own file
  fate: split off AC-3 codec FATE tests into their own file
  mpegvideo: remove abort() in ff_find_unused_picture()
  rv40: NEON optimised loop filter strength selection
  rv40: rearrange loop filter functions
  configure: cosmetics: sort some lists where appropriate
  swscale_mmx: drop no longer required parameters from VSCALEX macros
  swscale: Mark yuv2planeX_8_mmx as MMX2; it contains MMX2 instructions.
  build: conditionally compile x86 H.264 chroma optimizations
  v410 encoder and decoder
  ...

Conflicts:
	Changelog
	configure
	doc/developer.texi
	doc/general.texi
	libavcodec/arm/asm.S
	libavcodec/avcodec.h
	libavcodec/v410dec.c
	libavcodec/v410enc.c
	libavcodec/version.h
	libavcodec/x86/Makefile
	libavcodec/x86/dsputil_mmx.c
	libswscale/x86/swscale_mmx.c
	tests/Makefile
	tests/fate2.mak

Merged-by: Michael Niedermayer <michaelni@gmx.at>
2011-12-14 23:58:10 +01:00
Diego Biurrun
88b9735753 build: conditionally compile x86 H.264 chroma optimizations 2011-12-14 11:58:45 +01:00
Michael Niedermayer
0b9a69f244 Merge remote-tracking branch 'qatar/master'
* qatar/master: (22 commits)
  aacdec: Fix PS in ADTS.
  avconv: Consistently use PIX_FMT_NONE.
  dsputil: use cpuflags in x86 emu_edge_core
  dsputil: use movups instead of movdqu in ff_emu_edge_core_sse()
  wma: initialize prev_block_len_bits, next_block_len_bits, and block_len_bits.
  mov: Remove some redundant and obsolete comments.
  Add libavutil/mathematics.h #includes for INFINITY
  doxy: structure libavformat groups
  doxy: introduce an empty structure in libavcodec
  doxy: provide a start page and document libavutil
  doxy: cleanup pixfmt.h
  regtest: split video encode/decode tests into individual targets
  ARM: add explicit .arch and .fpu directives to asm.S
  pthread: do not touch has_b_frames
  avconv: cleanup the transcoding loop in output_packet().
  avconv: split subtitle transcoding out of output_packet().
  avconv: split video transcoding out of output_packet().
  avconv: split audio transcoding out of output_packet().
  avconv: reindent.
  avconv: move streamcopy-only code out of decoding loop.
  ...

Conflicts:
	avconv.c
	libavcodec/aaccoder.c
	libavcodec/pthread.c
	libavcodec/version.h
	libavutil/audioconvert.h
	libavutil/avutil.h
	libavutil/mem.h
	tests/ref/vsynth1/dv
	tests/ref/vsynth1/mpeg2thread
	tests/ref/vsynth2/dv
	tests/ref/vsynth2/mpeg2thread

Merged-by: Michael Niedermayer <michaelni@gmx.at>
2011-11-23 04:02:17 +01:00
Justin Ruggles
395f2e70dd dsputil: use movups instead of movdqu in ff_emu_edge_core_sse()
This allows emulated_edge_mc_sse() and gmc_sse() to be used under
AV_CPU_FLAG_SSE.
2011-11-22 15:40:51 -05:00
Michael Niedermayer
29582df797 Merge remote-tracking branch 'qatar/master'
* qatar/master:
  vble: remove vble_error_close
  VBLE Decoder
  tta: use an integer instead of a pointer to iterate output samples
  shorten: do not modify samples pointer when interleaving
  mpc7: only support stereo input.
  dpcm: do not try to decode empty packets
  dpcm: remove unneeded buf_size==0 check.
  twinvq: add SSE/AVX optimized sum/difference stereo interleaving
  vqf/twinvq: pass vqf COMM chunk info in extradata
  vqf: do not set bits_per_coded_sample for TwinVQ.
  twinvq: check for allocation failure in init_mdct_win()
  swscale: add padding to conversion buffer.
  rtpdec: Simplify finalize_packet
  http: Handle proxy authentication
  http: Print an error message for Authorization Required, too
  AVOptions: don't return an invalid option when option list is empty
  AIFF: add 'twos' FourCC for the mux/demuxer (big endian PCM audio)

Conflicts:
	libavcodec/avcodec.h
	libavcodec/tta.c
	libavcodec/vble.c
	libavcodec/version.h
	libavutil/opt.c
	libswscale/utils.c

Merged-by: Michael Niedermayer <michaelni@gmx.at>
2011-11-12 02:50:25 +01:00
Justin Ruggles
9d06037d48 twinvq: add SSE/AVX optimized sum/difference stereo interleaving 2011-11-11 14:13:58 -05:00
Michael Niedermayer
0bd42ae72c Merge remote-tracking branch 'qatar/master'
* qatar/master:
  avformat: Avoid a warning about mixed declarations and code
  BMV demuxer and decoder
  matroskaenc: Make sure the seekhead struct is freed even on seek failure
  mpeg12enc: Remove write-only variables.
  mpeg12enc: Don't set up run-level info for level 0.
  msmpeg4: Don't set up run-level info for level 0.
  avformat: Warn about using network functions without calling avformat_network_init
  avformat: Revise wording
  rdt: Set AVFMT_NOFILE on ff_rdt_demuxer
  rdt: Check the return value of avformat_open
  rtsp: Discard the dynamic handler, if it has an alloc function which failed
  dsputil: use cpuflags in x86 versions of vector_clip_int32()

Conflicts:
	libavcodec/avcodec.h
	libavcodec/version.h
	libavformat/Makefile
	libavformat/allformats.c
	libavformat/version.h

Merged-by: Michael Niedermayer <michaelni@gmx.at>
2011-11-08 02:03:14 +01:00
Justin Ruggles
b8f02f5b4e dsputil: use cpuflags in x86 versions of vector_clip_int32() 2011-11-06 20:50:06 -05:00
multiple authors
5d50fcc549 DIRAC Decoder stable version, MMX support removed.
Look for MMX_DISABLED to find the disabled functions.

Authors of this code are Marco Gerards <marco@gnu.org> and David Conrad <lessen42@gmail.com>
With changes from Jordi Ortiz <nenjordi@gmail.com>

Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
2011-10-30 21:50:08 +01:00
David Conrad
25a6c59487 MMX put_no_rnd_pixels_l2 2011-10-30 19:06:57 +01:00
Michael Niedermayer
173715d291 Merge remote-tracking branch 'qatar/master'
* qatar/master: (35 commits)
  libopencore-amr: check output buffer size before decoding
  libopencore-amr: remove unneeded buf_size==0 check.
  libopencore-amr: remove unneeded frame_count field.
  aac_latm: remove unneeded check for zero-size packet.
  pcmdec: fix output buffer size check by calculating the actual output size prior to decoding.
  pcmdec: move codec-specific variable declarations to the corresponding codec blocks.
  pcmdec: return buf_size instead of src-buf.
  avcodec: remove the Zork PCM encoder.
  pcm_zork: use AV_SAMPLE_FMT_U8 instead of shifting all samples by 8.
  pcmenc: remove unneeded sample_fmt check.
  pcmdec: move number of channels check to pcm_decode_init()
  pcmdec: remove unnecessary check for sample_fmt change
  pcmdec: move DVD PCM bits_per_coded_sample check near to the code that sets the sample size.
  pcmdec: do not needlessly set *data_size to 0
  alacdec: remove unneeded NULL or zero-size packet checks.
  alacdec: simplify buffer allocation by using FF_ALLOC_OR_GOTO()
  alacdec: ask for a sample for unsupported sample depths.
  alacdec: cosmetics: use 'ch' instead of 'chan' to iterate channels
  alacdec: move some declarations to the top of the function
  alacdec: always use get_sbits_long() for uncompressed samples
  ...

Conflicts:
	libavcodec/pcm.c
	tests/ref/acodec/pcm

Merged-by: Michael Niedermayer <michaelni@gmx.at>
2011-10-27 01:39:04 +02:00
Daniel Kang
ded3e9f054 H.264: Cometics to dsputil_mmx.c
Add whitespace.

Signed-off-by: Ronald S. Bultje <rsbultje@gmail.com>
2011-10-26 06:41:32 -07:00
Michael Niedermayer
b81f8880e0 Merge remote-tracking branch 'qatar/master'
* qatar/master: (23 commits)
  fix AC3ENC_OPT_MODE_ON/OFF
  h264: fix HRD parameters parsing
  prores: implement multithreading.
  prores: idct sse2/sse4 optimizations.
  swscale: use aligned move for storage into temporary buffer.
  prores: extract idct into its own dspcontext and merge with put_pixels.
  h264: fix invalid shifts in init_cavlc_level_tab()
  intfloat_readwrite: fix signed addition overflows
  mov: do not misreport empty stts
  mov: cosmetics, fix for and if spacing
  id3v2: fix NULL pointer dereference
  mov: read album_artist atom
  mov: fix disc/track numbers and totals
  doc: fix references to obsolete presets directories for avconv/ffmpeg
  flashsv: return more meaningful error value
  flashsv: fix typo in av_log() message
  smacker: validate channels and sample format.
  smacker: check buffer size before reading output size
  smacker: validate number of channels
  smacker: Separate audio flags from sample rates in smacker demuxer.
  ...

Conflicts:
	cmdutils.h
	doc/ffmpeg.texi
	libavcodec/Makefile
	libavcodec/motion_est_template.c
	libavformat/id3v2.c
	libavformat/mov.c

Merged-by: Michael Niedermayer <michaelni@gmx.at>
2011-10-12 05:40:57 +02:00
Ronald S. Bultje
e3f530feca prores: idct sse2/sse4 optimizations.
~3.0-3.5x as fast as original C version, 1.6x as fast overall.
2011-10-11 07:50:48 -07:00
Michael Niedermayer
1a34478b71 Merge remote-tracking branch 'qatar/master'
* qatar/master:
  Fix NASM include directive
  dsputil_mmx: Honor HAVE_AMD3DNOW
  lavf,lavd: remove all usage of AVFormatParameters from demuxers.
  jack: add 'channels' private option.
  VC-1: fix reading of custom PAR.
  Remove redundant and dubious video codec detection by its extradata
  mpeg12: remove repeat-field code disabled since May 2002
  patch checklist: suggest fate instead of regression tests
  Turn on resampling on sudden size change instead of bailing out during recode.
  avtools: reinitialise filter chain when input video stream changes dimensions

Conflicts:
	Makefile
	avconv.c
	doc/developer.texi
	ffplay.c
	libavcodec/x86/dsputil_mmx.c
	libavdevice/libdc1394.c

Merged-by: Michael Niedermayer <michaelni@gmx.at>
2011-08-15 23:35:53 +02:00
Alex Converse
48f7163f13 dsputil_mmx: Honor HAVE_AMD3DNOW 2011-08-15 11:20:08 -07:00
Baptiste Coudurier
9a33078b64 dsputil_mmx: fix indention
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
2011-08-15 00:37:19 +02:00
Michael Niedermayer
0cb233cf46 Merge commit 'b2c087871dafc7d030b2d48457ddff597dfd4925'
* commit 'b2c087871dafc7d030b2d48457ddff597dfd4925':
  Move x86util.asm from libavcodec/ to libavutil/.
  Move x86inc.asm to libavutil/.
  APIchanges: note error_recognition in lavf
  lavf: add support for error_recognition, use it in avidec, and bump minor API version
  avconv: change semantics of -map
  avconv: get rid of new* options.
  cmdutils: allow precisely specifying a stream for AVOptions.
  configure: add missing CFLAGS to fix building on the HURD
  libx264: Include hint for possible values for configuring libx264
  cmdutils: allow ':'-separated modifiers in option names.
  avconv: make -map_metadata work consistently with the other options
  avconv: remove deprecated options.
  avconv: make -map_chapters accept only the input file index.
  Make a copy of ffmpeg under a new name -- avconv.
  ffmpeg: add a warning stating that the program is deprecated.
  Add weighted motion compensation for RV40 B-frames
  RV3/4: calculate B-frame motion weights once per frame
  Move RV3/4-specific DSP functions into their own context
  mjpeg: propagate decode errors from ff_mjpeg_decode_sos and ff_mjpeg_decode_dqt
  h264: notice memory allocation failure

Conflicts:
	.gitignore
	Makefile
	cmdutils.c
	configure
	doc/ffplay.texi
	doc/ffprobe.texi
	doc/ffserver.texi
	libavcodec/libx264.c
	libavformat/avformat.h
	libavformat/avidec.c
	libavformat/version.h
	tests/lavf-regression.sh
	tests/lavfi-regression.sh

Merged-by: Michael Niedermayer <michaelni@gmx.at>
2011-08-13 02:56:08 +02:00
Kostya Shishkov
d241f51e0f Move RV3/4-specific DSP functions into their own context
Signed-off-by: Ronald S. Bultje <rsbultje@gmail.com>
2011-08-11 16:07:15 -07:00
Michael Niedermayer
faba79e080 Merge remote-tracking branch 'qatar/master'
* qatar/master:
  mxfdec: Include FF_INPUT_BUFFER_PADDING_SIZE when allocating extradata.
  H.264: tweak some other x86 asm for Atom
  probe: Fix insane flow control.
  mpegts: remove invalid error check
  s302m: use nondeprecated audio sample format API
  lavc: use designated initialisers for all codecs.
  x86: cabac: add operand size suffixes missing from 6c32576

Conflicts:
	libavcodec/ac3enc_float.c
	libavcodec/flacenc.c
	libavcodec/frwu.c
	libavcodec/pictordec.c
	libavcodec/qtrleenc.c
	libavcodec/v210enc.c
	libavcodec/wmv2dec.c

Merged-by: Michael Niedermayer <michaelni@gmx.at>
2011-07-30 06:46:08 +02:00
Jason Garrett-Glaser
a3bf7b864a H.264: tweak some other x86 asm for Atom 2011-07-29 12:24:15 -07:00
Michael Niedermayer
4095fa9038 Merge remote-tracking branch 'qatar/master'
* qatar/master:
  dnxhddec: optimise dnxhd_decode_dct_block()
  rtp: remove disabled code
  eac3enc: use different numbers of blocks per frame to allow higher bitrates
  dnxhd: add regression test for 10-bit
  dnxhd: 10-bit support
  dsputil: update per-arch init funcs for non-h264 high bit depth
  dsputil: template get_pixels() for different bit depths
  dsputil: create 16/32-bit dctcoef versions of some functions
  jfdctint: add 10-bit version
  mov: add clcp type track as Subtitle stream.
  mpeg4: add Mpeg4 Profiles names.
  mpeg4: decode Level Profile for MPEG4 Part 2.
  ffprobe: display bitstream level.
  imgconvert: remove unused glue and xglue macros

Conflicts:
	libavcodec/dsputil_template.c

Merged-by: Michael Niedermayer <michaelni@gmx.at>
2011-07-22 12:08:52 +02:00
Mans Rullgard
a617c6aaa3 dsputil: update per-arch init funcs for non-h264 high bit depth
Signed-off-by: Mans Rullgard <mans@mansr.com>
2011-07-21 18:10:58 +01:00
Michael Niedermayer
f93f6963ba Merge remote-tracking branch 'qatar/master'
* qatar/master:
  rv30: return AVERROR(EINVAL) instead of EINVAL
  build: add -L flags before existing LDFLAGS
  simple_idct: whitespace cosmetics
  simple_idct: make repeated code a macro
  dsputil: remove huge #if 0 block
  simple_idct: change 10-bit add/put stride from pixels to bytes
  dsputil: allow 9/10-bit functions for non-h264 codecs
  dnxhd: rename some data tables
  dnxhdenc: remove inline from function only called through pointer
  dnxhdenc: whitespace cosmetics
  swscale: mark YUV422P10(LE,BE) as supported for output
  configure: add -xc99 to LDFLAGS for Sun CC
  Remove unused and non-compiling vestigial g729 decoder
  Remove unused code under G729_BITEXACT #ifdef.
  mpegvideo: fix invalid picture unreferencing.
  dsputil: Remove extra blank line at end.
  dsputil: Replace a LONG_MAX check with HAVE_FAST_64BIT.
  simple_idct: add 10-bit version

Conflicts:
	Makefile
	libavcodec/g729data.h
	libavcodec/g729dec.c
	libavcodec/rv30.c
	tests/ref/lavfi/pixdesc
	tests/ref/lavfi/pixfmts_copy
	tests/ref/lavfi/pixfmts_null
	tests/ref/lavfi/pixfmts_scale
	tests/ref/lavfi/pixfmts_vflip

Merged-by: Michael Niedermayer <michaelni@gmx.at>
2011-07-21 16:28:53 +02:00
Mans Rullgard
e7a972e113 simple_idct: add 10-bit version
Signed-off-by: Mans Rullgard <mans@mansr.com>
2011-07-20 17:49:48 +01:00
Michael Niedermayer
3c3daf4d19 Merge remote-tracking branch 'qatar/master'
* qatar/master:
  vf_libopencv: replace opencv/cxtypes.h #include by opencv/cxcore.h
  dsputil: remove disabled code
  tta: remove disabled code
  gxfenc: place variable declarations before statements
  x86: Use LOCAL_ALIGNED in mpegvideo_mmx_template
  random_seed: use proper #includes

Merged-by: Michael Niedermayer <michaelni@gmx.at>
2011-07-18 16:43:46 +02:00
Diego Biurrun
65083b4911 dsputil: remove disabled code 2011-07-18 11:48:35 +02:00
Michael Niedermayer
3602ad7ee6 Merge commit '142e76f1055de5dde44696e71a5f63f2cb11dedf'
* commit '142e76f1055de5dde44696e71a5f63f2cb11dedf':
  swscale: fix crash with dithering due incorrect offset calculation.
  matroskadec: fix stupid typo (!= -> ==)
  build: remove duplicates from order-only directory prerequisite list
  build: rework rules for things in the tools dir
  configure: fix --cpu=host with gcc 4.6
  ARM: use const macro to define constant data in asm
  bitdepth: simplify FUNC/FUNCC macros
  dsputil: remove ff_emulated_edge_mc macro used in one place
  9/10-bit: simplify clipping macros
  matroskadec: reindent
  matroskadec: defer parsing of cues element until we seek.
  lavc: add support for codec-specific defaults.
  lavc: make avcodec_alloc_context3 officially public.
  lavc: remove a half-working attempt at different defaults for audio/video codecs.
  ac3dec: add a drc_scale private option
  lavf: add avformat_find_stream_info()
  lavc: introduce avcodec_open2() as a replacement for avcodec_open().

Conflicts:
	Makefile
	libavcodec/utils.c
	libavformat/avformat.h
	libswscale/swscale_internal.h

Merged-by: Michael Niedermayer <michaelni@gmx.at>
2011-07-11 04:23:25 +02:00
Mans Rullgard
710b8df949 dsputil: remove ff_emulated_edge_mc macro used in one place
This macro can cause problems in conjunction with the bitdepth
template expansion.  It was presumably added to keep source
compatibility when high bitdepth support was added.  However,
emulated_edge_mc is a dsputil pointer and should not be called
directly, so there is little reason to keep such a macro.

Signed-off-by: Mans Rullgard <mans@mansr.com>
2011-07-10 17:55:58 +01:00
Michael Niedermayer
2f56a97f24 Merge remote-tracking branch 'qatar/master'
* qatar/master: (22 commits)
  H.264: fix filter_mb_fast with 4:4:4 + 8x8dct
  alsa: limit buffer_size to 32768 frames.
  alsa: fallback to buffer_size/4 for period_size.
  doc: replace @pxref by @ref where appropriate
  mpeg1video: don't abort if thread_count is too high.
  segafilm: add support for videos with cri adx adpcm
  gxf: Fix 25 fps DV material in GXF being misdetected as 50 fps
  libxvid: Add const qualifier to silence compiler warning.
  H.264: improve qp_thresh check
  H.264: use fill_rectangle in CABAC decoding
  H.264: Remove redundant hl_motion_16/8 code
  H.264: merge fill_rectangle into P-SKIP MV prediction, to match B-SKIP
  H.264: faster P-SKIP decoding
  H.264: av_always_inline some more functions
  H.264: Add x86 assembly for 10-bit H.264 predict functions
  swscale: rename uv_off/uv_off2 to uv_off_px/byte.
  swscale: implement error dithering in planarCopyWrapper.
  swscale: error dithering for 16/9/10-bit to 8-bit.
  swscale: fix overflow in 16-bit vertical scaling.
  swscale: fix crash in 8-bpc bilinear output without alpha.
  ...

Conflicts:
	doc/developer.texi
	libavdevice/alsa-audio.h
	libavformat/gxf.c
	libswscale/swscale.c
	libswscale/swscale_internal.h
	libswscale/swscale_unscaled.c
	libswscale/x86/swscale_template.c
	tests/ref/lavfi/pixdesc
	tests/ref/lavfi/pixfmts_copy
	tests/ref/lavfi/pixfmts_crop
	tests/ref/lavfi/pixfmts_hflip
	tests/ref/lavfi/pixfmts_null
	tests/ref/lavfi/pixfmts_scale
	tests/ref/lavfi/pixfmts_vflip

Merged-by: Michael Niedermayer <michaelni@gmx.at>
2011-07-10 04:28:50 +02:00
Daniel Kang
c0483d0c7a H.264: Add x86 assembly for 10-bit H.264 predict functions
Mainly ported from 8-bit H.264 predict.

Some code ported from x264. LGPL ok by author.

Signed-off-by: Ronald S. Bultje <rsbultje@gmail.com>
2011-07-08 15:59:29 -07:00
Michael Niedermayer
5d4fd1d1ad Merge remote-tracking branch 'qatar/master'
* qatar/master: (36 commits)
  ARM: allow unaligned buffer in fixed-point NEON FFT4
  fate: test more FFT etc sizes
  dca: set AVCodecContext frame_size for DTS audio
  YASM: Shut up unused variable compiler warning with --disable-yasm.
  x86_32: Fix build on x86_32 with --disable-yasm.
  iirfilter: add fate test
  doxygen: Add qmul docs.
  ogg: propagate return values and return more meaningful error values
  H.264: fix overreads of qscale_table
  Remove unused static tables and static inline functions.
  eval: clear Parser instances before using
  dct-test: remove 'ref' function pointer from tables
  build: Remove deleted 'check' target from .PHONY list.
  oggdec: Abort Ogg header parsing when encountering a data packet.
  Add LGPL license boilerplate to files lacking it.
  mxfenc: small typo fix
  doxygen: Fix documentation for some VP8 functions.
  sha: use AV_RB32() instead of assuming buffer can be cast to uint32_t*
  des: allow unaligned input and output buffers
  aes: allow unaligned input and output buffers
  ...

Conflicts:
	libavcodec/dct-test.c
	libavcodec/libvpxenc.c
	libavcodec/x86/dsputil_mmx.c
	libavcodec/x86/h264_qpel_mmx.c
	libavfilter/x86/gradfun.c
	libavformat/oggdec.c

Merged-by: Michael Niedermayer <michaelni@gmx.at>
2011-07-05 02:26:17 +02:00
Daniel Kang
3c7c16fde3 YASM: Shut up unused variable compiler warning with --disable-yasm.
Signed-off-by: Diego Biurrun <diego@biurrun.de>
2011-07-04 18:49:09 +02:00
Daniel Kang
58f7aad051 Fix build with --disable-yasm.
Signed-off-by: Ronald S. Bultje <rsbultje@gmail.com>
2011-07-03 22:56:09 -07:00
Michael Niedermayer
889639969b dsputil_mmx: try to fix compilation without yasm.
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
2011-07-04 02:02:24 +02:00
Michael Niedermayer
976a8b2179 Merge remote-tracking branch 'qatar/master'
* qatar/master: (40 commits)
  H.264: template left MB handling
  H.264: faster fill_decode_caches
  H.264: faster write_back_*
  H.264: faster fill_filter_caches
  H.264: make filter_mb_fast support the case of unavailable top mb
  Do not include log.h in avutil.h
  Do not include pixfmt.h in avutil.h
  Do not include rational.h in avutil.h
  Do not include mathematics.h in avutil.h
  Do not include intfloat_readwrite.h in avutil.h
  Remove return statements following infinite loops without break
  RTSP: Doxygen comment cleanup
  doxygen: Escape '\' in Doxygen documentation.
  md5: cosmetics
  md5: use AV_WL32 to write result
  md5: add fate test
  md5: include correct headers
  md5: fix test program
  doxygen: Drop array size declarations from Doxygen parameter names.
  doxygen: Fix parameter names to match the function prototypes.
  ...

Conflicts:
	libavcodec/x86/dsputil_mmx.c
	libavformat/flvenc.c
	libavformat/oggenc.c
	libavformat/wtv.c

Merged-by: Michael Niedermayer <michaelni@gmx.at>
2011-07-04 00:45:21 +02:00
Daniel Kang
9bfa5363da H.264: Add x86 assembly for 10-bit H.264 qpel functions.
Mainly ported from 8-bit H.264 qpel.

Some code ported from x264. LGPL ok by author.

Signed-off-by: Ronald S. Bultje <rsbultje@gmail.com>
2011-07-03 07:43:38 -07:00
Michael Niedermayer
3074f03a07 Merge remote-tracking branch 'qatar/master'
* qatar/master:
  get_bits: remove x86 inline asm in A32 bitstream reader
  doc: Remove outdated information about our issue tracker
  avidec: Factor out the sync fucntionality.
  fate-aac: Expand coverage.
  ac3dsp: add x86-optimized versions of ac3dsp.extract_exponents().
  ac3dsp: simplify extract_exponents() now that it does not need to do clipping.
  ac3enc: clip coefficients after MDCT.
  ac3enc: add int32_t array clipping function to DSPUtil, including x86 versions.
  swscale: for >8bit scaling, read in native bit-depth.
  matroskadec: matroska_read_seek after after EBML_STOP leads to failure.
  doxygen: fix usage of @file directive in libavutil/{dict,file}.h
  doxygen: Help doxygen parser to understand the DECLARE_ALIGNED and offsetof macros

Conflicts:
	doc/issue_tracker.txt
	libavformat/avidec.c
	libavutil/dict.h
	libswscale/swscale.c
	libswscale/utils.c
	tests/ref/lavfi/pixfmts_scale

Merged-by: Michael Niedermayer <michaelni@gmx.at>
2011-07-02 03:24:32 +02:00
Justin Ruggles
6054cd25b4 ac3enc: add int32_t array clipping function to DSPUtil, including x86 versions. 2011-07-01 13:02:11 -04:00
Michael Niedermayer
bb9d5171a7 Merge remote-tracking branch 'qatar/master'
* qatar/master: (21 commits)
  swscale: Add Doxygen for hyscale_fast/hScale.
  fate: enable lavfi-pixmt tests on big endian systems
  PPC: swscale: disable altivec functions for unsupported formats
  fate: merge identical pixdesc_be/le tests
  swscale: Add Doxygen for yuv2planar*/yuv2packed* functions.
  build: call texi2pod.pl with full path instead of symlink
  build: include sub-makefiles using full path instead of symlinks
  swscale: update big endian reference values after dff5a835.
  wavpack: skip blocks with no samples
  cosmetics: remove outdated comment that is no longer true
  build: replace some addprefix/addsuffix with substitution refs
  avutil: Remove unused arbitrary precision integer code.
  configure: Drop check for availability of ten assembler operands.
  aacenc: Save channel configuration for later use.
  aacenc: Fix codebook trellising for zeroed bands.
  swscale: change prototypes of scaled YUV output functions.
  swscale: re-add support for non-native endianness.
  swscale: disentangle yuv2rgbX_c_full() into small functions.
  swscale: split yuv2packed[12X]_c() remainders into small functions.
  swscale: split yuv2packedX_altivec in smaller functions.
  ...

Conflicts:
	Makefile
	configure
	libavcodec/x86/dsputil_mmx.c
	libavfilter/Makefile
	libavformat/Makefile
	libavutil/integer.c
	libavutil/integer.h
	libswscale/swscale.c
	libswscale/swscale_internal.h
	libswscale/x86/swscale_template.c
	tests/ref/lavfi/pixdesc_le
	tests/ref/lavfi/pixfmts_scale

Merged-by: Michael Niedermayer <michaelni@gmx.at>
2011-06-29 05:23:12 +02:00
Diego Biurrun
d2ee495fb2 configure: Drop check for availability of ten assembler operands.
This was done to support gcc 2.95, which is an old legacy compiler
that fails to compile the current codebase anyway.
2011-06-28 13:14:37 +02:00
Michael Niedermayer
83f9bc8aee Merge remote-tracking branch 'qatar/master'
* qatar/master:
  lavf: prevent crash in av_open_input_file() if ap == NULL.
  more Changelog additions
  lavf: add a forgotten NULL check in convert_format_parameters().
  Fix build if yasm is not available.
  H.264: Add x86 assembly for 10-bit MC Chroma H.264 functions.

Conflicts:
	Changelog

Merged-by: Michael Niedermayer <michaelni@gmx.at>
2011-06-19 04:02:06 +02:00
Ronald S. Bultje
ed63f527f2 Fix build if yasm is not available. 2011-06-18 08:34:14 -04:00
Daniel Kang
f188a1e0ca H.264: Add x86 assembly for 10-bit MC Chroma H.264 functions.
Mainly ported from 8-bit H.264 MC Chroma.

Signed-off-by: Ronald S. Bultje <rsbultje@gmail.com>
2011-06-18 07:52:19 -04:00
Michael Niedermayer
c137fdd778 Merge remote-tracking branch 'qatar/master'
* qatar/master:
  swscale: remove misplaced comment.
  ffmpeg: fix streaming to ffserver.
  swscale: split out RGB48 output functions from yuv2packed[12X]_c().
  build: move vpath directives to main Makefile
  swscale: fix JPEG-range YUV scaling artifacts.
  build: move ALLFFLIBS to a more logical place
  ARM: factor some repetitive code into macros
  Fix SVQ3 after adding 4:4:4 H.264 support
  H.264: fix CODEC_FLAG_GRAY
  4:4:4 H.264 decoding support
  ac3enc: fix allocation of floating point samples.

Conflicts:
	ffmpeg.c
	libavcodec/dsputil_template.c
	libavcodec/h264.c
	libavcodec/mpegvideo.c
	libavcodec/snow.c
	libswscale/swscale.c
	libswscale/swscale_internal.h

Merged-by: Michael Niedermayer <michaelni@gmx.at>
2011-06-15 02:15:25 +02:00
Jason Garrett-Glaser
c90b94424c 4:4:4 H.264 decoding support
Note: this is 4:4:4 from the 2007 spec revision, not the previous (now deprecated) 4:4:4 mode in H.264.
2011-06-13 21:16:30 -07:00
Jason Garrett-Glaser
504811baea Roll back 4:4:4 H.264 for now
Needs some ARM/PPC asm modifications.
2011-06-13 13:38:46 -07:00
Jason Garrett-Glaser
c9c493872c 4:4:4 H.264 decoding support
Note: this is 4:4:4 from the 2007 spec revision, not the previous (now deprecated) 4:4:4 mode in H.264.
2011-06-13 12:21:39 -07:00
Michael Niedermayer
612122b187 Merge remote branch 'qatar/master'
* qatar/master: (32 commits)
  10-bit H.264 x86 chroma v loopfilter asm
  Port SMPTE S302M audio decoder from FFmbc 0.3. [Copyright headers corrected]
  Fix crash of interlaced MPEG2 decoding
  h264pred: fix one more aliasing violation.
  doc/APIchanges: fill in missing hashes and dates.
  flacenc: use proper initializers for AVOption default values.
  lavc: deprecate named constants for deprecated antialias_algo.
  aac: workaround for compilation on cygwin
  swscale: extend YUV422p support to 10bits depth
  tiff: add support for inverted FillOrder for uncompressed data
  Remove unused softfloat implementation.
  h264pred: fix aliasing violations.
  rotozoom: Eliminate French variable name.
  rotozoom: Check return value of fread().
  rotozoom: Return an error value instead of calling exit().
  rotozoom: Make init_demo() return int and check for errors on invocation.
  rotozoom: Drop silly UINT8 typedef.
  rotozoom: Drop some unnecessary parentheses.
  rotozoom: K&R coding style cosmetics
  rtsp: Only do keepalive using GET_PARAMETER if the server supports it
  ...

Conflicts:
	Changelog
	cmdutils.c
	doc/APIchanges
	doc/general.texi
	ffmpeg.c
	ffplay.c
	libavcodec/h264pred_template.c
	libavcodec/resample.c
	libavutil/pixfmt.h
	libavutil/softfloat.c
	libavutil/softfloat.h
	tests/rotozoom.c

Merged-by: Michael Niedermayer <michaelni@gmx.at>
2011-05-12 04:51:24 +02:00
Michael Niedermayer
59eb12faff Merge remote branch 'qatar/master'
* qatar/master: (30 commits)
  AVOptions: make default_val a union, as proposed in AVOption2.
  arm/h264pred: add missing argument type.
  h264dsp_mmx: place bracket outside #if/#endif block.
  lavf/utils: fix ff_interleave_compare_dts corner case.
  fate: add 10-bit H264 tests.
  h264: do not print "too many references" warning for intra-only.
  Enable decoding of high bit depth h264.
  Adds 8-, 9- and 10-bit versions of some of the functions used by the h264 decoder.
  Add support for higher QP values in h264.
  Add the notion of pixel size in h264 related functions.
  Make the h264 loop filter bit depth aware.
  Template dsputil_template.c with respect to pixel size, etc.
  Template h264idct_template.c with respect to pixel size, etc.
  Preparatory patch for high bit depth h264 decoding support.
  Move some functions in dsputil.c into a new file dsputil_template.c.
  Move the functions in h264idct into a new file h264idct_template.c.
  Move the functions in h264pred.c into a new file h264pred_template.c.
  Preparatory patch for high bit depth h264 decoding support.
  Add pixel formats for 9- and 10-bit yuv420p.
  Choose h264 chroma dc dequant function dynamically.
  ...

Conflicts:
	doc/APIchanges
	ffmpeg.c
	ffplay.c
	libavcodec/alpha/dsputil_alpha.c
	libavcodec/arm/dsputil_init_arm.c
	libavcodec/arm/dsputil_init_armv6.c
	libavcodec/arm/dsputil_init_neon.c
	libavcodec/arm/dsputil_iwmmxt.c
	libavcodec/arm/h264pred_init_arm.c
	libavcodec/bfin/dsputil_bfin.c
	libavcodec/dsputil.c
	libavcodec/h264.c
	libavcodec/h264.h
	libavcodec/h264_cabac.c
	libavcodec/h264_cavlc.c
	libavcodec/h264_loopfilter.c
	libavcodec/h264_ps.c
	libavcodec/h264_refs.c
	libavcodec/h264dsp.c
	libavcodec/h264idct.c
	libavcodec/h264pred.c
	libavcodec/mlib/dsputil_mlib.c
	libavcodec/options.c
	libavcodec/ppc/dsputil_altivec.c
	libavcodec/ppc/dsputil_ppc.c
	libavcodec/ppc/h264_altivec.c
	libavcodec/ps2/dsputil_mmi.c
	libavcodec/sh4/dsputil_align.c
	libavcodec/sh4/dsputil_sh4.c
	libavcodec/sparc/dsputil_vis.c
	libavcodec/utils.c
	libavcodec/version.h
	libavcodec/x86/dsputil_mmx.c
	libavformat/options.c
	libavformat/utils.c
	libavutil/pixfmt.h
	libswscale/swscale.c
	libswscale/swscale_internal.h
	libswscale/swscale_template.c
	tests/ref/seek/lavf_avi

Merged-by: Michael Niedermayer <michaelni@gmx.at>
2011-05-11 05:47:02 +02:00
Jason Garrett-Glaser
9f3d6ca4f1 Port x86 10-bit H.264 deblock asm from x264 2011-05-10 20:02:15 -07:00
Oskar Arvidsson
19a0729b4c Adds 8-, 9- and 10-bit versions of some of the functions used by the h264 decoder.
This patch lets e.g. dsputil_init chose dsp functions with respect to
the bit depth to decode. The naming scheme of bit depth dependent
functions is <base name>_<bit depth>[_<prefix>] (i.e. the old
clear_blocks_c is now named clear_blocks_8_c).

Note: Some of the functions for high bit depth is not dependent on the
bit depth, but only on the pixel size. This leaves some room for
optimizing binary size.

Preparatory patch for high bit depth h264 decoding support.

Signed-off-by: Ronald S. Bultje <rsbultje@gmail.com>
2011-05-10 07:24:36 -04:00
Baptiste Coudurier
6d4c49a2af Move png mmx functions into x86/png_mmx.c, remove them from DSPContext.
Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
2011-04-27 20:08:09 +02:00
Oskar Arvidsson
8dbe585641 Adds 8-, 9- and 10-bit versions of some of the functions used by the h264 decoder.
This patch lets e.g. dsputil_init chose dsp functions with respect to
the bit depth to decode. The naming scheme of bit depth dependent
functions is <base name>_<bit depth>[_<prefix>] (i.e. the old
clear_blocks_c is now named clear_blocks_8_c).

Note: Some of the functions for high bit depth is not dependent on the
bit depth, but only on the pixel size. This leaves some room for
optimizing binary size.

Preparatory patch for high bit depth h264 decoding support.

Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
2011-04-10 22:33:42 +02:00
Michael Niedermayer
3c8493074b Merge remote-tracking branch 'newdev/master'
* newdev/master:
  dsputil: allow to skip drawing of top/bottom edges.
  Split fate-psx-str-v3 into a video-only and audio-only test.

Conflicts:
	libavcodec/dsputil.c
	libavcodec/mpegvideo.c
	libavcodec/snow.c
	libavcodec/x86/dsputil_mmx.c

Merged-by: Michael Niedermayer <michaelni@gmx.at>
2011-03-27 01:40:18 +01:00
Alexander Strange
1500be13f2 dsputil: allow to skip drawing of top/bottom edges. 2011-03-26 17:45:38 -04:00
Michael Niedermayer
2fd41c9067 Merge remote-tracking branch 'newdev/master'
* newdev/master:
  avio: make udp_set_remote_url/get_local_port internal.
  asfdec: also subtract preroll when reading simple index object
  matroskaenc: remove a variable that's unused after bc17bd9.
  avio: cosmetics - nicer vertical alignment.
  Remove unnecessary icc version checks
  Disable 'attribute "foo" ignored' warnings from icc
  rtsp: Don't use a locale dependent format string
  Add xd55 codec tag for XDCAM HD422 720p25 CBR files.
  configure: get libavcodec version from new version.h header
  lavc: move the version macros to a new installed header.
  matroskaenc: simplify get_aac_sample_rates by using ff_mpeg4audio_get_config
  Do not use format string "%0.3f" for RTSP Range field.
  Add apply_window_int16() to DSPContext with x86-optimized versions and use it in the ac3_fixed encoder.
  Document usage of import libraries created by dlltool
  configure: Set the correct lib target for arm/wince dlltool
  fate: simplify regression-funcs.sh
  fate: add support for multithread testing

Conflicts:
	libavformat/rtspdec.c
	libavutil/attributes.h
	libavutil/internal.h
	libavutil/mem.h

Merged-by: Michael Niedermayer <michaelni@gmx.at>
2011-03-24 02:16:11 +01:00
Justin Ruggles
e6e9823488 Add apply_window_int16() to DSPContext with x86-optimized versions and use it
in the ac3_fixed encoder.
2011-03-22 21:08:30 -04:00
Michael Niedermayer
d375c10400 Fake-Merge remote-tracking branch 'ffmpeg-mt/master' 2011-03-22 22:36:57 +01:00
Mans Rullgard
2912e87a6c Replace FFmpeg with Libav in licence headers
Signed-off-by: Mans Rullgard <mans@mansr.com>
2011-03-19 13:33:20 +00:00
Ronald S. Bultje
6a717eb4aa dsputil_mmx.c: remove ff_vector128.
Remove ff_vector128, it is identical to ff_pb_80.
(cherry picked from commit bf6fa73245)
2011-02-20 19:05:46 +01:00
Ronald S. Bultje
bf6fa73245 dsputil_mmx.c: remove ff_vector128.
Remove ff_vector128, it is identical to ff_pb_80.
2011-02-19 10:51:15 -05:00
Ronald S. Bultje
9a1ced321b dsputil: move VC1-specific stuff into VC1DSPContext.
(cherry picked from commit 12802ec060)
2011-02-18 19:52:40 +01:00
Ronald S. Bultje
12802ec060 dsputil: move VC1-specific stuff into VC1DSPContext. 2011-02-17 17:35:35 -05:00
Justin Ruggles
fe2ff6d247 Separate format conversion DSP functions from DSPContext.
This will be beneficial for use with the audio conversion API without
requiring it to depend on all of dsputil.

Signed-off-by: Mans Rullgard <mans@mansr.com>
(cherry picked from commit c73d99e672)
2011-02-04 03:08:09 +01:00
Justin Ruggles
c73d99e672 Separate format conversion DSP functions from DSPContext.
This will be beneficial for use with the audio conversion API without
requiring it to depend on all of dsputil.

Signed-off-by: Mans Rullgard <mans@mansr.com>
2011-02-02 02:44:53 +00:00
Ronald S. Bultje
baffa091af Implement a SIMD version of emulated_edge_mc() for x86.
From ~550 cycles (C version) to 170 (SSE/x86-64), 206 (MMX/x86-32)
and 196 (SSE2/x86-32) cycles.
(cherry picked from commit 81f2a3f4ff)
2011-02-02 03:40:49 +01:00
Justin Ruggles
389b5bfa34 cosmetics: indentation
Signed-off-by: Mans Rullgard <mans@mansr.com>
(cherry picked from commit d19b744a36)
2011-02-02 03:40:49 +01:00
Justin Ruggles
a8ae4e0e7b Remove unneeded add bias from 3 functions.
DSPContext.vector_fmul_window()
DCADSPContext.lfe_fir()
SynthFilterContext.synth_filter_float()

Signed-off-by: Mans Rullgard <mans@mansr.com>
(cherry picked from commit 80ba1ddb58)
2011-02-02 03:40:48 +01:00
Ronald S. Bultje
81f2a3f4ff Implement a SIMD version of emulated_edge_mc() for x86.
From ~550 cycles (C version) to 170 (SSE/x86-64), 206 (MMX/x86-32)
and 196 (SSE2/x86-32) cycles.
2011-01-31 20:55:56 -05:00
Justin Ruggles
d19b744a36 cosmetics: indentation
Signed-off-by: Mans Rullgard <mans@mansr.com>
2011-01-31 20:30:15 +00:00
Justin Ruggles
80ba1ddb58 Remove unneeded add bias from 3 functions.
DSPContext.vector_fmul_window()
DCADSPContext.lfe_fir()
SynthFilterContext.synth_filter_float()

Signed-off-by: Mans Rullgard <mans@mansr.com>
2011-01-31 20:28:42 +00:00
Justin Ruggles
015f9f1ad3 Change DSPContext.vector_fmul() from dst=dst*src to dest=src0*src1.
Signed-off-by: Mans Rullgard <mans@mansr.com>
(cherry picked from commit 6eabb0d3ad)
2011-01-23 19:32:08 +01:00
Justin Ruggles
6eabb0d3ad Change DSPContext.vector_fmul() from dst=dst*src to dest=src0*src1.
Signed-off-by: Mans Rullgard <mans@mansr.com>
2011-01-22 17:53:27 +00:00
Mans Rullgard
ef4a65149d Replace ASMALIGN() with .p2align
This macro has unconditionally used .p2align for a long time and
serves no useful purpose.
2011-01-18 20:48:24 +00:00
Mans Rullgard
ac3c9d0169 x86: remove VLA in ac3_downmix_sse 2011-01-18 20:48:24 +00:00
Ronald S. Bultje
ec3233a855 Fix ff_pw_3 alignment.
Originally committed as revision 26344 to svn://svn.ffmpeg.org/ffmpeg/trunk
2011-01-14 23:26:34 +00:00
Jason Garrett-Glaser
19fb234e4a H.264: split luma dc idct out and implement MMX/SSE2 versions
About 2.5x the speed.

NOTE: the way that the asm code handles large qmuls is a bit suboptimal.
If x264-style dequant was used (separate shift and qmul values), it might
be possible to get some extra speed.

Originally committed as revision 26336 to svn://svn.ffmpeg.org/ffmpeg/trunk
2011-01-14 21:34:25 +00:00
Ronald S. Bultje
8d147f1f60 For rounding in chroma MC SSSE3, use 16-byte pw_3/4 instead of reading 8 bytes
and then using movlhps to dup it into the higher half of the register.

Originally committed as revision 26086 to svn://svn.ffmpeg.org/ffmpeg/trunk
2010-12-24 17:23:22 +00:00
Baptiste Coudurier
90f1f3bf00 In yadif filter, declare asm constants directly to avoid dependency on libavcodec
Originally committed as revision 25895 to svn://svn.ffmpeg.org/ffmpeg/trunk
2010-12-06 00:14:15 +00:00
Baptiste Coudurier
9e95999e2a 10l, add ff_pw_1 to dsputil_mmx for yadif sse2
Originally committed as revision 25881 to svn://svn.ffmpeg.org/ffmpeg/trunk
2010-12-04 13:06:06 +00:00
İsmail Dönmez
80e33d2451 dsputil: Use explicit movzbl instead of movzx
This fixes compilation with the latest clang trunk version.

Patch by İsmail Dönmez, ismail at namtrac dot org

Originally committed as revision 25628 to svn://svn.ffmpeg.org/ffmpeg/trunk
2010-11-01 19:35:51 +00:00
Ramiro Polla
153ca56b38 xmm_clobbers: list xmm registers first in clobber list
suncc does not like the leading commas inside the macro, but it has no problem
with trailing commas.

Originally committed as revision 25615 to svn://svn.ffmpeg.org/ffmpeg/trunk
2010-10-31 18:14:48 +00:00
Ramiro Polla
5d543a3d13 dsputil_mmx: add xmm registers to clobber list
Originally committed as revision 25611 to svn://svn.ffmpeg.org/ffmpeg/trunk
2010-10-31 13:57:58 +00:00
Ramiro Polla
559738eff3 dsputil_mmx: prefer xmm registers below xmm6 when they are available
Originally committed as revision 25606 to svn://svn.ffmpeg.org/ffmpeg/trunk
2010-10-31 13:13:53 +00:00
Ronald S. Bultje
dd68d4db43 MMX, MMX2, SSE2 and SSSE3 optimizations for pred16x16/8x8_plane H264 intra
prediction (plus some with different rounding for svq3/rv40). Speedup (for
SSSE3) about ~6-fold, 3.6% faster overall with cathedral sample.

Originally committed as revision 25361 to svn://svn.ffmpeg.org/ffmpeg/trunk
2010-10-05 22:06:18 +00:00
Eli Friedman
329d689f75 Use sse2 variant of put_pixels16() for no_rnd also. Provides a minor speed
increase to e.g. vc1, snow and mpeg decoding.

Patch by Eli Friedman <eli dot friedman gmail com>.

Originally committed as revision 25259 to svn://svn.ffmpeg.org/ffmpeg/trunk
2010-09-29 15:34:43 +00:00
Stefano Sabatini
c6c98d0897 Move mm_support() from libavcodec to libavutil, make it a public
function and rename it to av_get_cpu_flags().

Originally committed as revision 25076 to svn://svn.ffmpeg.org/ffmpeg/trunk
2010-09-08 15:07:14 +00:00
Stefano Sabatini
7160bb716b Rename FF_MM_ symbols related to CPU features flags as AV_CPU_FLAG_
symbols, and move them from libavcodec/avcodec.h to libavutil/cpu.h.

Originally committed as revision 25040 to svn://svn.ffmpeg.org/ffmpeg/trunk
2010-09-04 09:59:08 +00:00
Ronald S. Bultje
2c166c3af1 Port latest x264 deblock asm (before they moved to using NV12 as internal
format), LGPL'ed with permission from Jason and Loren. This includes mmx2
code, so remove inline asm from h264dsp_mmx.c accordingly.

Originally committed as revision 25031 to svn://svn.ffmpeg.org/ffmpeg/trunk
2010-09-03 16:52:46 +00:00
Ronald S. Bultje
14bc1f2485 Split h264dsp_mmx.c (which was #included in dsputil_mmx.c) in h264_qpel_mmx.c,
still #included in dsputil_mmx.c and is part of DSPContext, and h264dsp_mmx.c,
which represents H264DSPContext and is now compiled on its own.

Originally committed as revision 25018 to svn://svn.ffmpeg.org/ffmpeg/trunk
2010-09-01 20:48:59 +00:00
Ronald S. Bultje
79ce0f002e Fix compilation failure if yasm is disabled (missing vp3 symbols).
Originally committed as revision 24992 to svn://svn.ffmpeg.org/ffmpeg/trunk
2010-08-30 20:30:40 +00:00
Ronald S. Bultje
d0eb5a1174 Move H264 chroma MC from inline asm to yasm. This fixes VP3/5/6 and VC-1
fate failures on Win64.

Originally committed as revision 24989 to svn://svn.ffmpeg.org/ffmpeg/trunk
2010-08-30 16:31:04 +00:00
Ronald S. Bultje
e9f5f020c6 Move VP3 IDCT functions from inline ASM to YASM. This fixes part of the VP3/5/6
issues on Win64.

Originally committed as revision 24988 to svn://svn.ffmpeg.org/ffmpeg/trunk
2010-08-30 16:25:46 +00:00
Ronald S. Bultje
7e7c4b6008 Put ff_ prefix on non-static {put_signed,put,add}_pixels_clamped_mmx()
functions.

Originally committed as revision 24987 to svn://svn.ffmpeg.org/ffmpeg/trunk
2010-08-30 16:22:27 +00:00
Ronald S. Bultje
3a0885146c Move vp6_filter_diag4() from DSPContext to VP56DSPContext.
Originally committed as revision 24921 to svn://svn.ffmpeg.org/ffmpeg/trunk
2010-08-25 13:42:28 +00:00
Måns Rullgård
c0ec9918b0 Remove global mm_flags variable
Originally committed as revision 24909 to svn://svn.ffmpeg.org/ffmpeg/trunk
2010-08-24 17:47:05 +00:00
Eli Friedman
c12d6955e2 H.264: SSE2/SSSE3 weighted prediction asm
Patch by Eli Friedman <eli.friedman at gmail dot com>

Originally committed as revision 24702 to svn://svn.ffmpeg.org/ffmpeg/trunk
2010-08-05 00:13:38 +00:00
Måns Rullgård
f079a64aea Move cavs dsp functions to their own struct
Originally committed as revision 24685 to svn://svn.ffmpeg.org/ffmpeg/trunk
2010-08-03 20:59:00 +00:00
Loren Merritt
c7b1d9768c relicense h264 deblock sse2 to lgpl
Originally committed as revision 24408 to svn://svn.ffmpeg.org/ffmpeg/trunk
2010-07-22 00:39:49 +00:00
David Conrad
c7eec58170 Move ff_pw_* from vc1dsp_mmx.c to dsputil_mmx.c
Should fix compilation with icc and should help prevent any future duplicates

Originally committed as revision 24380 to svn://svn.ffmpeg.org/ffmpeg/trunk
2010-07-21 10:02:03 +00:00
Ronald S. Bultje
e9e456d850 VP8 MBedge loopfilter MMX/MMX2/SSE2 functions for both luma (width=16)
and chroma (width=8).

Originally committed as revision 24378 to svn://svn.ffmpeg.org/ffmpeg/trunk
2010-07-20 22:58:56 +00:00
Ronald S. Bultje
a711eb4829 VP8 H/V inner loopfilter MMX/MMXEXT/SSE2 optimizations.
Originally committed as revision 24250 to svn://svn.ffmpeg.org/ffmpeg/trunk
2010-07-15 23:02:34 +00:00
David Conrad
7af8fbd348 Make ff_pw_4 128 bits
Originally committed as revision 24207 to svn://svn.ffmpeg.org/ffmpeg/trunk
2010-07-11 22:52:55 +00:00
Ronald S. Bultje
f2a30bd840 Simple H/V loopfilter for VP8 in MMX, MMX2 and SSE2 (yay for yasm macros).
Originally committed as revision 24029 to svn://svn.ffmpeg.org/ffmpeg/trunk
2010-07-03 19:26:30 +00:00
Eli Friedman
b3858964d6 Add const to some pointer parameters.
Patch by Eli Friedman,  eli D friedman A gmail

Originally committed as revision 23826 to svn://svn.ffmpeg.org/ffmpeg/trunk
2010-06-27 15:11:38 +00:00
Jason Garrett-Glaser
4af8cdfc3f 16x16 and 8x8c x86 SIMD intra pred functions for VP8 and H.264
Originally committed as revision 23783 to svn://svn.ffmpeg.org/ffmpeg/trunk
2010-06-25 18:25:49 +00:00
David Conrad
413abbe164 Add bitexact versions of put_no_rnd_pixels8 _x2 and _y2 for vp3/theora
Originally committed as revision 23463 to svn://svn.ffmpeg.org/ffmpeg/trunk
2010-06-04 04:46:26 +00:00
David Conrad
eb6a6cd788 vp3: DC-only IDCT
2-4% faster overall decode

Originally committed as revision 22896 to svn://svn.ffmpeg.org/ffmpeg/trunk
2010-04-17 02:04:30 +00:00
Måns Rullgård
4693b031a3 Move H264 dsputil functions into their own struct
This moves the H264-specific functions from DSPContext to the new
H264DSPContext.  The code is made conditional on CONFIG_H264DSP
which is set by the codecs requiring it.

The qpel and chroma MC functions are not moved as these are used by
non-h264 code.

Originally committed as revision 22565 to svn://svn.ffmpeg.org/ffmpeg/trunk
2010-03-16 01:17:00 +00:00
Måns Rullgård
05aec7bb87 Separate DWT from snow and dsputil
This moves the DWT functions from snow.c and dsputil.c to a file of
their own.  A new struct, DWTContext, holds the function pointers
previously part of DSPContext.

Originally committed as revision 22522 to svn://svn.ffmpeg.org/ffmpeg/trunk
2010-03-14 17:50:12 +00:00
Måns Rullgård
f49747e904 x86: move function prototypes to header files
Originally committed as revision 22266 to svn://svn.ffmpeg.org/ffmpeg/trunk
2010-03-06 22:37:08 +00:00
Måns Rullgård
84dc2d8afa Remove DECLARE_ALIGNED_{8,16} macros
These macros are redundant.  All uses are replaced with the generic
DECLARE_ALIGNED macro instead.

Originally committed as revision 22233 to svn://svn.ffmpeg.org/ffmpeg/trunk
2010-03-06 14:24:59 +00:00
David Conrad
19530266a5 Enable SSE2 (put|avg)_pixels_16_sse2
SVQ1 chroma has been special-cased aligned to 16-bytes since at least r15466
Other architectures also assume 16-byte alignment here too but set STRIDE_ALIGN
to 16.

Originally committed as revision 21736 to svn://svn.ffmpeg.org/ffmpeg/trunk
2010-02-10 02:02:06 +00:00
Alex Converse
3deb53849e Implement an sse version of scalarproduct_float().
Originally committed as revision 21386 to svn://svn.ffmpeg.org/ffmpeg/trunk
2010-01-22 23:07:58 +00:00
Måns Rullgård
c67278098d Move array specifiers outside DECLARE_ALIGNED() invocations
Originally committed as revision 21377 to svn://svn.ffmpeg.org/ffmpeg/trunk
2010-01-22 03:25:11 +00:00
Gwenole Beauchesne
5716aec3f9 Fix XvMC. XvMCCreateBlocks() may not allocate 16-byte aligned blocks,
so we can't use SSE-optimized routines.

Originally committed as revision 21011 to svn://svn.ffmpeg.org/ffmpeg/trunk
2010-01-04 09:19:32 +00:00
Diego Biurrun
4052cbf161 Get rid of pointless CONFIG_ANY_H263 preprocessor definition.
Originally committed as revision 20975 to svn://svn.ffmpeg.org/ffmpeg/trunk
2009-12-30 11:33:59 +00:00
Loren Merritt
91e644ff77 r20739 broke compilation on systems without yasm
Originally committed as revision 20742 to svn://svn.ffmpeg.org/ffmpeg/trunk
2009-12-05 17:51:57 +00:00
Loren Merritt
b1159ad928 refactor and optimize scalarproduct
29-105% faster apply_filter, 6-90% faster ape decoding on core2
(Any x86 other than core2 probably gets much less, since this is mostly due to ssse3 cachesplit avoidance and I haven't written the full gamut of other cachesplit modes.)
9-123% faster ape decoding on G4.

Originally committed as revision 20739 to svn://svn.ffmpeg.org/ffmpeg/trunk
2009-12-05 15:09:10 +00:00
Loren Merritt
b10fa1bb8b port ape dsp functions from sse2 to mmx
now requires yasm

Originally committed as revision 20722 to svn://svn.ffmpeg.org/ffmpeg/trunk
2009-12-03 18:53:12 +00:00
Loren Merritt
e17ccf60fe huffyuv: add some const qualifiers
Originally committed as revision 20290 to svn://svn.ffmpeg.org/ffmpeg/trunk
2009-10-18 20:47:25 +00:00
Loren Merritt
2f77923d72 simd add_hfyu_left_prediction
2.2x faster than C on conroe, 3.6x on penryn.
4-6% faster huffyuv decoding if using left or plane mode and yuv

Originally committed as revision 20287 to svn://svn.ffmpeg.org/ffmpeg/trunk
2009-10-18 20:10:10 +00:00
Måns Rullgård
35de5d2412 cosmetics: fix indentation after previous commit
Originally committed as revision 20062 to svn://svn.ffmpeg.org/ffmpeg/trunk
2009-09-27 16:52:00 +00:00
Måns Rullgård
952e872198 Drop unused args from vector_fmul_add_add, simpify code, and rename
The src3 and step arguments to vector_fmul_add_add() are always zero
and one, respectively.  This removes these arguments from the function,
simplifies the code accordingly, and renames the function to better
match the new operation.

Originally committed as revision 20061 to svn://svn.ffmpeg.org/ffmpeg/trunk
2009-09-27 16:51:54 +00:00
Vitor Sessak
9263a05aab Mark "i" parameter of vector_clipf_sse() as early-clobber
Originally committed as revision 19731 to svn://svn.ffmpeg.org/ffmpeg/trunk
2009-08-27 15:52:44 +00:00
Vitor Sessak
50e23ae9d3 Mark parameter src of vector_clipf() as const
Originally committed as revision 19729 to svn://svn.ffmpeg.org/ffmpeg/trunk
2009-08-27 15:38:59 +00:00
Vitor Sessak
0a68cd876e SSE optimized vector_clipf(). 10% faster TwinVQ decoding.
Originally committed as revision 19728 to svn://svn.ffmpeg.org/ffmpeg/trunk
2009-08-27 14:49:36 +00:00
Diego Biurrun
9be6f0d2f8 Do not check for both CONFIG_VC1_DECODER and CONFIG_WMV3_DECODER,
the former depends upon the latter.

Originally committed as revision 19533 to svn://svn.ffmpeg.org/ffmpeg/trunk
2009-07-29 09:54:49 +00:00
Diego Biurrun
99e5a9d1ea Do not redundantly check for both CONFIG_THEORA_DECODER and CONFIG_VP3_DECODER.
The Theora decoder depends on the VP3 decoder.

Originally committed as revision 19492 to svn://svn.ffmpeg.org/ffmpeg/trunk
2009-07-22 22:27:10 +00:00
Carl Eugen Hoyos
36904c4c9f Icc 11.1 still does not align the stack pointer, disable some x264 functions.
Originally committed as revision 19454 to svn://svn.ffmpeg.org/ffmpeg/trunk
2009-07-17 09:07:38 +00:00
Jason Garrett-Glaser
73b02e2460 SSE version of clear_blocks
Originally committed as revision 19206 to svn://svn.ffmpeg.org/ffmpeg/trunk
2009-06-16 17:33:57 +00:00
David Conrad
c21c835b8d avg_ pixel functions need to use (dst+pix+1)>>1 to average with existing
pixels, not (dst+pix)>>1.
This makes the mmx functions bitexact with the C functions.

Originally committed as revision 18527 to svn://svn.ffmpeg.org/ffmpeg/trunk
2009-04-15 19:10:16 +00:00
David Conrad
9bf0fdf378 VC1: extend MMX qpel MC to include MMX2 avg qpel
Originally committed as revision 18519 to svn://svn.ffmpeg.org/ffmpeg/trunk
2009-04-15 02:25:42 +00:00
David Conrad
8013da7364 VC1: add and use avg_no_rnd chroma MC functions
Originally committed as revision 18518 to svn://svn.ffmpeg.org/ffmpeg/trunk
2009-04-14 23:56:10 +00:00
David Conrad
c374691b28 Rename put_no_rnd_h264_chroma* to reflect its usage in VC1 only
Originally committed as revision 18517 to svn://svn.ffmpeg.org/ffmpeg/trunk
2009-04-14 23:55:39 +00:00
Stefano Sabatini
6b4343616c Rename FF_MM_MMXEXT to FF_MM_MMX2, for both clarity and consistency
with libswscale.

Originally committed as revision 18330 to svn://svn.ffmpeg.org/ffmpeg/trunk
2009-04-04 13:20:53 +00:00
Reimar Döffinger
0be9e73e38 Mark line_skip3 asm argument as output-only instead of using av_uninit.
Originally committed as revision 18327 to svn://svn.ffmpeg.org/ffmpeg/trunk
2009-04-03 14:03:49 +00:00
Reimar Döffinger
d7460a9cac Mark put_signed_pixels_clamped_mmx output operands as early-clobber because
they are. Hopefully fixes some FATE errors, too.

Originally committed as revision 18326 to svn://svn.ffmpeg.org/ffmpeg/trunk
2009-04-03 14:02:34 +00:00
Reimar Döffinger
531a3d2721 Use DECLARE_ASM_CONST for non-global ff_vector128 constant used via MANGLE
Originally committed as revision 18325 to svn://svn.ffmpeg.org/ffmpeg/trunk
2009-04-03 14:01:24 +00:00
Alex Converse
3dd6531208 Rewrite put_signed_pixels_clamped_mmx() to eliminate mmx.h from dsputil_mmx.c.
Originally committed as revision 18319 to svn://svn.ffmpeg.org/ffmpeg/trunk
2009-04-02 21:02:42 +00:00
Zuxy Meng
ecb24904fe add SSE2 version of vp6_filter_diag
original patch by Zuxy Meng  zuxy.meng _at_ gmail _dot_ com

Originally committed as revision 17195 to svn://svn.ffmpeg.org/ffmpeg/trunk
2009-02-13 00:02:33 +00:00
Sebastien Lucas
6af3c226c3 add MMX version of vp6_filter_diag
original patch by Sebastien Lucas  sebastien.lucas _at_ gmail _dot_ com

Originally committed as revision 17194 to svn://svn.ffmpeg.org/ffmpeg/trunk
2009-02-12 23:52:52 +00:00
Aurelien Jacobs
5110b25e1e convert ff_pw_64 into an xmm_reg for future use in vp6 sse code
Originally committed as revision 17192 to svn://svn.ffmpeg.org/ffmpeg/trunk
2009-02-12 23:48:07 +00:00
Diego Biurrun
d3a4b4e09c Add check whether the compiler/assembler supports 10 or more operands.
thanks to Loren for some help with the asm statements

Originally committed as revision 17151 to svn://svn.ffmpeg.org/ffmpeg/trunk
2009-02-11 11:16:00 +00:00
Loren Merritt
3daa434a40 ff_add_hfyu_median_prediction_mmx2
overall ffvhuff decoding speedup: 28% on core2, 25% on k8.

Originally committed as revision 17059 to svn://svn.ffmpeg.org/ffmpeg/trunk
2009-02-08 17:45:30 +00:00
David Conrad
137ae32760 Workaround for gcc 3.4 to align sh properly
Originally committed as revision 16797 to svn://svn.ffmpeg.org/ffmpeg/trunk
2009-01-26 03:40:48 +00:00
Diego Biurrun
406792e7b0 cosmetics: Remove pointless period after copyright statement non-sentences.
Originally committed as revision 16684 to svn://svn.ffmpeg.org/ffmpeg/trunk
2009-01-19 15:46:40 +00:00
Aurelien Jacobs
49fb20cb8a replace all occurrence of ENABLE_ by the corresponding CONFIG_, HAVE_ or ARCH_
and remove all ENABLE_ definitions.

Originally committed as revision 16600 to svn://svn.ffmpeg.org/ffmpeg/trunk
2009-01-14 17:19:17 +00:00