ffmpeg

Author	SHA1	Message	Date
Jason Garrett-Glaser	19fb234e4a	H.264: split luma dc idct out and implement MMX/SSE2 versions About 2.5x the speed. NOTE: the way that the asm code handles large qmuls is a bit suboptimal. If x264-style dequant was used (separate shift and qmul values), it might be possible to get some extra speed. Originally committed as revision 26336 to svn://svn.ffmpeg.org/ffmpeg/trunk	2011-01-14 21:34:25 +00:00
Ronald S. Bultje	8d147f1f60	For rounding in chroma MC SSSE3, use 16-byte pw_3/4 instead of reading 8 bytes and then using movlhps to dup it into the higher half of the register. Originally committed as revision 26086 to svn://svn.ffmpeg.org/ffmpeg/trunk	2010-12-24 17:23:22 +00:00
Baptiste Coudurier	90f1f3bf00	In yadif filter, declare asm constants directly to avoid dependency on libavcodec Originally committed as revision 25895 to svn://svn.ffmpeg.org/ffmpeg/trunk	2010-12-06 00:14:15 +00:00
Baptiste Coudurier	9e95999e2a	10l, add ff_pw_1 to dsputil_mmx for yadif sse2 Originally committed as revision 25881 to svn://svn.ffmpeg.org/ffmpeg/trunk	2010-12-04 13:06:06 +00:00
İsmail Dönmez	80e33d2451	dsputil: Use explicit movzbl instead of movzx This fixes compilation with the latest clang trunk version. Patch by İsmail Dönmez, ismail at namtrac dot org Originally committed as revision 25628 to svn://svn.ffmpeg.org/ffmpeg/trunk	2010-11-01 19:35:51 +00:00
Ramiro Polla	153ca56b38	xmm_clobbers: list xmm registers first in clobber list suncc does not like the leading commas inside the macro, but it has no problem with trailing commas. Originally committed as revision 25615 to svn://svn.ffmpeg.org/ffmpeg/trunk	2010-10-31 18:14:48 +00:00
Ramiro Polla	5d543a3d13	dsputil_mmx: add xmm registers to clobber list Originally committed as revision 25611 to svn://svn.ffmpeg.org/ffmpeg/trunk	2010-10-31 13:57:58 +00:00
Ramiro Polla	559738eff3	dsputil_mmx: prefer xmm registers below xmm6 when they are available Originally committed as revision 25606 to svn://svn.ffmpeg.org/ffmpeg/trunk	2010-10-31 13:13:53 +00:00
Ronald S. Bultje	dd68d4db43	MMX, MMX2, SSE2 and SSSE3 optimizations for pred16x16/8x8_plane H264 intra prediction (plus some with different rounding for svq3/rv40). Speedup (for SSSE3) about ~6-fold, 3.6% faster overall with cathedral sample. Originally committed as revision 25361 to svn://svn.ffmpeg.org/ffmpeg/trunk	2010-10-05 22:06:18 +00:00
Eli Friedman	329d689f75	Use sse2 variant of put_pixels16() for no_rnd also. Provides a minor speed increase to e.g. vc1, snow and mpeg decoding. Patch by Eli Friedman <eli dot friedman gmail com>. Originally committed as revision 25259 to svn://svn.ffmpeg.org/ffmpeg/trunk	2010-09-29 15:34:43 +00:00
Stefano Sabatini	c6c98d0897	Move mm_support() from libavcodec to libavutil, make it a public function and rename it to av_get_cpu_flags(). Originally committed as revision 25076 to svn://svn.ffmpeg.org/ffmpeg/trunk	2010-09-08 15:07:14 +00:00
Stefano Sabatini	7160bb716b	Rename FF_MM_ symbols related to CPU features flags as AV_CPU_FLAG_ symbols, and move them from libavcodec/avcodec.h to libavutil/cpu.h. Originally committed as revision 25040 to svn://svn.ffmpeg.org/ffmpeg/trunk	2010-09-04 09:59:08 +00:00
Ronald S. Bultje	2c166c3af1	Port latest x264 deblock asm (before they moved to using NV12 as internal format), LGPL'ed with permission from Jason and Loren. This includes mmx2 code, so remove inline asm from h264dsp_mmx.c accordingly. Originally committed as revision 25031 to svn://svn.ffmpeg.org/ffmpeg/trunk	2010-09-03 16:52:46 +00:00
Ronald S. Bultje	14bc1f2485	Split h264dsp_mmx.c (which was #included in dsputil_mmx.c) in h264_qpel_mmx.c, still #included in dsputil_mmx.c and is part of DSPContext, and h264dsp_mmx.c, which represents H264DSPContext and is now compiled on its own. Originally committed as revision 25018 to svn://svn.ffmpeg.org/ffmpeg/trunk	2010-09-01 20:48:59 +00:00
Ronald S. Bultje	79ce0f002e	Fix compilation failure if yasm is disabled (missing vp3 symbols). Originally committed as revision 24992 to svn://svn.ffmpeg.org/ffmpeg/trunk	2010-08-30 20:30:40 +00:00
Ronald S. Bultje	d0eb5a1174	Move H264 chroma MC from inline asm to yasm. This fixes VP3/5/6 and VC-1 fate failures on Win64. Originally committed as revision 24989 to svn://svn.ffmpeg.org/ffmpeg/trunk	2010-08-30 16:31:04 +00:00
Ronald S. Bultje	e9f5f020c6	Move VP3 IDCT functions from inline ASM to YASM. This fixes part of the VP3/5/6 issues on Win64. Originally committed as revision 24988 to svn://svn.ffmpeg.org/ffmpeg/trunk	2010-08-30 16:25:46 +00:00
Ronald S. Bultje	7e7c4b6008	Put ff_ prefix on non-static {put_signed,put,add}_pixels_clamped_mmx() functions. Originally committed as revision 24987 to svn://svn.ffmpeg.org/ffmpeg/trunk	2010-08-30 16:22:27 +00:00
Ronald S. Bultje	3a0885146c	Move vp6_filter_diag4() from DSPContext to VP56DSPContext. Originally committed as revision 24921 to svn://svn.ffmpeg.org/ffmpeg/trunk	2010-08-25 13:42:28 +00:00
Måns Rullgård	c0ec9918b0	Remove global mm_flags variable Originally committed as revision 24909 to svn://svn.ffmpeg.org/ffmpeg/trunk	2010-08-24 17:47:05 +00:00
Eli Friedman	c12d6955e2	H.264: SSE2/SSSE3 weighted prediction asm Patch by Eli Friedman <eli.friedman at gmail dot com> Originally committed as revision 24702 to svn://svn.ffmpeg.org/ffmpeg/trunk	2010-08-05 00:13:38 +00:00
Måns Rullgård	f079a64aea	Move cavs dsp functions to their own struct Originally committed as revision 24685 to svn://svn.ffmpeg.org/ffmpeg/trunk	2010-08-03 20:59:00 +00:00
Loren Merritt	c7b1d9768c	relicense h264 deblock sse2 to lgpl Originally committed as revision 24408 to svn://svn.ffmpeg.org/ffmpeg/trunk	2010-07-22 00:39:49 +00:00
David Conrad	c7eec58170	Move ff_pw_* from vc1dsp_mmx.c to dsputil_mmx.c Should fix compilation with icc and should help prevent any future duplicates Originally committed as revision 24380 to svn://svn.ffmpeg.org/ffmpeg/trunk	2010-07-21 10:02:03 +00:00
Ronald S. Bultje	e9e456d850	VP8 MBedge loopfilter MMX/MMX2/SSE2 functions for both luma (width=16) and chroma (width=8). Originally committed as revision 24378 to svn://svn.ffmpeg.org/ffmpeg/trunk	2010-07-20 22:58:56 +00:00
Ronald S. Bultje	a711eb4829	VP8 H/V inner loopfilter MMX/MMXEXT/SSE2 optimizations. Originally committed as revision 24250 to svn://svn.ffmpeg.org/ffmpeg/trunk	2010-07-15 23:02:34 +00:00
David Conrad	7af8fbd348	Make ff_pw_4 128 bits Originally committed as revision 24207 to svn://svn.ffmpeg.org/ffmpeg/trunk	2010-07-11 22:52:55 +00:00
Ronald S. Bultje	f2a30bd840	Simple H/V loopfilter for VP8 in MMX, MMX2 and SSE2 (yay for yasm macros). Originally committed as revision 24029 to svn://svn.ffmpeg.org/ffmpeg/trunk	2010-07-03 19:26:30 +00:00
Eli Friedman	b3858964d6	Add const to some pointer parameters. Patch by Eli Friedman, eli D friedman A gmail Originally committed as revision 23826 to svn://svn.ffmpeg.org/ffmpeg/trunk	2010-06-27 15:11:38 +00:00
Jason Garrett-Glaser	4af8cdfc3f	16x16 and 8x8c x86 SIMD intra pred functions for VP8 and H.264 Originally committed as revision 23783 to svn://svn.ffmpeg.org/ffmpeg/trunk	2010-06-25 18:25:49 +00:00
David Conrad	413abbe164	Add bitexact versions of put_no_rnd_pixels8 _x2 and _y2 for vp3/theora Originally committed as revision 23463 to svn://svn.ffmpeg.org/ffmpeg/trunk	2010-06-04 04:46:26 +00:00
David Conrad	eb6a6cd788	vp3: DC-only IDCT 2-4% faster overall decode Originally committed as revision 22896 to svn://svn.ffmpeg.org/ffmpeg/trunk	2010-04-17 02:04:30 +00:00
Måns Rullgård	4693b031a3	Move H264 dsputil functions into their own struct This moves the H264-specific functions from DSPContext to the new H264DSPContext. The code is made conditional on CONFIG_H264DSP which is set by the codecs requiring it. The qpel and chroma MC functions are not moved as these are used by non-h264 code. Originally committed as revision 22565 to svn://svn.ffmpeg.org/ffmpeg/trunk	2010-03-16 01:17:00 +00:00
Måns Rullgård	05aec7bb87	Separate DWT from snow and dsputil This moves the DWT functions from snow.c and dsputil.c to a file of their own. A new struct, DWTContext, holds the function pointers previously part of DSPContext. Originally committed as revision 22522 to svn://svn.ffmpeg.org/ffmpeg/trunk	2010-03-14 17:50:12 +00:00
Måns Rullgård	f49747e904	x86: move function prototypes to header files Originally committed as revision 22266 to svn://svn.ffmpeg.org/ffmpeg/trunk	2010-03-06 22:37:08 +00:00
Måns Rullgård	84dc2d8afa	Remove DECLARE_ALIGNED_{8,16} macros These macros are redundant. All uses are replaced with the generic DECLARE_ALIGNED macro instead. Originally committed as revision 22233 to svn://svn.ffmpeg.org/ffmpeg/trunk	2010-03-06 14:24:59 +00:00
David Conrad	19530266a5	Enable SSE2 (put\|avg)_pixels_16_sse2 SVQ1 chroma has been special-cased aligned to 16-bytes since at least r15466 Other architectures also assume 16-byte alignment here too but set STRIDE_ALIGN to 16. Originally committed as revision 21736 to svn://svn.ffmpeg.org/ffmpeg/trunk	2010-02-10 02:02:06 +00:00
Alex Converse	3deb53849e	Implement an sse version of scalarproduct_float(). Originally committed as revision 21386 to svn://svn.ffmpeg.org/ffmpeg/trunk	2010-01-22 23:07:58 +00:00
Måns Rullgård	c67278098d	Move array specifiers outside DECLARE_ALIGNED() invocations Originally committed as revision 21377 to svn://svn.ffmpeg.org/ffmpeg/trunk	2010-01-22 03:25:11 +00:00
Gwenole Beauchesne	5716aec3f9	Fix XvMC. XvMCCreateBlocks() may not allocate 16-byte aligned blocks, so we can't use SSE-optimized routines. Originally committed as revision 21011 to svn://svn.ffmpeg.org/ffmpeg/trunk	2010-01-04 09:19:32 +00:00
Diego Biurrun	4052cbf161	Get rid of pointless CONFIG_ANY_H263 preprocessor definition. Originally committed as revision 20975 to svn://svn.ffmpeg.org/ffmpeg/trunk	2009-12-30 11:33:59 +00:00
Loren Merritt	91e644ff77	r20739 broke compilation on systems without yasm Originally committed as revision 20742 to svn://svn.ffmpeg.org/ffmpeg/trunk	2009-12-05 17:51:57 +00:00
Loren Merritt	b1159ad928	refactor and optimize scalarproduct 29-105% faster apply_filter, 6-90% faster ape decoding on core2 (Any x86 other than core2 probably gets much less, since this is mostly due to ssse3 cachesplit avoidance and I haven't written the full gamut of other cachesplit modes.) 9-123% faster ape decoding on G4. Originally committed as revision 20739 to svn://svn.ffmpeg.org/ffmpeg/trunk	2009-12-05 15:09:10 +00:00
Loren Merritt	b10fa1bb8b	port ape dsp functions from sse2 to mmx now requires yasm Originally committed as revision 20722 to svn://svn.ffmpeg.org/ffmpeg/trunk	2009-12-03 18:53:12 +00:00
Loren Merritt	e17ccf60fe	huffyuv: add some const qualifiers Originally committed as revision 20290 to svn://svn.ffmpeg.org/ffmpeg/trunk	2009-10-18 20:47:25 +00:00
Loren Merritt	2f77923d72	simd add_hfyu_left_prediction 2.2x faster than C on conroe, 3.6x on penryn. 4-6% faster huffyuv decoding if using left or plane mode and yuv Originally committed as revision 20287 to svn://svn.ffmpeg.org/ffmpeg/trunk	2009-10-18 20:10:10 +00:00
Måns Rullgård	35de5d2412	cosmetics: fix indentation after previous commit Originally committed as revision 20062 to svn://svn.ffmpeg.org/ffmpeg/trunk	2009-09-27 16:52:00 +00:00
Måns Rullgård	952e872198	Drop unused args from vector_fmul_add_add, simpify code, and rename The src3 and step arguments to vector_fmul_add_add() are always zero and one, respectively. This removes these arguments from the function, simplifies the code accordingly, and renames the function to better match the new operation. Originally committed as revision 20061 to svn://svn.ffmpeg.org/ffmpeg/trunk	2009-09-27 16:51:54 +00:00
Vitor Sessak	9263a05aab	Mark "i" parameter of vector_clipf_sse() as early-clobber Originally committed as revision 19731 to svn://svn.ffmpeg.org/ffmpeg/trunk	2009-08-27 15:52:44 +00:00
Vitor Sessak	50e23ae9d3	Mark parameter src of vector_clipf() as const Originally committed as revision 19729 to svn://svn.ffmpeg.org/ffmpeg/trunk	2009-08-27 15:38:59 +00:00
Vitor Sessak	0a68cd876e	SSE optimized vector_clipf(). 10% faster TwinVQ decoding. Originally committed as revision 19728 to svn://svn.ffmpeg.org/ffmpeg/trunk	2009-08-27 14:49:36 +00:00
Diego Biurrun	9be6f0d2f8	Do not check for both CONFIG_VC1_DECODER and CONFIG_WMV3_DECODER, the former depends upon the latter. Originally committed as revision 19533 to svn://svn.ffmpeg.org/ffmpeg/trunk	2009-07-29 09:54:49 +00:00
Diego Biurrun	99e5a9d1ea	Do not redundantly check for both CONFIG_THEORA_DECODER and CONFIG_VP3_DECODER. The Theora decoder depends on the VP3 decoder. Originally committed as revision 19492 to svn://svn.ffmpeg.org/ffmpeg/trunk	2009-07-22 22:27:10 +00:00
Carl Eugen Hoyos	36904c4c9f	Icc 11.1 still does not align the stack pointer, disable some x264 functions. Originally committed as revision 19454 to svn://svn.ffmpeg.org/ffmpeg/trunk	2009-07-17 09:07:38 +00:00
Jason Garrett-Glaser	73b02e2460	SSE version of clear_blocks Originally committed as revision 19206 to svn://svn.ffmpeg.org/ffmpeg/trunk	2009-06-16 17:33:57 +00:00
David Conrad	c21c835b8d	avg_ pixel functions need to use (dst+pix+1)>>1 to average with existing pixels, not (dst+pix)>>1. This makes the mmx functions bitexact with the C functions. Originally committed as revision 18527 to svn://svn.ffmpeg.org/ffmpeg/trunk	2009-04-15 19:10:16 +00:00
David Conrad	9bf0fdf378	VC1: extend MMX qpel MC to include MMX2 avg qpel Originally committed as revision 18519 to svn://svn.ffmpeg.org/ffmpeg/trunk	2009-04-15 02:25:42 +00:00
David Conrad	8013da7364	VC1: add and use avg_no_rnd chroma MC functions Originally committed as revision 18518 to svn://svn.ffmpeg.org/ffmpeg/trunk	2009-04-14 23:56:10 +00:00
David Conrad	c374691b28	Rename put_no_rnd_h264_chroma* to reflect its usage in VC1 only Originally committed as revision 18517 to svn://svn.ffmpeg.org/ffmpeg/trunk	2009-04-14 23:55:39 +00:00
Stefano Sabatini	6b4343616c	Rename FF_MM_MMXEXT to FF_MM_MMX2, for both clarity and consistency with libswscale. Originally committed as revision 18330 to svn://svn.ffmpeg.org/ffmpeg/trunk	2009-04-04 13:20:53 +00:00
Reimar Döffinger	0be9e73e38	Mark line_skip3 asm argument as output-only instead of using av_uninit. Originally committed as revision 18327 to svn://svn.ffmpeg.org/ffmpeg/trunk	2009-04-03 14:03:49 +00:00
Reimar Döffinger	d7460a9cac	Mark put_signed_pixels_clamped_mmx output operands as early-clobber because they are. Hopefully fixes some FATE errors, too. Originally committed as revision 18326 to svn://svn.ffmpeg.org/ffmpeg/trunk	2009-04-03 14:02:34 +00:00
Reimar Döffinger	531a3d2721	Use DECLARE_ASM_CONST for non-global ff_vector128 constant used via MANGLE Originally committed as revision 18325 to svn://svn.ffmpeg.org/ffmpeg/trunk	2009-04-03 14:01:24 +00:00
Alex Converse	3dd6531208	Rewrite put_signed_pixels_clamped_mmx() to eliminate mmx.h from dsputil_mmx.c. Originally committed as revision 18319 to svn://svn.ffmpeg.org/ffmpeg/trunk	2009-04-02 21:02:42 +00:00
Zuxy Meng	ecb24904fe	add SSE2 version of vp6_filter_diag original patch by Zuxy Meng zuxy.meng _at_ gmail _dot_ com Originally committed as revision 17195 to svn://svn.ffmpeg.org/ffmpeg/trunk	2009-02-13 00:02:33 +00:00
Sebastien Lucas	6af3c226c3	add MMX version of vp6_filter_diag original patch by Sebastien Lucas sebastien.lucas _at_ gmail _dot_ com Originally committed as revision 17194 to svn://svn.ffmpeg.org/ffmpeg/trunk	2009-02-12 23:52:52 +00:00
Aurelien Jacobs	5110b25e1e	convert ff_pw_64 into an xmm_reg for future use in vp6 sse code Originally committed as revision 17192 to svn://svn.ffmpeg.org/ffmpeg/trunk	2009-02-12 23:48:07 +00:00
Diego Biurrun	d3a4b4e09c	Add check whether the compiler/assembler supports 10 or more operands. thanks to Loren for some help with the asm statements Originally committed as revision 17151 to svn://svn.ffmpeg.org/ffmpeg/trunk	2009-02-11 11:16:00 +00:00
Loren Merritt	3daa434a40	ff_add_hfyu_median_prediction_mmx2 overall ffvhuff decoding speedup: 28% on core2, 25% on k8. Originally committed as revision 17059 to svn://svn.ffmpeg.org/ffmpeg/trunk	2009-02-08 17:45:30 +00:00
David Conrad	137ae32760	Workaround for gcc 3.4 to align sh properly Originally committed as revision 16797 to svn://svn.ffmpeg.org/ffmpeg/trunk	2009-01-26 03:40:48 +00:00
Diego Biurrun	406792e7b0	cosmetics: Remove pointless period after copyright statement non-sentences. Originally committed as revision 16684 to svn://svn.ffmpeg.org/ffmpeg/trunk	2009-01-19 15:46:40 +00:00
Aurelien Jacobs	49fb20cb8a	replace all occurrence of ENABLE_ by the corresponding CONFIG_, HAVE_ or ARCH_ and remove all ENABLE_ definitions. Originally committed as revision 16600 to svn://svn.ffmpeg.org/ffmpeg/trunk	2009-01-14 17:19:17 +00:00
Aurelien Jacobs	b250f9c66d	Change semantic of CONFIG_, HAVE_ and ARCH_*. They are now always defined to either 0 or 1. Originally committed as revision 16590 to svn://svn.ffmpeg.org/ffmpeg/trunk	2009-01-13 23:44:16 +00:00
Diego Biurrun	c47d146be8	Add missing 'void' keyword to parameterless function declarations. Originally committed as revision 16436 to svn://svn.ffmpeg.org/ffmpeg/trunk	2009-01-05 13:57:43 +00:00
Mathieu Velten	21ff7689da	Use H264 MMX chroma functions to accelerate RV40 decoding. Patch by Mathieu Velten (matmaul A gmail) Originally committed as revision 16419 to svn://svn.ffmpeg.org/ffmpeg/trunk	2009-01-04 01:36:11 +00:00
Jason Garrett-Glaser	37fed10087	Add x264 SSE2 iDCT functions to H.264 decoder. Originally committed as revision 16409 to svn://svn.ffmpeg.org/ffmpeg/trunk	2009-01-03 00:46:17 +00:00
Carl Eugen Hoyos	2c67c65963	Fix h264 decoding on SSE2 cores with icc compilation. Originally committed as revision 16373 to svn://svn.ffmpeg.org/ffmpeg/trunk	2008-12-28 19:40:13 +00:00
Jason Garrett-Glaser	c1fc70362f	Fix compilation without optimization under 64-bit with x264 deblock asm enabled. Originally committed as revision 16313 to svn://svn.ffmpeg.org/ffmpeg/trunk	2008-12-26 00:19:08 +00:00
Diego Biurrun	a6493a8fbd	Rename libavcodec/i386/ --> libavcodec/x86/. It contains optimizations that are not specific to i386 and libavutil uses this naming scheme already. Originally committed as revision 16270 to svn://svn.ffmpeg.org/ffmpeg/trunk	2008-12-22 09:12:42 +00:00

... 4 5 6 7 8

379 Commits