Commit Graph

142 Commits

Author SHA1 Message Date
Alex Converse
3deb53849e Implement an sse version of scalarproduct_float().
Originally committed as revision 21386 to svn://svn.ffmpeg.org/ffmpeg/trunk
2010-01-22 23:07:58 +00:00
Måns Rullgård
c67278098d Move array specifiers outside DECLARE_ALIGNED() invocations
Originally committed as revision 21377 to svn://svn.ffmpeg.org/ffmpeg/trunk
2010-01-22 03:25:11 +00:00
Gwenole Beauchesne
5716aec3f9 Fix XvMC. XvMCCreateBlocks() may not allocate 16-byte aligned blocks,
so we can't use SSE-optimized routines.

Originally committed as revision 21011 to svn://svn.ffmpeg.org/ffmpeg/trunk
2010-01-04 09:19:32 +00:00
Diego Biurrun
4052cbf161 Get rid of pointless CONFIG_ANY_H263 preprocessor definition.
Originally committed as revision 20975 to svn://svn.ffmpeg.org/ffmpeg/trunk
2009-12-30 11:33:59 +00:00
Loren Merritt
91e644ff77 r20739 broke compilation on systems without yasm
Originally committed as revision 20742 to svn://svn.ffmpeg.org/ffmpeg/trunk
2009-12-05 17:51:57 +00:00
Loren Merritt
b1159ad928 refactor and optimize scalarproduct
29-105% faster apply_filter, 6-90% faster ape decoding on core2
(Any x86 other than core2 probably gets much less, since this is mostly due to ssse3 cachesplit avoidance and I haven't written the full gamut of other cachesplit modes.)
9-123% faster ape decoding on G4.

Originally committed as revision 20739 to svn://svn.ffmpeg.org/ffmpeg/trunk
2009-12-05 15:09:10 +00:00
Loren Merritt
b10fa1bb8b port ape dsp functions from sse2 to mmx
now requires yasm

Originally committed as revision 20722 to svn://svn.ffmpeg.org/ffmpeg/trunk
2009-12-03 18:53:12 +00:00
Loren Merritt
e17ccf60fe huffyuv: add some const qualifiers
Originally committed as revision 20290 to svn://svn.ffmpeg.org/ffmpeg/trunk
2009-10-18 20:47:25 +00:00
Loren Merritt
2f77923d72 simd add_hfyu_left_prediction
2.2x faster than C on conroe, 3.6x on penryn.
4-6% faster huffyuv decoding if using left or plane mode and yuv

Originally committed as revision 20287 to svn://svn.ffmpeg.org/ffmpeg/trunk
2009-10-18 20:10:10 +00:00
Måns Rullgård
35de5d2412 cosmetics: fix indentation after previous commit
Originally committed as revision 20062 to svn://svn.ffmpeg.org/ffmpeg/trunk
2009-09-27 16:52:00 +00:00
Måns Rullgård
952e872198 Drop unused args from vector_fmul_add_add, simpify code, and rename
The src3 and step arguments to vector_fmul_add_add() are always zero
and one, respectively.  This removes these arguments from the function,
simplifies the code accordingly, and renames the function to better
match the new operation.

Originally committed as revision 20061 to svn://svn.ffmpeg.org/ffmpeg/trunk
2009-09-27 16:51:54 +00:00
Vitor Sessak
9263a05aab Mark "i" parameter of vector_clipf_sse() as early-clobber
Originally committed as revision 19731 to svn://svn.ffmpeg.org/ffmpeg/trunk
2009-08-27 15:52:44 +00:00
Vitor Sessak
50e23ae9d3 Mark parameter src of vector_clipf() as const
Originally committed as revision 19729 to svn://svn.ffmpeg.org/ffmpeg/trunk
2009-08-27 15:38:59 +00:00
Vitor Sessak
0a68cd876e SSE optimized vector_clipf(). 10% faster TwinVQ decoding.
Originally committed as revision 19728 to svn://svn.ffmpeg.org/ffmpeg/trunk
2009-08-27 14:49:36 +00:00
Diego Biurrun
9be6f0d2f8 Do not check for both CONFIG_VC1_DECODER and CONFIG_WMV3_DECODER,
the former depends upon the latter.

Originally committed as revision 19533 to svn://svn.ffmpeg.org/ffmpeg/trunk
2009-07-29 09:54:49 +00:00
Diego Biurrun
99e5a9d1ea Do not redundantly check for both CONFIG_THEORA_DECODER and CONFIG_VP3_DECODER.
The Theora decoder depends on the VP3 decoder.

Originally committed as revision 19492 to svn://svn.ffmpeg.org/ffmpeg/trunk
2009-07-22 22:27:10 +00:00
Carl Eugen Hoyos
36904c4c9f Icc 11.1 still does not align the stack pointer, disable some x264 functions.
Originally committed as revision 19454 to svn://svn.ffmpeg.org/ffmpeg/trunk
2009-07-17 09:07:38 +00:00
Jason Garrett-Glaser
73b02e2460 SSE version of clear_blocks
Originally committed as revision 19206 to svn://svn.ffmpeg.org/ffmpeg/trunk
2009-06-16 17:33:57 +00:00
David Conrad
c21c835b8d avg_ pixel functions need to use (dst+pix+1)>>1 to average with existing
pixels, not (dst+pix)>>1.
This makes the mmx functions bitexact with the C functions.

Originally committed as revision 18527 to svn://svn.ffmpeg.org/ffmpeg/trunk
2009-04-15 19:10:16 +00:00
David Conrad
9bf0fdf378 VC1: extend MMX qpel MC to include MMX2 avg qpel
Originally committed as revision 18519 to svn://svn.ffmpeg.org/ffmpeg/trunk
2009-04-15 02:25:42 +00:00
David Conrad
8013da7364 VC1: add and use avg_no_rnd chroma MC functions
Originally committed as revision 18518 to svn://svn.ffmpeg.org/ffmpeg/trunk
2009-04-14 23:56:10 +00:00
David Conrad
c374691b28 Rename put_no_rnd_h264_chroma* to reflect its usage in VC1 only
Originally committed as revision 18517 to svn://svn.ffmpeg.org/ffmpeg/trunk
2009-04-14 23:55:39 +00:00
Stefano Sabatini
6b4343616c Rename FF_MM_MMXEXT to FF_MM_MMX2, for both clarity and consistency
with libswscale.

Originally committed as revision 18330 to svn://svn.ffmpeg.org/ffmpeg/trunk
2009-04-04 13:20:53 +00:00
Reimar Döffinger
0be9e73e38 Mark line_skip3 asm argument as output-only instead of using av_uninit.
Originally committed as revision 18327 to svn://svn.ffmpeg.org/ffmpeg/trunk
2009-04-03 14:03:49 +00:00
Reimar Döffinger
d7460a9cac Mark put_signed_pixels_clamped_mmx output operands as early-clobber because
they are. Hopefully fixes some FATE errors, too.

Originally committed as revision 18326 to svn://svn.ffmpeg.org/ffmpeg/trunk
2009-04-03 14:02:34 +00:00
Reimar Döffinger
531a3d2721 Use DECLARE_ASM_CONST for non-global ff_vector128 constant used via MANGLE
Originally committed as revision 18325 to svn://svn.ffmpeg.org/ffmpeg/trunk
2009-04-03 14:01:24 +00:00
Alex Converse
3dd6531208 Rewrite put_signed_pixels_clamped_mmx() to eliminate mmx.h from dsputil_mmx.c.
Originally committed as revision 18319 to svn://svn.ffmpeg.org/ffmpeg/trunk
2009-04-02 21:02:42 +00:00
Zuxy Meng
ecb24904fe add SSE2 version of vp6_filter_diag
original patch by Zuxy Meng  zuxy.meng _at_ gmail _dot_ com

Originally committed as revision 17195 to svn://svn.ffmpeg.org/ffmpeg/trunk
2009-02-13 00:02:33 +00:00
Sebastien Lucas
6af3c226c3 add MMX version of vp6_filter_diag
original patch by Sebastien Lucas  sebastien.lucas _at_ gmail _dot_ com

Originally committed as revision 17194 to svn://svn.ffmpeg.org/ffmpeg/trunk
2009-02-12 23:52:52 +00:00
Aurelien Jacobs
5110b25e1e convert ff_pw_64 into an xmm_reg for future use in vp6 sse code
Originally committed as revision 17192 to svn://svn.ffmpeg.org/ffmpeg/trunk
2009-02-12 23:48:07 +00:00
Diego Biurrun
d3a4b4e09c Add check whether the compiler/assembler supports 10 or more operands.
thanks to Loren for some help with the asm statements

Originally committed as revision 17151 to svn://svn.ffmpeg.org/ffmpeg/trunk
2009-02-11 11:16:00 +00:00
Loren Merritt
3daa434a40 ff_add_hfyu_median_prediction_mmx2
overall ffvhuff decoding speedup: 28% on core2, 25% on k8.

Originally committed as revision 17059 to svn://svn.ffmpeg.org/ffmpeg/trunk
2009-02-08 17:45:30 +00:00
David Conrad
137ae32760 Workaround for gcc 3.4 to align sh properly
Originally committed as revision 16797 to svn://svn.ffmpeg.org/ffmpeg/trunk
2009-01-26 03:40:48 +00:00
Diego Biurrun
406792e7b0 cosmetics: Remove pointless period after copyright statement non-sentences.
Originally committed as revision 16684 to svn://svn.ffmpeg.org/ffmpeg/trunk
2009-01-19 15:46:40 +00:00
Aurelien Jacobs
49fb20cb8a replace all occurrence of ENABLE_ by the corresponding CONFIG_, HAVE_ or ARCH_
and remove all ENABLE_ definitions.

Originally committed as revision 16600 to svn://svn.ffmpeg.org/ffmpeg/trunk
2009-01-14 17:19:17 +00:00
Aurelien Jacobs
b250f9c66d Change semantic of CONFIG_*, HAVE_* and ARCH_*.
They are now always defined to either 0 or 1.

Originally committed as revision 16590 to svn://svn.ffmpeg.org/ffmpeg/trunk
2009-01-13 23:44:16 +00:00
Diego Biurrun
c47d146be8 Add missing 'void' keyword to parameterless function declarations.
Originally committed as revision 16436 to svn://svn.ffmpeg.org/ffmpeg/trunk
2009-01-05 13:57:43 +00:00
Mathieu Velten
21ff7689da Use H264 MMX chroma functions to accelerate RV40 decoding.
Patch by Mathieu Velten (matmaul A gmail)

Originally committed as revision 16419 to svn://svn.ffmpeg.org/ffmpeg/trunk
2009-01-04 01:36:11 +00:00
Jason Garrett-Glaser
37fed10087 Add x264 SSE2 iDCT functions to H.264 decoder.
Originally committed as revision 16409 to svn://svn.ffmpeg.org/ffmpeg/trunk
2009-01-03 00:46:17 +00:00
Carl Eugen Hoyos
2c67c65963 Fix h264 decoding on SSE2 cores with icc compilation.
Originally committed as revision 16373 to svn://svn.ffmpeg.org/ffmpeg/trunk
2008-12-28 19:40:13 +00:00
Jason Garrett-Glaser
c1fc70362f Fix compilation without optimization under 64-bit with x264 deblock asm enabled.
Originally committed as revision 16313 to svn://svn.ffmpeg.org/ffmpeg/trunk
2008-12-26 00:19:08 +00:00
Diego Biurrun
a6493a8fbd Rename libavcodec/i386/ --> libavcodec/x86/.
It contains optimizations that are not specific to i386 and
libavutil uses this naming scheme already.

Originally committed as revision 16270 to svn://svn.ffmpeg.org/ffmpeg/trunk
2008-12-22 09:12:42 +00:00