Commit Graph

12 Commits

Author SHA1 Message Date
Ronald S. Bultje
baffa091af Implement a SIMD version of emulated_edge_mc() for x86.
From ~550 cycles (C version) to 170 (SSE/x86-64), 206 (MMX/x86-32)
and 196 (SSE2/x86-32) cycles.
(cherry picked from commit 81f2a3f4ff)
2011-02-02 03:40:49 +01:00
Jason Garrett-Glaser
2966cc1849 Update x264asm header files to latest versions.
Modify the asm accordingly.
GLOBAL is now no longoer necessary for PIC-compliant loads.

Originally committed as revision 23739 to svn://svn.ffmpeg.org/ffmpeg/trunk
2010-06-23 19:20:46 +00:00
Alex Converse
3deb53849e Implement an sse version of scalarproduct_float().
Originally committed as revision 21386 to svn://svn.ffmpeg.org/ffmpeg/trunk
2010-01-22 23:07:58 +00:00
Loren Merritt
758c7455f1 fix a crash in ape decoding on x86_32 sse2
Originally committed as revision 20777 to svn://svn.ffmpeg.org/ffmpeg/trunk
2009-12-08 21:24:01 +00:00
Loren Merritt
a4605efdf5 slightly faster scalarproduct_and_madd_int16_ssse3 on penryn, no change on conroe
Originally committed as revision 20743 to svn://svn.ffmpeg.org/ffmpeg/trunk
2009-12-05 17:53:11 +00:00
Loren Merritt
b1159ad928 refactor and optimize scalarproduct
29-105% faster apply_filter, 6-90% faster ape decoding on core2
(Any x86 other than core2 probably gets much less, since this is mostly due to ssse3 cachesplit avoidance and I haven't written the full gamut of other cachesplit modes.)
9-123% faster ape decoding on G4.

Originally committed as revision 20739 to svn://svn.ffmpeg.org/ffmpeg/trunk
2009-12-05 15:09:10 +00:00
Loren Merritt
b10fa1bb8b port ape dsp functions from sse2 to mmx
now requires yasm

Originally committed as revision 20722 to svn://svn.ffmpeg.org/ffmpeg/trunk
2009-12-03 18:53:12 +00:00
Loren Merritt
b07781b6e4 fix linking on systems with a function name prefix (10l in r20287)
Originally committed as revision 20294 to svn://svn.ffmpeg.org/ffmpeg/trunk
2009-10-18 21:44:03 +00:00
Loren Merritt
e17ccf60fe huffyuv: add some const qualifiers
Originally committed as revision 20290 to svn://svn.ffmpeg.org/ffmpeg/trunk
2009-10-18 20:47:25 +00:00
Loren Merritt
2f77923d72 simd add_hfyu_left_prediction
2.2x faster than C on conroe, 3.6x on penryn.
4-6% faster huffyuv decoding if using left or plane mode and yuv

Originally committed as revision 20287 to svn://svn.ffmpeg.org/ffmpeg/trunk
2009-10-18 20:10:10 +00:00
Loren Merritt
3daa434a40 ff_add_hfyu_median_prediction_mmx2
overall ffvhuff decoding speedup: 28% on core2, 25% on k8.

Originally committed as revision 17059 to svn://svn.ffmpeg.org/ffmpeg/trunk
2009-02-08 17:45:30 +00:00
Diego Biurrun
a6493a8fbd Rename libavcodec/i386/ --> libavcodec/x86/.
It contains optimizations that are not specific to i386 and
libavutil uses this naming scheme already.

Originally committed as revision 16270 to svn://svn.ffmpeg.org/ffmpeg/trunk
2008-12-22 09:12:42 +00:00