Commit Graph

152 Commits

Author SHA1 Message Date
Ronald S. Bultje
c60ed66dbe Revert r24339 (it causes fate failures on x86-64) - I'll figure out what's
wrong with it tomorrow or so, then re-submit.

Originally committed as revision 24341 to svn://svn.ffmpeg.org/ffmpeg/trunk
2010-07-19 23:57:09 +00:00
Ronald S. Bultje
6526976f0c Remove FF_MM_SSE2/3 flags for CPUs where this is generally not faster than
regular MMX code. Examples of this are the Core1 CPU. Instead, set a new flag,
FF_MM_SSE2/3SLOW, which can be checked for particular SSE2/3 functions that
have been checked specifically on such CPUs and are actually faster than
their MMX counterparts.

In addition, use this flag to enable particular VP8 and LPC SSE2 functions
that are faster than their MMX counterparts.

Based on a patch by Loren Merritt <lorenm AT u washington edu>.

Originally committed as revision 24340 to svn://svn.ffmpeg.org/ffmpeg/trunk
2010-07-19 22:38:23 +00:00
Ronald S. Bultje
1878f685c0 Implement chroma (width=8) inner loopfilter MMX/MMX2/SSE2 functions.
Originally committed as revision 24339 to svn://svn.ffmpeg.org/ffmpeg/trunk
2010-07-19 21:53:28 +00:00
Ronald S. Bultje
fb9bdf048c Be more efficient with registers or stack memory. Saves 8/16 bytes stack
for x86-32, or 2 MM registers on x86-64.

Originally committed as revision 24338 to svn://svn.ffmpeg.org/ffmpeg/trunk
2010-07-19 21:45:36 +00:00
Ronald S. Bultje
3facfc99da Change function prototypes for width=8 inner and mbedge loopfilter functions
so that it does both U and V planes at the same time. This will have speed
advantages when using SSE2 (or higher) optimizations, since we can do both
the U and V rows together in a single xmm register.

This also renames filter16 to filter16y and filter8 to filter8uv so that it's
more obvious what each function is used for.

Originally committed as revision 24337 to svn://svn.ffmpeg.org/ffmpeg/trunk
2010-07-19 21:18:04 +00:00
Loren Merritt
1ee076b1b1 more credits to D. J. Bernstein for fft
Originally committed as revision 24308 to svn://svn.ffmpeg.org/ffmpeg/trunk
2010-07-18 20:06:42 +00:00
Ronald S. Bultje
819b2dd2b1 Attempt to fix x86-64 testsuite on fate.
Originally committed as revision 24275 to svn://svn.ffmpeg.org/ffmpeg/trunk
2010-07-16 21:35:30 +00:00
Ronald S. Bultje
6f323f1251 Remove duplicate define.
Originally committed as revision 24272 to svn://svn.ffmpeg.org/ffmpeg/trunk
2010-07-16 19:54:47 +00:00
Ronald S. Bultje
889b2c26ee Revert 24270, it contained some stuff that shouldn't have been in there.
Originally committed as revision 24271 to svn://svn.ffmpeg.org/ffmpeg/trunk
2010-07-16 19:54:25 +00:00
Ronald S. Bultje
2356a7834b Remove duplicate define.
Originally committed as revision 24270 to svn://svn.ffmpeg.org/ffmpeg/trunk
2010-07-16 19:42:32 +00:00
Ronald S. Bultje
ede1b9665a Give x86 r%d registers names, this will simplify implementation of the chroma
inner loopfilter, and it also allows us to save one register on x86-64/sse2.

Originally committed as revision 24269 to svn://svn.ffmpeg.org/ffmpeg/trunk
2010-07-16 19:38:10 +00:00
Ronald S. Bultje
526e831a46 Change return statement, the REP_RET is a mistake since the else case (x86-64,
sse2) doesn't actually loop, so REP_RET isn't necessary.

Originally committed as revision 24268 to svn://svn.ffmpeg.org/ffmpeg/trunk
2010-07-16 18:29:14 +00:00
Ronald S. Bultje
a711eb4829 VP8 H/V inner loopfilter MMX/MMXEXT/SSE2 optimizations.
Originally committed as revision 24250 to svn://svn.ffmpeg.org/ffmpeg/trunk
2010-07-15 23:02:34 +00:00
David Conrad
faa26db28b MMX/SSE VC1 loop filter
Originally committed as revision 24208 to svn://svn.ffmpeg.org/ffmpeg/trunk
2010-07-11 22:53:01 +00:00
David Conrad
7af8fbd348 Make ff_pw_4 128 bits
Originally committed as revision 24207 to svn://svn.ffmpeg.org/ffmpeg/trunk
2010-07-11 22:52:55 +00:00
Vitor Sessak
881fd7a62f Move SSE optimized 32-point DCT to its own file. Should fix breakage with YASM
disabled.

Originally committed as revision 24078 to svn://svn.ffmpeg.org/ffmpeg/trunk
2010-07-06 17:48:23 +00:00
Vitor Sessak
4dcc4f8eaa SSE optimized 32-point DCT
Originally committed as revision 24077 to svn://svn.ffmpeg.org/ffmpeg/trunk
2010-07-06 16:58:54 +00:00
Ronald S. Bultje
f2a30bd840 Simple H/V loopfilter for VP8 in MMX, MMX2 and SSE2 (yay for yasm macros).
Originally committed as revision 24029 to svn://svn.ffmpeg.org/ffmpeg/trunk
2010-07-03 19:26:30 +00:00
Jason Garrett-Glaser
b06855f18a SSSE3 versions of vp8 width4 bilinear MC functions
Originally committed as revision 24013 to svn://svn.ffmpeg.org/ffmpeg/trunk
2010-07-03 00:48:12 +00:00
Jason Garrett-Glaser
dcc602d802 SSSE3 versions of width4 VP8 6-tap MC functions
Also make some small changes to saturation order of 4-tap SSSE3 MC to fix a
non-bitexactness bug.

Patch mostly by Eli Friedman <eli.friedman AT gmail DOT com>.

Originally committed as revision 23965 to svn://svn.ffmpeg.org/ffmpeg/trunk
2010-07-02 05:27:41 +00:00
Jason Garrett-Glaser
8434fc26eb Fix 100L in vp8dsp asm init
Originally committed as revision 23946 to svn://svn.ffmpeg.org/ffmpeg/trunk
2010-07-01 22:09:22 +00:00
Jason Garrett-Glaser
17dc7c7a60 Fix h264/vp8 intra pred on Athlon XP
Whose idea was it to have a CPU that didn't SIGILL on an invalid instruction?

Originally committed as revision 23927 to svn://svn.ffmpeg.org/ffmpeg/trunk
2010-07-01 10:29:47 +00:00
Måns Rullgård
49bd8e4b84 Fix grammar errors in documentation
Originally committed as revision 23904 to svn://svn.ffmpeg.org/ffmpeg/trunk
2010-06-30 15:38:06 +00:00
Jason Garrett-Glaser
82a8d0f114 Use add instead of lshift in mmxext vp8 idct
Originally committed as revision 23891 to svn://svn.ffmpeg.org/ffmpeg/trunk
2010-06-29 17:23:17 +00:00
Ronald S. Bultje
565344e7e4 Remove unused macros (duplicates from the now-LGPL x86util.asm).
Originally committed as revision 23890 to svn://svn.ffmpeg.org/ffmpeg/trunk
2010-06-29 17:04:29 +00:00
Ronald S. Bultje
2dd2f71692 MMX idct_add for VP8.
Originally committed as revision 23886 to svn://svn.ffmpeg.org/ffmpeg/trunk
2010-06-29 14:43:11 +00:00
Jason Garrett-Glaser
29e719377f Add missing mm_support call toff_h264_pred_init_x86.
I'm not sure if this is supposed to be here, but it can't hurt.

Originally committed as revision 23885 to svn://svn.ffmpeg.org/ffmpeg/trunk
2010-06-29 12:28:06 +00:00
Jason Garrett-Glaser
004cda8e79 Add mmxext version of VP8 DC Hadamard transform
Originally committed as revision 23878 to svn://svn.ffmpeg.org/ffmpeg/trunk
2010-06-29 01:41:59 +00:00
Jason Garrett-Glaser
37355fe823 Make x86util.asm LGPL so we can use it in LGPL asm
Strip out most x264-specific stuff (not used anywhere in ffmpeg).

Originally committed as revision 23877 to svn://svn.ffmpeg.org/ffmpeg/trunk
2010-06-29 00:40:12 +00:00
Jason Garrett-Glaser
bc14f04b2f MMXEXT version of vp8 4x4 vertical pred
Originally committed as revision 23876 to svn://svn.ffmpeg.org/ffmpeg/trunk
2010-06-29 00:23:52 +00:00
Jason Garrett-Glaser
fb9927ad7d Add mmx/mmxext/ssse3 4x4 TM intra pred functions for vp8
Originally committed as revision 23875 to svn://svn.ffmpeg.org/ffmpeg/trunk
2010-06-28 23:53:07 +00:00
Jason Garrett-Glaser
8b746bb473 Add missing comment header for predict_4x4_dc_mmxext
Originally committed as revision 23874 to svn://svn.ffmpeg.org/ffmpeg/trunk
2010-06-28 23:37:24 +00:00
Jason Garrett-Glaser
270a85d259 Fix some intra pred MMX functions that used MMXEXT instructions
Also add predict_4x4_dc MMXEXT function for vp8/h264.

Originally committed as revision 23873 to svn://svn.ffmpeg.org/ffmpeg/trunk
2010-06-28 23:35:17 +00:00
Jason Garrett-Glaser
a912da761d Fix VP8 bilinear mc on x86_64
Originally committed as revision 23872 to svn://svn.ffmpeg.org/ffmpeg/trunk
2010-06-28 22:13:14 +00:00
Baptiste Coudurier
50f70541d3 Change MMXEXT to MMX2, MMXEXT is deprecated
Originally committed as revision 23865 to svn://svn.ffmpeg.org/ffmpeg/trunk
2010-06-28 21:12:00 +00:00
Jason Garrett-Glaser
0fecad09fe Add x86 asm functions for VP8 put_pixels
Originally committed as revision 23858 to svn://svn.ffmpeg.org/ffmpeg/trunk
2010-06-28 19:14:40 +00:00
Jason Garrett-Glaser
a173aa8940 Add MMX, SSE2, SSSE3 asm for VP8 bilinear MC
Originally committed as revision 23857 to svn://svn.ffmpeg.org/ffmpeg/trunk
2010-06-28 18:56:24 +00:00
Måns Rullgård
1f65b67c46 Fix x86 build with h264dsp disabled
Originally committed as revision 23844 to svn://svn.ffmpeg.org/ffmpeg/trunk
2010-06-28 10:02:15 +00:00
Eli Friedman
b3858964d6 Add const to some pointer parameters.
Patch by Eli Friedman,  eli D friedman A gmail

Originally committed as revision 23826 to svn://svn.ffmpeg.org/ffmpeg/trunk
2010-06-27 15:11:38 +00:00
David Conrad
30bdefd1de Fix build without yasm
Originally committed as revision 23816 to svn://svn.ffmpeg.org/ffmpeg/trunk
2010-06-27 02:52:43 +00:00
Jason Garrett-Glaser
0178d14fe5 First shot at VP8 optimizations:
- MMXEXT, SSE2 and SSSE3 MC functions
- MMX and SSE4 IDCT dc_add functions

Patch by Jason Garrett-Glaser <darkshikari gmail com> and myself.

Originally committed as revision 23815 to svn://svn.ffmpeg.org/ffmpeg/trunk
2010-06-27 02:01:45 +00:00
Måns Rullgård
0912db0206 Make vp8 select h264dsp and use this to pull in mmx intrapred
Originally committed as revision 23790 to svn://svn.ffmpeg.org/ffmpeg/trunk
2010-06-25 19:10:08 +00:00
Carl Eugen Hoyos
0c59074868 Fix compilation without --enable-gpl.
Originally committed as revision 23789 to svn://svn.ffmpeg.org/ffmpeg/trunk
2010-06-25 19:06:29 +00:00
Carl Eugen Hoyos
96da2a6967 Cosmetics: Fix indentation.
Originally committed as revision 23785 to svn://svn.ffmpeg.org/ffmpeg/trunk
2010-06-25 18:34:03 +00:00
Jason Garrett-Glaser
4af8cdfc3f 16x16 and 8x8c x86 SIMD intra pred functions for VP8 and H.264
Originally committed as revision 23783 to svn://svn.ffmpeg.org/ffmpeg/trunk
2010-06-25 18:25:49 +00:00
Vitor Sessak
89c7d8058c Fix compilation on x64.
Originally committed as revision 23753 to svn://svn.ffmpeg.org/ffmpeg/trunk
2010-06-24 08:53:32 +00:00
Vitor Sessak
57dbd12b6d Fix asm constraints in apply_window()
Originally committed as revision 23752 to svn://svn.ffmpeg.org/ffmpeg/trunk
2010-06-24 08:46:47 +00:00
Vitor Sessak
bc2b368215 SSE-optimized MP3 floating point windowing functions
Originally committed as revision 23750 to svn://svn.ffmpeg.org/ffmpeg/trunk
2010-06-24 07:44:50 +00:00
Jason Garrett-Glaser
2966cc1849 Update x264asm header files to latest versions.
Modify the asm accordingly.
GLOBAL is now no longoer necessary for PIC-compliant loads.

Originally committed as revision 23739 to svn://svn.ffmpeg.org/ffmpeg/trunk
2010-06-23 19:20:46 +00:00
David Conrad
413abbe164 Add bitexact versions of put_no_rnd_pixels8 _x2 and _y2 for vp3/theora
Originally committed as revision 23463 to svn://svn.ffmpeg.org/ffmpeg/trunk
2010-06-04 04:46:26 +00:00