Loren Merritt b1159ad928 refactor and optimize scalarproduct
29-105% faster apply_filter, 6-90% faster ape decoding on core2
(Any x86 other than core2 probably gets much less, since this is mostly due to ssse3 cachesplit avoidance and I haven't written the full gamut of other cachesplit modes.)
9-123% faster ape decoding on G4.

Originally committed as revision 20739 to svn://svn.ffmpeg.org/ffmpeg/trunk
2009-12-05 15:09:10 +00:00
..
2009-05-26 05:00:51 +00:00
2009-08-24 10:59:14 +00:00
2009-05-25 22:19:35 +00:00