Christophe Gisquet 08e3ea60ff x86: synth filter float: implement SSE2 version
Timings for Arrandale:
          C    SSE
win32:  2108   334
win64:  1152   322

Factorizing the inner loop with a call/jmp is a >15 cycles cost, even with
the jmp destination being aligned.

Unrolling for ARCH_X86_64 is a 20 cycles gain.

Signed-off-by: Janne Grunau <janne-libav@jannau.net>
2014-02-28 13:00:48 +01:00
2014-02-26 08:09:58 +01:00
2014-02-21 15:46:10 +01:00
2014-02-24 17:30:48 +01:00
2014-02-24 17:30:48 +01:00
2013-07-07 21:43:23 +02:00
2014-02-26 08:09:58 +01:00
2013-10-22 10:49:31 +02:00
2011-03-16 21:54:39 +01:00
2011-04-07 02:54:12 +02:00
2014-02-12 13:13:17 +00:00
2014-02-15 16:49:04 -05:00

Libav README
------------

1) Documentation
----------------

* Read the documentation in the doc/ directory.

2) Licensing
------------

* See the LICENSE file.
Description
No description provided
Readme 173 MiB
Languages
C 92.1%
Assembly 6%
Makefile 1.2%
C++ 0.3%
Objective-C 0.2%
Other 0.1%