Christophe GISQUET 34454c761f SBR DSP x86: implement SSE sbr_sum_square_sse
The 32bits targets have been compiled with -mfpmath=sse for proper reference.
sbr_sum_square C  /32bits: 82c (unrolled)/102c
               C  /64bits: 69c (unrolled)/82c
               SSE/32bits: 42c
               SSE/64bits: 31c

Use of SSE4.1 dpps to perform the final sum is slower.
Not unrolling to perform 8 operations in a loop yields 10 more cycles.

Signed-off-by: Ronald S. Bultje <rsbultje@gmail.com>
2012-02-23 15:50:06 -08:00
2012-02-23 19:36:16 +01:00
2012-02-23 19:36:16 +01:00
2012-01-26 00:04:28 +02:00
2012-02-23 19:36:16 +01:00
2012-02-21 16:45:34 +01:00
2012-02-23 19:36:16 +01:00
2012-02-23 19:36:16 +01:00
2012-02-23 19:50:46 +01:00
2011-03-16 21:54:39 +01:00
2011-04-07 02:54:12 +02:00
2012-02-23 19:36:16 +01:00
2012-02-23 19:36:16 +01:00
2012-01-21 14:54:31 +01:00

Libav README
------------

1) Documentation
----------------

* Read the documentation in the doc/ directory.

2) Licensing
------------

* See the LICENSE file.
Description
No description provided
Readme 173 MiB
Languages
C 92.1%
Assembly 6%
Makefile 1.2%
C++ 0.3%
Objective-C 0.2%
Other 0.1%