Christophe Gisquet 133b34207c x86: float dsp: unroll SSE versions
vector_fmul and vector_fmac_scalar are guaranteed that they can process in
batch of 16 elements, but their SSE versions only does 8 at a time.

Therefore, unroll them a bit.
299 to 261c for 256 elements in vector_fmac_scalar on Arrandale/Win64.

Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
2014-02-15 18:54:21 +01:00
..
2013-04-10 11:04:05 +03:00
2014-02-11 03:46:52 +01:00
2013-02-17 00:18:16 +01:00
2013-06-29 13:23:57 +02:00