ffmpeg/libavutil/x86
Christophe Gisquet 996697e266 x86: float dsp: unroll SSE versions
vector_fmul and vector_fmac_scalar are guaranteed that they can process in
batch of 16 elements, but their SSE versions only does 8 at a time.

Therefore, unroll them a bit.
299 to 261c for 256 elements in vector_fmac_scalar on Arrandale/Win64.

Signed-off-by: Janne Grunau <janne-libav@jannau.net>
2014-02-20 14:18:05 +01:00
..
asm.h dsputil: Make dsputil selectable 2013-04-10 11:04:05 +03:00
bswap.h x86: place some inline asm under #if HAVE_INLINE_ASM 2012-06-25 13:23:12 +01:00
cpu.c libavutil: x86: Add AVX2 capable CPU detection. 2013-10-25 19:36:55 +01:00
cpu.h libavutil: x86: Add AVX2 capable CPU detection. 2013-10-25 19:36:55 +01:00
cpuid.asm x86: include x86inc.asm in x86util.asm 2012-10-31 00:37:42 +01:00
emms.asm x86: Add a Yasm-based emms() replacement 2013-01-18 22:02:13 +01:00
emms.h avutil: Ensure that emms_c is always defined, even on non-x86 2013-02-14 19:29:04 +01:00
float_dsp_init.c Consistently use "cpu_flags" as variable/parameter name for CPU flags 2013-07-18 00:31:35 +02:00
float_dsp.asm x86: float dsp: unroll SSE versions 2014-02-20 14:18:05 +01:00
intreadwrite.h Replace FFmpeg with Libav in licence headers 2011-03-19 13:33:20 +00:00
lls_init.c x86: lpc: simd av_evaluate_lls 2013-06-29 13:23:57 +02:00
lls.asm lls/x86: use 3-operator vaddpd in ADDPD_MEM 2013-07-02 10:15:09 +02:00
Makefile x86: lpc: simd av_update_lls 2013-06-29 13:23:57 +02:00
timer.h avutil: Fix compilation with inline asm disabled on mingw 2013-09-22 00:50:32 +03:00
w64xmmtest.h Add more missing includes after removing the implicit common.h 2012-08-16 10:49:54 +03:00
x86inc.asm x86inc: Speed up assembling with Yasm 2014-01-26 18:40:08 +01:00
x86util.asm x86inc: FMA3/4 Support 2013-10-14 12:41:54 +01:00