FFT: factor a shuffle out of the inner loop and merge it into fft_permute.

6% faster SSE FFT on Conroe, 2.5% on Penryn.

Signed-off-by: Janne Grunau <janne-ffmpeg@jannau.net>
(cherry picked from commit e6b1ed693a)
This commit is contained in:
Loren Merritt
2011-02-12 11:48:16 +00:00
committed by Michael Niedermayer
parent 709946b34c
commit 11ab1e409f
6 changed files with 45 additions and 38 deletions

View File

@@ -30,6 +30,7 @@ av_cold void ff_fft_init_mmx(FFTContext *s)
s->imdct_half = ff_imdct_half_sse;
s->fft_permute = ff_fft_permute_sse;
s->fft_calc = ff_fft_calc_sse;
s->fft_permutation = FF_FFT_PERM_SWAP_LSBS;
} else if (has_vectors & AV_CPU_FLAG_3DNOWEXT && HAVE_AMD3DNOWEXT) {
/* 3DNowEx for K7 */
s->imdct_calc = ff_imdct_calc_3dn2;