vpx/vp9/common/x86
Jingning Han 9d67495f72 Optimize 32x32 2D inverse DCT for speed-up
This commit exploits the sparsity of quantized coefficient matrix.
It detects each 32x8 array and skip the corresponding inverse
transformation if all entries are zero.

For ped1080p at 8000 kbps, this on average reduces the runtime of
32x32 inverse 2D-DCT SSE2 function from 6256 cycles -> 5200
cycles. It makes the overall encoding process about 2% faster at
speed 0. The speed-up is more pronounceable for the decoding process.

Change-Id: If20056c3566bd117642a76f8884c83e8bc8efbcf
2013-07-31 17:13:31 -07:00
..
vp9_asm_stubs.c convolve8 optimizations for neon 2013-07-11 11:08:19 -07:00
vp9_copy_sse2.asm Replace copy_memNxM functions with a generic copy/avg function. 2013-07-10 18:27:24 -07:00
vp9_idct_intrin_sse2.c Optimize 32x32 2D inverse DCT for speed-up 2013-07-31 17:13:31 -07:00
vp9_intrapred_sse2.asm SSE/SSE2 assembly for 4x4/8x8/16x16/32x32 TM intra prediction. 2013-07-10 09:28:03 -07:00
vp9_intrapred_ssse3.asm d45 intra prediction SSSE3 optimizations. 2013-07-26 13:30:02 -07:00
vp9_loopfilter_intrin_sse2.c vp9_loopfilter_intrin_sse2: cosmetics: fix indent 2013-07-16 13:09:16 -07:00
vp9_loopfilter_mmx.asm Removing unused simple loopfilter code. 2013-05-10 11:04:43 -07:00
vp9_postproc_mmx.asm Code cleanup: lower case variable names. 2013-03-20 16:41:30 -07:00
vp9_postproc_sse2.asm Code cleanup: lower case variable names. 2013-03-20 16:41:30 -07:00
vp9_postproc_x86.h google style guide include guards 2012-11-30 07:30:59 -08:00
vp9_subpixel_8t_ssse3.asm convolve: support larger blocks, fix asm saturation bug 2013-04-18 13:57:59 -07:00