3068d7d944
Optimizing all SSSE3 assembly for convolution: 1. vp9_filter_block1d4_h8_sse2 2. vp9_filter_block1d8_h8_sse2 3. vp9_filter_block1d16_h8_sse2 4. vp9_filter_block1d4_v8_sse2 5. vp9_filter_block1d8_v8_sse2 6. vp9_filter_block1d16_v8_sse2 my optimization include: -processing 2x8 elements in one 128 bit register instead of processing 8 elements in one 128 bit register. -removing unecessary loads. This optimization gives between 2.4% user level gain for 480p input and 1.6% user level gain for 720p. This Optimization is done only for 64 bit Change-Id: Ic07fce2f9360329b4f2d956efda1480ae958766b |
||
---|---|---|
.. | ||
vp9_asm_stubs.c | ||
vp9_copy_sse2.asm | ||
vp9_idct_intrin_sse2.c | ||
vp9_intrapred_sse2.asm | ||
vp9_intrapred_ssse3.asm | ||
vp9_loopfilter_intrin_avx2.c | ||
vp9_loopfilter_intrin_sse2.c | ||
vp9_loopfilter_mmx.asm | ||
vp9_postproc_mmx.asm | ||
vp9_postproc_sse2.asm | ||
vp9_postproc_x86.h | ||
vp9_subpixel_8t_intrin_avx2.c | ||
vp9_subpixel_8t_intrin_ssse3.c | ||
vp9_subpixel_8t_sse2.asm | ||
vp9_subpixel_8t_ssse3.asm | ||
vp9_subpixel_bilinear_sse2.asm | ||
vp9_subpixel_bilinear_ssse3.asm |