ed36720b66
This patch followed "Add filter_selectively_vert_row2 to enable parallel loopfiltering" commit, and added x86 SSE2 optimization to do 16-pixel filtering in parallel. For other optimizations (neon and dspr2), current 16-pixel functions were done by calling 8-pixel functions twice, and real 16-pixel functions could be added later. Decoder speedup: tulip clip: 2% speed gain; old_town_cross: 1.2% speed gain; bus: 2% speed gain. Change-Id: I4818a0c72f84b34f5fe678e496cf4a10238574b7 |
||
---|---|---|
.. | ||
vp9_asm_stubs.c | ||
vp9_copy_sse2.asm | ||
vp9_idct_intrin_sse2.c | ||
vp9_intrapred_sse2.asm | ||
vp9_intrapred_ssse3.asm | ||
vp9_loopfilter_intrin_avx2.c | ||
vp9_loopfilter_intrin_sse2.c | ||
vp9_loopfilter_mmx.asm | ||
vp9_postproc_mmx.asm | ||
vp9_postproc_sse2.asm | ||
vp9_postproc_x86.h | ||
vp9_subpixel_8t_sse2.asm | ||
vp9_subpixel_8t_ssse3.asm |