f1342a7b07
This function now has an AVX intrinsics version which is about 80% faster compared to the C implementation. This provides a 2-4% total speed-up for encode, depending on encoding parameters. The function utilizes 3 properties of the cost function lookup table, constructed in 'cal_nmvjointsadcost' and 'cal_nmvsadcosts'. For the joint cost: - mvjointsadcost[1] == mvjointsadcost[2] == mvjointsadcost[3] For the component costs: - For all i: mvsadcost[0][i] == mvsadcost[1][i] (equal per component cost) - For all i: mvsadcost[0][i] == mvsadcost[0][-i] (Cost function is even) These must hold, otherwise the AVX version of the function cannot be used. Change-Id: I184055b864c5a2dc37b2d8c5c9012eb801e9daf6 |
||
---|---|---|
.. | ||
vp9_avg_intrin_sse2.c | ||
vp9_dct_mmx.asm | ||
vp9_dct_sse2.c | ||
vp9_dct_ssse3_x86_64.asm | ||
vp9_dct_ssse3.c | ||
vp9_denoiser_sse2.c | ||
vp9_diamond_search_sad_avx.c | ||
vp9_error_intrin_avx2.c | ||
vp9_error_sse2.asm | ||
vp9_highbd_block_error_intrin_sse2.c | ||
vp9_highbd_error_avx.asm | ||
vp9_highbd_error_sse2.asm | ||
vp9_quantize_sse2.c | ||
vp9_quantize_ssse3_x86_64.asm | ||
vp9_temporal_filter_apply_sse2.asm |