generic-library/vpx

Author	SHA1	Message	Date
Johann	eb88b172fe	Make vp9 subpixel match vp8 The only difference between the two was that the vp9 function allowed for every step in the bilinear filter (16 steps) while vp8 only allowed for half of those. Since all the call sites in vp9 (<< 1) the input, it only ever used the same steps as vp8. This will allow moving the subpel variance to vpx_dsp with the rest of the variance functions. Change-Id: I6fa2509350a2dc610c46b3e15bde98a15a084b75	2015-06-03 22:10:51 -07:00
James Zern	330fba41e2	vp9 intrinsics: add vp9_rtcd include silences a missing declaration warning Change-Id: I59a34e1a1377cf3529b678d7ec0122bd43ab1bf1	2015-05-15 10:43:47 -07:00
levytamar82	efdfdf5787	32 Align Load bug In the sub_pixel_avg_variance the parameter sec was also aligned load and changed to unaligned. Change-Id: I4d4966e0291059ea4d705baed1503dc58444fcb7	2014-08-14 14:07:28 -07:00
levytamar82	69a5f5ecf7	Fix bug 807 in the sub_pixel_variance function the dst is aligned to 16 bytes and not to 32 bytes - now load unaligned data Change-Id: I2e0b9745543697efc56fefa32857ea10117af135	2014-08-07 18:51:02 -07:00
levytamar82	ea14909687	AVX2 SubPixel AVG Variance Optimization Optimizing 2 functions to process 32 elements in parallel instead of 16: 1. vp9_sub_pixel_avg_variance64x64 2. vp9_sub_pixel_avg_variance32x32 both of those function were calling vp9_sub_pixel_avg_variance16xh_ssse3 instead of calling that function, it calls vp9_sub_pixel_avg_variance32xh_avx2 that is written in avx2 and process 32 elements in parallel. This Optimization gave 80% function level gain and 2% user level gain Change-Id: Iea694654e1b7612dc6ed11e2626208c2179502c8	2014-02-28 22:51:04 -07:00
James Zern	d12b39daab	vp9_subpel_variance_impl_intrin_avx2.c: make some tables static + fix formatting Change-Id: I7b4ec11b7b46d8926750e0b69f7a606f3ab80895	2014-02-18 20:42:49 -08:00
levytamar82	52dac5d1cb	AVX2 SubPixel Variance Optimization Optimizing 2 functions to process 32 elements in parallel instead of 16: 1. vp9_sub_pixel_variance64x64 2. vp9_sub_pixel_variance32x32 both of those function were calling vp9_sub_pixel_variance16xh_ssse3 instead of calling that function, it calls vp9_sub_pixel_variance32xh_avx2 that is written in avx2 and process 32 elements in parallel. This Optimization gave 70% function level gain and 2% user level gain Change-Id: I4f5cb386b346ff6c878a094e1c3b37e418e50bde	2014-02-14 16:59:11 -07:00

7 Commits