generic-library/vpx

Author	SHA1	Message	Date
Yaowu Xu	7e89c102c4	vp9-highbitdepth -> vpx-highbitdepth Change-Id: I1e90cf7ab4bb02c0ef119b0bd1596771edefedff	2016-08-05 15:41:33 -07:00
Yaowu Xu	3bd709fafe	Remove vp8, vp9 folders Change-Id: I09b8acd22d031ece52e1fee18b998349bf1cf06b	2016-07-28 14:33:21 +00:00
Yi Luo	81ad95363a	Convolution vertical filter SSSE3 optimization - Apply 8-pixel vertical filtering direction parallelism. - Add unit tests to verify bit exact. - Encoder speed improves ~29% (enable EXT_INTERP) on Xeon E5-2680. - Combinational cycle count of vp10_convolve() drops from 26.06% to 6.73%. Change-Id: Ic1ae48f8fb1909991577947a8c00d07832737e57	2016-06-23 12:56:47 -07:00
Yi Luo	229690a95c	Convolution horizontal filter SSSE3 optimization - Apply signal direction/4-pixel vertical/8-pixel vertical parallelism. - Add unit test to verify the bit exact result. - Overall encoding time improves ~24% on Xeon E5-2680 CPU. Change-Id: I104dcbfd43451476fee1f94cd16ca5f965878e59	2016-06-20 11:10:30 -07:00
Jingning Han	9de916eb20	Fix dual filter type for high bit-depth This commit fixes the compiler error in high bit-depth inter predictor when dual filter type experiment is turned on. Change-Id: I404a76a246477f2fcffc38a3275007d5dfe229cd	2016-05-09 02:14:48 +00:00
Jingning Han	bd33326372	Dual prediction filter type for motion compensated reference Make the bit-stream level support per direction filter type coding for motion compensated reference. Change-Id: I61a2360b301075f6734cfd9711b7ae68f214174d	2016-05-07 03:03:04 +00:00
Angie Chiang	c0f708c03a	Merge "convolve8 sse2 test" into nextgenv2	2016-03-11 19:57:30 +00:00
Debargha Mukherjee	bab2912b5e	Some refactoring and cleanups of interp filter Includes various cosmetic changes and refactoring including naming the sharp filters differently (since they are no longer 8-tap). Change-Id: Ida5a19ca0daa9f6a64a6734394c685b2a4a2564a	2016-02-26 15:42:49 -08:00
Angie Chiang	8878fa4f9a	convolve8 sse2 test This experiment shows that when frame size is 64x64 vpx_highbd_convolve8_sse2 and vpx_convolve8_sse2's speed are similar. However when frame size becomes 1024x1024 vpx_highbd_convolve8_sse2 is around 50% slower than vpx_convolve8_sse2 we think the bottleneck is from memory IO VP10ConvolveTest.vpx_highbd_convolve8_sse2_speed_8_64 VP10ConvolveTest.vpx_highbd_convolve8_sse2_speed_8_64 (17 ms) VP10ConvolveTest.vpx_highbd_convolve8_sse2_speed_16_64 VP10ConvolveTest.vpx_highbd_convolve8_sse2_speed_16_64 (42 ms) VP10ConvolveTest.vpx_highbd_convolve8_sse2_speed_32_64 VP10ConvolveTest.vpx_highbd_convolve8_sse2_speed_32_64 (139 ms) VP10ConvolveTest.vpx_highbd_convolve8_sse2_speed_64_64 VP10ConvolveTest.vpx_highbd_convolve8_sse2_speed_64_64 (499 ms) VP10ConvolveTest.vpx_convolve8_sse2_speed_l_8_64 VP10ConvolveTest.vpx_convolve8_sse2_speed_l_8_64 (16 ms) VP10ConvolveTest.vpx_convolve8_sse2_speed_l_16_64 VP10ConvolveTest.vpx_convolve8_sse2_speed_l_16_64 (40 ms) VP10ConvolveTest.vpx_convolve8_sse2_speed_l_32_64 VP10ConvolveTest.vpx_convolve8_sse2_speed_l_32_64 (130 ms) VP10ConvolveTest.vpx_convolve8_sse2_speed_l_64_64 VP10ConvolveTest.vpx_convolve8_sse2_speed_l_64_64 (485 ms) VP10ConvolveTest.vpx_highbd_convolve8_sse2_speed_8_1024 VP10ConvolveTest.vpx_highbd_convolve8_sse2_speed_8_1024 (32 ms) VP10ConvolveTest.vpx_highbd_convolve8_sse2_speed_16_1024 VP10ConvolveTest.vpx_highbd_convolve8_sse2_speed_16_1024 (61 ms) VP10ConvolveTest.vpx_highbd_convolve8_sse2_speed_32_1024 VP10ConvolveTest.vpx_highbd_convolve8_sse2_speed_32_1024 (196 ms) VP10ConvolveTest.vpx_highbd_convolve8_sse2_speed_64_1024 VP10ConvolveTest.vpx_highbd_convolve8_sse2_speed_64_1024 (694 ms) VP10ConvolveTest.vpx_convolve8_sse2_speed_l_8_1024 VP10ConvolveTest.vpx_convolve8_sse2_speed_l_8_1024 (21 ms) VP10ConvolveTest.vpx_convolve8_sse2_speed_l_16_1024 VP10ConvolveTest.vpx_convolve8_sse2_speed_l_16_1024 (44 ms) VP10ConvolveTest.vpx_convolve8_sse2_speed_l_32_1024 VP10ConvolveTest.vpx_convolve8_sse2_speed_l_32_1024 (138 ms) VP10ConvolveTest.vpx_convolve8_sse2_speed_l_64_1024 VP10ConvolveTest.vpx_convolve8_sse2_speed_l_64_1024 (491 ms) Change-Id: I3131a031e0380e8eae748cfcccc6cbb961d05943	2016-02-24 17:01:20 -08:00
Angie Chiang	1e403064b9	Fix 12 TAP convolution bug Priviously, we do 12-tap interpolation even there is no sub pixel, This could cause a bug becuase decoder doesn't extend border when there is no sub pixel. In this situation, if we still do interpolation, we will access the border extension which doesn't exist and cause a memory error Change-Id: I55b879722f0a10c5d13261bd9617a75c826a2418	2016-02-19 19:31:38 -08:00
Angie Chiang	d5349112e8	add convolution function with adjustable length Change-Id: I1a5b1e15a188ef11594d0c6ac0dbd42aac59cfca	2016-02-05 17:33:19 -08:00

11 Commits