generic-library/vpx

Author	SHA1	Message	Date
Yi Luo	7317200002	Hybrid inverse transforms 16x16 AVX2 optimization - Add unit tests to verify the bit-exact result. - User level time reduction (EXT_TX): encoder: 3.63% decoder: 2.36% - Also add tx_type=V_DCT...H_FLIPADST SSE2 for 16x16 inv txfm. Change-Id: Idc6d9e8254aa536e5f18a87fa0d37c6bd551c083	2016-11-01 13:38:20 -07:00
Yi Luo	1a0f27aaa6	Fix avx2 16x16/32x32 fwd txfm coeff output on HBD Change-Id: Ida036defe5688894a63007a31aa2dd0b3f0b5d59	2016-10-21 14:14:00 -07:00
Yi Luo	157e45a44b	Fix the overflow of av1_fht32x32() in 2D DCT_DCT - Use range check function to avoid DCT_DCT overflow. We need to re-develop the column txfm side scaling/rounding. Now, we prefer to maintain the current BDRate level. - Encoder user level time reduction <1% owing to av1_fht32x32_avx2. - Add MemCheck unit test and fdct32() unit test. Change-Id: I1e67030f67bc637859798ebe2f6698afffb8531c	2016-10-20 09:22:24 -07:00
Yi Luo	fed8e1c06d	Hybrid forward transform 32x32 AVX2 optimization - av1_fht32x32 AVX2 function level time reduction ~89% compared to C. - av1_fht32x32_avx2() on DCT_DCT improves 42.62% over aom_fdct32x32_avx2() But function replacement must go with the corresponding inverse txfm. - No obvious user level time reduction due to 32x32 TX_TYPE selection. - Zero high 128b YMM to avoid AVX-SSE transition penalties (fix 16x16 case). - Added 32x32 AVX2 unit tests to verify bitexact. - AVX2 optimization summary: On CPU i7-6700, based on 16x16/32x32 fwd txfm optimization results: C to AVX2: function level time reduction, ~86-89%. SSE2 to AVX2: function level time reduction, ~51%. Change-Id: Idd0cd8bf066a61c7117140ef15ab6c1f8eb4b036	2016-10-12 14:19:53 -07:00
Yi Luo	e8e8cd8f1b	Hybrid forward transforms 16x16 AVX2 optimization - Unit tests are added for AVX2 SIMD. - Encoder speed improvement: AV1 baseline and EXT_TX, three 1080p sequences at bitrate: 800 Kbps, 2 Mbps, 6 Mbps, on i7-6700 CPU, average user level time reduction: 3.86%. Change-Id: Ibbd7837ee3a831c6b1e4e471bf6c8d3fa3a19ff4	2016-10-06 15:33:15 -07:00

5 Commits