generic-library/vpx

Author	SHA1	Message	Date
Linfeng Zhang	f95686895b	Merge changes I08b562b6,Ia275940a,I51106e90 * changes: Add vpx_highbd_idct32x32_{34, 135, 1024}_add_{sse2, sse4_1} Update highbd idct x86 optimizations. Update 32x32 idct sse2 and ssse3 optimizations.	2017-08-16 16:36:37 +00:00
Jerome Jiang	6b9c691daf	Merge "Clean up writing YUV files for debug purpose."	2017-08-15 18:28:54 +00:00
Jerome Jiang	a153080b55	Clean up writing YUV files for debug purpose. Change legacy vp8/9_write_yuv_frame to vpx_write_yuv_files. Delete some flags that can be enabled during build. To enable writing denoised YUV, use the following command line: CFLAGS='-DOUTPUT_YUV_DENOISED' ./configure --enable-vp9-temporal-denoising For skinmap, use CFLAGS='-DOUTPUT_YUV_SKINMAP' Change-Id: I236974ac8b3cf279d20c4dc7f6162d8b480b6528	2017-08-15 10:44:03 -07:00
Johann	77ed4414d6	quantize: silence unsigned overflow warning The result of the xor operation is unsigned. If coeff was negative, this results in an unsigned value - INT_MIN. Change-Id: I1f1edeaa6de1f4c68b848e8a82a666d390b749f0	2017-08-15 09:48:24 -07:00
Linfeng Zhang	d72e20b123	Add vpx_highbd_idct32x32_{34, 135, 1024}_add_{sse2, sse4_1} BUG=webm:1412 Change-Id: I08b562b60fa85fbc2fec1c15c323a3444b44618f	2017-08-14 17:05:22 -07:00
Linfeng Zhang	69775d2f40	Update highbd idct x86 optimizations. BUG=webm:1412 Change-Id: Ia275940af7d7d8637e9a851a9e39d655bfbe4069	2017-08-14 16:59:50 -07:00
Linfeng Zhang	3f05a70c41	Update 32x32 idct sse2 and ssse3 optimizations. Change-Id: I51106e90344035452621c49a6e1be7d5276b6c70	2017-08-14 16:59:31 -07:00
Linfeng Zhang	15193ce51f	Merge "Clean highbd idct x86 code with inline functions"	2017-08-10 20:25:18 +00:00
Johann Koenig	9bb8ce5efb	Merge "neon: vpx_quantize_b_32x32"	2017-08-10 15:42:49 +00:00
Johann Koenig	0b393ae505	Merge "quantize: copy ssse3 optimizations to intrinsics"	2017-08-10 15:42:20 +00:00
Linfeng Zhang	39da7fb786	Clean highbd idct x86 code with inline functions Created inline functions highbd_butterfly_cospi16_sse2() and highbd_butterfly_cospi16_sse4_1() BUG=webm:1412 Change-Id: Icbc53a73712b6207379872a5e88d0a4d09e2322a	2017-08-08 17:53:28 -07:00
Johann	93166c5e51	neon: vpx_quantize_b_32x32 With skip block the neon is about twice as fast as C. The neon has no shortcut for coeff < zbin so it always takes the same amount of time. Even if the C can take the shortcut, it is over twice as fast in neon. If it can't, that gap increases to over 10x. BUG=webm:1426 Change-Id: I400722146c1b5a5f6289f67d85fd642463d2bfc6	2017-08-08 14:05:18 -07:00
Johann	d52cb59729	quantize: copy ssse3 optimizations to intrinsics Fairly minor differences from sse2. pabsw and psignw are the big gains. Also re-uses some values in eob calculation to avoid an extra pcmp. Fixes test failures in HBD and OS X builds. Allows using it in 32bit builds, where it is about 40% faster than sse2. Substantially faster than the assembly for skip_block. 10-20% faster the rest of the time. Change-Id: If783bb3567e561e47667e10133b9c84414a334e2	2017-08-08 12:22:14 -07:00
Linfeng Zhang	853165ba39	Update 32x32 idct sse2 funcs, add partial case 135 Change-Id: I2b9add83f6fd8f9138fed3bec04a59877a237a6a	2017-08-07 17:37:02 -07:00
Linfeng Zhang	d670678f26	Rename highbd_multiplication_and_add_xx() to highbd_butterfly_xx() in idct x86 code Change-Id: I5159499a73a5c1b680516f6ca9c3d84f00c35083	2017-08-04 15:33:37 -07:00
Linfeng Zhang	fa829e0e5a	Replace multiplication_and_add() with butterfly() in idct x86 code Change-Id: I266e45a3d75a5357c7d6e6f20ab5c6fdbfe4982e	2017-08-04 15:33:34 -07:00
Linfeng Zhang	c9fb719ee1	Update butterfly() in idct x86 optimizations. Change-Id: Ic73e03bab9fdc085146f52094014db4af36ad701	2017-08-04 15:33:28 -07:00
Linfeng Zhang	7f20c3ac44	Add vpx_highbd_idct16x16_{10, 38, 256}_add_sse4_1 BUG=webm:1412 Change-Id: I8877c986b4042f7b8e33f5674c86700675a0e4ca	2017-08-04 15:31:17 -07:00
Linfeng Zhang	22b6dc9fdf	Update for loop increment of idct x86 functions Change-Id: Ided7895eaf41d5bc9d64fe536a17f5a078da68d4	2017-08-04 15:29:19 -07:00
Linfeng Zhang	0c61331244	Update high bitdepth 16x16 idct x86 code Prepare for high bitdepth 16x16 idct sse4.1 code. Just functions moving and renaming. BUG=webm:1412 Change-Id: Ie056fe4494b1f299491968beadcef990e2ab714a	2017-08-04 15:12:33 -07:00
Scott LaVarnway	c42517568d	vpx_dsp: merge avx2 variance files BUG=webm:1404 Change-Id: Ieb8f85c3811b05df78722cb41eeb1166966ceec4	2017-08-04 07:49:30 -07:00
Linfeng Zhang	e921c7ba8d	Merge "Rewrite vpx_idct16x16_{10,256}_add_sse2() and add case 38 function"	2017-08-04 01:16:35 +00:00
Scott LaVarnway	f6c6f37e0c	Merge "vpx_dsp: Use correct check for halfpel in"	2017-08-03 23:17:09 +00:00
Linfeng Zhang	563d58ab84	Rewrite vpx_idct16x16_{10,256}_add_sse2() and add case 38 function BUG=webm:1412 Change-Id: I945f0fb6807b8948747243794dc7352b959221f7	2017-08-03 13:59:47 -07:00
Linfeng Zhang	6624f20785	Merge changes I76727df0,I66297d78,I1d000c6b * changes: Extract inlined 16x16 idct sse2 code into header file Add transpose_32bit_8x4() sse2 optimization Update x86 idct optimization	2017-08-03 20:51:02 +00:00
Scott LaVarnway	8334a48d3a	vpx_dsp: Use correct check for halfpel in vpx_sub_pixel_variance32xh_avx2() and vpx_sub_pixel_avg_variance32xh_avx2 see: `17fae3a` Change to use correct check for halfpel Change-Id: Ib0741c5c2fd011e9650ca62b76009f1b59fdbe4c	2017-08-03 06:57:40 -07:00
Linfeng Zhang	15a47db730	Extract inlined 16x16 idct sse2 code into header file Will be called by high bitdepth functions. Change-Id: I76727df00941b5a27adceaba8347f275475fcd8c	2017-08-02 16:17:43 -07:00
Linfeng Zhang	8c0ab7607e	Add transpose_32bit_8x4() sse2 optimization Change-Id: I66297d78b38db718cfe3ebb8ea972f5a72c17955	2017-08-02 16:15:58 -07:00
Scott LaVarnway	698e56f26c	Merge "vpxdsp: variance_impl_avx2.c cleanup"	2017-08-02 19:08:10 +00:00
Scott LaVarnway	632fe8286a	vpxdsp: variance_impl_avx2.c cleanup BUG=webm:1404 Change-Id: I8d8498009e5ef7bf1137e4ff16ec81738a020b02	2017-08-02 05:57:39 -07:00
Linfeng Zhang	6738ad7aaf	Update x86 idct optimization Move constant coefficients preparation into inline function. Change-Id: I1d000c6b161794c8828ff70768439b767e2afea1	2017-08-01 14:40:12 -07:00
Linfeng Zhang	c0490b52b1	Merge "Rewrite vpx_highbd_idct8x8_{12,64}_add_sse2"	2017-08-01 21:39:39 +00:00
Johann Koenig	847394fe77	Merge "neon: vpx_quantize_b"	2017-08-01 16:44:31 +00:00
Linfeng Zhang	bf14d468c1	Rewrite vpx_highbd_idct8x8_{12,64}_add_sse2 This replaces commit `aa1c4cd`, which has a bug and was reverted in commit `3c73e58`. The bug is caused by rounding -step1[5] in highbd_idct8x8_12_half1d(). Change-Id: I37b3a5f0d91815f2dc570209091dc6626fd178a8	2017-07-31 16:36:13 -07:00
Johann	2d6b5df657	neon: vpx_quantize_b With skip block or coeff < zbin it is about twice as fast as C. If most coeff values are > zbin it is about 10-15x as fast as C. BUG=webm:1426 Change-Id: I5d3c007b014a372d5ef0882b39bb48983b4131c7	2017-07-31 10:38:46 -07:00
James Zern	78155b7ed5	highbd_inv_txfm_sse4: make << of neg. val a multiply left shifting a negative value is undefined; quiets a ubsan warning. this is applied to a constant, no change in the generated code. Change-Id: I595f0ff7904ef025e07bb80234293d958dc9f254	2017-07-30 12:48:28 -07:00
James Zern	d35b627340	Revert "Rewrite vpx_highbd_idct8x8_{12,64}_add_sse2" This reverts commit `aa1c4cd140`. This fails the following tests with extreme input coefficients: SSE2/InvTrans8x8DCT.CompareReference/0 SSE2/InvTrans8x8DCT.CompareReference/2 previously the optimized path was skipped in this range Change-Id: I9af015a46eba96208834a219fafd651d37556a80	2017-07-29 11:12:27 -07:00
Linfeng Zhang	75653b7032	Merge changes Ia0e20f5f,I28150789,I35df041b,I221dff34 * changes: Update vpx_idct16x16_10_add_sse2() Add vpx_idct16x16_38_add_sse2() Rewrite vpx_highbd_idct8x8_{12,64}_add_sse2 Refactor highbd idct 4x4 and 8x8 x86 functions	2017-07-28 22:43:00 +00:00
James Zern	3c73e587d1	Revert "quantize ssse3: declare all variables" This reverts commit `03f5e300d6`. This causes test failures under OSX: SSSE3/VP9QuantizeTest.EOBCheck/0 SSSE3/VP9QuantizeTest.OperationCheck/0 Change-Id: I122732717ead1f7af5b04c529a6948e382e5e59b	2017-07-28 01:22:16 -07:00
Linfeng Zhang	5232e35bc2	Update vpx_idct16x16_10_add_sse2() Change-Id: Ia0e20f5fa47382af5785221eebb05212b40bd35c	2017-07-27 18:03:25 -07:00
Linfeng Zhang	7f4acf8700	Add vpx_idct16x16_38_add_sse2() Change-Id: I28150789feadc0b63d2fadc707e48971b41f9898	2017-07-27 18:02:43 -07:00
Linfeng Zhang	aa1c4cd140	Rewrite vpx_highbd_idct8x8_{12,64}_add_sse2 BUG=webm:1412 Change-Id: I35df041b757d42278ac7a5cdbd909e8ffcee1455	2017-07-27 18:02:36 -07:00
Linfeng Zhang	9c43d81bc2	Refactor highbd idct 4x4 and 8x8 x86 functions BUG=webm:1412 Change-Id: I221dff34dd5f71b390b5e043d0a137ccb0a01dec	2017-07-27 18:01:03 -07:00
Johann Koenig	a83e1f1d53	Merge "quantize ssse3: declare all variables"	2017-07-27 21:18:35 +00:00
James Zern	1c666465af	inv_txfm_{sse2,ssse3}: clear conversion warnings visual studio reports tran_high_t (int64) -> short in calls to _mm_set1_epi16 Change-Id: Icb8d1baee77ad3d45edb1477a443d3e648f0b745	2017-07-25 20:13:49 -07:00
James Zern	62682ac8ad	highbd_idct_sse.c: clear conversion warnings visual studio reports tran_high_t (int64) -> int in calls to _mm_setr_epi32 Change-Id: Ic2247c8e3800991202151790d78bd94c4f4aed05	2017-07-25 20:11:09 -07:00
James Zern	85736e616e	vpx_variance16x16_sse2: correct cast order allow the right shift to operate on 64-bits, this matches the rest of the implementations previously: `b0f1ae147` vpx_get16x16var_avx2: correct cast order Change-Id: I632ee5e418f3f9b30e79ecd05588eb172b0783aa	2017-07-25 16:45:40 -07:00
James Zern	b0f1ae1475	vpx_get16x16var_avx2: correct cast order allow the right shift to operate on 64-bits, this matches the rest of the implementations missed in: `6acd061aa` variance_avx2: sync variance functions with c-code Change-Id: Icae436b881251ccb9f9ed64fcbf8d358c58a4617	2017-07-24 16:29:44 -07:00
Johann	4b9a848bb3	variance: call C comp_avg_pred Keep optimized code out of the reference implementation. This matches the style of the other sub calls. Change-Id: I3da6acd4f2c647b029c420e22ac9410a18259689	2017-07-18 20:22:53 +00:00
Johann	03f5e300d6	quantize ssse3: declare all variables Copy missing line from avx implementation. Change-Id: I9755c5b4d4034867de6fa9f741c24bf49dce3a27	2017-07-18 12:32:57 -07:00

1 2 3 4 5 ...

847 Commits