generic-library/vpx

Author	SHA1	Message	Date
Yaowu Xu	6a94d6ad8e	Merge "Enable sse2 version of inverse wht for hbd build"	2016-01-31 04:38:39 +00:00
James Zern	8faccb709a	Merge changes If13946e4,I61a1814d,I2ca9aa3c,I44d91eaa * changes: intrapred: protect functions w/CONFIG check vp9_noise_estimate: protect copy_frame w/CONFIG check vp8_cx_iface: delete 3 unused functions vp8: mark intra_prediction_down_copy inline	2016-01-30 00:17:16 +00:00
Yaowu Xu	0aef1bc898	Enable sse2 version of inverse wht for hbd build Change-Id: If8f5efd701a11c8a7ad3078d10ec3cd0fe27667e	2016-01-29 14:47:56 -08:00
Yaowu Xu	b229710811	SSSE3 idct8x8 functions for highbitdpeth build This commit changes SSSE3 optimized idct8x8 functions to work with highbitdepth build. With this commit and the previous one that enabled SSSE3 idct32x32 functions, tests showed virtually no difference on decoding speed for file fdJc1_IBKJA.248.webm for the build with -enable-vp9-highbitdpeth option and the build without the option. Change-Id: Ibe0634149ec70e8b921e6b30171664b8690a9c45	2016-01-29 12:36:53 -08:00
Yaowu Xu	aac1ef7f80	Enable hbd_build to use SSSE3optimized functions This commit changes the SSSE3 assembly functions for idct32x32 to support highbitdepth build. On test clip fdJc1_IBKJA.248.webm, this cuts the speed difference between hbd and lbd build from between 3-4% to 1-2%. Change-Id: Ic3390e0113bc1ca5bba8ec80d1795ad31b484fca	2016-01-29 01:30:43 +00:00
James Zern	fea27ccca0	intrapred: protect functions w/CONFIG check d207e, d63e, d45e are only used with CONFIG_MISC_FIXES Change-Id: If13946e483c4d0ccaa3e1d60dc14216c06d5a219	2016-01-26 20:13:57 -08:00
James Zern	3a2ad10de2	Merge "Code clean of sad4xNx4D_sse"	2016-01-25 20:57:15 +00:00
Alex Converse	ed3df445d9	Revert "Merge "Change highbd variance rounding to prevent negative variance."" This reverts commit `ea48370a50`, reversing changes made to `15939cb2d7`. The commit was insufficiently tested and causes failures. Change-Id: I623d6fc2cd3ae6fd42d0abab1f8eada465ae57a7	2016-01-13 11:19:06 -08:00
Alex Converse	ea48370a50	Merge "Change highbd variance rounding to prevent negative variance."	2016-01-13 00:25:54 +00:00
Jian Zhou	26a6ce4c6d	Code clean of highbd_tm_predictor_32x32 Remove the ARCH_X86_64 constraint. No performance hit on both big core and small core. Change-Id: I39860b62b7a0ae4acaafdca7d68f3e5820133a81	2015-12-22 16:51:57 -08:00
Jian Zhou	355bfa2193	Code clean of highbd_tm_predictor_16x16 Remove the ARCH_X86_64 constraint. Change-Id: I0139f8e998cc5525df55161c2054008d21ac24d4	2015-12-22 16:34:40 -08:00
Jian Zhou	a4c265f1b7	Code clean of highbd_dc_predictor_32x32 Remove the ARCH_X86_64 constraint. Change-Id: I7d2545fc4f24eb352cf3e03082fc4d48d46fbb09	2015-12-22 16:06:54 -08:00
James Zern	cedb1db594	Merge "Code clean of highbd_tm_predictor_4x4"	2015-12-22 16:45:01 +00:00
James Zern	a097963f80	Merge "Code clean of highbd_dc_predictor_4x4"	2015-12-22 16:30:37 +00:00
Jian Zhou	52e7f4153b	Merge "Code clean of highbd_v_predictor_4x4"	2015-12-21 18:07:48 +00:00
Yunqing Wang	b597e3e188	Merge "Fix for issue 1114 compile error"	2015-12-19 04:29:39 +00:00
James Zern	8b2ddbc728	sad_sse2: fix sad4xN(_avg) on windows reduce the register count by 1 to avoid xmm6 and unnecessarily penalizing the other users of the base macro Change-Id: I59605c9a41a31c1b74f67ec06a40d1a7f92c4699	2015-12-18 19:19:32 -08:00
Jian Zhou	db11307502	Code clean of highbd_tm_predictor_4x4 Replace MMX with SSE2, reduce mem access to left neighbor, loop unrolled. Change-Id: I941be915af809025f121ecc6c6443f73c9903e70	2015-12-18 18:43:41 -08:00
Jian Zhou	c91dd55eda	Code clean of highbd_v_predictor_4x4 MMX replaced with SSE2, same performance. Change-Id: I2ab8f30a71e5fadbbc172fb385093dec1e11a696	2015-12-18 15:25:27 -08:00
Jian Zhou	8366b414dd	Code clean of highbd_dc_predictor_4x4 MMX replaced with SSE2, same performance. Change-Id: Ic57855254e26757191933c948fac6aa047fadafc	2015-12-18 12:45:23 -08:00
Peter de Rivaz	7361ef732b	Fix for issue 1114 compile error In 32-bit build with --enable-shared, there is a lot of register pressure and register src_strideq is reused. The code needs to use the stack based version of src_stride, but this doesn't compile when used in an lea instruction. This patch also fixes a related segmentation fault caused by the implementation using src_strideq even though it has been reused. This patch also fixes the HBD subpel variance tests that fail when compiled without disable-optimizations. These failures were caused by local variables in the assembler routines colliding with the caller's stack frame. Change-Id: Ice9d4dafdcbdc6038ad5ee7c1c09a8f06deca362	2015-12-18 09:43:22 +00:00
Jian Zhou	789dbb3131	Code clean of sad4xNx4D_sse Replace MMX with SSE2. Change-Id: I948ca1be6ed9b8e67f16555e226f1203726b7da6	2015-12-17 17:43:46 -08:00
Jian Zhou	b158d9a649	Code clean of sad4xN(_avg)_sse Replace MMX with SSE2, reduce psadbw ops which may help Silvermont. Change-Id: Ic7aec15245c9e5b2f3903dc7631f38e60be7c93d	2015-12-17 11:10:42 -08:00
James Zern	b81f04a0cc	Merge "move vp9_avg to vpx_dsp"	2015-12-15 03:41:22 +00:00
James Zern	d36659cec7	move vp9_avg to vpx_dsp Change-Id: I7bc991abea383db1f86c1bb0f2e849837b54d90f	2015-12-14 14:42:12 -08:00
Jian Zhou	2404e3290e	Merge "Code clean of tm_predictor_32x32"	2015-12-14 17:56:01 +00:00
Jian Zhou	6e87880e7f	Merge "Speed up tm_predictor_16x16"	2015-12-11 18:55:46 +00:00
Jian Zhou	88120481a4	Code clean of tm_predictor_32x32 Reallocate the xmm register usage so that no ARCH_X86_64 required. Reduce memory access to the left neighbor by half. Speed up by single digit on big core machine. Change-Id: I392515ed8e8aeb02e6a717b3966b1ba13f5be990	2015-12-11 10:32:08 -08:00
Jian Zhou	62f986265f	Merge "SSE2 based h_predictor_32x32"	2015-12-11 18:02:34 +00:00
James Zern	ecb8dff768	Merge "dc_left_pred[48]: fix pic builds"	2015-12-11 02:48:11 +00:00
Jian Zhou	5604924945	Merge "Code clean of dc_left/top_predictor_16x16"	2015-12-11 01:53:44 +00:00
James Zern	40ee78bc19	dc_left_pred[48]: fix pic builds GET_GOT modifies the stack pointer so the offset for left's address will be wrong if loaded afterword. Change-Id: Iff9433aec45f5f6fe1a59ed8080c589bad429536	2015-12-10 15:44:31 -08:00
Yunqing Wang	322ea7ff5b	Fix the win32 crash when GET_GOT is not defined This patch continues to fix the win32 crash issue: https://bugs.chromium.org/p/webm/issues/detail?id=1105 Johann's patch is here: https://chromium-review.googlesource.com/#/c/316446/2 Change-Id: I7fe191c717e40df8602e229371321efb0d689375	2015-12-10 14:25:01 -08:00
Jian Zhou	4ec5953080	Code clean of dc_left/top_predictor_16x16 Remove some redundant code. Change-Id: Ida2e8c0ce28770f7a9545ca014fe792b04295260	2015-12-10 11:59:58 -08:00
Jian Zhou	c90a8a1a43	SSE2 based h_predictor_32x32 Relocate the function from SSSE3 to SSE2, Unroll loop from 16 to 8, and reduce mem access to left. Speed up by single digit in ./test_intra_pred_speed on big core machines. Change-Id: I2b7fc95ffc0c42145be2baca4dc77116dff1c960	2015-12-10 10:09:58 -08:00
Johann Koenig	420b9f5bd3	Merge "fix null pointer crash in Win32 because esp register is broken"	2015-12-09 19:31:12 +00:00
Jian Zhou	aa5b517a39	Re-enable SSE2 based intra 4x4 prediction 4x4 Intra predictor implemented with MMX is replaced with SSE2. Segfault in change 315561 when decoding vp8 is taken care of. Change-Id: I083a7cb4eb8982954c20865160f91ebec777ec76	2015-12-07 18:50:37 -08:00
Scott LaVarnway	c7e557b82c	Merge "VP9: Add ssse3 version of vpx_idct32x32_135_add()"	2015-12-07 21:13:35 +00:00
Sergey Kolomenkin	5fc9688792	fix null pointer crash in Win32 because esp register is broken https://bugs.chromium.org/p/webm/issues/detail?id=1105 Change-Id: I304ea85ea1f6474e26f074dc39dc0748b90d4d3d	2015-12-07 12:57:06 -08:00
James Zern	79a9add666	Revert "MMX in intra 4x4 prediction replaced with SSE2" This reverts commit `89a1efa4c4`. This causes a segfault when decoding vp8, in both 32 and 64-bit Change-Id: Idbb9bb28ab897e1d055340497c47b49a12231367	2015-12-05 10:20:39 -08:00
Jian Zhou	e86c7c863e	Speed up h_predictor_16x16 Relocate the function from SSSE3 to SSE2, Unroll loop from 8 to 4, and reduce mem access to left. Speed up by >20% in ./test_intra_pred_speed. Change-Id: Ie48229c2e32404706b722442942c84983bda74cc	2015-12-04 12:12:55 -08:00
Jian Zhou	da3f08fac3	Speed up h_predictor_8x8 Relocate the function from SSSE3 to SSE2, Unroll loop from 4 to 2, and reduce mem access to left. Speed up by >20% in ./test_intra_pred_speed. Change-Id: Ib9f1846819783b6e05e2a310c930eb844b2b4d2e	2015-12-04 11:36:44 -08:00
Jian Zhou	aa2764abdd	MMX in intra 8x8 prediction replaced with SSE2 8x8 Intra predictor implemented with MMX is replaced with SSE2. Change-Id: I0c90e7c1e1e6942489ac2bfe58903b728aac7a52	2015-12-03 18:11:06 -08:00
Jian Zhou	89a1efa4c4	MMX in intra 4x4 prediction replaced with SSE2 4x4 Intra predictor implemented with MMX is replaced with SSE2. Change-Id: Id57da2a7c38832d0356bc998790fc1989d39eafc	2015-12-03 16:40:23 -08:00
Jian Zhou	623e988add	Merge "SSE2 speed up of h_predictor_4x4"	2015-12-02 18:49:00 +00:00
Scott LaVarnway	f0b0b1fe62	VP9: Add ssse3 version of vpx_idct32x32_135_add() Change-Id: I9a780131efaad28cf1ad233ae64c5c319a329727	2015-12-02 04:50:46 -08:00
Jian Zhou	c7fae5d893	Speed up tm_predictor_16x16 Reduce mem access to left. Speed up by 10% in ./test_intra_pred_speed with the same instruction size. Change-Id: Ia33689d62476972cc82ebb06b50415aeccc95d15	2015-11-30 17:46:40 -08:00
Scott LaVarnway	2669e05949	Merge "VPX: x86 asm version of vpx_idct32x32_1024_add()"	2015-11-30 23:28:27 +00:00
Jian Zhou	9d29d76280	SSE2 speed up of h_predictor_4x4 Relocate h_predictor_4x4 from SSSE3 to SSE2 with XMM registers. Speed up by ~25% in ./test_intra_pred_speed. Change-Id: I64e14c13b482a471449be3559bfb0da45cf88d9d	2015-11-30 10:08:05 -08:00
Scott LaVarnway	0148e20c3c	VPX: x86 asm version of vpx_idct32x32_1024_add() Change-Id: I3ba4ede553e068bf116dce59d1317347988b3542	2015-11-25 10:11:29 -08:00

1 2 3 4 5

216 Commits