generic-library/vpx

Author	SHA1	Message	Date
Scott LaVarnway	9d24fe60f1	Merge "Code clean of sub_pixel_variance4xh -- 2"	2016-05-26 13:20:24 +00:00
Scott LaVarnway	a4f3751be5	Code clean of sub_pixel_variance4xh -- 2 Replace MMX with SSE2. Change-Id: Id8482d2589131f9427e7f36bc64413f058caf31f	2016-05-24 04:44:05 -07:00
James Zern	3fb55d24e8	Revert "Code clean of sub_pixel_variance4xh" This reverts commit `2468163e07`. causes valgrind errors for overread of buffer in SubpelVarianceTest Change-Id: I448e52c76f815ac199305b71f7d169f2bc167679	2016-05-19 23:37:27 -07:00
James Zern	146ccd304f	Merge "Code clean of sub_pixel_variance4xh"	2016-05-18 23:18:35 +00:00
Johann Koenig	36b610d8c1	Merge "neon hadamard 8x8"	2016-05-18 20:11:16 +00:00
Scott LaVarnway	2468163e07	Code clean of sub_pixel_variance4xh Replace MMX with SSE2. Change-Id: Ia8fcba755952804e347d7d7736f57d1f90c988a0	2016-05-18 04:24:41 -07:00
Johann	9b54e812f7	neon hadamard 8x8 Runs about 30% faster than the C BUG=webm:1021 Change-Id: I6809d6d84c3077ab619c53298296950e976bdaba	2016-05-16 11:58:02 -07:00
Linfeng Zhang	2f55beb355	Merge "remove mmx variance functions"	2016-05-11 22:21:23 +00:00
Linfeng Zhang	d0ffae825d	remove mmx variance functions there are sse2 equivalents which is a reasonable modern baseline Removed mmx variance functions: vpx_get_mb_ss_mmx() vpx_get8x8var_mmx() vpx_get4x4var_mmx() vpx_variance4x4_mmx() vpx_variance8x8_mmx() vpx_mse16x16_mmx() vpx_variance16x16_mmx() vpx_variance16x8_mmx() vpx_variance8x16_mmx() Change-Id: Iffaf85344c6676a3dd337c0645a2dd5deb2f86a1	2016-05-11 12:39:42 -07:00
Linfeng Zhang	d0e687bf8c	remove mmx sad functions there are sse2 equivalents which is a reasonable modern baseline Change-Id: Ibbe536a5ad1c2cccef6bdcc75c13b3dde35a56ba	2016-05-11 10:50:04 -07:00
Jim Bankoski	fce3cee8dd	Move vpx_add_plane from codec to vpx_dsp and dedup. Change-Id: I12218d8331c0558c0587a66321e3ca46da7e5cc7	2016-05-02 12:17:39 -07:00
Johann	2f5840de3e	vpx_minmax_8x8_neon and test BUG=https://bugs.chromium.org/p/webm/issues/detail?id=1156 Change-Id: Ief0ad8d6255b0ef0f233cda153799e3c72d3dbc6	2016-04-21 21:40:25 -07:00
Johann Koenig	c59c5cbeff	Merge "Enable vpx_idct32x32_1024_add_neon for neon as well, not only for neon_asm"	2016-04-15 16:00:51 +00:00
Martin Storsjo	d8b3e29ee7	Enable vpx_idct32x32_1024_add_neon for neon as well, not only for neon_asm This was never hooked up for the 32x32_34 case as the neon_asm version in `3f7c12da`, when the intrinsics version was added. Change-Id: Ic7db4ce5850c637315f9fe9e2de93a4f8cf9e320	2016-04-15 10:25:47 +03:00
Johann	26faa3ec7a	Apply 'const' to data not pointer Change-Id: Ic6b695442e319f7582a7ee8e52a47ae3e38c7298	2016-04-14 14:47:16 -07:00
Scott LaVarnway	67c4c8244a	VPX: loopfilter_mmx.asm using x86inc 2 This reverts commit `9aa083d164`. Fixes a decoder mismatch with 32bit PIC builds. Change-Id: I94717df662834810302fe3594b38c53084a4e284	2016-03-08 04:24:47 -08:00
James Zern	9aa083d164	Revert "VPX: loopfilter_mmx.asm using x86inc" This reverts commit `15ecdc3970`. breaks 32-bit pic builds Change-Id: I8bb1b9471a293f05ac7423aaba0339d408931b7a	2016-03-04 18:23:45 -08:00
Scott LaVarnway	15ecdc3970	VPX: loopfilter_mmx.asm using x86inc Change-Id: Idcf29281d617b275e3ca50f77e6d00c60992a36d	2016-02-18 15:34:58 -08:00
James Zern	9b44d9d00f	split vpx_highbd_lpf_horizontal_16 in two replace with vpx_highbd_lpf_horizontal_edge_16 and vpx_highbd_lpf_horizontal_edge_8 to avoid passing a count parameter Change-Id: I551f8cec0fce57032cb2652584bb802e2248644d	2016-02-16 23:13:58 -08:00
James Zern	1b519fb666	split vpx_lpf_horizontal_16 in two replace with vpx_lpf_horizontal_edge_16 and vpx_lpf_horizontal_edge_8 to avoid passing a count parameter Change-Id: I848c95c02a3c6ebaa6c2bdf0983dce05cd645271	2016-02-16 22:57:45 -08:00
James Zern	e7a23d703b	vpx_highbd_lpf_horizontal_4: remove unused count param Change-Id: I655a771e1b1a8753be5669ef9348a312ba6cfdbc	2016-02-16 22:57:45 -08:00
James Zern	5171857329	vpx_highbd_lpf_horizontal_8: remove unused count param Change-Id: Iaca71ea3796115d4c2d43563b4e6f3914e21f1bf	2016-02-16 22:57:44 -08:00
James Zern	3c1019e49d	vpx_highbd_lpf_vertical_4: remove unused count param Change-Id: Ic6da723c5cf3cd8127db1f476c3e46ea134cb774	2016-02-16 22:57:44 -08:00
James Zern	72a9f06ac2	vpx_highbd_lpf_vertical_8: remove unused count param Change-Id: Id16f7259897654831d31642c2d5e0bbe5e13416c	2016-02-16 22:57:44 -08:00
James Zern	b1e97c6a25	vpx_lpf_horizontal_4: remove unused count param Change-Id: Iec7d8eda343991f7d7d46931dca17af23c821d11	2016-02-16 22:57:27 -08:00
James Zern	bd5a5bb561	vpx_lpf_horizontal_8: remove unused count param Change-Id: I48741e167a7b09b7c9ad3bfc1c4b88ef1029ae46	2016-02-16 22:54:40 -08:00
James Zern	109a47b342	vpx_lpf_vertical_4: remove unused count param Change-Id: I43a191cb3d42e51e7bca266adfa11c6239a8064c	2016-02-16 14:59:00 -08:00
James Zern	37225744db	vpx_lpf_vertical_8: remove unused count param Change-Id: Ic69406da00afb0f06588e8c0deb2b043952b078c	2016-02-16 14:59:00 -08:00
Yaowu Xu	0aef1bc898	Enable sse2 version of inverse wht for hbd build Change-Id: If8f5efd701a11c8a7ad3078d10ec3cd0fe27667e	2016-01-29 14:47:56 -08:00
Yaowu Xu	b229710811	SSSE3 idct8x8 functions for highbitdpeth build This commit changes SSSE3 optimized idct8x8 functions to work with highbitdepth build. With this commit and the previous one that enabled SSSE3 idct32x32 functions, tests showed virtually no difference on decoding speed for file fdJc1_IBKJA.248.webm for the build with -enable-vp9-highbitdpeth option and the build without the option. Change-Id: Ibe0634149ec70e8b921e6b30171664b8690a9c45	2016-01-29 12:36:53 -08:00
Yaowu Xu	aac1ef7f80	Enable hbd_build to use SSSE3optimized functions This commit changes the SSSE3 assembly functions for idct32x32 to support highbitdepth build. On test clip fdJc1_IBKJA.248.webm, this cuts the speed difference between hbd and lbd build from between 3-4% to 1-2%. Change-Id: Ic3390e0113bc1ca5bba8ec80d1795ad31b484fca	2016-01-29 01:30:43 +00:00
James Zern	3a2ad10de2	Merge "Code clean of sad4xNx4D_sse"	2016-01-25 20:57:15 +00:00
Jian Zhou	26a6ce4c6d	Code clean of highbd_tm_predictor_32x32 Remove the ARCH_X86_64 constraint. No performance hit on both big core and small core. Change-Id: I39860b62b7a0ae4acaafdca7d68f3e5820133a81	2015-12-22 16:51:57 -08:00
Jian Zhou	355bfa2193	Code clean of highbd_tm_predictor_16x16 Remove the ARCH_X86_64 constraint. Change-Id: I0139f8e998cc5525df55161c2054008d21ac24d4	2015-12-22 16:34:40 -08:00
Jian Zhou	a4c265f1b7	Code clean of highbd_dc_predictor_32x32 Remove the ARCH_X86_64 constraint. Change-Id: I7d2545fc4f24eb352cf3e03082fc4d48d46fbb09	2015-12-22 16:06:54 -08:00
James Zern	cedb1db594	Merge "Code clean of highbd_tm_predictor_4x4"	2015-12-22 16:45:01 +00:00
James Zern	a097963f80	Merge "Code clean of highbd_dc_predictor_4x4"	2015-12-22 16:30:37 +00:00
Jian Zhou	db11307502	Code clean of highbd_tm_predictor_4x4 Replace MMX with SSE2, reduce mem access to left neighbor, loop unrolled. Change-Id: I941be915af809025f121ecc6c6443f73c9903e70	2015-12-18 18:43:41 -08:00
Jian Zhou	c91dd55eda	Code clean of highbd_v_predictor_4x4 MMX replaced with SSE2, same performance. Change-Id: I2ab8f30a71e5fadbbc172fb385093dec1e11a696	2015-12-18 15:25:27 -08:00
Jian Zhou	8366b414dd	Code clean of highbd_dc_predictor_4x4 MMX replaced with SSE2, same performance. Change-Id: Ic57855254e26757191933c948fac6aa047fadafc	2015-12-18 12:45:23 -08:00
Jian Zhou	789dbb3131	Code clean of sad4xNx4D_sse Replace MMX with SSE2. Change-Id: I948ca1be6ed9b8e67f16555e226f1203726b7da6	2015-12-17 17:43:46 -08:00
Jian Zhou	b158d9a649	Code clean of sad4xN(_avg)_sse Replace MMX with SSE2, reduce psadbw ops which may help Silvermont. Change-Id: Ic7aec15245c9e5b2f3903dc7631f38e60be7c93d	2015-12-17 11:10:42 -08:00
James Zern	b81f04a0cc	Merge "move vp9_avg to vpx_dsp"	2015-12-15 03:41:22 +00:00
James Zern	d36659cec7	move vp9_avg to vpx_dsp Change-Id: I7bc991abea383db1f86c1bb0f2e849837b54d90f	2015-12-14 14:42:12 -08:00
Jian Zhou	88120481a4	Code clean of tm_predictor_32x32 Reallocate the xmm register usage so that no ARCH_X86_64 required. Reduce memory access to the left neighbor by half. Speed up by single digit on big core machine. Change-Id: I392515ed8e8aeb02e6a717b3966b1ba13f5be990	2015-12-11 10:32:08 -08:00
Jian Zhou	c90a8a1a43	SSE2 based h_predictor_32x32 Relocate the function from SSSE3 to SSE2, Unroll loop from 16 to 8, and reduce mem access to left. Speed up by single digit in ./test_intra_pred_speed on big core machines. Change-Id: I2b7fc95ffc0c42145be2baca4dc77116dff1c960	2015-12-10 10:09:58 -08:00
Jian Zhou	aa5b517a39	Re-enable SSE2 based intra 4x4 prediction 4x4 Intra predictor implemented with MMX is replaced with SSE2. Segfault in change 315561 when decoding vp8 is taken care of. Change-Id: I083a7cb4eb8982954c20865160f91ebec777ec76	2015-12-07 18:50:37 -08:00
Scott LaVarnway	c7e557b82c	Merge "VP9: Add ssse3 version of vpx_idct32x32_135_add()"	2015-12-07 21:13:35 +00:00
James Zern	79a9add666	Revert "MMX in intra 4x4 prediction replaced with SSE2" This reverts commit `89a1efa4c4`. This causes a segfault when decoding vp8, in both 32 and 64-bit Change-Id: Idbb9bb28ab897e1d055340497c47b49a12231367	2015-12-05 10:20:39 -08:00
Jian Zhou	e86c7c863e	Speed up h_predictor_16x16 Relocate the function from SSSE3 to SSE2, Unroll loop from 8 to 4, and reduce mem access to left. Speed up by >20% in ./test_intra_pred_speed. Change-Id: Ie48229c2e32404706b722442942c84983bda74cc	2015-12-04 12:12:55 -08:00

1 2 3

117 Commits