generic-library/vpx

Author	SHA1	Message	Date
James Zern	aa68a8301e	vpx_dsp_common.h: fix include guard Change-Id: I1ad41c096ec86870f9aecab6fdbc3af03e972afc	2016-05-04 14:54:32 -07:00
James Bankoski	89f905e5e5	Merge "libvpx: add a unit test for plane_add_noise."	2016-05-04 13:09:05 +00:00
Jim Bankoski	34d5aff747	libvpx: add a unit test for plane_add_noise. In so doing this fixes a couple of bugs: vpx_plane_add_noise.c needed to subtract a clamp instead of add. And the assembly (mmx sse) had assumptions that parameters were continuous in memory which was not true. Change-Id: I76f2c43cf54bfc838eb2edf8a443eaaa7565d7b5	2016-05-03 16:23:06 -07:00
James Bankoski	e755a283dd	Merge "Move vpx_add_plane from codec to vpx_dsp and dedup."	2016-05-03 14:11:57 +00:00
Jim Bankoski	fce3cee8dd	Move vpx_add_plane from codec to vpx_dsp and dedup. Change-Id: I12218d8331c0558c0587a66321e3ca46da7e5cc7	2016-05-02 12:17:39 -07:00
Alex Converse	f6d13e7be5	Merge "bitreader: remove an unsigned overflow."	2016-04-28 16:26:37 +00:00
Alex Converse	a68b24fdee	Tweak casts on vpx_sub_pixel_variance to avoid implicit overflow. Change-Id: I481eb271b082fa3497b0283f37d9b4d1f6de270c	2016-04-27 16:37:18 -07:00
Alex Converse	36a0c7ffe3	bitreader: remove an unsigned overflow. bits_left is in the range [0, 64 (= BD_VALUE_SIZE)] , so the narrowing conversion should be safe. Change-Id: I943fcd359eaad76249ee1e1fb03a2ac16945d2fd	2016-04-27 15:31:35 -07:00
Alex Converse	6c4007be1c	Be explicit about overflow in vpx_variance16x16_sse2. The product always fits in uint32_t, but the operands don't. An optimizing compiler should generate the wraparound code. (Verified with clang). Change-Id: I25eb64df99152992bc898b8ccbb01d55c8d16e3c	2016-04-27 15:22:17 -07:00
Alex Converse	ccb894ce73	Remove casts on < 16x16 variance. These blocks will never overflow since max sum is +/-255wh. Change-Id: Ia2c630339fd9cfb411b56b6040ff402095f12a2e	2016-04-27 15:21:58 -07:00
Johann	2f5840de3e	vpx_minmax_8x8_neon and test BUG=https://bugs.chromium.org/p/webm/issues/detail?id=1156 Change-Id: Ief0ad8d6255b0ef0f233cda153799e3c72d3dbc6	2016-04-21 21:40:25 -07:00
Johann	8c02a36953	hadamard 8x8 test The order of the output structure is not currently important. BUG=https://bugs.chromium.org/p/webm/issues/detail?id=1021 Change-Id: Ibc0006d569675db6c5060c4529f5d9e73f2e96a6	2016-04-21 22:28:21 +00:00
Johann Koenig	c59c5cbeff	Merge "Enable vpx_idct32x32_1024_add_neon for neon as well, not only for neon_asm"	2016-04-15 16:00:51 +00:00
Martin Storsjo	d8b3e29ee7	Enable vpx_idct32x32_1024_add_neon for neon as well, not only for neon_asm This was never hooked up for the 32x32_34 case as the neon_asm version in `3f7c12da`, when the intrinsics version was added. Change-Id: Ic7db4ce5850c637315f9fe9e2de93a4f8cf9e320	2016-04-15 10:25:47 +03:00
Johann	26faa3ec7a	Apply 'const' to data not pointer Change-Id: Ic6b695442e319f7582a7ee8e52a47ae3e38c7298	2016-04-14 14:47:16 -07:00
James Zern	5ab46e0ecd	Merge changes I7a1c0cba,Ie02b5caf,I2cbd85d7,I644f35b0 * changes: vpx_fdct16x16_1_sse2: improve load pattern vpx_fdct16x16_1_c/msa: fix accumulator overflow vpx_fdctNxN_1_sse2: reduce store size dct32x32_test: add PartialTrans32x32Test, Random	2016-04-06 02:51:53 +00:00
James Zern	38bc1d0f4b	vpx_fdct16x16_1_sse2: improve load pattern load the full row rather than doing 2 8-wide columns Change-Id: I7a1c0cba06b0dc1ae86046410922b1efccb95c95	2016-04-04 16:03:42 -07:00
James Zern	eb64ea3e89	vpx_fdct16x16_1_c/msa: fix accumulator overflow tran_low_t is only signed 16-bits in non-high-bitdepth mode Change-Id: Ie02b5caf2658e8e71f995c17dd5ce666a4d64918	2016-04-04 16:03:41 -07:00
James Zern	3735def667	vpx_fdctNxN_1_sse2: reduce store size only output[0] needs to be set, store_output is more involved than a movdqa in the high bitdepth case Change-Id: I2cbd85d7cf74688bdf47eb767934fe42e02bff67	2016-04-04 16:02:06 -07:00
James Zern	c21d437052	vpx_fdct32x32_1_msa: fix accumulator overflow Change-Id: I33a5432eda3416382e1cea06b45082c0c65faa75	2016-04-02 11:04:38 -07:00
James Zern	f4cae05cd4	vpx_fdctNxN_1_c: remove unnecessary store only output[0] needs to be set, the other values will be ignored in this case. Change-Id: I8e9692fc0d6d85700ba46f70c2e899a956023910	2016-04-01 12:21:59 -07:00
James Zern	0269df41c1	vpx_fdct32x32_1_c: fix accumulator overflow tran_low_t is only 16-bits in non-high-bitdepth mode Change-Id: Ifc06110c95e86e6d790c44250d52a538b2e9713b	2016-03-30 15:20:20 -07:00
Scott LaVarnway	67c4c8244a	VPX: loopfilter_mmx.asm using x86inc 2 This reverts commit `9aa083d164`. Fixes a decoder mismatch with 32bit PIC builds. Change-Id: I94717df662834810302fe3594b38c53084a4e284	2016-03-08 04:24:47 -08:00
James Zern	9aa083d164	Revert "VPX: loopfilter_mmx.asm using x86inc" This reverts commit `15ecdc3970`. breaks 32-bit pic builds Change-Id: I8bb1b9471a293f05ac7423aaba0339d408931b7a	2016-03-04 18:23:45 -08:00
Scott LaVarnway	dd6729f826	VPX: Remove pmin/pmax from subpixel functions. These instructions are unnecessary if the adds are done in the correct order. Change-Id: I4e533b8267c32e610a4b94203ad052dc9fdabd71	2016-02-27 05:47:56 -08:00
Scott LaVarnway	51beb29f52	Merge "VPX: vpx_filter_block1d16_(v8, v8_avg)"	2016-02-27 13:31:18 +00:00
James Zern	654d2163c9	x86/convolve.h: remove redundant check in FUN_CONV_2D the filter will be the same in this case Change-Id: I95159bcb05bbfb71b57da741393e80cc7ffc5cff	2016-02-25 23:31:50 -08:00
James Zern	6d8c8c6201	x86/convolve.h: replace while w/if for w < 16 in non-hbd configurations; any high-bitdepth changes will be done in a follow-up Change-Id: Ia74e30971b744c1faab68c92fdeda1a053988c77	2016-02-25 21:44:06 -08:00
Scott LaVarnway	1f736e400f	VPX: vpx_filter_block1d16_(v8, v8_avg) Store result with one 16 byte store instead of two 8 byte stores. Change-Id: I43acbc5edfd6d6055a926f9b9605d47127400f09	2016-02-25 06:15:24 -08:00
James Zern	b3ceb629ba	x86/convolve.h: change filter[] \|\| chains to \| Change-Id: I661f64390f232826857b259e7a67e77f5a3a91ad	2016-02-24 19:47:43 -08:00
Scott LaVarnway	06d0e2fe6c	BUG FIX: vpx_filter_block1d(8,4)_(v8, v8_avg) Change-Id: Ic7ea79988ed0864e7ddbfeb312516bcf77eaaac1	2016-02-23 12:23:41 -08:00
Scott LaVarnway	15ecdc3970	VPX: loopfilter_mmx.asm using x86inc Change-Id: Idcf29281d617b275e3ca50f77e6d00c60992a36d	2016-02-18 15:34:58 -08:00
James Zern	9b44d9d00f	split vpx_highbd_lpf_horizontal_16 in two replace with vpx_highbd_lpf_horizontal_edge_16 and vpx_highbd_lpf_horizontal_edge_8 to avoid passing a count parameter Change-Id: I551f8cec0fce57032cb2652584bb802e2248644d	2016-02-16 23:13:58 -08:00
James Zern	1b519fb666	split vpx_lpf_horizontal_16 in two replace with vpx_lpf_horizontal_edge_16 and vpx_lpf_horizontal_edge_8 to avoid passing a count parameter Change-Id: I848c95c02a3c6ebaa6c2bdf0983dce05cd645271	2016-02-16 22:57:45 -08:00
James Zern	e7a23d703b	vpx_highbd_lpf_horizontal_4: remove unused count param Change-Id: I655a771e1b1a8753be5669ef9348a312ba6cfdbc	2016-02-16 22:57:45 -08:00
James Zern	5171857329	vpx_highbd_lpf_horizontal_8: remove unused count param Change-Id: Iaca71ea3796115d4c2d43563b4e6f3914e21f1bf	2016-02-16 22:57:44 -08:00
James Zern	3c1019e49d	vpx_highbd_lpf_vertical_4: remove unused count param Change-Id: Ic6da723c5cf3cd8127db1f476c3e46ea134cb774	2016-02-16 22:57:44 -08:00
James Zern	72a9f06ac2	vpx_highbd_lpf_vertical_8: remove unused count param Change-Id: Id16f7259897654831d31642c2d5e0bbe5e13416c	2016-02-16 22:57:44 -08:00
James Zern	b1e97c6a25	vpx_lpf_horizontal_4: remove unused count param Change-Id: Iec7d8eda343991f7d7d46931dca17af23c821d11	2016-02-16 22:57:27 -08:00
James Zern	bd5a5bb561	vpx_lpf_horizontal_8: remove unused count param Change-Id: I48741e167a7b09b7c9ad3bfc1c4b88ef1029ae46	2016-02-16 22:54:40 -08:00
James Zern	109a47b342	vpx_lpf_vertical_4: remove unused count param Change-Id: I43a191cb3d42e51e7bca266adfa11c6239a8064c	2016-02-16 14:59:00 -08:00
James Zern	37225744db	vpx_lpf_vertical_8: remove unused count param Change-Id: Ic69406da00afb0f06588e8c0deb2b043952b078c	2016-02-16 14:59:00 -08:00
James Zern	26c6fbdcda	vpx_ve_predictor_4x4_c: quiet unused param warning Change-Id: I62234260e2d2de94d602c6d8095c8f8124334052	2016-02-11 19:22:29 -08:00
James Zern	05437805f7	intrapred/d135: flatten border results before storing the results along the top and left border are then stored with a moving window into the vector. ~40-67% faster on ARM, ~40-77+% on x86 depending on the block size. Change-Id: Iab369aa2946a3ae4eb7290d512868fe5db92dbc8	2016-02-05 12:31:48 -08:00
James Zern	cdf1077d5a	intrapred: protect functions w/CONFIG check x2 high-bitdepth version d207e, d63e, d45e are only used with CONFIG_MISC_FIXES Change-Id: I77292e11f51fd76d4127fd0027f876866bcf8675	2016-02-02 19:38:37 -08:00
Yaowu Xu	6a94d6ad8e	Merge "Enable sse2 version of inverse wht for hbd build"	2016-01-31 04:38:39 +00:00
James Zern	8faccb709a	Merge changes If13946e4,I61a1814d,I2ca9aa3c,I44d91eaa * changes: intrapred: protect functions w/CONFIG check vp9_noise_estimate: protect copy_frame w/CONFIG check vp8_cx_iface: delete 3 unused functions vp8: mark intra_prediction_down_copy inline	2016-01-30 00:17:16 +00:00
Yaowu Xu	0aef1bc898	Enable sse2 version of inverse wht for hbd build Change-Id: If8f5efd701a11c8a7ad3078d10ec3cd0fe27667e	2016-01-29 14:47:56 -08:00
Yaowu Xu	b229710811	SSSE3 idct8x8 functions for highbitdpeth build This commit changes SSSE3 optimized idct8x8 functions to work with highbitdepth build. With this commit and the previous one that enabled SSSE3 idct32x32 functions, tests showed virtually no difference on decoding speed for file fdJc1_IBKJA.248.webm for the build with -enable-vp9-highbitdpeth option and the build without the option. Change-Id: Ibe0634149ec70e8b921e6b30171664b8690a9c45	2016-01-29 12:36:53 -08:00
Yaowu Xu	aac1ef7f80	Enable hbd_build to use SSSE3optimized functions This commit changes the SSSE3 assembly functions for idct32x32 to support highbitdepth build. On test clip fdJc1_IBKJA.248.webm, this cuts the speed difference between hbd and lbd build from between 3-4% to 1-2%. Change-Id: Ic3390e0113bc1ca5bba8ec80d1795ad31b484fca	2016-01-29 01:30:43 +00:00
James Zern	fea27ccca0	intrapred: protect functions w/CONFIG check d207e, d63e, d45e are only used with CONFIG_MISC_FIXES Change-Id: If13946e483c4d0ccaa3e1d60dc14216c06d5a219	2016-01-26 20:13:57 -08:00
James Zern	3a2ad10de2	Merge "Code clean of sad4xNx4D_sse"	2016-01-25 20:57:15 +00:00
Alex Converse	ed3df445d9	Revert "Merge "Change highbd variance rounding to prevent negative variance."" This reverts commit `ea48370a50`, reversing changes made to `15939cb2d7`. The commit was insufficiently tested and causes failures. Change-Id: I623d6fc2cd3ae6fd42d0abab1f8eada465ae57a7	2016-01-13 11:19:06 -08:00
Alex Converse	ea48370a50	Merge "Change highbd variance rounding to prevent negative variance."	2016-01-13 00:25:54 +00:00
Jian Zhou	26a6ce4c6d	Code clean of highbd_tm_predictor_32x32 Remove the ARCH_X86_64 constraint. No performance hit on both big core and small core. Change-Id: I39860b62b7a0ae4acaafdca7d68f3e5820133a81	2015-12-22 16:51:57 -08:00
Jian Zhou	355bfa2193	Code clean of highbd_tm_predictor_16x16 Remove the ARCH_X86_64 constraint. Change-Id: I0139f8e998cc5525df55161c2054008d21ac24d4	2015-12-22 16:34:40 -08:00
Jian Zhou	a4c265f1b7	Code clean of highbd_dc_predictor_32x32 Remove the ARCH_X86_64 constraint. Change-Id: I7d2545fc4f24eb352cf3e03082fc4d48d46fbb09	2015-12-22 16:06:54 -08:00
James Zern	cedb1db594	Merge "Code clean of highbd_tm_predictor_4x4"	2015-12-22 16:45:01 +00:00
James Zern	a097963f80	Merge "Code clean of highbd_dc_predictor_4x4"	2015-12-22 16:30:37 +00:00
Jian Zhou	52e7f4153b	Merge "Code clean of highbd_v_predictor_4x4"	2015-12-21 18:07:48 +00:00
Yunqing Wang	b597e3e188	Merge "Fix for issue 1114 compile error"	2015-12-19 04:29:39 +00:00
James Zern	8b2ddbc728	sad_sse2: fix sad4xN(_avg) on windows reduce the register count by 1 to avoid xmm6 and unnecessarily penalizing the other users of the base macro Change-Id: I59605c9a41a31c1b74f67ec06a40d1a7f92c4699	2015-12-18 19:19:32 -08:00
Jian Zhou	db11307502	Code clean of highbd_tm_predictor_4x4 Replace MMX with SSE2, reduce mem access to left neighbor, loop unrolled. Change-Id: I941be915af809025f121ecc6c6443f73c9903e70	2015-12-18 18:43:41 -08:00
Jian Zhou	c91dd55eda	Code clean of highbd_v_predictor_4x4 MMX replaced with SSE2, same performance. Change-Id: I2ab8f30a71e5fadbbc172fb385093dec1e11a696	2015-12-18 15:25:27 -08:00
Jian Zhou	8366b414dd	Code clean of highbd_dc_predictor_4x4 MMX replaced with SSE2, same performance. Change-Id: Ic57855254e26757191933c948fac6aa047fadafc	2015-12-18 12:45:23 -08:00
Peter de Rivaz	7361ef732b	Fix for issue 1114 compile error In 32-bit build with --enable-shared, there is a lot of register pressure and register src_strideq is reused. The code needs to use the stack based version of src_stride, but this doesn't compile when used in an lea instruction. This patch also fixes a related segmentation fault caused by the implementation using src_strideq even though it has been reused. This patch also fixes the HBD subpel variance tests that fail when compiled without disable-optimizations. These failures were caused by local variables in the assembler routines colliding with the caller's stack frame. Change-Id: Ice9d4dafdcbdc6038ad5ee7c1c09a8f06deca362	2015-12-18 09:43:22 +00:00
Jian Zhou	789dbb3131	Code clean of sad4xNx4D_sse Replace MMX with SSE2. Change-Id: I948ca1be6ed9b8e67f16555e226f1203726b7da6	2015-12-17 17:43:46 -08:00
Jian Zhou	b158d9a649	Code clean of sad4xN(_avg)_sse Replace MMX with SSE2, reduce psadbw ops which may help Silvermont. Change-Id: Ic7aec15245c9e5b2f3903dc7631f38e60be7c93d	2015-12-17 11:10:42 -08:00
James Zern	b81f04a0cc	Merge "move vp9_avg to vpx_dsp"	2015-12-15 03:41:22 +00:00
James Zern	d36659cec7	move vp9_avg to vpx_dsp Change-Id: I7bc991abea383db1f86c1bb0f2e849837b54d90f	2015-12-14 14:42:12 -08:00
Jian Zhou	2404e3290e	Merge "Code clean of tm_predictor_32x32"	2015-12-14 17:56:01 +00:00
Jian Zhou	6e87880e7f	Merge "Speed up tm_predictor_16x16"	2015-12-11 18:55:46 +00:00
Jian Zhou	88120481a4	Code clean of tm_predictor_32x32 Reallocate the xmm register usage so that no ARCH_X86_64 required. Reduce memory access to the left neighbor by half. Speed up by single digit on big core machine. Change-Id: I392515ed8e8aeb02e6a717b3966b1ba13f5be990	2015-12-11 10:32:08 -08:00
Jian Zhou	62f986265f	Merge "SSE2 based h_predictor_32x32"	2015-12-11 18:02:34 +00:00
James Zern	ecb8dff768	Merge "dc_left_pred[48]: fix pic builds"	2015-12-11 02:48:11 +00:00
Jian Zhou	5604924945	Merge "Code clean of dc_left/top_predictor_16x16"	2015-12-11 01:53:44 +00:00
James Zern	40ee78bc19	dc_left_pred[48]: fix pic builds GET_GOT modifies the stack pointer so the offset for left's address will be wrong if loaded afterword. Change-Id: Iff9433aec45f5f6fe1a59ed8080c589bad429536	2015-12-10 15:44:31 -08:00
Yunqing Wang	322ea7ff5b	Fix the win32 crash when GET_GOT is not defined This patch continues to fix the win32 crash issue: https://bugs.chromium.org/p/webm/issues/detail?id=1105 Johann's patch is here: https://chromium-review.googlesource.com/#/c/316446/2 Change-Id: I7fe191c717e40df8602e229371321efb0d689375	2015-12-10 14:25:01 -08:00
Jian Zhou	4ec5953080	Code clean of dc_left/top_predictor_16x16 Remove some redundant code. Change-Id: Ida2e8c0ce28770f7a9545ca014fe792b04295260	2015-12-10 11:59:58 -08:00
Jian Zhou	c90a8a1a43	SSE2 based h_predictor_32x32 Relocate the function from SSSE3 to SSE2, Unroll loop from 16 to 8, and reduce mem access to left. Speed up by single digit in ./test_intra_pred_speed on big core machines. Change-Id: I2b7fc95ffc0c42145be2baca4dc77116dff1c960	2015-12-10 10:09:58 -08:00
Johann Koenig	420b9f5bd3	Merge "fix null pointer crash in Win32 because esp register is broken"	2015-12-09 19:31:12 +00:00
Jian Zhou	aa5b517a39	Re-enable SSE2 based intra 4x4 prediction 4x4 Intra predictor implemented with MMX is replaced with SSE2. Segfault in change 315561 when decoding vp8 is taken care of. Change-Id: I083a7cb4eb8982954c20865160f91ebec777ec76	2015-12-07 18:50:37 -08:00
Scott LaVarnway	c7e557b82c	Merge "VP9: Add ssse3 version of vpx_idct32x32_135_add()"	2015-12-07 21:13:35 +00:00
Sergey Kolomenkin	5fc9688792	fix null pointer crash in Win32 because esp register is broken https://bugs.chromium.org/p/webm/issues/detail?id=1105 Change-Id: I304ea85ea1f6474e26f074dc39dc0748b90d4d3d	2015-12-07 12:57:06 -08:00
James Zern	79a9add666	Revert "MMX in intra 4x4 prediction replaced with SSE2" This reverts commit `89a1efa4c4`. This causes a segfault when decoding vp8, in both 32 and 64-bit Change-Id: Idbb9bb28ab897e1d055340497c47b49a12231367	2015-12-05 10:20:39 -08:00
Jian Zhou	e86c7c863e	Speed up h_predictor_16x16 Relocate the function from SSSE3 to SSE2, Unroll loop from 8 to 4, and reduce mem access to left. Speed up by >20% in ./test_intra_pred_speed. Change-Id: Ie48229c2e32404706b722442942c84983bda74cc	2015-12-04 12:12:55 -08:00
Jian Zhou	da3f08fac3	Speed up h_predictor_8x8 Relocate the function from SSSE3 to SSE2, Unroll loop from 4 to 2, and reduce mem access to left. Speed up by >20% in ./test_intra_pred_speed. Change-Id: Ib9f1846819783b6e05e2a310c930eb844b2b4d2e	2015-12-04 11:36:44 -08:00
Jian Zhou	aa2764abdd	MMX in intra 8x8 prediction replaced with SSE2 8x8 Intra predictor implemented with MMX is replaced with SSE2. Change-Id: I0c90e7c1e1e6942489ac2bfe58903b728aac7a52	2015-12-03 18:11:06 -08:00
Jian Zhou	89a1efa4c4	MMX in intra 4x4 prediction replaced with SSE2 4x4 Intra predictor implemented with MMX is replaced with SSE2. Change-Id: Id57da2a7c38832d0356bc998790fc1989d39eafc	2015-12-03 16:40:23 -08:00
Jian Zhou	623e988add	Merge "SSE2 speed up of h_predictor_4x4"	2015-12-02 18:49:00 +00:00
Scott LaVarnway	f0b0b1fe62	VP9: Add ssse3 version of vpx_idct32x32_135_add() Change-Id: I9a780131efaad28cf1ad233ae64c5c319a329727	2015-12-02 04:50:46 -08:00
Jian Zhou	c7fae5d893	Speed up tm_predictor_16x16 Reduce mem access to left. Speed up by 10% in ./test_intra_pred_speed with the same instruction size. Change-Id: Ia33689d62476972cc82ebb06b50415aeccc95d15	2015-11-30 17:46:40 -08:00
Scott LaVarnway	2669e05949	Merge "VPX: x86 asm version of vpx_idct32x32_1024_add()"	2015-11-30 23:28:27 +00:00
Jian Zhou	9d29d76280	SSE2 speed up of h_predictor_4x4 Relocate h_predictor_4x4 from SSSE3 to SSE2 with XMM registers. Speed up by ~25% in ./test_intra_pred_speed. Change-Id: I64e14c13b482a471449be3559bfb0da45cf88d9d	2015-11-30 10:08:05 -08:00
Scott LaVarnway	0148e20c3c	VPX: x86 asm version of vpx_idct32x32_1024_add() Change-Id: I3ba4ede553e068bf116dce59d1317347988b3542	2015-11-25 10:11:29 -08:00
Jian Zhou	901d20369a	Merge "Speed up tm_predictor_8x8"	2015-11-25 02:34:07 +00:00
Alex Converse	022c848b4d	Change highbd variance rounding to prevent negative variance. Always round sum error and sum square error toward zero in variance calculations. This prevents variance from becoming negative. Avoiding rounding variance at all might be better but would be far more invasive. Change-Id: Icf24e0e75ff94952fc026ba6a4d26adf8d373f1c	2015-11-24 16:32:01 -08:00
Jian Zhou	f4621c5c8d	Speed up tm_predictor_8x8 Left neighbor read from memory only once. Speed up by ~20% in ./test_intra_pred_speed. Change-Id: Ia1388630df6fed0dce9a6eeded6cb855bbc43505	2015-11-24 16:07:06 -08:00
Alex Converse	b84fa548fb	Merge "bitreader/writer: Change shift to signed"	2015-11-24 18:33:45 +00:00
Scott LaVarnway	97e6cc6198	VPX: Removed unnecessary pmulhrsw in IDCT32X32_34 and fixed macro name. Change-Id: I306b98a2b4ec80b130ae80290b4cd9c7a5363311	2015-11-23 10:24:09 -08:00
James Zern	16eba81f69	Revert "Speed up h_predictor_4x4" This reverts commit `d76032ae87`. breaks 32-bit builds Change-Id: If6266ec2a405b5a21d615112f0f37e8a71193858	2015-11-20 22:25:29 -08:00
James Zern	1b10753ad7	Merge "Speed up h_predictor_4x4"	2015-11-21 01:12:42 +00:00
Alex Converse	612e3c8a0e	Merge "Fix a signed shift overflow in vpx_rb_read_inv_signed_literal."	2015-11-20 17:42:05 +00:00
Scott LaVarnway	e7fc39fdf5	Merge "VPX: x86 asm version of vpx_idct32x32_34_add()"	2015-11-20 15:11:00 +00:00
Alex Converse	6aa2163b69	bitreader/writer: Change shift to signed Silences several legal but suspicious unsigned overflows found with clang -fsanitize=integer. Change-Id: I69399751492a183167932b0a10751c433c32ca7b	2015-11-19 15:13:39 -08:00
Alex Converse	42b7c44b2f	Fix a signed shift overflow in vpx_rb_read_inv_signed_literal. Found with clang -fsanitize=integer Change-Id: I17cb2166c06ff463abfaf9b0e6bc749d0d6fdf94	2015-11-19 15:04:20 -08:00
Jian Zhou	d76032ae87	Speed up h_predictor_4x4 Modify h_predictor_4x4 with XMM registers. Speed up by ~25% in ./test_intra_pred_speed. Change-Id: Id01c34c48e75b9d56dfc2e93af12cf0c0326a279	2015-11-19 11:34:22 -08:00
Jian Zhou	79b68626ae	Speed up tm_predictor_4x4 tm_predictor_4x4 is implemented with SSE2 using XMM registers. Speed up by ~25% in ./test_intra_pred_speed. Change-Id: I25074b78d476a2cb17f81cf654bdfd80df2070e0	2015-11-18 16:44:25 -08:00
Scott LaVarnway	ed833048c2	VPX: x86 asm version of vpx_idct32x32_34_add() Change-Id: Ic81f38998fb1b8d33f5a5d7424c2c41002786cef	2015-11-17 17:42:24 -08:00
James Zern	0ccad4d649	Revert "VPX: x86 asm version of vpx_idct32x32_34_add()" This reverts commit `9aeaa2016e`. This causes some test vectors to fail. Change-Id: I3659a2068404ec5a0591fba5c88b1bec0c9059a4	2015-11-11 11:12:38 -08:00
James Zern	e3efed7f4c	Merge "convolve_copy_sse2: replace SSE w/SSE2 code"	2015-11-10 22:35:12 +00:00
Scott LaVarnway	f48321974b	Merge "VPX: x86 asm version of vpx_idct32x32_34_add()"	2015-11-10 21:40:11 +00:00
Scott LaVarnway	9aeaa2016e	VPX: x86 asm version of vpx_idct32x32_34_add() Change-Id: I8a933c63b7fbf3c65e2c06dbdca9646cadd0b7cb	2015-11-10 11:54:56 -08:00
James Zern	40dab58941	convolve_copy_sse2: replace SSE w/SSE2 code this should be neutral or slightly faster on modern (P4+) architectures Change-Id: Iec4c080275941eb8c9e05a66a2daf0405d86a69b	2015-11-09 23:45:16 -08:00
Debargha Mukherjee	65dd056e41	Merge "Optimize vpx_quantize_{b,b_32x32} assembler."	2015-10-26 18:04:49 +00:00
Ronald S. Bultje	53dc9fd0a0	vp10: merge ext_ipred_bltr experiment into misc_fixes. Change-Id: I2f2deb700748408b8278b7f5c29ee1f2e39785ec	2015-10-21 22:27:34 -04:00
Geza Lore	9cfba09ac0	Optimize vpx_quantize_{b,b_32x32} assembler. Added optimization of the 8 bit assembly quantizer routines. This makes these functions up to 100% faster, depending on encoding parameters. This patch maskes the encoder faster in both the high bitdepth and 8bit configurations. In the high bitdepth configuration, it effects profile 0 only. Based on my profiling using 1080p input the net gain is between 1-3% for the 8 bit config, and around 2.5-4.5% for the high bitdepth config, depending on target bitrate. The difference between the 8 bit and high bitdepth configurations for the same encoder run is reduced by 1% in all cases I have profiled. Change-Id: I86714a6b7364da20cd468cd784247009663a5140	2015-10-20 10:11:19 +01:00
Ronald S. Bultje	c7dc1d78bf	vp10: add extended-intra prediction edges experiment. This experiment allows using full above/right edges for all transform sizes whenever available (for d45/d63), and adds bottom/left edges for d207. See issue 1043. Change-Id: I5cf7f345e783e8539bb6b6d2c9972fb1d6d0a78b	2015-10-16 19:30:39 -04:00
Johann	ec623a0bb7	Upstream Mozilla fix for older Apple clang builds Also use the _mm_broadcastsi128_si256 intrisic for Apple clang versions 4.[012] https://bugzilla.mozilla.org/show_bug.cgi?id=1085607 https://code.google.com/p/webm/issues/detail?id=1082 Change-Id: I6bc821d8163387194ef663e94bfed91fa7281d88	2015-10-14 07:41:23 -07:00
hui su	6f31722950	Fix compiler warnings Change-Id: I761256a8100d83abf1b937f3739580237e3fad2a	2015-10-13 10:33:17 -07:00
Alex Converse	0c00af126d	Add vpx_highbd_convolve_{copy,avg}_sse2 single-threaded: swanky (silvermont): ~1% faster overall peppy (celeron,haswell): ~1.5% faster overall Change-Id: Ib74f014374c63c9eaf2d38191cbd8e2edcc52073	2015-10-09 11:50:25 -07:00
Geza Lore	cbada4a982	Remove 4 mova insts from quantize_ssse3_x86_64.asm Change-Id: If3cb9345b44162e600e6c74873e0cb4c207fc7fb	2015-10-09 07:52:04 -07:00
Julia Robson	37c68efee2	SSSE3 optimisation for quantize in high bit depth When configured with high bit detpth enabled, the 8bit quantize function stopped using optimised code. This made 8bit content decode slowly. This commit re-enables the SSSE3 optimisations. Change-Id: I194b505dd3f4c494e5c5e53e020f5d94534b16b5	2015-10-06 13:32:02 +01:00
Scott LaVarnway	b212094839	Merge "VPX: refactor vpx_idct32x32_1_add_sse2()"	2015-10-06 11:35:15 +00:00
Julia Robson	5e6533e707	SSE2 optimisation for quantize in high bit depth When configured with high bit detpth enabled, the 8bit quantize function stopped using optimised code. This made 8bit content decode slowly. This commit re-enables the SSE2 optimisation (but not the SSSE3 optimisation). Change-Id: Id015fe3c1c44580a4bff3f4bd985170f2806a9d9	2015-10-05 10:59:16 -07:00
Scott LaVarnway	23d1c06268	VPX: refactor vpx_idct32x32_1_add_sse2() Change-Id: Ia1a2cac0e9dc05f3207b3433a6c1589fa7f2aee3	2015-10-05 06:33:42 -07:00
Ronald S. Bultje	3fedf4a59b	Merge "vp10: reimplement d45/4x4 to match vp8 instead of vp9."	2015-10-02 17:15:59 +00:00
Debargha Mukherjee	cb5c47f20d	Merge "Accelerated transform in high bit depth"	2015-10-02 06:55:55 +00:00
Ronald S. Bultje	62a1579525	vp10: reimplement d45/4x4 to match vp8 instead of vp9. This is more a proof of concept than anything else. The problem here isn't so much how to code it, but rather where to place the resulting code. All intrapred DSP code lives in vpx_dsp, so do we want the vp10 specific intra pred functions to live there, or in vp10/? See issue 1015. Change-Id: I675f7badcc8e18fd99a9553910ecf3ddf81f0a05	2015-10-01 10:11:54 -04:00
Ronald S. Bultje	c26a9ecaa2	vp8: change build_intra4x4_predictors() to use vpx_dsp. I've added a few new functions (d45e, d63e, he, ve) to cover the filtered h/v 4x4 predictors that are vp8-specific, the "correct" d45 with the correctly filtered bottom-right pixel (as opposed to the unfiltered version in vp9), and the "broken" d63 with weirdly filtered bottom-right pixels (which is correctly filtered in vp9). There may be a minor performance impact on all systems because we have to do an extra copy of the Above pixel array to incorporate the topleft pixel in the same array (thus fitting the vpx_dsp API). In addition, armv6 will have a more serious performance impact b/c I removed the armv6/vp8-specific assembly. I'm not sure anyone cares... Change-Id: I7f9e5ebee11d8e21aca2cd517a69eefc181b2e86	2015-09-30 18:45:49 -04:00
Ronald S. Bultje	54d48955f6	vp8: change build_intra_predictors_mby_s to use vpx_dsp. Change-Id: I2000820e0c04de2c975d370a0cf7145330289bb2	2015-09-30 18:45:40 -04:00
Julia Robson	406030d1b0	Accelerated transform in high bit depth When configured with high bitdepth enabled, the 8bit transform stopped using optimised code. This made 8bit content decode slowly. Change-Id: I67d91f9b212921d5320f949fc0a0d3f32f90c0ea	2015-09-28 21:09:16 -07:00
Johann	dd4f953350	Remove vpx_filter_block1d16_v8_intrin_ssse3 This was rewritten and moved to vpx_dsp/x86/vpx_subpixel_8t_ssse3.asm in `195883023b` Change-Id: I117ce983dae12006e302679ba7f175573dd9e874	2015-09-18 16:05:43 -07:00
James Zern	683b5a3161	vpx_subpixel_8t_ssse3: fix reg counts/access fixes build on windows x64; previously 'heightq' i.e., the 64-bit register was accessed when only the 32-bit value was needed. given this is from a stack variable the upper bits were undefined. + bump register/xmm counts; users of SETUP_LOCAL_VARS touch xmm13 in 64-bit builds and filter_block1d16_v* uses one extra temp variable Change-Id: I9c768c0b2047481d1d3b11c2e16b2f8de6eb0d80	2015-09-17 12:27:34 -07:00
Ronald S. Bultje	a3df343cda	vp10: code sign bit before absolute value in non-arithcoded header. For reading, this makes the operation branchless, although it still requires two shifts. For writing, this makes the operation as fast as writing an unsigned value, branchlessly. This is also how other codecs typically code signed, non-arithcoded bitstream elements. See issue 1039. Change-Id: I6a8182cc88a16842fb431688c38f6b52d7f24ead	2015-09-16 19:35:03 -04:00
Debargha Mukherjee	1c8567ff09	Remove some trailing whitespaces Change-Id: Icf06d35ca347713253d1eba341a894b51efa81a9	2015-09-08 01:31:04 -07:00
Scott LaVarnway	195883023b	VPX: subpixel_8t_ssse3 asm using x86inc This is based on the original patch optimized for 32bit platforms by Tamar/Ilya and now uses the x86inc style asm. The assembly was also modified to support 64bit platforms. Change-Id: Ice12f249bbbc162a7427e3d23fbf0cbe4135aff2	2015-09-03 20:35:51 -07:00
Johann	c5f11912ae	Include vpx_dsp_common.h when using VPXMIN/MAX Change-Id: I2e387a06484a06301f3cd6600c4ba2f4335b61ee	2015-08-31 14:36:35 -07:00
Angie Chiang	45db71d0ac	Expand the idct4_c() function in idct8_c() Change-Id: I5afa3c351ba7c5e7deb3889f7471619ac60af255	2015-08-28 10:53:11 -07:00
Johann Koenig	5c245a46d8	Merge changes I53b5bdc5,Ib81168a7,Ie0113945 * changes: Only build ssse3 filter functions on 64 bit Clean up unused function warnings in vp8 encoder Clean up unused function warnings in vp8 onyx_if.c	2015-08-27 20:58:53 +00:00
Johann Koenig	18ea2a7e0c	Merge "Add sse2 versions of halfpix variance"	2015-08-27 20:56:32 +00:00
Johann	a28b2c6ff0	Add sse2 versions of halfpix variance These were lost in the great sub pixel variance move of `6a82f0d7fb` Not having these functions caused a ~10% performance regression in some realtime vp8 encodes. Change-Id: I50658483d9198391806b27899f2c0d309233c4b5	2015-08-27 11:58:38 -07:00
James Zern	5e16d397bd	vpx_dsp_common: add VPX prefix to MIN/MAX prevents redeclaration warnings; vp8 has its own define which will be resolved in a future commit Change-Id: Ic941fef3dd4262fcdce48b73075fe6b375f11c9c	2015-08-26 20:11:32 -07:00
Johann	f5507b514c	Only build ssse3 filter functions on 64 bit Avoid an unused function warning by only building the functions when they will be used. Change-Id: I53b5bdc5a180c79d63b34e4c8921d679bbc54009	2015-08-26 10:32:18 -07:00
Scott LaVarnway	6c0f6dd817	Merge "VPX: scaled convolve : fix windows build errors"	2015-08-21 12:06:34 +00:00
Scott LaVarnway	acf24cc1b8	VPX: scaled convolve : fix windows build errors Change-Id: Ic81d435ea928183197040cdf64b6afd7dbaf57e4	2015-08-20 13:09:27 -07:00
Scott LaVarnway	6a21ca20cc	Merge "VPX ssse3 scaled convolve"	2015-08-19 22:12:21 +00:00
Jingning Han	b1339751b9	Merge "Rename inv_txfm_sse2.asm to inv_wht_sse2.asm"	2015-08-19 18:26:30 +00:00
Jingning Han	49f6ff1103	Rename inv_txfm_sse2.asm to inv_wht_sse2.asm Change-Id: I43bcc70680503e4c18d8f021097307778cf9ea70	2015-08-19 10:29:53 -07:00
Scott LaVarnway	2030c49cf8	VPX ssse3 scaled convolve Change-Id: I71d5994e21813554a927d35ebcc26bf7a68984fd	2015-08-18 15:13:02 -07:00
Jingning Han	5de049b067	Turn on dspr2 loop filter functions in vpx_dsp Add the dspr2 files to vpx_dsp.mk and enable these functions in vpx_dsp_rtcd_defs.pl file. Change-Id: I79feb5af24f174f4a0788dc6f3b6df7f4e1fa467	2015-08-17 16:15:24 -07:00
James Zern	1794624c18	Merge changes I2fe52bfb,I5e5084eb * changes: VPX: removed filter == 128 checks from mips convolve code VPX: removed step checks from mips convolve code	2015-08-14 19:45:27 +00:00
James Zern	78629508f2	Merge "VPX: removed step checks from neon convolve code"	2015-08-14 19:23:46 +00:00
Yaowu Xu	94ba3939cd	vpx_highbd_ssim_parms_8x8: make parameter types consistent Change-Id: Ie1fe6603232adc22dbe4d51bd1008c856a6d40ca	2015-08-14 09:18:07 -07:00
Scott LaVarnway	89dcc13939	VPX: removed filter == 128 checks from mips convolve code The check is handled by the predictor table. Change-Id: I2fe52bfbbfccb2edd13ba250986e3a4b4b589459	2015-08-13 12:57:01 -07:00
Scott LaVarnway	aeea00cc4f	VPX: removed step checks from mips convolve code The check is handled by the predictor table. Change-Id: I5e5084ebb46be8087c8c9d80b5f76e919a1cd05b	2015-08-13 11:27:04 -07:00
Scott LaVarnway	fa47212933	VPX: removed step checks from neon convolve code The check is handled by the predictor table. Change-Id: I42479f843e77a2d40cdcdfc9e2e6c48a05a36561	2015-08-12 16:46:53 -07:00
Scott LaVarnway	6cf95bd1e7	Merge "VPX: remove step == 16 and filter[3] != 128 checks"	2015-08-12 20:13:33 +00:00
James Zern	345b11cd73	Merge "fix build w/only mmx+sse enabled"	2015-08-12 02:26:08 +00:00
Jingning Han	3ee6db6c81	Fork VP9 and VP10 codebase This commit folks the VP9 and VP10 codebase and makes libvpx support VP8, VP9, and VP10. Change-Id: I81782e0b809acb3c9844bee8c8ec8f4d5e8fa356	2015-08-11 17:05:28 -07:00
James Zern	23532eb7b6	fix build w/only mmx+sse enabled many _sse2.asm have sse implementations as well Change-Id: Idfa1f5cab593e4913aaad37f7223e8430188c44a	2015-08-11 15:52:43 -07:00
Scott LaVarnway	b04dad328c	Merge "VPX: remove scaled calls from FUN_CONV_1D"	2015-08-11 21:46:50 +00:00
Scott LaVarnway	4ef08dcec8	Merge "VPX: Add rtcd support for scaling."	2015-08-11 13:19:00 +00:00
Aℓex Converse	b152472ba7	Merge "Move vp9_systemdependent.h to vpx_ports bitops.h and system_state.h"	2015-08-11 01:18:39 +00:00
Alex Converse	a8a08ce57e	Move vp9_systemdependent.h to vpx_ports bitops.h and system_state.h Use system_state.h in vpx_dsp and remove unneeded includes of vp9_systemdependent.h. Change-Id: I92557ec6dd5aa790160b4f31fe7967db0d7ec3c4	2015-08-10 15:37:14 -07:00
James Zern	9265bad906	Merge changes from topic 'x86inc' * changes: Only use .text sections for aout Use newer x86inc.asm Use .text instead of .rodata on macho Copy PIC handling code from x86_abi_support Set 'private_extern' visibility for macho targets Avoid 'amdnop' when building with nasm Catch all elf formats Expand PIC default to macho64 and respect CONFIG_PIC from libvpx Use libvpx defines to set name mangling rules Customize x86inc.asm for libvpx	2015-08-10 21:20:38 +00:00
Scott LaVarnway	a229dbc1f0	VPX: remove step == 16 and filter[3] != 128 checks from FUN_CONV_1D and FUN_CONV_2D macros. The functions will not be called with these inputs. Change-Id: I67ec75e4edafc0acee70190521a80ea85dfa521b	2015-08-10 13:44:32 -07:00
Alex Converse	4ea7f2be43	fastssim: Add some missing consts Change-Id: Id36f180032c8a92c686da6f716a7468332b23b94	2015-08-10 09:48:25 -07:00
Johann	41a0a0cb35	Use newer x86inc.asm Rename updated version of x86inc.asm Use "private_prefix" instead of "program_name" and make vpx the default prefix. Change-Id: I4883a99b2aee8e5dc9f2c16a2e6f4b5d6e4de458	2015-08-07 16:44:44 -07:00
Alex Converse	26f4f2dc8e	ssim: Add missing statics and consts Change-Id: I2aa2a545bd2f8f170c66c2e267ea9d617ff10d87	2015-08-07 12:01:19 -07:00
Alex Converse	c1f911a2ea	psnrhvs: Add missing consts and static consts. Change-Id: I63932edaef4c4d4d0a57e6f7d3e4aa42651a5c47	2015-08-07 12:01:14 -07:00
Alex Converse	c65e79d2e5	ssim: Replace unsigned long with uint32_t. The assembly only writes the low 4 bytes, and the HBD version only uses uint32_t bytes. Change-Id: Ie3694ecda511c231e55870df814cbae30e588073	2015-08-07 11:48:31 -07:00
Alex Converse	17cfee3cb5	fastssim: Add stdlib.h for malloc/free Change-Id: I4d734febc14c534dba20b67cf6bd628996cc9ab7	2015-08-07 11:20:05 -07:00
Alex Converse	c7b7011b9b	Move VP9 SSIM metrics to vpx_dsp. Change-Id: I20c7b42631b579fade6cf7ebf6d4c69b2fcb5e5e	2015-08-06 18:25:25 -07:00
Aℓex Converse	7ac505c726	Merge "Narrow a load in iwht4x4_16_add."	2015-08-06 22:21:16 +00:00
Alex Converse	0572052725	Narrow a load in iwht4x4_16_add. The top half is unused. Change-Id: I29b2f6a93e20ea43aff4ad0bd2d52257e1e752b6	2015-08-05 12:16:12 -07:00
Scott LaVarnway	4e6b5079c6	VPX: remove scaled calls from FUN_CONV_1D and FUN_CONV_2D macros. The predict lut now handles this case. The encoder now calls vpx_scaled_2d() instead of vpx_convolve8() for scaling. Change-Id: Ia1c8af8a31e4cb4887a587143108cb45835f7df7	2015-08-05 10:47:06 -07:00
James Zern	afd2f68dae	Revert "VP9_COPY_CONVOLVE_SSE2 optimization" This reverts commit `a5e97d874b`. Additionally: Revert "vpx_convolve_copy_sse2: fix win64" This reverts commit `22a8474fe7`. This change performs poorly on various x86_64 devices affecting performance by 1-3% at 1080P. Performance on chromebook like devices was mixed neutral to slightly negative, so there should be minimal change there. Change-Id: I95831233b4b84ee96369baa192a2d4cc7639658c	2015-08-04 17:57:01 -07:00
Jingning Han	d621de7e8d	Change vp9_quantize to vpx_quantize This commit clears all the vp9_ prefix use case in vpx_dsp. It gets the vp9 folder ready to branch out vp10. Change-Id: I2906eec179ee792b4af8c9b4161313653050e931	2015-08-04 15:31:49 -07:00
Jingning Han	3ad75fc623	Merge "Replace vp9_ prefix with vpx_ prefix in vpx_dsp function names"	2015-08-04 22:30:36 +00:00
Jingning Han	08a453b9de	Replace vp9_ prefix with vpx_ prefix in vpx_dsp function names This commit clears the function naming convention in vpx_dsp. It replaces vp9_ prefix of global functions with vpx_ prefix. It also removes the vp9_ prefix from static functions. Change-Id: I6394359a63b71a51dda01342eec6a3cc08dfeedf	2015-08-04 13:46:11 -07:00
Jingning Han	5f138986fc	Exclude inv_txfm dspr2 files from make file when highbd is on Add a guard to exclud dspr2 inverse transform files from vpx_dsp make file, when high bit-depth is turned on. This fixes the jenkins nightly build. Change-Id: Ibacd86563af1ec4810c550905b3fa0397baeeafc	2015-08-04 09:47:31 -07:00
Parag Salasakar	814e1346a6	Merge "mips msa vpx convolve optimzation"	2015-08-04 04:30:22 +00:00
Parag Salasakar	cc4c5de22f	Merge "mips msa vpx subpel variance optimization"	2015-08-04 04:30:11 +00:00
Jingning Han	bfad9d2fe6	Move inverse transfrom dspr2 functions from vp9 to vpx_dsp Change-Id: Ia9cf7c31cab4ba3dd6b9bb668c4b3e84bd55cf69	2015-08-03 11:59:50 -07:00
Jingning Han	92b08f516a	Add common_dspr2.c file to vpx_dsp/mips Move the declaration of commonly referenced variable to vpx_dsp/mips/common_dspr2.c. Change-Id: Ia51287b02e2ac5cfae0fba98c721f0810618f28e	2015-08-03 10:53:47 -07:00
Jingning Han	a68356202d	Remove vpx_ prefix from the dspr2 file name in vpx_dsp/mips Make it consistent with other formats. Change-Id: I28f0d05ff7c5bf2b815989b3f1bd6c6b25608677	2015-08-03 09:59:14 -07:00
Scott LaVarnway	8f6b943100	VPX: Add rtcd support for scaling. Change-Id: If34bfb0d918967445aea7dc30cd7b55ebfedb1f2	2015-08-03 09:43:34 -07:00
Jingning Han	d10fc5af8f	Merge "Add vpx_dsp_rtcd.h to inv_txfm_sse2.c"	2015-08-03 16:03:09 +00:00
Jingning Han	b096db5ad4	Merge "Remove vp9_common.h from idct16x16_neon.c"	2015-08-03 16:03:02 +00:00
Parag Salasakar	1579bb88c5	mips msa vpx convolve optimzation Removed redundant clip/saturate code from 2tap filter functions average improvement 10%-40% Change-Id: I1dafb5f7d2ce7a021d883d8af30fb93cd9ace173	2015-08-03 14:03:40 +05:30
Parag Salasakar	9b375871db	mips msa vpx subpel variance optimization Removed redundant clip/saturate code from 2tap filter functions average improvement 20%-40% Change-Id: I362540b0c7d5d3d69932c39d61b7d2a44da533d2	2015-08-03 13:00:55 +05:30
Jingning Han	da7dc59837	Merge "Factor out mips/msa inverse transform implementations"	2015-08-03 03:18:39 +00:00
Jingning Han	0fcfc613c6	Merge "Add x86inc flag guard to inv_txfm_sse2.asm"	2015-08-02 21:56:09 +00:00
Jingning Han	6eabf229e2	Remove vp9_common.h from idct16x16_neon.c Change-Id: I3df35a99900ef8ce549d315866849a10db1a4c7b	2015-08-02 09:57:25 -07:00
Jingning Han	4f7a7d29fa	Add x86inc flag guard to inv_txfm_sse2.asm Fix the VS build failure. Change-Id: I4fb9d1c83980c4b52d5a848a9cb02ec72493dccb	2015-08-02 08:43:51 -07:00
Jingning Han	80ae856c8b	Add vpx_dsp_rtcd.h to inv_txfm_sse2.c Change-Id: Ibab434fb4bd6da02dba087582ed74811f555c3ed	2015-08-02 08:25:13 -07:00
James Zern	22a8474fe7	vpx_convolve_copy_sse2: fix win64 xmm6-7 need to be stored Change-Id: I6c51559598d335946ec91be6246b49589c63b724	2015-08-01 11:45:49 -07:00
Jingning Han	44849516d4	Factor out mips/msa inverse transform implementations Move mips/msa inverse transform implementations from vp9 folder to vpx_dsp. Change-Id: Ic4cf3f05247c3c63db7b532a0e5000017a962391	2015-08-01 09:25:12 -07:00
Jingning Han	b4c7d0523a	Merge "Factor inverse transform functions into vpx_dsp"	2015-08-01 16:20:24 +00:00

... 2 3 4 5 6 ...

411 Commits