generic-library/vpx

Author	SHA1	Message	Date
James Zern	462e0ff88b	vpx_dsp,add_noise: remove mmx implementation a sse2 version exists, this is a reasonable modern baseline. Change-Id: If31d36c8412d25b53f41b4a93cf02f46802c0c33	2016-06-02 23:51:22 -07:00
James Zern	eea8ea88ab	vpx_dsp: remove mmx variance implementations there are sse2 equivalents for all remaining variance implementations Change-Id: I10b947e73fc0067688181f819b59e47966bec3d2	2016-06-02 23:46:16 -07:00
Linfeng Zhang	ad0646cb84	Slow pshufb removal in 3 intra prediction functions. Replaced vpx_d45_predictor_4x4_ssse3(), vpx_d45_predictor_8x8_ssse3() and vpx_d207_predictor_4x4_ssse3() with created vpx_d45_predictor_4x4_sse2(), vpx_d45_predictor_8x8_sse2() and vpx_d207_predictor_4x4_sse2() respectively. It's mostly neutral or slightly worse than ssse3 in good cases and better than ssse3 in the bad cases (but still worse than using the mmx regs). Change-Id: Ib0237ceb71d2c57b8a93fd3170330cfed9d56bdd	2016-06-02 10:55:58 -07:00
Yaowu Xu	46ff1072b3	variance_avx2.c: UBSAN/IOC fix BUG=https://bugs.chromium.org/p/webm/issues/detail?id=1222 Change-Id: Ifb3bedf9b4e1b007b21aebaa4beb9ba50424efef	2016-05-31 16:44:35 -07:00
Linfeng Zhang	0ba9b299e9	Merge "Upgrade vpx_lpf_{vertical,horizontal}_4 mmx to sse2"	2016-05-27 15:47:28 +00:00
Linfeng Zhang	4b5e462d08	Upgrade vpx_lpf_{vertical,horizontal}_4 mmx to sse2 Followed the code style of other lpf fuctions. These 2 functions put 2 rows of data in a single xmm register, so they have similar but not identical filter operations, and cannot share the same macros. Change-Id: I3bab55a5d1a1232926ac8fd1f03251acc38302bc	2016-05-26 14:55:18 -07:00
Scott LaVarnway	9d24fe60f1	Merge "Code clean of sub_pixel_variance4xh -- 2"	2016-05-26 13:20:24 +00:00
Scott LaVarnway	a4f3751be5	Code clean of sub_pixel_variance4xh -- 2 Replace MMX with SSE2. Change-Id: Id8482d2589131f9427e7f36bc64413f058caf31f	2016-05-24 04:44:05 -07:00
James Zern	3fb55d24e8	Revert "Code clean of sub_pixel_variance4xh" This reverts commit 2468163e0770108f5216b65445ce05a8241bca21. causes valgrind errors for overread of buffer in SubpelVarianceTest Change-Id: I448e52c76f815ac199305b71f7d169f2bc167679	2016-05-19 23:37:27 -07:00
Yaowu Xu	d1f0f4cc63	Merge "Clarify integer value ranges"	2016-05-18 23:55:05 +00:00
James Zern	146ccd304f	Merge "Code clean of sub_pixel_variance4xh"	2016-05-18 23:18:35 +00:00
Johann Koenig	36b610d8c1	Merge "neon hadamard 8x8"	2016-05-18 20:11:16 +00:00
Yaowu Xu	a564b18d7f	Clarify integer value ranges This commit clarifies integer value range for vairables used in several variance functions, also change to use proper type conversion to reflect the value ranges. Change-Id: Ic3234b83a912ce1ad12d1b254f3378763e15cc5c	2016-05-18 10:25:12 -07:00
Scott LaVarnway	2468163e07	Code clean of sub_pixel_variance4xh Replace MMX with SSE2. Change-Id: Ia8fcba755952804e347d7d7736f57d1f90c988a0	2016-05-18 04:24:41 -07:00
Johann	9b54e812f7	neon hadamard 8x8 Runs about 30% faster than the C BUG=webm:1021 Change-Id: I6809d6d84c3077ab619c53298296950e976bdaba	2016-05-16 11:58:02 -07:00
Yaowu Xu	c1e4f5a80d	Merge "Change to use correct check for halfpel"	2016-05-13 01:27:47 +00:00
Linfeng Zhang	2f55beb355	Merge "remove mmx variance functions"	2016-05-11 22:21:23 +00:00
Yaowu Xu	17fae3ad0a	Change to use correct check for halfpel In motion estimation stage for subpel motion, subpel variance is computed use bilinear interpolation. The motion vector precision used is at 1/8 pel and three bits are used to represent the x and y subpel offsets. Based on this, the half pel check should be against 4, not 8. Change-Id: I1f56fa1fa3f2f5e19a20d27983efe628557f170e	2016-05-11 13:52:59 -07:00
Linfeng Zhang	d0ffae825d	remove mmx variance functions there are sse2 equivalents which is a reasonable modern baseline Removed mmx variance functions: vpx_get_mb_ss_mmx() vpx_get8x8var_mmx() vpx_get4x4var_mmx() vpx_variance4x4_mmx() vpx_variance8x8_mmx() vpx_mse16x16_mmx() vpx_variance16x16_mmx() vpx_variance16x8_mmx() vpx_variance8x16_mmx() Change-Id: Iffaf85344c6676a3dd337c0645a2dd5deb2f86a1	2016-05-11 12:39:42 -07:00
Linfeng Zhang	d0e687bf8c	remove mmx sad functions there are sse2 equivalents which is a reasonable modern baseline Change-Id: Ibbe536a5ad1c2cccef6bdcc75c13b3dde35a56ba	2016-05-11 10:50:04 -07:00
Jim Bankoski	da33728f48	vpx_dsp: Rename postproc.c add_noise. Change-Id: I4906d1b79a2951e659995202b9fa97e2ea5cfba0	2016-05-10 06:52:58 -07:00
Scott LaVarnway	c2c5297595	Merge "VPX: refactor vpx_idct16x16_1_add_sse2()"	2016-05-09 22:15:17 +00:00
James Bankoski	7cced7b3ea	Merge "libvpx: vpx_add_plane_noise make c match assembly"	2016-05-09 20:17:38 +00:00
Johann Koenig	9e5811f485	Merge changes Id13b97f4,I1d342725 * changes: The subfunctions are only defined for sse2 Unlike non-hbd variance, opt2 is never used	2016-05-09 18:38:59 +00:00
Scott LaVarnway	1490342be5	VPX: refactor vpx_idct16x16_1_add_sse2() Change-Id: I431ea0d9abe764d110a1ba32a8cb15e2fdac8805	2016-05-09 09:50:00 -07:00
Jim Bankoski	7a91d21d69	libvpx: vpx_add_plane_noise make c match assembly This change makes the c match the assembly and removes the todo's associated with getting this to work. Change-Id: Ie32e9ebb584a9d60399662d8bcb71b74fbd19d1e	2016-05-07 12:47:49 -07:00
Johann	7e4c306981	Use canonical avg_pred functions Change-Id: Ibe0cc388226622561d2b4a00e5bdc1016a3c4a94	2016-05-06 19:06:03 -07:00
Johann	b23bd2360f	The subfunctions are only defined for sse2 See highbd_subpel_variance_impl_sse2.asm Change-Id: Id13b97f4f6d189ed71cdc6d52b3c4ea63dc1da05	2016-05-06 18:58:49 -07:00
Johann	a761197fbd	Unlike non-hbd variance, opt2 is never used Change-Id: I1d342725df332c4efc6006d9e3dcb7372c41f448	2016-05-06 18:38:04 -07:00
James Zern	5e679848e8	Merge changes from topic 'missing-proto' * changes: vp9_frame_scale_ssse3.c: make 2 functions static vp9_pickmode.c: make function static vp9_noise_estimate.c: make function static vp9_aq_360.c: add missing include vp9_idct_intrin_sse2: add missing vp9_rtcd.h include vpx_dsp/*.[hc]: add missing vpx_dsp_rtcd.h include	2016-05-06 02:25:29 +00:00
James Zern	2184692c07	vpx_dsp/*.[hc]: add missing vpx_dsp_rtcd.h include Change-Id: I103be7eee36492f8619144ce8325bc916d4975c7	2016-05-04 15:06:44 -07:00
James Zern	4f69f741d8	vpx_dsp_common.h: remove circular include Change-Id: I05b3028a38bbc062c388eeb95e99a3fee583ae6b	2016-05-04 14:54:53 -07:00
James Zern	aa68a8301e	vpx_dsp_common.h: fix include guard Change-Id: I1ad41c096ec86870f9aecab6fdbc3af03e972afc	2016-05-04 14:54:32 -07:00
James Bankoski	89f905e5e5	Merge "libvpx: add a unit test for plane_add_noise."	2016-05-04 13:09:05 +00:00
Jim Bankoski	34d5aff747	libvpx: add a unit test for plane_add_noise. In so doing this fixes a couple of bugs: vpx_plane_add_noise.c needed to subtract a clamp instead of add. And the assembly (mmx sse) had assumptions that parameters were continuous in memory which was not true. Change-Id: I76f2c43cf54bfc838eb2edf8a443eaaa7565d7b5	2016-05-03 16:23:06 -07:00
James Bankoski	e755a283dd	Merge "Move vpx_add_plane from codec to vpx_dsp and dedup."	2016-05-03 14:11:57 +00:00
Jim Bankoski	fce3cee8dd	Move vpx_add_plane from codec to vpx_dsp and dedup. Change-Id: I12218d8331c0558c0587a66321e3ca46da7e5cc7	2016-05-02 12:17:39 -07:00
Alex Converse	f6d13e7be5	Merge "bitreader: remove an unsigned overflow."	2016-04-28 16:26:37 +00:00
Alex Converse	a68b24fdee	Tweak casts on vpx_sub_pixel_variance to avoid implicit overflow. Change-Id: I481eb271b082fa3497b0283f37d9b4d1f6de270c	2016-04-27 16:37:18 -07:00
Alex Converse	36a0c7ffe3	bitreader: remove an unsigned overflow. bits_left is in the range [0, 64 (= BD_VALUE_SIZE)] , so the narrowing conversion should be safe. Change-Id: I943fcd359eaad76249ee1e1fb03a2ac16945d2fd	2016-04-27 15:31:35 -07:00
Alex Converse	6c4007be1c	Be explicit about overflow in vpx_variance16x16_sse2. The product always fits in uint32_t, but the operands don't. An optimizing compiler should generate the wraparound code. (Verified with clang). Change-Id: I25eb64df99152992bc898b8ccbb01d55c8d16e3c	2016-04-27 15:22:17 -07:00
Alex Converse	ccb894ce73	Remove casts on < 16x16 variance. These blocks will never overflow since max sum is +/-255wh. Change-Id: Ia2c630339fd9cfb411b56b6040ff402095f12a2e	2016-04-27 15:21:58 -07:00
Johann	2f5840de3e	vpx_minmax_8x8_neon and test BUG=https://bugs.chromium.org/p/webm/issues/detail?id=1156 Change-Id: Ief0ad8d6255b0ef0f233cda153799e3c72d3dbc6	2016-04-21 21:40:25 -07:00
Johann	8c02a36953	hadamard 8x8 test The order of the output structure is not currently important. BUG=https://bugs.chromium.org/p/webm/issues/detail?id=1021 Change-Id: Ibc0006d569675db6c5060c4529f5d9e73f2e96a6	2016-04-21 22:28:21 +00:00
Johann Koenig	c59c5cbeff	Merge "Enable vpx_idct32x32_1024_add_neon for neon as well, not only for neon_asm"	2016-04-15 16:00:51 +00:00
Martin Storsjo	d8b3e29ee7	Enable vpx_idct32x32_1024_add_neon for neon as well, not only for neon_asm This was never hooked up for the 32x32_34 case as the neon_asm version in 3f7c12da, when the intrinsics version was added. Change-Id: Ic7db4ce5850c637315f9fe9e2de93a4f8cf9e320	2016-04-15 10:25:47 +03:00
Johann	26faa3ec7a	Apply 'const' to data not pointer Change-Id: Ic6b695442e319f7582a7ee8e52a47ae3e38c7298	2016-04-14 14:47:16 -07:00
James Zern	5ab46e0ecd	Merge changes I7a1c0cba,Ie02b5caf,I2cbd85d7,I644f35b0 * changes: vpx_fdct16x16_1_sse2: improve load pattern vpx_fdct16x16_1_c/msa: fix accumulator overflow vpx_fdctNxN_1_sse2: reduce store size dct32x32_test: add PartialTrans32x32Test, Random	2016-04-06 02:51:53 +00:00
James Zern	38bc1d0f4b	vpx_fdct16x16_1_sse2: improve load pattern load the full row rather than doing 2 8-wide columns Change-Id: I7a1c0cba06b0dc1ae86046410922b1efccb95c95	2016-04-04 16:03:42 -07:00
James Zern	eb64ea3e89	vpx_fdct16x16_1_c/msa: fix accumulator overflow tran_low_t is only signed 16-bits in non-high-bitdepth mode Change-Id: Ie02b5caf2658e8e71f995c17dd5ce666a4d64918	2016-04-04 16:03:41 -07:00

... 3 4 5 6 7 ...

493 Commits