generic-library/vpx

Author	SHA1	Message	Date
Johann	c7f9d0719d	vp8: clean up rtcd Remove lines which specify the same name for a function. Change-Id: I956bd8ce2b81a2a8feab5621d28bd2499c2b4c2d	2016-09-29 12:10:01 -07:00
Johann	e4ddf9db6a	Hook up vp8_diamond_search_sad_sse3 The original commit never set any 'specialize' line: `61311e6103` It appears the sadx4 version of function uses sdx4df calls to speed up the search. There are no sse3 versions of the sdx4df functions, but there are sse2 and msa versions. There is a neon version of vpx_sad16x16x4d but not any of the smaller versions. Perhaps if they existed this function could be expanded to use them. Change-Id: I936d7d6b1a3ff6dcd5a4d2322272708c47cdec13	2016-09-27 15:31:49 -07:00
Johann	1d14e42df7	Un-Revert "Restore vp8_sixtap_predict4x4_neon" This restores `d9dce2f48e` Switched to using signed shift-and-narrow. Instead of saturating negative results to 0, it was saturating them to 255. BUG=webm:817 BUG=webm:1273 Change-Id: I571095336aa4182e3288b17924fcaaece42b0a49	2016-09-23 14:58:57 -07:00
James Zern	6ae58fd55e	Merge "Revert "Restore vp8_sixtap_predict4x4_neon""	2016-09-16 06:13:42 +00:00
Johann Koenig	7795e99296	Revert "Restore vp8_sixtap_predict4x4_neon" This reverts commit `d9dce2f48e`. Appears to be failing the SixtapPredict tests in some configurations and possibly test vectors as well. Change-Id: Ica6aa83ebac47d0a76e451846e7da67b1c17a7d7	2016-09-16 06:12:49 +00:00
Johann	43743b1d3e	Restore vp8_bilinear_predict4x4_neon This function was removed when clang started introducing alignment hints which caused the 32 bit vld1_lane_u32/vst1_lane_u32 to fail: https://llvm.org/bugs/show_bug.cgi?id=24421 The load has been rendered safe with an implementation ~indiscernible performance-wise that uses _u8 and over-reads just a touch. It is still ~5x faster than C in the unaligned case and doing both filters. BUG=webm:892 BUG=webm:1273 Change-Id: Icf7167189391b46202f47233bb585c24c42bcc36	2016-09-15 21:16:11 -07:00
Johann	d9dce2f48e	Restore vp8_sixtap_predict4x4_neon This function was removed when clang started introducing alignment hints which caused the 32 bit vld1_lane_u32/vst1_lane_u32 to fail: https://llvm.org/bugs/show_bug.cgi?id=24421 The load has been rendered safe with an implementation ~indiscernible performance-wise that uses _u8 and over-reads just a touch. The store, when unaligned, has a version that is ~25% slower but safe when xoffset = 0 (second pass filter only). When the first pass filter (or both) are in play, the new version is almost identical in speed. Worst case performance (both filters, unaligned stores) is roughly 3-4x faster than C. BUG=webm:817 BUG=webm:1273 Change-Id: I1e490e94453e0872151fe0dafb05557463f6247d	2016-09-15 14:56:47 -07:00
Johann	d55724fae9	Remove armv6 target Change-Id: I1fa81cc9cabf362a185fc3a53f1e58de533a41e5	2016-08-04 12:55:06 -07:00
Jim Bankoski	88e6951465	deblock filter : moved from vp8 code branch The deblocking filters used in vp8 have been moved to vpx_dsp for use by both vp8 and vp9. Change-Id: I5209d76edafc894b550f751fc76d3aa6799b392d	2016-07-12 05:53:00 -07:00
Johann	ce11055d57	Remove sixtap/bilinear 4x4 neon implementations These implementations rely on casting the pointers to load the data. Clang implemented optimizations which automatically add alignment hints to such loads. The 4x4 filters do not guarantee the necessary alignment so the resulting assembly is broken. https://llvm.org/bugs/show_bug.cgi?id=24421 BUG=webm:817 BUG=webm:892 Change-Id: I608885299f1f86ff83653b65e0e40d0ae87fb3fe	2016-05-06 17:20:15 -07:00
Jim Bankoski	fce3cee8dd	Move vpx_add_plane from codec to vpx_dsp and dedup. Change-Id: I12218d8331c0558c0587a66321e3ca46da7e5cc7	2016-05-02 12:17:39 -07:00
Ronald S. Bultje	c26a9ecaa2	vp8: change build_intra4x4_predictors() to use vpx_dsp. I've added a few new functions (d45e, d63e, he, ve) to cover the filtered h/v 4x4 predictors that are vp8-specific, the "correct" d45 with the correctly filtered bottom-right pixel (as opposed to the unfiltered version in vp9), and the "broken" d63 with weirdly filtered bottom-right pixels (which is correctly filtered in vp9). There may be a minor performance impact on all systems because we have to do an extra copy of the Above pixel array to incorporate the topleft pixel in the same array (thus fitting the vpx_dsp API). In addition, armv6 will have a more serious performance impact b/c I removed the armv6/vp8-specific assembly. I'm not sure anyone cares... Change-Id: I7f9e5ebee11d8e21aca2cd517a69eefc181b2e86	2015-09-30 18:45:49 -04:00
Ronald S. Bultje	7cdcfee82c	vp8: change build_intra_predictors_mbuv_s to use vpx_dsp. Change-Id: I936c2430c3c5b1e0ab5dec0a20110525e925b5e4	2015-09-30 18:45:46 -04:00
Ronald S. Bultje	54d48955f6	vp8: change build_intra_predictors_mby_s to use vpx_dsp. Change-Id: I2000820e0c04de2c975d370a0cf7145330289bb2	2015-09-30 18:45:40 -04:00
Alex Converse	d816fa7bfd	Replace VP8 SSIM with VP9 derived vpx_dsp SSIM. Change-Id: Ic61f30af12d1b01c1d5adc4e08bc20e20ad38027	2015-08-07 11:20:05 -07:00
Parag Salasakar	d35f992599	mips msa vp8 denoising filter optimization average improvement ~2x-3x Change-Id: I6c17012c731fa4d56e0343f8de0df47b2dde289b	2015-08-01 08:05:25 +05:30
Parag Salasakar	8fbc641540	mips msa vp8 temporal filter optimization average improvement ~2x-3x Change-Id: I05593bed583234dc7809aaec6cab82773a29505d	2015-07-31 12:03:19 +05:30
Parag Salasakar	0e3f494b21	mips msa vp8 block subtract optimization average improvement ~2x-3x Change-Id: I30abf4c92cddcc9e87b7a40d4106076e1ec701c2	2015-07-31 09:29:10 +05:30
Parag Salasakar	56aa0da405	mips msa vp8 quantize optimization average improvement ~2x-3x Change-Id: I6fc37191bf9cb5a67e1af9787d0d27659c17bdba	2015-07-30 12:56:57 -07:00
Parag Salasakar	0c2a14f9e2	mips msa vp8 fdct optimization average improvement ~2x-4x Change-Id: Id0bc600440f7ef53348f585ebadb1ac6869e9a00	2015-07-30 08:14:42 +05:30
Parag Salasakar	a5d9416fd7	mips msa vp8 post proc optimization average improvement ~2x-4x Change-Id: I93abc15389649c169bb8b69127c0b95407d34692	2015-07-29 09:40:26 +05:30
Parag Salasakar	5deb983744	mips msa vp8 filter by weight optimization average improvement ~3x-5x Change-Id: Ia808ae56b118e0e1b293901447aa5a0f597b405b	2015-07-28 08:16:34 +05:30
Parag Salasakar	af6733aec6	mips msa vp8 recon intra optimization average improvement ~3x-5x Change-Id: I73306863e9bf172d5adc06b8dd54e43985d1e063	2015-07-25 12:32:26 +05:30
Parag Salasakar	fb73ceae85	mips msa vp8 bilinear filter optimization average improvement ~3x-4x Change-Id: I8c0b3d5c86c9eb4f802b87c971864d2cfceeb7cc	2015-07-24 09:21:35 +05:30
Parag Salasakar	509fb0bc9d	mips msa vp8 copy mem optimization average improvement ~2x-4x Change-Id: I3af3ecced96c5b8e0cb811256e5089e28fe013a2	2015-07-23 10:29:40 +05:30
Parag Salasakar	55c0df5ef1	mips msa vp8 sixtap filter optimization average improvement ~3x-5x Change-Id: I5fd88cb088814be443d04be384b9fca99b22adef	2015-07-13 09:23:52 +05:30
Parag Salasakar	0ea2684c2c	mips msa vp8 loop filter optimization average improvement ~2x-4x Change-Id: I20c4f900ef95d99b18f9cf4db592cd352c2212eb	2015-07-08 12:41:00 +05:30
Johann	6a82f0d7fb	Move sub pixel variance to vpx_dsp Change-Id: I66bf6720c396c89aa2d1fd26d5d52bf5d5e3dff1	2015-07-07 15:51:04 -07:00
Jingning Han	9d251f9510	Merge "Unify subtract function used in VP8/9"	2015-07-07 20:42:19 +00:00
Jingning Han	0ede9f52b7	Unify subtract function used in VP8/9 This commit replaces the vp8_ prefixed subtract function with the common vpx_subtract_block function. It removes redundant SIMD optimization codes and unit tests. Change-Id: I42e086c32c93c6125e452dcaa6ed04337fe028d9	2015-07-07 09:57:44 -07:00
Parag Salasakar	3d938d71b0	mips msa vp8 idct optimization average improvement ~2x-5x Change-Id: I19e82f78772993bcd67fcf975fe180232172f86d	2015-07-07 12:41:54 +05:30
James Zern	47fe535422	disable vp8_sub_pixel_variance8x8_neon fails unit tests: [ FAILED ] NEON/VP8SubpelVarianceTest.ExtremeRef/0, where GetParam() = (3, 3, 0x14e36d, 0) [ FAILED ] NEON/VP8SubpelVarianceTest.Ref/0, where GetParam() = (3, 3, 0x14e36d, 0) the tests were recently enabled in: `eb88b17` Make vp9 subpixel match vp8 the functions likely haven't changed since being converted from assembly Change-Id: I6141717b111b8f735f436c160d74270af53ef722	2015-06-05 20:18:51 -07:00
Johann	516c087c51	Remove unused sub pixel mse Change-Id: I7a5e4e2632c3fa69d2a85a68fa9b418631caf09c	2015-06-03 08:00:51 -07:00
Johann	86d0cb8325	Disable neon bilinear 4x4 Clang adds alignment hints when casting up the loads/stores. Although this should be safe for most paths, it's causing some crashes. Either the source of the misalignment needs to be determined and adjusted or the intrinsics need to be rewritten to avoid using the cast to load the data. BUG=817,892 Change-Id: Ia3aa824d6a4cd97e14325ff49dc730b6f85ec7e8	2015-06-02 00:02:55 +00:00
Johann	c3bdffb0a5	Move variance functions to vpx_dsp subpel functions will be moved in another patch. Change-Id: Idb2e049bad0b9b32ac42cc7731cd6903de2826ce	2015-05-26 12:01:52 -07:00
James Zern	6eb1016301	vp8_copy32xn: sync function signature + include vp8_rtcd.h in copy_c.c silences missing prototype warnings Change-Id: Iecc279c695b08a26b231dedb41e3b84c551703f3	2015-05-14 22:41:13 -07:00
Johann	d5d9289800	Move shared SAD code to vpx_dsp Create a new component, vpx_dsp, for code that can be shared between codecs. Move the SAD code into the component. This reduces the size of vpxenc/dec by 36k on x86_64 builds. Change-Id: I73f837ddaecac6b350bf757af0cfe19c4ab9327a	2015-05-06 16:58:20 -07:00
Johann	6eec73a747	Remove asm offset dependencies The obj_int_extract code is no longer worth maintaining. It creates significant issues when adapting for different build systems and no longer offers as significant of a performance benefit due to improvements in intrinsics. Source files will remain until the various third-party builds are updated. The neon fast quantizer has been moved to intrinsics. The armv6 version has been removed because so few remaining targets require it. Compilers and processors have improved significantly since the pack_tokens code was written. The assembly is no longer faster than the C code. pack_tokens were the only optimizations for the armv5te targets so the targets will be removed after the test infrastructure has been updated. BUG=710 Change-Id: Ic785b167cd9f95eeff31c7c76b7b736c07fb30eb	2014-11-06 16:00:01 -08:00
Johann	2134eb2f05	Remove pair quantization The intrinsics version of the pair quant is slower than running it individually. Change-Id: I7b4ea8599d4aab04be0a5a0c59b8b29a7fc283f4	2014-10-31 13:42:55 -07:00
Johann	7ae75c3d52	vp8 quantization -> intrinsics Use intrinsics for neon quantization. Slight loss (<5%) of performance compared to the assembly. Roughly 10x faster on arm64 because that was running C code before. Change-Id: I7cf5242d8f29b7eab5bca6a1c20c89c9fc9ca66d	2014-10-31 13:42:13 -07:00
Scott LaVarnway	fe2cc873dc	VP8 encoder for ARMv8 by using NEON intrinsics 1 Add vp8_mse16x16_neon.c - vp8_mse16x16_neon - vp8_get4x4sse_cs_neon Change-Id: I108952f60a9ae50613f0ce3903c2c81df19d99d0 Signed-off-by: James Yu <james.yu@linaro.org>	2014-09-15 12:04:09 -07:00
Johann	23bed46ddd	Move vp8_variance_halfpixvar16x16_*_neon definitions These functions moved from 'neon_asm' to 'neon' in `9293d267d2` Change-Id: I9cb5626c3eec96876a73fb18f2bfc982a5858edb	2014-09-09 08:21:36 -07:00
Jia Jia	395f2e874b	vp8 encoder: remove vp8_yv12_copy_partial_frame_neon Use generic C implementation instead of neon-specific code Change-Id: Ib322b4ece9cdbd4de76a9eed3d2e9fd1d8542406	2014-09-08 08:59:24 -07:00
Scott LaVarnway	ec94967ffe	Revert "Revert "VP8 for ARMv8 by using NEON intrinsics 10"" This reverts commit `677fb5123e` Compiles with 4.6. Change-Id: I7f87048911b6bc28a61741d95501fa45ee97b819	2014-09-04 08:51:20 -07:00
Scott LaVarnway	dcbfacbb98	Neon version of vp8_build_intra_predictors_mby_s() and vp8_build_intra_predictors_mbuv_s(). This patch replaces the assembly version with an intrinsic version. On a Nexus 7, vpxenc (in realtime mode, speed -12) reported a performance improvement of ~2.6%. Change-Id: I9ef65bad929450c0215253fdae1c16c8b4a8f26f	2014-09-03 13:41:27 -07:00
Johann	5b788c0cbe	Merge "Revert "Revert "VP8 for ARMv8 by using NEON intrinsics 06" This reverts commit `81ad047ee5`. Revert "VP8 for ARMv8 by using NEON intrinsics 15" This reverts commit 727af7cebe3698b8493ba6c1360b0a6606c310fb.""	2014-09-03 13:27:11 -07:00
Scott LaVarnway	652ef29d09	Revert "Revert "VP8 for ARMv8 by using NEON intrinsics 08"" This reverts commit `928ff03889` Compiles with 4.6 now. Change-Id: Ib455da1098bb0e0623248be07579882a425fcbd1	2014-08-29 13:29:36 -07:00
Johann	911e96a4eb	Revert "Revert "VP8 for ARMv8 by using NEON intrinsics 06" This reverts commit `81ad047ee5`. Revert "VP8 for ARMv8 by using NEON intrinsics 15" This reverts commit 727af7cebe3698b8493ba6c1360b0a6606c310fb." This reverts commit `920f803f2e` Change-Id: I410d9036214a1b18427cca70b4bc6d8239740737	2014-08-20 09:41:50 -07:00
James Yu	eed005b076	VP8 encoder for ARMv8 by using NEON intrinsics 6 Add shortfdct_neon.c - vp8_short_fdct4x4_neon - vp8_short_fdct8x4_neon Change-Id: I90152c803b484f5fab839473d632c50af0524e68 Signed-off-by: James Yu <james.yu@linaro.org>	2014-08-20 09:25:29 -07:00
James Yu	6d6fdd9c3d	VP8 encoder for ARMv8 by using NEON intrinsics 3 Add subtract_neon.c - vp8_subtract_b_neon - vp8_subtract_mby_neon - vp8_subtract_mbuv_neon Change-Id: If9a17a093478552e3e3276eeaa3f098b9021d08c Signed-off-by: James Yu <james.yu@linaro.org>	2014-08-20 09:20:55 -07:00

1 2

70 Commits