generic-library/vpx

Author	SHA1	Message	Date
Johann	2057d3ef75	use memcpy for unaligned neon stores Advise the compiler that the store is eventually going to a uint8_t buffer. This helps avoid getting alignment hints which would cause the memory access to fail. Originally added as a workaround for clang: https://bugs.llvm.org//show_bug.cgi?id=24421 Change-Id: Ie9854b777cfb2f4baaee66764f0e51dcb094d51e	2017-05-17 12:11:31 -07:00
Johann	1d14e42df7	Un-Revert "Restore vp8_sixtap_predict4x4_neon" This restores `d9dce2f48e` Switched to using signed shift-and-narrow. Instead of saturating negative results to 0, it was saturating them to 255. BUG=webm:817 BUG=webm:1273 Change-Id: I571095336aa4182e3288b17924fcaaece42b0a49	2016-09-23 14:58:57 -07:00
Johann Koenig	7795e99296	Revert "Restore vp8_sixtap_predict4x4_neon" This reverts commit `d9dce2f48e`. Appears to be failing the SixtapPredict tests in some configurations and possibly test vectors as well. Change-Id: Ica6aa83ebac47d0a76e451846e7da67b1c17a7d7	2016-09-16 06:12:49 +00:00
Johann	d9dce2f48e	Restore vp8_sixtap_predict4x4_neon This function was removed when clang started introducing alignment hints which caused the 32 bit vld1_lane_u32/vst1_lane_u32 to fail: https://llvm.org/bugs/show_bug.cgi?id=24421 The load has been rendered safe with an implementation ~indiscernible performance-wise that uses _u8 and over-reads just a touch. The store, when unaligned, has a version that is ~25% slower but safe when xoffset = 0 (second pass filter only). When the first pass filter (or both) are in play, the new version is almost identical in speed. Worst case performance (both filters, unaligned stores) is roughly 3-4x faster than C. BUG=webm:817 BUG=webm:1273 Change-Id: I1e490e94453e0872151fe0dafb05557463f6247d	2016-09-15 14:56:47 -07:00
Jim Bankoski	3e04114f3d	prepend ++ instead of post in for loops. Applied the following regex : search for: (for.\(.;.;) ([a-zA-Z_])\+\+\) replace with: \1 ++\2) This misses some for loops: ie : for (mb_col = 0; mb_col < oci->mb_cols; mb_col++, mi++) Change-Id: Icf5f6fb93cced0992e0bb71d2241780f7fb1f0a8	2016-07-18 06:54:50 -07:00
clang-format	81a6739533	vp8: apply clang-format Change-Id: I7605b6678014a5426ceb45c27b54885e0c4e06ed	2016-07-15 19:28:44 -07:00
Johann	ce11055d57	Remove sixtap/bilinear 4x4 neon implementations These implementations rely on casting the pointers to load the data. Clang implemented optimizations which automatically add alignment hints to such loads. The 4x4 filters do not guarantee the necessary alignment so the resulting assembly is broken. https://llvm.org/bugs/show_bug.cgi?id=24421 BUG=webm:817 BUG=webm:892 Change-Id: I608885299f1f86ff83653b65e0e40d0ae87fb3fe	2016-05-06 17:20:15 -07:00
Jia Jia	0ae866bd19	vp8/vp9: neon: msvc: move the 'ifdef _MSC_VER' bit to vpx_ports/mem.h. fix compiling warning. Change-Id: If8706a9046436f704c597e4275a6810c76ba7daa	2014-09-14 01:43:54 +08:00
James Zern	5e30127c7a	vp8_sixtap_predict4x4_neon: init src vectors quiets uninitialized warnings on the first load. Change-Id: Ied9b03928537a9ed2cd414b9e8a0be00191b0f32	2014-07-10 23:48:47 -07:00
Martin Storsjo	d5d82a5e1a	arm: Add a no-op define of __builtin_prefetch for MSVC Both GCC and RVCT/ARMCC support __builtin_prefetch, but MSVC doesn't. Change-Id: I44e1eecead61bc88d8fdfd3fef03d76d4f5afe08	2014-05-07 10:43:24 +03:00
James Yu	08e38f06db	VP8 for ARMv8 by using NEON intrinsics 14 Add sixtappredict_neon.c - vp8_sixtap_predict16x16_neon - vp8_sixtap_predict8x8_neon - vp8_sixtap_predict8x4_neon - vp8_sixtap_predict4x4_neon Change-Id: I3b02fce48ae2e6c6099041ba5ddd7b090f1463b9 Signed-off-by: James Yu <james.yu@linaro.org>	2014-05-03 19:07:12 -07:00

11 Commits