vpx/vp8/common/arm/neon
Johann d9dce2f48e Restore vp8_sixtap_predict4x4_neon
This function was removed when clang started introducing alignment hints
which caused the 32 bit vld1_lane_u32/vst1_lane_u32 to fail:
https://llvm.org/bugs/show_bug.cgi?id=24421

The load has been rendered safe with an implementation ~indiscernible
performance-wise that uses _u8 and over-reads just a touch.

The store, when unaligned, has a version that is ~25% slower but safe
when xoffset = 0 (second pass filter only). When the first pass filter
(or both) are in play, the new version is almost identical in speed.

Worst case performance (both filters, unaligned stores) is roughly 3-4x
faster than C.

BUG=webm:817
BUG=webm:1273

Change-Id: I1e490e94453e0872151fe0dafb05557463f6247d
2016-09-15 14:56:47 -07:00
..
bilinearpredict_neon.c vp8: apply clang-format 2016-07-15 19:28:44 -07:00
copymem_neon.c prepend ++ instead of post in for loops. 2016-07-18 06:54:50 -07:00
dc_only_idct_add_neon.c prepend ++ instead of post in for loops. 2016-07-18 06:54:50 -07:00
dequant_idct_neon.c vp8: apply clang-format 2016-07-15 19:28:44 -07:00
dequantizeb_neon.c vp8: apply clang-format 2016-07-15 19:28:44 -07:00
idct_blk_neon.c prepend ++ instead of post in for loops. 2016-07-18 06:54:50 -07:00
idct_dequant_0_2x_neon.c vp8: apply clang-format 2016-07-15 19:28:44 -07:00
idct_dequant_full_2x_neon.c vp8: apply clang-format 2016-07-15 19:28:44 -07:00
iwalsh_neon.c vp8: apply clang-format 2016-07-15 19:28:44 -07:00
loopfiltersimplehorizontaledge_neon.c vp8: apply clang-format 2016-07-15 19:28:44 -07:00
loopfiltersimpleverticaledge_neon.c vp8: apply clang-format 2016-07-15 19:28:44 -07:00
mbloopfilter_neon.c vp8: apply clang-format 2016-07-15 19:28:44 -07:00
shortidct4x4llm_neon.c vp8: apply clang-format 2016-07-15 19:28:44 -07:00
sixtappredict_neon.c Restore vp8_sixtap_predict4x4_neon 2016-09-15 14:56:47 -07:00
vp8_loopfilter_neon.c vp8: apply clang-format 2016-07-15 19:28:44 -07:00