vpx/vpx_dsp/arm
Johann f3c97ed32e subpel variance neon: reduce stack usage
Unlike x86, arm does not impose additional alignment restrictions on
vector loads. For incoming values to the first pass, it uses vld1_u32()
which typically does impose a 4 byte alignment. However, as the first
pass operates on user-supplied values we must prepare for unaligned
values anyway (and have, see mem_neon.h).

But for the local temporary values there is no stride and the load will
use vld1_u8 which does not require 4 byte alignment.

There are 3 temporary structures. In the C, one is uint16_t. The arm
saturates between passes but still passes tests. If this becomes an
issue new functions will be needed.

Change-Id: I3c9d4701bfeb14b77c783d0164608e621bfecfb1
2017-05-24 13:28:13 -07:00
..
avg_neon.c move neon load/stores to a new file 2017-05-15 08:29:43 -07:00
deblock_neon.c postproc: vpx_mbpost_proc_down_neon 2017-01-09 10:21:56 -08:00
fdct_neon.c neon fdct: 4x4 implementation 2017-05-17 07:38:18 -07:00
fwd_txfm_neon.c move neon load/stores to a new file 2017-05-15 08:29:43 -07:00
hadamard_neon.c move neon load/stores to a new file 2017-05-15 08:29:43 -07:00
highbd_idct4x4_add_neon.c Update highbd idct functions arguments to use uint16_t dst 2017-05-03 13:59:16 -07:00
highbd_idct8x8_add_neon.c Update highbd idct functions arguments to use uint16_t dst 2017-05-03 13:59:16 -07:00
highbd_idct16x16_add_neon.c Update highbd idct functions arguments to use uint16_t dst 2017-05-03 13:59:16 -07:00
highbd_idct32x32_34_add_neon.c Update highbd idct functions arguments to use uint16_t dst 2017-05-03 13:59:16 -07:00
highbd_idct32x32_135_add_neon.c Update highbd idct functions arguments to use uint16_t dst 2017-05-03 13:59:16 -07:00
highbd_idct32x32_1024_add_neon.c Update highbd idct functions arguments to use uint16_t dst 2017-05-03 13:59:16 -07:00
highbd_idct32x32_add_neon.c Update highbd idct functions arguments to use uint16_t dst 2017-05-03 13:59:16 -07:00
highbd_intrapred_neon.c Add high bitdepth intra prediction NEON optimization (mode tm) 2016-11-15 14:19:46 -08:00
highbd_loopfilter_neon.c cosmetics,*loopfilter_neon.c: s/tranpose/transpose/ 2016-10-12 16:12:56 -07:00
highbd_vpx_convolve8_neon.c Update highbd convolve functions arguments to use uint16_t src/dst 2017-04-25 14:22:19 -07:00
highbd_vpx_convolve_avg_neon.c Update highbd convolve functions arguments to use uint16_t src/dst 2017-04-25 14:22:19 -07:00
highbd_vpx_convolve_copy_neon.c Update highbd convolve functions arguments to use uint16_t src/dst 2017-04-25 14:22:19 -07:00
highbd_vpx_convolve_neon.c Update highbd convolve functions arguments to use uint16_t src/dst 2017-04-25 14:22:19 -07:00
idct4x4_1_add_neon.asm Cosmetics by unifying dest_stride to stride in idct 2016-12-12 15:13:22 -08:00
idct4x4_1_add_neon.c move neon load/stores to a new file 2017-05-15 08:29:43 -07:00
idct4x4_add_neon.asm Cosmetics by unifying dest_stride to stride in idct 2016-12-12 15:13:22 -08:00
idct4x4_add_neon.c neon 4 byte helper functions 2017-05-15 13:42:31 -07:00
idct8x8_1_add_neon.c Clean DC only idct NEON intrinsics 2016-12-28 13:51:44 -08:00
idct8x8_add_neon.c move neon load/stores to a new file 2017-05-15 08:29:43 -07:00
idct16x16_1_add_neon.c Clean DC only idct NEON intrinsics 2016-12-28 13:51:44 -08:00
idct16x16_add_neon.c move neon load/stores to a new file 2017-05-15 08:29:43 -07:00
idct32x32_1_add_neon.c Clean DC only idct NEON intrinsics 2016-12-28 13:51:44 -08:00
idct32x32_34_add_neon.c move neon load/stores to a new file 2017-05-15 08:29:43 -07:00
idct32x32_135_add_neon.c move neon load/stores to a new file 2017-05-15 08:29:43 -07:00
idct32x32_add_neon.c move neon load/stores to a new file 2017-05-15 08:29:43 -07:00
idct_neon.asm enable vpx_idct16x16_10_add_neon in hbd builds 2016-12-06 16:09:19 -08:00
idct_neon.h move neon load/stores to a new file 2017-05-15 08:29:43 -07:00
intrapred_neon_asm.asm Replace prefix vp9_ with vpx_ for intra prediction functions 2015-07-27 13:42:06 -07:00
intrapred_neon.c Merge "Add 32x32 d45 and 8x8, 16x16, 32x32 d135 NEON intra prediction" 2016-11-22 23:20:53 +00:00
loopfilter_4_neon.asm Check in vpx_lpf_vertical_4_dual_neon() assembly 2016-12-02 15:54:30 -08:00
loopfilter_8_neon.asm NEON asm of vpx_lpf_{horizontal,vertical}_8_dual_neon() 2016-08-16 08:50:57 -07:00
loopfilter_16_neon.asm Refactor vpx lpf NEON files (step 2/2) 2016-09-30 09:56:28 -07:00
loopfilter_neon.c cosmetics,*loopfilter_neon.c: s/tranpose/transpose/ 2016-10-12 16:12:56 -07:00
mem_neon.h sub pel variance neon: 4x block sizes 2017-05-22 14:40:01 -07:00
sad4d_neon.c vpx_dsp: apply clang-format 2016-07-25 14:14:19 -07:00
sad_neon.c vpx_dsp: apply clang-format 2016-07-25 14:14:19 -07:00
save_reg_neon.asm replace by VSTM/VLDM to reduce one of VST1/VLD1 2016-07-28 23:01:38 +00:00
subpel_variance_neon.c subpel variance neon: reduce stack usage 2017-05-24 13:28:13 -07:00
subtract_neon.c vpx_dsp: apply clang-format 2016-07-25 14:14:19 -07:00
transpose_neon.h Add vpx_highbd_idct32x32_135_add_neon() 2017-03-16 22:37:55 -07:00
variance_neon.c variance neon: assert overflow conditions 2017-05-22 11:25:06 -07:00
vpx_convolve8_avg_neon_asm.asm VPX: removed step checks from neon convolve code 2015-08-12 16:46:53 -07:00
vpx_convolve8_neon_asm.asm VPX: removed step checks from neon convolve code 2015-08-12 16:46:53 -07:00
vpx_convolve8_neon.c add vpx high bitdepth convolve8 NEON intrinsics optimization 2016-10-17 15:23:54 -07:00
vpx_convolve_avg_neon_asm.asm Code refactor on InterpKernel 2015-07-31 10:27:33 -07:00
vpx_convolve_avg_neon.c Refine vpx_convolve_copy_neon() and vpx_convolve_avg_neon() 2016-09-29 16:19:39 -07:00
vpx_convolve_copy_neon_asm.asm Code refactor on InterpKernel 2015-07-31 10:27:33 -07:00
vpx_convolve_copy_neon.c Refine vpx_convolve_copy_neon() and vpx_convolve_avg_neon() 2016-09-29 16:19:39 -07:00
vpx_convolve_neon.c add vpx high bitdepth convolve8 NEON intrinsics optimization 2016-10-17 15:23:54 -07:00