vpx/arm at a047fee606fa32f6a434bc7b08b46787ba2a787d - vpx

History

Timothy B. Terriberry 18dc92fd66 Add 4-tap version of 2nd-pass ARMv6 MC filter.

The existing code applied a 6-tap filter with 0's on either end.
We're already paying the branch penalty to avoid computing the two
 extra columns needed as input to this filter.
We might as well save time computing the filter as well.
This reduces the inner loop from 21 instructions to 16, the number
 of loads per iteration from 4 to 1, and the number of multiplies
 from 7 to 4.
The gain in overall decoding performance, however, is small (less
 than 1%).

This change also means we now valgrind clean on ARMv6, which is
 its real purpose.
The errors reported here were valgrind's fault (it does not detect
 that 0 times an uninitialized value is initialized), but Julian
 Seward says it would slow down valgrind considerably to make such
 checks.
Speeding up libvpx rather, even by a small amount, seems a much
 better idea if only to enable proper valgrind checking of the
 rest of the codec.

Change-Id: Ifb376ea195e086b60f61daf1097d8910c4d8ff16

2010-09-27 18:25:45 -07:00

armv6

Add 4-tap version of 2nd-pass ARMv6 MC filter.

2010-09-27 18:25:45 -07:00

neon

combine max values and compare once

2010-09-24 15:42:50 -04:00

bilinearfilter_arm.c

Use WebM in copyright notice for consistency

2010-09-09 10:01:21 -04:00

filter_arm.c

Add 4-tap version of 2nd-pass ARMv6 MC filter.