Commit Graph

2 Commits

Author SHA1 Message Date
levytamar82
af10457e02 Fix bug 806
in the function sad32x32x4d and sad64x64x4d the source is aligned to 16 bytes
and not to 32 bytes - the load is now unaligned.

Change-Id: I922fdba56d0936b5cf72e4503519f185645a168c
2014-08-07 14:13:30 -07:00
levytamar82
0fa8b668c1 AVX2 SAD Optimization:
2 functions were optimized for avx2 by using full 256 bit register
In order to handle 32 elements in parallel instead of only 16 in parallel:
1. vp9_sad32x32x4d
2. vp9_sad64x64x4d

The function level gain is 66% and the user level gain is ~1%.

Change-Id: I4efbb3bc7d8bc03b64b6c98f5cd5c4a9dd3212cb
2014-03-21 13:53:32 -07:00