5 Commits

Author SHA1 Message Date
Sindre Aamås
93db6511a8 [UT] Test VAA routines with a wider variety of resolutions
Test even and odd multiples of 32 width because some AVX2 routines
have conditional logic based on that.
2016-04-11 16:40:36 +02:00
Sindre Aamås
57fc3e9917 [Processing] Add AVX2 VAA routines
Process 8 lines at a time rather than 16 lines at a time because
this appears to give more reliable memory subsystem performance on
Haswell.

Speedup is > 2x as compared to SSE2 when not memory-bound on Haswell.
On my Haswell MBP, VAACalcSadSsdBgd is about ~3x faster when uncached,
which appears to be related to processing 8 lines at a time as opposed
to 16 lines at a time. The other routines are also faster as compared
to the SSE2 routines in this case but to a lesser extent.
2016-04-11 16:09:56 +02:00
Martin Storsjö
dd913ef878 Don't use tabs for indentation in multi-line macros
The astyle configuration makes sure normal code is indented consistently
with 2 spaces, but astyle doesn't seem to touch the indentation in
these multi-line macros.
2015-05-13 22:06:54 +03:00
ruil2
3ff145e839 rename namespace and funciton name to avoid conflicts with old library 2014-09-17 15:50:59 +08:00
zhiliang wang
0163eb520d Add UT for VaaCalc Functions. 2014-08-27 13:53:18 +08:00