Johann f7d1486f48 neon variance: process 16 values at a time
Read in a Q register. Works on blocks of 16 and larger.

Improvement of about 20% for 64x64. The smaller blocks are faster, but
don't have quite the same level of improvement. 16x32 is only about 5%

BUG=webm:1422

Change-Id: Ie11a877c7b839e66690a48117a46657b2ac82d4b
2017-05-08 18:48:55 +00:00
..
2016-07-25 14:14:19 -07:00
2016-07-25 14:14:19 -07:00
2016-07-25 14:14:19 -07:00