In the variance calculations the difference is summed and later squared.
When the sum exceeds sqrt(2^31) the value is treated as a negative when
it is shifted which gives incorrect results.
To fix this we cast the result of the multiplication as unsigned.
The alternative fix is to shift sum down by 4 before multiplying.
However that will reduce precision.
For 16x16 blocks the maximum sum is 65280 and sqrt(2^31) is 46340 (and
change).
PPC change is untested.
Change-Id: I1bad27ea0720067def6d71a6da5f789508cec265
Added preload instructions to armv6 encoder optimizations.
About 5% average speed-up on Tegra2 for VGA@30fps sequence.
Change-Id: I41d74737720fb71ce7a316f07555357822f3347e
Half pixel interpolations optimized in variance calculations. Separate
function calls to vp8_filter_block2d_bil_x_pass_armv6 are avoided.On
average, performance improvement is 6-7% for VGA@30fps sequences.
Change-Id: Idb5f118a9d51548e824719d2cfe5be0fa6996628