The neon intrinsics are not able to load just the 4 values that are
used. In vpx_dsp/arm/intrapred_neon.c:dc_4x4 it loads 8 values for both
the 'above' and 'left' computations, but only uses the sum of the first
4 values.
BUG=webm:1268
Change-Id: I937113d7e3a21e25bebde3593de0446bf6b0115a