e7cd80718b
Vp9_sad3x16_sse2() is heavily called in decoder, in which the unaligned reads consume lots of cpu cycles. When CONFIG_SUBPELREFMV is off, the unaligned offset is 1. In this situation, we can adjust the src_ptr to be 4-byte aligned, and then do the aligned reads. This reduced the reading time significantly. Tests on 1080p clip showed over 2% decoder performance gain with CONFIG_SUBPELREFM off. Change-Id: I953afe3ac5406107933ef49d0b695eafba9a6507