Jingning Han af31b27aae Optimze inv 16x16 DCT with 10 non-zero coeffs - P2
This commit further optimizes SSE2 operations in the second 1-D
inverse 16x16 DCT, with (<10) non-zero coefficients. The average
runtime of this module goes down from 779 cycles -> 725 cycles.

Change-Id: Iac31b123640d9b1e8f906e770702936b71f0ba7f
2014-01-09 12:46:09 -08:00
..
2014-01-06 14:19:44 -08:00
2013-07-12 15:25:48 -07:00
2014-01-08 12:05:53 -07:00