Sindre Aamås fc16010583 [Common/x86] DeblockChromaLt4V_ssse3 optimizations
Use packed 8-bit operations rather than unpack to 16-bit.

Avoid spills.

~2.68x speedup on Haswell (x86-64).
~2.38x speedup on Haswell (x86 32-bit).
2016-02-15 02:07:25 +01:00
..
2015-10-15 10:04:00 -07:00