Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
Speed: from 3.9x to 9.6x speed improvement over C, and some small (up to 15%) speed improvements over existing MMX code (particularly for bigger filters).