Andrew Russell a46f5459c3 improved speed of 4x4 sse2 fdct.
* speed improvment of 30 percent achieved
* multiplies and adds remain the same
* non-arithmetic instructions minimized by hand, by:
   -expanding 2 pass loop
   -removing irrelivant "shuffles"
   -combining last two rounding steps
* further improvments may be possible

Change-Id: Idec2c3f52910c48e6a0e0f9aefed5cae31b0b8c0
2014-03-03 14:25:42 -08:00
..
2014-03-01 10:59:24 -08:00
2014-03-03 14:25:42 -08:00
2014-02-26 18:07:23 -08:00
2014-02-26 16:21:12 -08:00