Christophe Gisquet 81aa0f4604 x86: hpeldsp: implement SSSE3 version of _xy2
Loading pb_1 rather than pw_8192 was benchmarked to be more efficient.
Loading of the 2 yields no advantage. Loading of one saves ~11 cycles.

decicycles count:
put8:  3223(mmx)    -> 2387
avg8:  2863(mmxext) -> 2125
put16: 4356(sse2)   -> 3553
avg16: 4481(sse2)   -> 3513

Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
2014-05-24 15:15:56 +02:00
..
2014-05-21 03:25:08 +02:00
2014-03-17 00:48:09 +01:00
2014-02-24 08:30:19 +01:00
2014-05-21 03:25:08 +02:00
2014-05-21 03:25:08 +02:00
2014-05-22 20:17:40 +02:00
2014-04-19 09:56:01 +02:00
2014-03-22 14:07:03 +01:00
2014-04-19 09:56:01 +02:00
2014-05-21 03:25:08 +02:00
2014-03-29 18:13:15 +01:00
2014-03-29 18:13:15 +01:00
2014-04-19 09:56:01 +02:00