Sindre Aamås
c8c74903f8
[Encoder] Add single-block AVX2 4x4 DCT/IDCT routines
...
We do four blocks at a time when possible, but need to handle
single blocks at a time for intra prediction.
~3.15x speedup over MMX for the DCT on Haswell.
~2.94x speedup over MMX for the IDCT on Haswell.
Returns diminish with increasing vector length because a larger
proportion of the time is spent on load/store/shuffling.
2016-02-02 17:22:49 +01:00
..
2015-06-10 10:22:29 +03:00
2015-05-15 10:50:49 +03:00
2015-07-09 10:03:00 +08:00
2016-02-02 17:22:49 +01:00
2015-06-10 10:22:01 +03:00
2016-02-02 17:22:49 +01:00
2016-01-14 09:16:12 +08:00
2016-01-14 09:16:12 +08:00
2015-08-27 17:24:48 +08:00
2015-05-14 13:58:40 +03:00
2015-06-10 10:22:01 +03:00
2015-06-10 10:22:01 +03:00
2015-11-24 10:44:23 +08:00
2015-06-30 10:29:49 +08:00
2015-08-27 17:24:48 +08:00
2016-01-14 09:16:12 +08:00
2015-06-10 10:22:01 +03:00
2015-05-15 10:50:49 +03:00
2015-06-10 10:21:29 +03:00
2015-12-10 15:07:19 +08:00
2015-05-15 10:50:49 +03:00
2014-08-21 15:36:57 +08:00
2015-10-28 14:39:30 +02:00
2015-05-14 13:58:40 +03:00
2015-11-26 09:32:33 +08:00
2015-12-15 17:10:52 +08:00
2015-06-10 10:22:01 +03:00
2015-06-30 10:29:49 +08:00
2015-11-25 13:46:21 -08:00
2015-05-15 10:50:49 +03:00
2015-11-20 09:51:01 +08:00
2015-11-25 13:36:37 +08:00
2015-06-10 10:22:13 +03:00
2015-11-24 11:14:58 -08:00
2015-05-14 13:58:40 +03:00
2015-10-14 19:43:19 -07:00
2015-05-14 13:58:40 +03:00
2015-05-14 13:58:40 +03:00
2015-06-10 10:21:52 +03:00
2015-10-19 22:48:28 -07:00
2015-12-09 09:55:04 -08:00
2015-07-09 10:03:00 +08:00
2016-01-14 09:16:12 +08:00
2015-11-25 13:46:21 -08:00
2015-12-09 09:55:04 -08:00
2015-11-30 10:32:48 -08:00
2014-08-11 16:08:49 +08:00