Commit Graph

6 Commits

Author SHA1 Message Date
Yi Luo
f62dcc9c33 Replace idct32x32_1024_add_ssse3 assembly with intrinsics
- Encoding/decoding test, BQTerrace_1920x1080_60.y4m, on
  i7-6700, no obvious user-level speed performance downgrade.
- Passed unit tests.

Change-Id: I20688e0dd3731021ec8fb4404734336f1a426bfc
2017-02-16 16:10:40 -08:00
Yi Luo
72a43e2378 Add idct32x32_135_add SSSE3 intrinsics
- Replace the corresponding assembly code.
- No user level speed performance degrade.
- Unit tests passed.

Change-Id: Idd0c5a4bad4976f1617c34100cb46e75e3b961e5
2017-02-16 11:29:34 -08:00
clang-format
4b402746ca apply clang-format
Change-Id: I75e4a9e0b37bd4586f26c8d6c1fa27f3f6ff1bce
2017-02-14 12:45:52 -08:00
Yi Luo
bd86de1ac8 Replace idct32x32_34_add_ssse3 assembly with intrinsics
- No user-level speed performance change.
- Pass unit tests.

Change-Id: Idfc598e00f354265e41f6b3219f4734216c115c6
2017-02-14 10:38:36 -08:00
Yi Luo
ac04d11abc Replace idct8x8_12_add_ssse3 assembly code with intrinsics
- Performance achieves the same as assembly.
- Unit tests pass.

Change-Id: I6eacfbbd826b3946c724d78fbef7948af6406ccd
2017-02-08 10:07:45 -08:00
Jingning Han
8f95389742 Add SSSE3 intrinsic 8x8 inverse 2D-DCT
The intrinsic version reduces the average cycles from 183 to 175.

Change-Id: I7c1bcdb0a830266e93d8347aed38120fb3be0e03
2017-02-01 14:47:53 -08:00