Commit Graph

6 Commits

Author SHA1 Message Date
Sindre Aamås
fe4a47a979 [UT] Add comment on X86_ASM checksum ifdef 2016-06-08 21:53:30 +02:00
Sindre Aamås
8a0af4a3f2 [Processing/x86] DyadicBilinearDownsample optimizations
Average vertically before horizontally; horizontal averaging is more
worksome. Doing the vertical averaging first reduces the number of
horizontal averages by half.

Use pmaddubsw and pavgw to do the horizontal averaging for a slight
performance improvement.

Minor tweaks.

Improve the SSSE3 dyadic downsample routines and drop the SSE4 routines.
The non-temporal loads used in the SSE4 routines do nothing for cache-
backed memory AFAIK.

Adjust tests because averaging vertically first gives slightly different
output.

~2.39x speedup for the widthx32 routine on Haswell when not memory-bound.
~2.20x speedup for the widthx16 routine on Haswell when not memory-bound.

Note that the widthx16 routine can be unrolled for further speedup.
2016-06-02 13:44:28 +02:00
huili2
a8d9576297 Merge pull request #2405 from HaiboZhu/Fix_UT_decoder_init_fail
Fix the decoder init failed case in UT
2016-03-16 16:28:14 +08:00
Haibo Zhu
46f42ec5f3 Fix the decoder init failed case in UT 2016-03-14 17:06:58 +08:00
Karina
f84f2315ab change downsampling logic that downsampling source is from the nearest layer instead of the highest layer 2016-03-14 09:55:36 +08:00
sijchen
46667588e3 moving test cases to specific files to avoid the too long encode_decode_api_test.cpp 2015-11-30 10:47:10 -08:00