openh264

History

Sindre Aamås 8a0af4a3f2 [Processing/x86] DyadicBilinearDownsample optimizations Average vertically before horizontally; horizontal averaging is more worksome. Doing the vertical averaging first reduces the number of horizontal averages by half. Use pmaddubsw and pavgw to do the horizontal averaging for a slight performance improvement. Minor tweaks. Improve the SSSE3 dyadic downsample routines and drop the SSE4 routines. The non-temporal loads used in the SSE4 routines do nothing for cache- backed memory AFAIK. Adjust tests because averaging vertically first gives slightly different output. ~2.39x speedup for the widthx32 routine on Haswell when not memory-bound. ~2.20x speedup for the widthx16 routine on Haswell when not memory-bound. Note that the widthx16 routine can be unrolled for further speedup.		2016-06-02 13:44:28 +02:00
..
ProcessUT_AdaptiveQuantization.cpp	Add checks for cpu features in tests	2015-01-24 22:47:23 +02:00
ProcessUT_DownSample.cpp	[Processing/x86] DyadicBilinearDownsample optimizations	2016-06-02 13:44:28 +02:00
ProcessUT_ScrollDetection.cpp	rename namespace and funciton name to avoid conflicts with old library	2014-09-17 15:50:59 +08:00
ProcessUT_VaaCalc.cpp	[UT] Test VAA routines with a wider variety of resolutions	2016-04-11 16:40:36 +02:00
targets.mk	improve py, and change mk according to mk	2014-09-12 10:25:46 +08:00