670 Commits

Author SHA1 Message Date
sijchen
aaa25160ec Merge pull request #2353 from saamas/encoder-x86-dct-opt2
[Encoder] x86 DCT optimizations
2016-02-08 15:00:12 -08:00
sijchen
e5e7013b73 Merge pull request #2350 from sijchen/th00
[Common] Add sink to IWelsTask
2016-02-08 14:59:38 -08:00
Sindre Aamås
c8c74903f8 [Encoder] Add single-block AVX2 4x4 DCT/IDCT routines
We do four blocks at a time when possible, but need to handle
single blocks at a time for intra prediction.

~3.15x speedup over MMX for the DCT on Haswell.
~2.94x speedup over MMX for the IDCT on Haswell.

Returns diminish with increasing vector length because a larger
proportion of the time is spent on load/store/shuffling.
2016-02-02 17:22:49 +01:00
Sindre Aamås
f90960983c [Encoder] Add single-block SSE2 4x4 DCT/IDCT routines
We do four blocks at a time when possible, but need to handle
single blocks at a time for intra prediction.

~2.31x speedup over MMX for the DCT on Haswell.
~1.92x speedup over MMX for the IDCT on Haswell.
2016-02-02 17:22:48 +01:00
unknown
3873addc3d fix frame size constraints for width and height 2016-02-01 15:55:53 +08:00
HaiboZhu
1030820ec4 Merge pull request #2342 from sijchen/enh_ut_tem
[UT] correct and enhance the ut template and trace improvement
2016-02-01 09:08:05 +08:00
sijchen
47e3f4c45c correct and enhance the ut template 2016-01-19 17:16:39 -08:00
Sindre Aamås
cc8d541432 [UT] Utilize DCT function pointer typedefs 2016-01-19 22:00:24 +01:00
Sindre Aamås
a45c10cf91 [UT] Only run AVX2 tests if host supports AVX2 2016-01-19 14:27:46 +01:00
Sindre Aamås
3088d96978 [Encoder] Add an AVX2 4x4 IDCT implementation
~2.03x faster on Haswell as compared to the SSE2 version.
2016-01-19 13:12:28 +01:00
Sindre Aamås
b267163f10 [Encoder] Add an AVX2 4x4 DCT implementation
~2.52x faster on Haswell as compared to the SSE2 version.
2016-01-19 13:12:28 +01:00
Sindre Aamås
b9adbcf37c [UT] Add missing SSE2 4x4 IDCT test
IDCT input is defined in such a way that the intermediate values
cannot legally overflow an int16_t. The use of random values
as input causes such overflows. This results in implementation-
dependent output depending on which type is used to hold
intermediate results. Use a template for the test reference
implementation to test implementations with different
intermediate representation.
2016-01-19 13:12:28 +01:00
Sindre Aamås
8764231784 [UT] Improve DCT tests
Initialize input arrays with different random values.

Otherwise, the input to the DCT routines is effectively
all zero values after taking the difference.

Reduce duplication.
2016-01-19 13:12:28 +01:00
sijchen
d46cd07511 fix the prob in case that the task uID is too big 2016-01-15 16:06:09 -08:00
sijchen
5eb18b101e change the output way of debug trace 2016-01-13 22:13:43 -08:00
Karina
0f0d54ef51 using independent encoder control logic for SAVC case 2016-01-14 09:16:12 +08:00
sijchen
cce1c29844 add sink to IWelsTask (for further enhancements) 2016-01-13 16:24:54 -08:00
sijchen
5cad0f9bba enhance a UT to cover more case 2016-01-11 22:01:02 -08:00
huade
0f24b80af8 reduce one test sequences and let travis jobs num to 4, thus reduce test time 2015-12-14 17:18:21 +08:00
HaiboZhu
ee01b3afaf Merge pull request #2307 from huili2/fix_decstat
fix iAvgLumaQp in decStat
2015-12-14 10:26:16 +08:00
huili2
b2d4a95537 fix iAvgLumaQp in decStat 2015-12-11 14:14:42 +08:00
sijchen
0c820f4c06 adjust encoder test case to cover multi-thread without loadbalancing 2015-12-09 09:58:03 -08:00
sijchen
76ca56498a Add tasks and thread pool call for SM_SIZELIMITED_SLICE mode 2015-12-09 09:55:04 -08:00
HaiboZhu
7e9fdc181f Merge pull request #2301 from huili2/simple_parseonly_ctx
remove parseonly in decoder ctx
2015-12-09 10:33:29 +08:00
sijchen
f38d24f036 fix the conflict with the current master 2015-11-30 23:42:26 -08:00
Guangwei Wang
c917d09263 fix bug in UT code 2015-12-01 08:55:00 +08:00
sijchen
420778f4d8 add valid adjustment in test to avoid outputing warning trace 2015-11-30 11:33:13 -08:00
sijchen
42ac53b5fc update win UT project after UT structure change 2015-11-30 11:29:47 -08:00
sijchen
46667588e3 moving test cases to specific files to avoid the too long encode_decode_api_test.cpp 2015-11-30 10:47:10 -08:00
huili2
926fc67451 remove parseonly in decoder ctx 2015-11-27 08:56:20 +08:00
sijchen
05c89b75f0 remove duplicated operation after thread pool and rename a task for clearer meaning 2015-11-25 13:46:21 -08:00
sijchen
67dab5d70e Merge pull request #2266 from sijchen/ut0
[UT] put class notification to header file
2015-11-25 09:57:43 -08:00
HaiboZhu
404315ab19 Merge pull request #2270 from huili2/parseonly_api_bugfix
disable wrongly calling for parseonly related
2015-11-25 09:00:54 +08:00
huili2
9fade10d77 disable wrongly calling for parseonly related 2015-11-24 11:11:27 +08:00
sijchen
5d03a8a692 put class notification to header file 2015-11-23 15:55:24 -08:00
sijchen
f3c4b878ff update the usage of flag and MD5 value 2015-11-23 11:54:43 -08:00
Martin Storsjö
eaf4798119 Readd a test for GetOption in TestInitUninit
In dc2cbe4, the previous test for GetOption that succeeds when the
decoder is initialized was removed. Add a GetOption call for a different
option, now that DECODER_OPTION_DATAFORMAT is removed.
2015-11-20 00:17:43 +02:00
Martin Storsjö
b3b083c883 Fully initialize m_sDecParam in TestInitUninit
Before dc2cbe4, the DecoderConfigParam function returned early
since DecoderSetCsp signaled a failure, which is why the uninitialized
parameters weren't read before.

This fixes valgrind warnings about conditional jumps depending on
uninitialized values.
2015-11-20 00:13:42 +02:00
huili2
dc2cbe4a22 remove API data format in decoder in 1.6 2015-11-17 13:58:57 +08:00
sijchen
b5d890c1ea Merge pull request #2224 from sijchen/thp73
[Encoder] put the logic related to multiple D layer into a class …
2015-11-13 11:57:07 -08:00
Haibo Zhu
628befe8be Revert "Merge pull request #2217 from huili2/simply_dec_ctx"
This reverts commit 27172bafd7ff2cc80b08768a32a23470f3d6d3fd, reversing
changes made to 24916a652ee5d3e36d931c222df20966f7c158fa.
2015-11-13 20:16:03 +08:00
Karina
7c1fbad53a fix crash 2015-11-13 17:16:26 +08:00
sijchen
e508c86dac fix the missing loadbalancing part 2015-11-12 13:15:07 -08:00
sijchen
aeb5ab4b99 [Encoder] put the logic related to multiple D layer into a class for better structure 2015-11-11 22:55:16 -08:00
HaiboZhu
1a2606f45d Merge pull request #2219 from sijchen/api3
[Encoder] change API for slicing part for easier usage
2015-11-11 09:19:03 +08:00
HaiboZhu
27172bafd7 Merge pull request #2217 from huili2/simply_dec_ctx
remove bParseonly in ctx using that in param, and slightly modify the…
2015-11-11 09:18:04 +08:00
sijchen
33c378f7b7 change API for slicing part for easier usage (the UseLoadBalancing flag is still under working) 2015-11-10 09:50:06 -08:00
Karina
e20ce63778 do GOM rate control for I frame 2015-11-06 16:08:52 +08:00
sijchen
59779539e7 add autolock in ThreadPoolTest to avoid possible conflict 2015-11-04 10:29:08 -08:00
HaiboZhu
f13f502203 Merge pull request #2208 from sijchen/fixslc
[Encoder] Fix for a slicing and multi-threading setting
2015-11-04 09:24:08 +08:00