1184 Commits

Author SHA1 Message Date
Martin Storsjö
350cafd69a Silence warnings about conversions in MSVC
The kind of cast (static_cast vs C style cast) is different to
match the closely surrounding code.
2016-07-20 12:10:34 +03:00
HaiboZhu
1f770c488c Merge pull request #2531 from GuangweiWang/enable-disable-AVX2
add option for enable/disable AVX2
2016-07-20 13:49:31 +08:00
Guangwei Wang
7d00e8bc42 add option for enable/disable AVX2 2016-07-15 12:15:57 +08:00
Karina
ffe11835fc change FrameQP control and complexity calculation 2016-07-11 10:20:17 +08:00
Karina
35e073714d fix QP issue when adaptive quant turns on 2016-06-22 13:33:50 +08:00
Sindre Aamås
0f7b8365b9 [Encoder] Avoid valgrind downsampling false positives
X86 SIMD downsampling routines may, for convenience, read slightly beyond
the input data and into the alignment padding area beyond each line. This
causes valgrind to warn about uninitialized values even if these values
only affect lanes of a SIMD vector that are effectively never used.

Avoid these false positives by zero-initializing the padding area beyond
each line of the source buffer used for downsampling.
2016-06-16 21:19:17 +02:00
HaiboZhu
d35647ec3b Merge pull request #2491 from ruil2/nalsize
add nalsize checking UT and fix nalsize control when cabac on
2016-06-15 10:24:18 +08:00
HaiboZhu
151a7ff643 Merge pull request #2490 from sijchen/refactor_ref4
[Encoder] refactor: to avoid only use idx0 in syntax writing, for now it has no impact on bs
2016-06-15 10:23:38 +08:00
Karina
b5cef5d49c modify reserved nal header size and change source frame in NalSizeChecking UT 2016-06-08 10:12:27 +08:00
Karina
2171d84f1e add nalsize checking UT and fix nalsize control when cabac on 2016-06-03 17:36:14 +08:00
ruil2
3eba80765c Merge pull request #2487 from sijchen/refactor_ref31
[Encoder] Preprocess: refactor to improve code readability
2016-06-03 13:39:04 +08:00
Karina
4f41c3a5bf fix codingIdx update issue 2016-06-02 21:17:31 +08:00
sijchen@cisco.com
a7ae1efc3a add back the missing part after merging and formatting 2016-06-01 21:33:33 -07:00
sijchen@cisco.com
8bacc3d4d0 Preprocess: refactor to improve code readability 2016-06-01 21:26:24 -07:00
sijchen@cisco.com
8537a9274d fix a prob 2016-06-01 09:21:12 -07:00
sijchen@cisco.com
a9601cdc59 refactor to avoid only use idx0 in syntax writing, for now it has no impact on bs, may benefit future usage 2016-06-01 09:21:12 -07:00
Karina
268a0eb6f4 remove redundant initialization 2016-06-01 10:52:51 +08:00
HaiboZhu
515eeb41e4 Merge pull request #2481 from ruil2/maxbitrate1
fix iContinualSkipFrames calculation
2016-06-01 09:03:57 +08:00
ruil2
2d3fc37a07 Merge pull request #2484 from sijchen/refactor_preprocess13
[Encoder] Refactor: add class for diff preprocess strategy
2016-06-01 08:31:02 +08:00
Karina
87e81a7a40 use the same name to avoid confusing. 2016-06-01 08:21:03 +08:00
sijchen@cisco.com
03863ae4c6 different preprocess actually used diff source picture management 2016-05-31 14:36:21 -07:00
sijchen@cisco.com
a1cae49732 add class for diff preprocess strategy 2016-05-31 13:48:45 -07:00
Karina
dd021b6ca8 fix iContinualSkipFrames calculation 2016-05-31 21:01:11 +08:00
Karina
64ad70b0ea get the correct did for savc case 2016-05-31 17:35:20 +08:00
Karina
4fc2b1f636 refine RC 2016-05-31 16:44:04 +08:00
Karina
e3c306608c fix dependency ID mapping issue 2016-05-30 15:03:39 +08:00
HaiboZhu
c17a58efdf Merge pull request #2473 from ruil2/update_interface
modify the interface that use a independent subseqID for each layer
2016-05-25 10:00:13 +08:00
Karina
2ef9613e55 avoid overflow 2016-05-24 13:25:05 +08:00
ruil2
c96c8b05a8 Merge pull request #2468 from sijchen/refactor_pre
[Encoder] Refactor: create diff func for diff case to make logic clean
2016-05-23 13:21:40 +08:00
Karina
9b2dd55324 add GetBsPostion for cabac and cavlc 2016-05-20 14:29:48 +08:00
sijchen
27e803f6f4 refactor to make logic clean 2016-05-19 09:42:39 -07:00
Karina
ac37666cf1 modify the interface that use a independent subseqID for each layer 2016-05-19 17:17:17 +08:00
Karina
8a341070f2 fix overflow issue 2016-05-19 12:00:49 +08:00
sijchen
1ac02f3002 fix conflict with master 2016-05-18 10:57:39 -07:00
Karina
c298d66d48 fix temporal layer skip issue 2016-05-18 09:47:49 +08:00
Haibo Zhu
85f4beb9a8 Fix the wrong variable name which casue the build error 2016-05-17 13:46:04 +08:00
ruil2
0ec686f7ec Merge pull request #2452 from sijchen/refactor_sps2
Refactoring: Wrap all the operations related to eSpsPpsIdStrategy to class
2016-05-17 09:19:14 +08:00
sijchen
00747540fb move strategy related pointer to class 2016-05-16 10:55:13 -07:00
Karina
3b55d64902 fix crash when temporal layer is skipped, the frame should not be encoded 2016-05-16 14:43:13 +08:00
sijchen
ffb85046b4 Refactoring: Wrap all the operations related to eSpsPpsIdStrategy to class, to improve code readability 2016-05-04 15:06:02 -07:00
HaiboZhu
c30cc41261 Merge pull request #2448 from saamas/encoder-getnonzerocount-sse42
[Encoder] Add an SSE4.2 implementation of WelsGetNonZeroCount
2016-05-04 09:49:47 +08:00
ruil2
e9dc97803d Merge pull request #2447 from saamas/encoder-cavlcparamcal-sse42
[Encoder] Add an SSE4.2 implementation of CavlcParamCal
2016-04-28 09:08:44 +08:00
ruil2
7d65687284 Merge pull request #2441 from saamas/encoder-add-avx2-4x4-quantization-routines
[Encoder] Add AVX2 4x4 quantization routines
2016-04-28 09:08:31 +08:00
Sindre Aamås
fb0b2b3f41 [Encoder/x86] Drop unneeded LOAD_4_PARA in CavlcParamCal_sse42 2016-04-24 22:59:35 +02:00
Sindre Aamås
d1c7713191 [Encoder/x86] Minor CavlcParamCal_sse42 tweak
Do more elaborate register allocation to avoid a few mov instructions.
2016-04-24 22:36:23 +02:00
Sindre Aamås
f56bdc3aa4 [Encoder/x86] Minor CavlcParamCal_sse42 tweak
Avoid loading single-use parameter.
2016-04-21 16:29:02 +02:00
Sindre Aamås
2eb8800712 [Encoder/x86] Remove a leftover mov instruction in CavlcParamCal_sse42 2016-04-21 15:53:33 +02:00
Sindre Aamås
4645bd26aa [Encoder] Add an SSE4.2 implementation of WelsGetNonZeroCount
Avoid touching some cache lines by using popcnt instead of table
lookups.

Also gives a speedup of ~1.4x on Haswell as compared with SSE2.
2016-04-20 19:10:24 +02:00
Sindre Aamås
3f31aff4dc [Encoder] Add an SSE4.2 implementation of CavlcParamCal
Use a combination of table lookups and pshufb to convert coefficients
to zero run/level format. Two 16-entry lookup tables are used for a
total of 192 bytes worth of tables. (The existing SSE2 version uses a
table of size 2048 bytes.)

Speedup is ~1.5x-3x as compared with the SSE2 version on Haswell (the
speedup is greater for input with many trailing zeros).

The use of popcnt makes it require SSE4.2. This can be replaced with
a small LUT and accumulation which would reduce the requirement to
SSSE3.
2016-04-20 18:37:08 +02:00
Sindre Aamås
502b16925e [UT] Add tests for CavlcParamCal_c and CavlcParamCal_sse2 2016-04-20 18:37:08 +02:00