sijchen
27e803f6f4
refactor to make logic clean
2016-05-19 09:42:39 -07:00
Karina
8a341070f2
fix overflow issue
2016-05-19 12:00:49 +08:00
sijchen
1ac02f3002
fix conflict with master
2016-05-18 10:57:39 -07:00
Karina
c298d66d48
fix temporal layer skip issue
2016-05-18 09:47:49 +08:00
Haibo Zhu
85f4beb9a8
Fix the wrong variable name which casue the build error
2016-05-17 13:46:04 +08:00
ruil2
0ec686f7ec
Merge pull request #2452 from sijchen/refactor_sps2
...
Refactoring: Wrap all the operations related to eSpsPpsIdStrategy to class
2016-05-17 09:19:14 +08:00
sijchen
00747540fb
move strategy related pointer to class
2016-05-16 10:55:13 -07:00
Karina
3b55d64902
fix crash when temporal layer is skipped, the frame should not be encoded
2016-05-16 14:43:13 +08:00
sijchen
ffb85046b4
Refactoring: Wrap all the operations related to eSpsPpsIdStrategy to class, to improve code readability
2016-05-04 15:06:02 -07:00
HaiboZhu
c30cc41261
Merge pull request #2448 from saamas/encoder-getnonzerocount-sse42
...
[Encoder] Add an SSE4.2 implementation of WelsGetNonZeroCount
2016-05-04 09:49:47 +08:00
ruil2
e9dc97803d
Merge pull request #2447 from saamas/encoder-cavlcparamcal-sse42
...
[Encoder] Add an SSE4.2 implementation of CavlcParamCal
2016-04-28 09:08:44 +08:00
ruil2
7d65687284
Merge pull request #2441 from saamas/encoder-add-avx2-4x4-quantization-routines
...
[Encoder] Add AVX2 4x4 quantization routines
2016-04-28 09:08:31 +08:00
Sindre Aamås
fb0b2b3f41
[Encoder/x86] Drop unneeded LOAD_4_PARA in CavlcParamCal_sse42
2016-04-24 22:59:35 +02:00
Sindre Aamås
d1c7713191
[Encoder/x86] Minor CavlcParamCal_sse42 tweak
...
Do more elaborate register allocation to avoid a few mov instructions.
2016-04-24 22:36:23 +02:00
Sindre Aamås
f56bdc3aa4
[Encoder/x86] Minor CavlcParamCal_sse42 tweak
...
Avoid loading single-use parameter.
2016-04-21 16:29:02 +02:00
Sindre Aamås
2eb8800712
[Encoder/x86] Remove a leftover mov instruction in CavlcParamCal_sse42
2016-04-21 15:53:33 +02:00
Sindre Aamås
4645bd26aa
[Encoder] Add an SSE4.2 implementation of WelsGetNonZeroCount
...
Avoid touching some cache lines by using popcnt instead of table
lookups.
Also gives a speedup of ~1.4x on Haswell as compared with SSE2.
2016-04-20 19:10:24 +02:00
Sindre Aamås
3f31aff4dc
[Encoder] Add an SSE4.2 implementation of CavlcParamCal
...
Use a combination of table lookups and pshufb to convert coefficients
to zero run/level format. Two 16-entry lookup tables are used for a
total of 192 bytes worth of tables. (The existing SSE2 version uses a
table of size 2048 bytes.)
Speedup is ~1.5x-3x as compared with the SSE2 version on Haswell (the
speedup is greater for input with many trailing zeros).
The use of popcnt makes it require SSE4.2. This can be replaced with
a small LUT and accumulation which would reduce the requirement to
SSSE3.
2016-04-20 18:37:08 +02:00
Sindre Aamås
502b16925e
[UT] Add tests for CavlcParamCal_c and CavlcParamCal_sse2
2016-04-20 18:37:08 +02:00
HaiboZhu
3b68840d5f
Merge pull request #2444 from GuangweiWang/fix-assembly-arm64
...
Fix assembly arm64
Code review at: https://rbcommons.com/s/OpenH264/r/1594/
2016-04-20 10:03:01 +08:00
Guangwei Wang
cc407b4b21
fix code style
2016-04-17 19:47:55 +08:00
Guangwei Wang
0b8cdcaff8
extension 32-bit parameters to 64-bit on arm64 assembly function
2016-04-17 19:41:57 +08:00
Karina
dd340b7fe7
modify neon comment
2016-04-14 14:49:11 +08:00
Karina
d34e209266
fix 32-bit parameters issue on arm64 assembly function
2016-04-13 19:30:08 +08:00
Sindre Aamås
bb49e23719
[Encoder] Add AVX2 4x4 quantization routines
...
WelsQuantFour4x4Max_avx2 (~2.06x speedup over SSE2)
WelsQuantFour4x4_avx2 (~2.32x speedup over SSE2)
WelsQuant4x4Dc_avx2 (~1.49x speedup over SSE2)
WelsQuant4x4_avx2 (~1.42x speedup over SSE2)
2016-04-13 11:56:47 +02:00
Karina
7e14400b0b
refine the workflow for encode one frame
2016-03-31 16:58:20 +08:00
Karina
3927d91b85
fix temporal layer issue when output frame rate is different from input frame rate
2016-03-29 15:48:06 +08:00
Karina
10dfb2670b
fix skip frames statistics issue
2016-03-24 14:17:47 +08:00
sijchen
47d310539f
Squashed commit of the following:
...
commit c8111942e07437034a74b33887c33b5ad78e476a
Author: Karina <ruil2@cisco.com>
Date: Wed Mar 23 14:31:18 2016 +0800
update SHA table
commit f36a25344c25a131581dcbcd2d103fc4b131012e
Author: Karina <ruil2@cisco.com>
Date: Wed Mar 23 13:45:58 2016 +0800
fix bitrate overflow issue when adaptive quality turns on
2016-03-23 10:23:33 -07:00
Forrest Shi
47ad929c25
fix bug for debug mode
2016-03-23 11:14:40 +08:00
huade
a7a5b7b0f4
refactor for slice buffer init/allocate/free
2016-03-22 13:51:20 +08:00
sijchen
33bb96f604
Merge pull request #2420 from sijchen/fix_sps
...
[Encoder] fix the lack of eSpsPpsIdStrategy==INCREASING_ID under simulcast avc on
2016-03-21 21:51:07 -07:00
sijchen
8103988cde
Merge pull request #2418 from ruil2/refine_init
...
fix preprocessing initialization logic
2016-03-21 11:32:24 -07:00
Karina
228cdeba1b
refine reset function
2016-03-21 10:48:41 +08:00
Karina
7c15d68e24
fix preprocessing initialization logic
2016-03-18 16:43:11 +08:00
Karina
316ab31882
fix bitrate update issue
2016-03-18 14:28:32 +08:00
zhilwang
d7570bfa52
Merge pull request #2401 from saamas/decoder-use-encoder-x86-idct-routines
...
[Decoder] Use encoder x86 IDCT routines
2016-03-18 08:50:33 +08:00
Haibo Zhu
43f767d06e
Diable assert in release with -DNDEBUG macro
...
Update the code to avoid the function unused warning
2016-03-17 11:24:01 +08:00
sijchen
90deb80b50
rename the functions
2016-03-14 21:41:08 -07:00
sijchen
c009183e97
fix the lack of eSpsPpsIdStrategy==INCREASING_ID under simulcast avc on
2016-03-14 11:28:44 -07:00
Karina
f84f2315ab
change downsampling logic that downsampling source is from the nearest layer instead of the highest layer
2016-03-14 09:55:36 +08:00
Sindre Aamås
98042f1600
[Decoder] Use encoder x86 IDCT routines
...
Move asm routines to common. Delete obsolete decoder routines.
Use wider routines where applicable.
~1.07x overall faster decode on a quick 720p30 4Mbps test on Haswell.
2016-03-09 10:41:42 +01:00
Sindre Aamås
48a520915a
[Encoder/x86] Add AVX2 SATD routines
...
WelsSampleSatd16x16_avx2 (~2.31x speedup over SSE4.1 on Haswell).
WelsSampleSatd16x8_avx2 (~2.19x speedup over SSE4.1 on Haswell).
WelsSampleSatd8x16_avx2 (~1.68x speedup over SSE4.1 on Haswell).
WelsSampleSatd8x8_avx2 (~1.53x speedup over SSE4.1 on Haswell).
2016-03-08 11:31:17 +01:00
HaiboZhu
d9bfc9204b
Merge pull request #2394 from sijchen/th021
...
[Common] remove sink in WelsThreadPool and hide the construtor to finish the s…
2016-03-08 16:29:40 +08:00
Karina
fee9d502bb
format update and fix build issue when turn on STAT_OUTPUT macro
2016-03-04 13:55:14 +08:00
sijchen
4db9c32976
remove sink in WelsThreadPool and hide the construtor to finish the singleTon
2016-03-02 17:08:09 -08:00
sijchen
d4f09d9048
put CWelsThreadPool to singleTon for future usage (including add sink for IWelsTask)
2016-02-29 11:40:25 -08:00
Martin Storsjö
69e3fac093
Avoid reading iCountMbNumInSlice out of bounds on slice realloc
...
Prior to 7bcb3ba4f4abf18a,
pCurLayer->sLayerInfo.pSliceInLayer[uiSliceIdx].iCountMbNumInSlice
was read after setting pCurLayer->sLayerInfo.pSliceInLayer to
the newly allocated, larger array. After this commit, it is read
before the array has been switched, and thus is read from the
old array (which only holds elements up to iMaxSliceNumOld, not
up to iMaxSliceNum).
This fixes reads out of bounds, and crashes in the test suite.
2016-02-25 10:31:58 +02:00
HaiboZhu
040974f735
Merge pull request #2378 from shihuade/MultiThread_V4.9_V5
...
add thread-based slice buffer and refactor reallocate process
2016-02-25 14:40:56 +08:00
HaiboZhu
321c772536
Merge pull request #2372 from ruil2/refine_trace
...
update trace for ENCODER_OPTION_TRACE_CALLBACK
2016-02-25 10:50:12 +08:00