sijchen
41b4ecb06b
Avoid memory problem when mem alloc failed during initializing pRefList
2016-02-17 09:52:30 -08:00
sijchen
4b97dcb367
avoid memory problem when mem alloc failed during initializing pRefList
2016-02-16 10:05:49 -08:00
Karina
18728a4876
trace cleanup
2016-02-16 10:52:37 +08:00
ruil2
a26955e444
Merge pull request #2358 from sijchen/fix_free2
...
[Encoder] avoid memory problem if mem alloc failed in the middle of InitDqLayer
2016-02-16 10:47:23 +08:00
sijchen
855d1cf8c2
add input parameter check as protection for an encoder interface
2016-02-15 11:54:51 -08:00
sijchen
b76a79c726
move the rc free to the correct condition to avoid access to invalid memory
2016-02-15 10:13:50 -08:00
sijchen
025500d5aa
move the assigning m_uiSpatialPicNum earlier to cover the memory leak if error in allocating pic
2016-02-15 10:13:23 -08:00
sijchen
36722c553b
use WelsMallocz instead of WelsMalloc to avoid non-null pointer at init
2016-02-15 10:12:44 -08:00
sijchen
71aa533038
move the printing of MEMORY_CHECK part to more reasonable
2016-02-15 10:12:34 -08:00
sijchen
6a0f0811ae
use WelsUninitEncoderExt in all free process in WelsInitEncoderExt
2016-02-15 10:06:43 -08:00
sijchen
408b7cad17
use WelsUninitEncoderExt rather than FreeMemorySvc which correctly deals with release of vpp memory
2016-02-15 10:04:52 -08:00
Sindre Aamås
e96a7b5c92
[Common/x86] DeblockChromaEq4V_ssse3 optimizations
...
Use packed 8-bit operations rather than unpack to 16-bit.
Avoid spills.
~2.07x speedup on Haswell (x86-64).
~2.12x speedup on Haswell (x86 32-bit).
2016-02-15 02:08:03 +01:00
Sindre Aamås
fc16010583
[Common/x86] DeblockChromaLt4V_ssse3 optimizations
...
Use packed 8-bit operations rather than unpack to 16-bit.
Avoid spills.
~2.68x speedup on Haswell (x86-64).
~2.38x speedup on Haswell (x86 32-bit).
2016-02-15 02:07:25 +01:00
Sindre Aamås
62fb37d096
[Common/x86] DeblockLumaEq4_ssse3 optimizations
...
Use packed 8-bit operations rather than unpack to 16-bit.
Minimize spills.
~2.31x speedup on Haswell (x86-64).
~2.40x speedup on Haswell (x86 32-bit).
2016-02-15 02:06:39 +01:00
Sindre Aamås
732e1c5f78
[Common/x86] DeblockLumaLt4_ssse3 optimizations
...
Use packed 8-bit operations rather than unpack to 16-bit.
Avoid spills.
~1.97x speedup on Haswell (x86-64).
~3.09x speedup on Haswell (x86 32-bit).
2016-02-15 02:06:18 +01:00
sijchen
2b9a250fbd
include the free-ing of pointer into FreeDqLayer
2016-02-12 16:23:57 -08:00
sijchen
a1a3873a62
improve the code structure
2016-02-10 22:25:41 -08:00
sijchen
43fdf74fa6
fix a miss of assigning and remove an unused line
2016-02-10 21:54:53 -08:00
sijchen
914302a462
avoid memory problem if mem alloc failed in the middle of InitDqLayer
2016-02-10 21:54:53 -08:00
sijchen
aaa25160ec
Merge pull request #2353 from saamas/encoder-x86-dct-opt2
...
[Encoder] x86 DCT optimizations
2016-02-08 15:00:12 -08:00
sijchen
e5e7013b73
Merge pull request #2350 from sijchen/th00
...
[Common] Add sink to IWelsTask
2016-02-08 14:59:38 -08:00
HaiboZhu
ad9ca3824f
Merge pull request #2354 from ruil2/remove_trace
...
fix error width and height issue
2016-02-04 12:00:20 +08:00
Karina
ae508b9724
fix error width and height issue
2016-02-04 10:25:03 +08:00
sijchen
f5fd7420a9
Merge pull request #2351 from huili2/fix_width_height_enc_constraint
...
fix frame size constraints for width and height
2016-02-02 16:31:05 -08:00
Sindre Aamås
c8c74903f8
[Encoder] Add single-block AVX2 4x4 DCT/IDCT routines
...
We do four blocks at a time when possible, but need to handle
single blocks at a time for intra prediction.
~3.15x speedup over MMX for the DCT on Haswell.
~2.94x speedup over MMX for the IDCT on Haswell.
Returns diminish with increasing vector length because a larger
proportion of the time is spent on load/store/shuffling.
2016-02-02 17:22:49 +01:00
Sindre Aamås
f90960983c
[Encoder] Add single-block SSE2 4x4 DCT/IDCT routines
...
We do four blocks at a time when possible, but need to handle
single blocks at a time for intra prediction.
~2.31x speedup over MMX for the DCT on Haswell.
~1.92x speedup over MMX for the IDCT on Haswell.
2016-02-02 17:22:48 +01:00
Sindre Aamås
7486de2844
[Encoder] AVX2 DCT tweaks
...
Do some shuffling in load/store unpack/pack to save some
work in horizontal DCTs.
Use a few 128-bit broadcasts to compact data vectors a bit.
~1.04x speedup for the DCT case on Haswell.
~1.12x speedup for the IDCT case on Haswell.
2016-02-02 17:22:48 +01:00
Karina
2d4cbcf060
remove trace
2016-02-02 17:34:59 +08:00
unknown
3873addc3d
fix frame size constraints for width and height
2016-02-01 15:55:53 +08:00
HaiboZhu
1030820ec4
Merge pull request #2342 from sijchen/enh_ut_tem
...
[UT] correct and enhance the ut template and trace improvement
2016-02-01 09:08:05 +08:00
sijchen
ef329e33c3
add simulcastAvc setting in setting trace
2016-01-20 14:24:16 -08:00
Sindre Aamås
e22d731f26
[Encoder] yasm-compatible vinserti128 syntax in DCT asm
2016-01-19 21:48:23 +01:00
Sindre Aamås
144ff0fd51
[Encoder] SSE2 4x4 IDCT optimizations
...
Use a combination of instruction types that distributes more
evenly across execution ports on common architectures.
Do the horizontal IDCT without transposing back and forth.
Minor tweaks.
~1.14x faster on Haswell. Should be faster on other architectures
as well.
2016-01-19 13:12:29 +01:00
Sindre Aamås
991e344d8c
[Encoder] SSE2 4x4 DCT optimizations
...
Use a combination of instruction types that distributes more
evenly across execution ports on common architectures.
Do the horizontal DCT without transposing back and forth.
Minor tweaks.
~1.54x faster on Haswell. Should be faster on other architectures
as well.
2016-01-19 13:12:28 +01:00
Sindre Aamås
3088d96978
[Encoder] Add an AVX2 4x4 IDCT implementation
...
~2.03x faster on Haswell as compared to the SSE2 version.
2016-01-19 13:12:28 +01:00
Sindre Aamås
b267163f10
[Encoder] Add an AVX2 4x4 DCT implementation
...
~2.52x faster on Haswell as compared to the SSE2 version.
2016-01-19 13:12:28 +01:00
HaiboZhu
8eb4de10a2
Merge pull request #2337 from HaiboZhu/Add_Protection_wrong_API_call
...
Add protection for wrong API call without initialize
2016-01-19 13:42:49 +08:00
HaiboZhu
5e3e975ffb
Merge pull request #2331 from ruil2/return_value
...
add return value judgment
2016-01-19 12:25:10 +08:00
Haibo Zhu
6d7bd2daf4
Add protection for wrong API call without initialize
2016-01-19 12:00:54 +08:00
Martin Storsjö
fbe35cffca
Avoid warnings in MSVC about implicitly casting floats to integers
2016-01-16 11:10:25 +02:00
Karina
559e786fa4
add return value judgment
2016-01-15 10:30:41 +08:00
HaiboZhu
d11f12db54
Merge pull request #2330 from ruil2/mt_build_1
...
fix build issue when some macro turn on
2016-01-15 09:28:07 +08:00
sijchen
5eb18b101e
change the output way of debug trace
2016-01-13 22:13:43 -08:00
Karina
67f4dcf2e2
fix build issue when some macro turn on
2016-01-14 09:40:20 +08:00
Karina
0f0d54ef51
using independent encoder control logic for SAVC case
2016-01-14 09:16:12 +08:00
sijchen
cce1c29844
add sink to IWelsTask (for further enhancements)
2016-01-13 16:24:54 -08:00
sijchen
5cad0f9bba
enhance a UT to cover more case
2016-01-11 22:01:02 -08:00
sijchen
bf35b6fee7
add a debug trace if encoder returns error
2016-01-11 22:00:24 -08:00
sijchen
19f5eb0932
complete a debug trace in load-balancing task
2016-01-11 22:00:14 -08:00
sijchen
7a8da6a468
remove unneed codes after new task-managements
2016-01-11 21:59:49 -08:00
sijchen
dcdd496082
fix a bug in multi-layer case in task-management
2016-01-11 21:58:10 -08:00
HaiboZhu
b940e2cdf8
Merge pull request #2325 from ruil2/trace1
...
separate each layer trace output
2016-01-11 14:05:55 +08:00
ruil2
c32263e06b
Merge pull request #2322 from HaiboZhu/Fix_Encoder_Info_Output
...
Fix the build errors when open the encoder info output
2016-01-08 17:15:15 +08:00
Karina
d4f979c495
seperate each layer trace output
2016-01-05 14:02:58 +08:00
Karina
57c87f1845
update format
2016-01-05 11:40:59 +08:00
HaiboZhu
cd75541c8f
Merge pull request #2323 from ruil2/rc_timestamp
...
resolve abnormal timestamp(rollback or jump case)
2015-12-31 09:55:58 +08:00
Haibo Zhu
a6a504f944
Fix the build errors when open the encoder info output
2015-12-31 09:06:59 +08:00
huili2
740968d1f6
modify EC method comment in API
2015-12-30 13:41:29 +08:00
Karina
0d5db3d986
resolve abnormal timestamp(rollback or jump case)
2015-12-29 15:05:42 +08:00
huade
f161566458
remove pSliceBs from ctx
2015-12-15 17:10:52 +08:00
huade
ef38c2abf8
refact threadIdc and CPU cores logic in init module
2015-12-15 11:27:00 +08:00
sijchen
406f89ec54
Merge pull request #2309 from shihuade/MultiThread_V4.4_ThreadSliceNum_V3_Pull
...
remove iCountThreadsNum and unitfy with iMultipleThreadIdc
2015-12-14 09:44:13 -08:00
huade
549a1b9bf4
fixed layer size update bugs
2015-12-14 14:56:09 +08:00
huade
e8536c6b73
remove iCountThreadsNum and unitfy with iMultipleThreadIdc
2015-12-14 12:26:02 +08:00
HaiboZhu
ee01b3afaf
Merge pull request #2307 from huili2/fix_decstat
...
fix iAvgLumaQp in decStat
2015-12-14 10:26:16 +08:00
HaiboZhu
762d1812bb
Merge pull request #2306 from shihuade/MultiThread_V4.4_ThreadSliceNum_V2_Pull
...
refact validate and init logic for fixed sliceMode
2015-12-14 09:44:38 +08:00
HaiboZhu
92637b4912
Merge pull request #2304 from sijchen/th21
...
[Encoder] Add tasks and thread pool call for SM_SIZELIMITED_SLICE mode
2015-12-11 16:16:16 +08:00
huili2
b2d4a95537
fix iAvgLumaQp in decStat
2015-12-11 14:14:42 +08:00
huade
14d89eb48c
refact validate and init logic for fixed sliceMode
2015-12-11 13:08:05 +08:00
Karina
fde8bd2554
update temporal layer quant
2015-12-10 15:07:19 +08:00
sijchen
76ca56498a
Add tasks and thread pool call for SM_SIZELIMITED_SLICE mode
2015-12-09 09:55:04 -08:00
HaiboZhu
7e9fdc181f
Merge pull request #2301 from huili2/simple_parseonly_ctx
...
remove parseonly in decoder ctx
2015-12-09 10:33:29 +08:00
huade
dcfe76d1ff
unitfy slice bs writing for multi-thread(sliceindex==0 is the same with others )
2015-12-08 14:09:43 +08:00
HaiboZhu
6dc3f72ef8
Merge pull request #2291 from sijchen/api5
...
[Encoder] Console: update help info in console to sync with recent api change
2015-12-07 14:26:54 +08:00
Karina
5ac58e8dc9
add parameter output trace
2015-12-03 16:47:57 +08:00
Karina
fd43759fc2
change output interface
2015-12-02 09:58:56 +08:00
sijchen
c55e5f6130
update help info in console to sync with recent api change
2015-12-01 16:45:37 -08:00
HaiboZhu
ece95c815c
Merge pull request #2286 from sijchen/ut3
...
[Encoder] adjust the input para judgement of iMaxNalSize
2015-12-01 15:24:02 +08:00
sijchen
89752ff62f
Refactor: remove CWelsTaskManageMultiD
2015-11-30 10:32:48 -08:00
HaiboZhu
f679da900f
Merge pull request #2281 from sijchen/th11
...
[Encoder] remove duplicated operation after thread pool
2015-11-27 12:13:33 +08:00
HaiboZhu
b749fe7160
Merge pull request #2273 from sijchen/th0
...
[Encoder] use different task when load-balancing or not, to save computation
2015-11-27 09:29:22 +08:00
HaiboZhu
921443ead8
Merge pull request #2272 from sijchen/rf0
...
[Encoder] put duplicated codes into one function
2015-11-27 09:27:37 +08:00
huili2
926fc67451
remove parseonly in decoder ctx
2015-11-27 08:56:20 +08:00
huade
436da21ccf
initial for iReturn and refact PPS Sps bs write function
2015-11-26 14:06:01 +08:00
huade
4a4ade1201
refact WriteSliceBs()
2015-11-26 09:32:33 +08:00
sijchen
8667452940
adjust the input para judgement of iMaxNalSize
2015-11-25 14:21:32 -08:00
sijchen
05c89b75f0
remove duplicated operation after thread pool and rename a task for clearer meaning
2015-11-25 13:46:21 -08:00
HaiboZhu
a422180695
Merge pull request #2277 from ruil2/qp_trace
...
add minqp and maxqp parameters in console
2015-11-25 15:05:12 +08:00
Karina
ab7eb1535d
add minqp and maxqp parameters in console
2015-11-25 14:21:44 +08:00
huade
d02addd90f
remove pCountMbNumInSlice from SSliceCtx
2015-11-25 13:36:37 +08:00
HaiboZhu
60f36eb25a
Merge pull request #2275 from HaiboZhu/Fix_Emulation_Prevention_Bytes_Profiles_Bugs
...
Add protection for emulation prevention bytes and profile_id
2015-11-25 12:30:51 +08:00
HaiboZhu
f47be08065
Merge pull request #2271 from sijchen/rf1
...
[Encoder] refactor multi-thread logic and add error-dealing
2015-11-25 12:04:00 +08:00
unknown
cc6b409f12
Add protection for emulation prevention bytes and profile_id
2015-11-25 11:48:07 +08:00
HaiboZhu
d85b1f6863
Merge pull request #2274 from shihuade/MultiThread_V4.2_SSliceCtx_PFirstMBInSlice_Pull_BugFixed
...
fixed bug for firsMbIndex in multi-thread-slice encoding with slicemo…
2015-11-25 11:12:24 +08:00
HaiboZhu
404315ab19
Merge pull request #2270 from huili2/parseonly_api_bugfix
...
disable wrongly calling for parseonly related
2015-11-25 09:00:54 +08:00
sijchen
13cb84e695
use different task when load-balancing or not to save computation
2015-11-24 14:19:15 -08:00
sijchen
1247006cbb
remove unneeded variable
2015-11-24 13:39:27 -08:00
sijchen
2df092bcae
refactor multi-thread logic
2015-11-24 13:35:55 -08:00
sijchen
2fc9c08710
put duplicated codes into one function
2015-11-24 11:14:58 -08:00
huade
29dd5e71be
fixed bug for firsMbIndex in multi-thread-slice encoding with slicemode==SM_SIZELIMITED_SLICE
2015-11-24 17:55:30 +08:00