Karina
65218a3c35
update trace for ENCODER_OPTION_TRACE_CALLBACK
2016-02-22 14:33:10 +08:00
ruil2
2754129064
Merge pull request #2360 from saamas/common-x86-deblock-optimizations
...
[Common/x86] Deblocking optimizations
2016-02-19 09:52:39 +08:00
ruil2
13586a3dfc
Merge pull request #2366 from sijchen/fix_free6
...
[Encoder] add error handling in memory allocation failed case for multi-threading
2016-02-18 10:25:19 +08:00
ruil2
f791ac28ec
Merge pull request #2365 from sijchen/fix_free42
...
[Encoder] avoid memory problem when mem alloc failed during initializing pRefList
2016-02-18 10:25:07 +08:00
ruil2
de1a70d164
Merge pull request #2363 from sijchen/fix_free5
...
[Encoder] add input parameter check as protection for an encoder interface
2016-02-18 10:24:55 +08:00
sijchen
4537682042
Merge pull request #2362 from ruil2/trace1
...
trace cleanup
2016-02-17 14:52:46 -08:00
sijchen
e07ee9c096
use WELS_DELETE_OP for deleting
2016-02-17 10:07:33 -08:00
sijchen
74955c877f
set pointers to null and call uninit
2016-02-17 10:07:33 -08:00
sijchen
cc675f9fd1
add error handling in memory allocation failed case
2016-02-17 10:07:33 -08:00
sijchen
41b4ecb06b
Avoid memory problem when mem alloc failed during initializing pRefList
2016-02-17 09:52:30 -08:00
sijchen
4b97dcb367
avoid memory problem when mem alloc failed during initializing pRefList
2016-02-16 10:05:49 -08:00
Karina
18728a4876
trace cleanup
2016-02-16 10:52:37 +08:00
ruil2
a26955e444
Merge pull request #2358 from sijchen/fix_free2
...
[Encoder] avoid memory problem if mem alloc failed in the middle of InitDqLayer
2016-02-16 10:47:23 +08:00
ruil2
6cf240237b
Merge pull request #2361 from sijchen/fix_free00
...
[Encoder] multiple protection if memory allocation failed
2016-02-16 10:47:02 +08:00
sijchen
855d1cf8c2
add input parameter check as protection for an encoder interface
2016-02-15 11:54:51 -08:00
sijchen
b76a79c726
move the rc free to the correct condition to avoid access to invalid memory
2016-02-15 10:13:50 -08:00
sijchen
025500d5aa
move the assigning m_uiSpatialPicNum earlier to cover the memory leak if error in allocating pic
2016-02-15 10:13:23 -08:00
sijchen
36722c553b
use WelsMallocz instead of WelsMalloc to avoid non-null pointer at init
2016-02-15 10:12:44 -08:00
sijchen
71aa533038
move the printing of MEMORY_CHECK part to more reasonable
2016-02-15 10:12:34 -08:00
sijchen
6a0f0811ae
use WelsUninitEncoderExt in all free process in WelsInitEncoderExt
2016-02-15 10:06:43 -08:00
sijchen
408b7cad17
use WelsUninitEncoderExt rather than FreeMemorySvc which correctly deals with release of vpp memory
2016-02-15 10:04:52 -08:00
Sindre Aamås
e96a7b5c92
[Common/x86] DeblockChromaEq4V_ssse3 optimizations
...
Use packed 8-bit operations rather than unpack to 16-bit.
Avoid spills.
~2.07x speedup on Haswell (x86-64).
~2.12x speedup on Haswell (x86 32-bit).
2016-02-15 02:08:03 +01:00
Sindre Aamås
fc16010583
[Common/x86] DeblockChromaLt4V_ssse3 optimizations
...
Use packed 8-bit operations rather than unpack to 16-bit.
Avoid spills.
~2.68x speedup on Haswell (x86-64).
~2.38x speedup on Haswell (x86 32-bit).
2016-02-15 02:07:25 +01:00
Sindre Aamås
62fb37d096
[Common/x86] DeblockLumaEq4_ssse3 optimizations
...
Use packed 8-bit operations rather than unpack to 16-bit.
Minimize spills.
~2.31x speedup on Haswell (x86-64).
~2.40x speedup on Haswell (x86 32-bit).
2016-02-15 02:06:39 +01:00
Sindre Aamås
732e1c5f78
[Common/x86] DeblockLumaLt4_ssse3 optimizations
...
Use packed 8-bit operations rather than unpack to 16-bit.
Avoid spills.
~1.97x speedup on Haswell (x86-64).
~3.09x speedup on Haswell (x86 32-bit).
2016-02-15 02:06:18 +01:00
sijchen
8b1206001c
Merge pull request #2355 from pra85/patch-1
...
Fix a typo
2016-02-12 16:42:02 -08:00
sijchen
2b9a250fbd
include the free-ing of pointer into FreeDqLayer
2016-02-12 16:23:57 -08:00
Prayag Verma
2d378b9db8
Fix a typo
...
`Availabe` → `Available`
2016-02-11 12:04:58 +05:30
sijchen
a1a3873a62
improve the code structure
2016-02-10 22:25:41 -08:00
sijchen
43fdf74fa6
fix a miss of assigning and remove an unused line
2016-02-10 21:54:53 -08:00
sijchen
914302a462
avoid memory problem if mem alloc failed in the middle of InitDqLayer
2016-02-10 21:54:53 -08:00
sijchen
aaa25160ec
Merge pull request #2353 from saamas/encoder-x86-dct-opt2
...
[Encoder] x86 DCT optimizations
2016-02-08 15:00:12 -08:00
sijchen
e5e7013b73
Merge pull request #2350 from sijchen/th00
...
[Common] Add sink to IWelsTask
2016-02-08 14:59:38 -08:00
HaiboZhu
ad9ca3824f
Merge pull request #2354 from ruil2/remove_trace
...
fix error width and height issue
2016-02-04 12:00:20 +08:00
Karina
ae508b9724
fix error width and height issue
2016-02-04 10:25:03 +08:00
sijchen
f5fd7420a9
Merge pull request #2351 from huili2/fix_width_height_enc_constraint
...
fix frame size constraints for width and height
2016-02-02 16:31:05 -08:00
sijchen
fb901269ef
Merge pull request #2352 from ruil2/remove_trace
...
remove trace
2016-02-02 16:30:50 -08:00
Sindre Aamås
db9fa9154c
Update README.md nasm version requirement
...
Version 2.10.06 has some RIP-relative relocation fixes for macho64
that are needed to generate correct code on 64-bit OS X with recent
code changes.
2016-02-02 17:22:49 +01:00
Sindre Aamås
c8c74903f8
[Encoder] Add single-block AVX2 4x4 DCT/IDCT routines
...
We do four blocks at a time when possible, but need to handle
single blocks at a time for intra prediction.
~3.15x speedup over MMX for the DCT on Haswell.
~2.94x speedup over MMX for the IDCT on Haswell.
Returns diminish with increasing vector length because a larger
proportion of the time is spent on load/store/shuffling.
2016-02-02 17:22:49 +01:00
Sindre Aamås
f90960983c
[Encoder] Add single-block SSE2 4x4 DCT/IDCT routines
...
We do four blocks at a time when possible, but need to handle
single blocks at a time for intra prediction.
~2.31x speedup over MMX for the DCT on Haswell.
~1.92x speedup over MMX for the IDCT on Haswell.
2016-02-02 17:22:48 +01:00
Sindre Aamås
7486de2844
[Encoder] AVX2 DCT tweaks
...
Do some shuffling in load/store unpack/pack to save some
work in horizontal DCTs.
Use a few 128-bit broadcasts to compact data vectors a bit.
~1.04x speedup for the DCT case on Haswell.
~1.12x speedup for the IDCT case on Haswell.
2016-02-02 17:22:48 +01:00
Karina
2d4cbcf060
remove trace
2016-02-02 17:34:59 +08:00
unknown
3873addc3d
fix frame size constraints for width and height
2016-02-01 15:55:53 +08:00
HaiboZhu
1030820ec4
Merge pull request #2342 from sijchen/enh_ut_tem
...
[UT] correct and enhance the ut template and trace improvement
2016-02-01 09:08:05 +08:00
zhilwang
c420d72443
Merge pull request #2341 from saamas/encoder-x86-dct-opt
...
[Encoder] x86 DCT optimizations
2016-01-28 10:33:34 +08:00
HaiboZhu
51f3bbdfde
Merge pull request #2345 from shihuade/WP8ScriptUpdate
...
update build script for wp8 under multi-vc version
2016-01-24 07:56:23 +08:00
Forrest Shi
21402ca419
update build script for wp8 under multi-vc version
2016-01-23 16:56:53 +08:00
HaiboZhu
3174e2a220
Merge pull request #2344 from mstorsjo/cleanup-map
...
Ignore the MSVC generated map file, remove it on make clean
2016-01-22 09:45:57 +08:00
Martin Storsjö
fa52fbfc9d
Ignore the MSVC generated map file, remove it on make clean
2016-01-21 10:23:34 +02:00
HaiboZhu
77c40e09e0
Merge pull request #2343 from HaiboZhu/Add_map_file_msvc
...
Generate map file for msvc build
2016-01-21 14:34:50 +08:00