4180 Commits

Author SHA1 Message Date
sijchen
30da4f196e Merge pull request #2426 from ruil2/fix_trace
fix skip frames statistics issue
2016-03-24 09:49:48 -07:00
zhilwang
25818b0fc2 Merge pull request #2428 from mstorsjo/sign-extension
Add missing sign extension for x86_64 in mb_copy.asm
2016-03-24 17:07:42 +08:00
Martin Storsjö
a4e71d6662 Add missing sign extension for x86_64 in mb_copy.asm
This fixes running the code built for x86_64 OS X with Xcode 7.3.
2016-03-24 10:20:42 +02:00
Karina
10dfb2670b fix skip frames statistics issue 2016-03-24 14:17:47 +08:00
sijchen
22bec09507 Merge pull request #2425 from sijchen/ruil_rc_update
fix bitrate overflow issue when adaptive quality turns on
2016-03-23 10:54:51 -07:00
sijchen
47d310539f Squashed commit of the following:
commit c8111942e07437034a74b33887c33b5ad78e476a
Author: Karina <ruil2@cisco.com>
Date:   Wed Mar 23 14:31:18 2016 +0800

    update SHA table

commit f36a25344c25a131581dcbcd2d103fc4b131012e
Author: Karina <ruil2@cisco.com>
Date:   Wed Mar 23 13:45:58 2016 +0800

    fix bitrate overflow issue when adaptive quality turns on
2016-03-23 10:23:33 -07:00
HaiboZhu
c0641f40d9 Merge pull request #2423 from shihuade/SPSUpdate
fix bug for debug mode
2016-03-23 13:27:35 +08:00
HaiboZhu
e52c6eacb0 Merge pull request #2422 from HaiboZhu/Bugfix_level_check_error_fmo_return_value
Fix the level limit check bug and fmo return overflow bug
2016-03-23 12:14:20 +08:00
Haibo Zhu
e4a4fb6577 (1) Fix the level limit check wrong condition
(2) Fix the FMO return value overflow bug
2016-03-23 11:15:26 +08:00
Forrest Shi
47ad929c25 fix bug for debug mode 2016-03-23 11:14:40 +08:00
sijchen
22d6a94919 Merge pull request #2414 from ksb2go/master
Google has deprecated using SVN. Move over to GitHub
2016-03-22 16:20:58 -07:00
sijchen
40e1a69fae Merge pull request #2421 from shihuade/MultiThread_V5.2_Pull_V2
refactor for slice buffer init/allocate/free
2016-03-22 16:20:37 -07:00
huade
a7a5b7b0f4 refactor for slice buffer init/allocate/free 2016-03-22 13:51:20 +08:00
sijchen
33bb96f604 Merge pull request #2420 from sijchen/fix_sps
[Encoder] fix the lack of eSpsPpsIdStrategy==INCREASING_ID under simulcast avc on
2016-03-21 21:51:07 -07:00
sijchen
8103988cde Merge pull request #2418 from ruil2/refine_init
fix preprocessing initialization logic
2016-03-21 11:32:24 -07:00
Karina
228cdeba1b refine reset function 2016-03-21 10:48:41 +08:00
sijchen
38313b913d Merge pull request #2419 from ruil2/bitrate_update
fix bitrate update issue
2016-03-18 16:07:59 -07:00
Karina
7c15d68e24 fix preprocessing initialization logic 2016-03-18 16:43:11 +08:00
Karina
316ab31882 fix bitrate update issue 2016-03-18 14:28:32 +08:00
zhilwang
d7570bfa52 Merge pull request #2401 from saamas/decoder-use-encoder-x86-idct-routines
[Decoder] Use encoder x86 IDCT routines
2016-03-18 08:50:33 +08:00
David Chen
7112938a28 Google has deprecated using SVN. Move over to GitHub 2016-03-17 17:25:22 -07:00
HaiboZhu
a8ab4afe5b Merge pull request #2410 from HaiboZhu/Add_disable_assert_in_release
Diable assert in release with -DNDEBUG macro
2016-03-17 15:46:25 +08:00
HaiboZhu
c441f6f390 Merge pull request #2411 from huili2/memory_leak_fix
fix memory leak when alloc failed in decoder
2016-03-17 15:46:12 +08:00
Haibo Zhu
43f767d06e Diable assert in release with -DNDEBUG macro
Update the code to avoid the function unused warning
2016-03-17 11:24:01 +08:00
unknown
693fd14272 fix memory leak when alloc failed in decoder 2016-03-17 10:31:25 +08:00
Sindre Aamås
b6c4a5447c [Decoder/x86] IDCT one block at a time with SSE2
At lower bitrates, it is overall faster to conditionally do one block
at a time with SSE2 on Haswell and likely other common architectures.
At higher bitrates, it is faster to use the wider routine that IDCTs
four blocks at a time. To avoid potential performance regressions
as compared to MMX, stick with single-block IDCTs with SSE2. There
is still a performance advantage as compared to MMX because the
single-block SSE2 routine is faster than the corresponding MMX
routine.

Stick with four blocks at a time with AVX2 for which that appears
to be consistently faster on Haswell.
2016-03-16 19:55:11 +01:00
huili2
a8d9576297 Merge pull request #2405 from HaiboZhu/Fix_UT_decoder_init_fail
Fix the decoder init failed case in UT
2016-03-16 16:28:14 +08:00
HaiboZhu
7a3b3fdbe7 Merge pull request #2403 from ruil2/downsampling1
change downsampling logic
2016-03-16 09:48:08 +08:00
sijchen
90deb80b50 rename the functions 2016-03-14 21:41:08 -07:00
sijchen
c009183e97 fix the lack of eSpsPpsIdStrategy==INCREASING_ID under simulcast avc on 2016-03-14 11:28:44 -07:00
Haibo Zhu
46f42ec5f3 Fix the decoder init failed case in UT 2016-03-14 17:06:58 +08:00
Karina
f84f2315ab change downsampling logic that downsampling source is from the nearest layer instead of the highest layer 2016-03-14 09:55:36 +08:00
HaiboZhu
25f53a2e3d Merge pull request #2399 from saamas/encoder-x86-add-avx2-satd-routines
[Encoder/x86] Add AVX2 SATD routines
2016-03-10 09:59:33 +08:00
Sindre Aamås
98042f1600 [Decoder] Use encoder x86 IDCT routines
Move asm routines to common. Delete obsolete decoder routines.

Use wider routines where applicable.

~1.07x overall faster decode on a quick 720p30 4Mbps test on Haswell.
2016-03-09 10:41:42 +01:00
HaiboZhu
bffda9ec02 Merge pull request #2397 from HaiboZhu/Remove_level_limit_check
Change the level limit check behavior to make the compatibility
2016-03-09 09:50:44 +08:00
Haibo Zhu
31de8bb3a0 Change the level limit check behavior to make the compatibility 2016-03-09 08:34:07 +08:00
Sindre Aamås
48a520915a [Encoder/x86] Add AVX2 SATD routines
WelsSampleSatd16x16_avx2 (~2.31x speedup over SSE4.1 on Haswell).
WelsSampleSatd16x8_avx2  (~2.19x speedup over SSE4.1 on Haswell).
WelsSampleSatd8x16_avx2  (~1.68x speedup over SSE4.1 on Haswell).
WelsSampleSatd8x8_avx2   (~1.53x speedup over SSE4.1 on Haswell).
2016-03-08 11:31:17 +01:00
volvet
d4c68527b1 Merge pull request #2389 from saamas/common-x86-deblock-chroma-horizontal-ssse3-optimizations
[Common/x86] Deblock chroma horizontal ssse3 optimizations
2016-03-08 17:09:08 +08:00
HaiboZhu
d9bfc9204b Merge pull request #2394 from sijchen/th021
[Common] remove sink in WelsThreadPool and hide the construtor to finish the s…
2016-03-08 16:29:40 +08:00
HaiboZhu
74b8a66140 Merge pull request #2395 from ruil2/stat_output
format update and fix build issue when turn on STAT_OUTPUT macro
2016-03-07 13:46:27 +08:00
Karina
fee9d502bb format update and fix build issue when turn on STAT_OUTPUT macro 2016-03-04 13:55:14 +08:00
sijchen
316f740630 Merge pull request #2390 from sijchen/th012
[Common] put CWelsThreadPool to singleTon for future usage
2016-03-03 09:47:20 -08:00
huili2
ac6cf877d6 Merge pull request #2392 from mstorsjo/decoder-error-return
Fix a return value check
2016-03-03 16:40:55 +08:00
Martin Storsjö
7f53c29302 Fix a return value check
In 9cb4f4e8e21af, the error code returned from CheckIntraNxNPredMode
was changed - therefore, these return value checks, that look
for a specific error code, need to be updated accordingly.

This fixes crashes in DecodeCrashTestAPI.DecoderCrashTest
with some seeds.
2016-03-03 10:15:34 +02:00
sijchen
4db9c32976 remove sink in WelsThreadPool and hide the construtor to finish the singleTon 2016-03-02 17:08:09 -08:00
sijchen
d4f09d9048 put CWelsThreadPool to singleTon for future usage (including add sink for IWelsTask) 2016-02-29 11:40:25 -08:00
HaiboZhu
52d25f544a Merge pull request #2386 from huili2/return_info_change
modify return value check inside decoder
2016-02-29 09:21:31 +08:00
sijchen
7e88b13809 Merge pull request #2380 from mstorsjo/fix-slice-realloc
Avoid reading iCountMbNumInSlice out of bounds on slice realloc
2016-02-26 09:46:13 -08:00
Sindre Aamås
a009153741 [Common/x86] DeblockChromaEq4H_ssse3 optimizations
Use packed 8-bit operations rather than unpack to 16-bit.

~5.80x speedup on Haswell (x86-64).
~1.69x speedup on Haswell (x86 32-bit).
2016-02-26 10:58:16 +01:00
Sindre Aamås
9909c306f1 [Common/x86] DeblockChromaLt4H_ssse3 optimizations
Use packed 8-bit operations rather than unpack to 16-bit.

~5.72x speedup on Haswell (x86-64).
~1.85x speedup on Haswell (x86 32-bit).
2016-02-26 10:58:16 +01:00