openh264

Author	SHA1	Message	Date
ruil2	56618249d7	Merge pull request #2436 from saamas/processing-add-avx2-vaa-routines [Processing] Add AVX2 VAA routines	2016-04-28 09:08:03 +08:00
HaiboZhu	98c6c6de11	Merge pull request #2446 from HaiboZhu/Reduce_log_size_for_parse_only_mode Add the log reduce logic into parse only mode	2016-04-20 10:48:57 +08:00
HaiboZhu	3b68840d5f	Merge pull request #2444 from GuangweiWang/fix-assembly-arm64 Fix assembly arm64 Code review at: https://rbcommons.com/s/OpenH264/r/1594/	2016-04-20 10:03:01 +08:00
Haibo Zhu	3ccecfbdbe	Add the log reduce logic into parse only mode	2016-04-20 09:58:12 +08:00
Guangwei Wang	cc407b4b21	fix code style	2016-04-17 19:47:55 +08:00
Guangwei Wang	0b8cdcaff8	extension 32-bit parameters to 64-bit on arm64 assembly function	2016-04-17 19:41:57 +08:00
Karina	1ecb9582df	update arm assembly comments	2016-04-14 14:57:21 +08:00
Karina	dd340b7fe7	modify neon comment	2016-04-14 14:49:11 +08:00
Karina	525dbe7093	add 32-bit parameter sign-extentions for block_add_aarch64_neon.S	2016-04-14 10:06:57 +08:00
Karina	d34e209266	fix 32-bit parameters issue on arm64 assembly function	2016-04-13 19:30:08 +08:00
Karina	7943764869	add missing sign extension for arm64	2016-04-12 16:27:58 +08:00
Sindre Aamås	57fc3e9917	[Processing] Add AVX2 VAA routines Process 8 lines at a time rather than 16 lines at a time because this appears to give more reliable memory subsystem performance on Haswell. Speedup is > 2x as compared to SSE2 when not memory-bound on Haswell. On my Haswell MBP, VAACalcSadSsdBgd is about ~3x faster when uncached, which appears to be related to processing 8 lines at a time as opposed to 16 lines at a time. The other routines are also faster as compared to the SSE2 routines in this case but to a lesser extent.	2016-04-11 16:09:56 +02:00
Karina	7e14400b0b	refine the workflow for encode one frame	2016-03-31 16:58:20 +08:00
Karina	3927d91b85	fix temporal layer issue when output frame rate is different from input frame rate	2016-03-29 15:48:06 +08:00
sijchen	30da4f196e	Merge pull request #2426 from ruil2/fix_trace fix skip frames statistics issue	2016-03-24 09:49:48 -07:00
Martin Storsjö	a4e71d6662	Add missing sign extension for x86_64 in mb_copy.asm This fixes running the code built for x86_64 OS X with Xcode 7.3.	2016-03-24 10:20:42 +02:00
Karina	10dfb2670b	fix skip frames statistics issue	2016-03-24 14:17:47 +08:00
sijchen	47d310539f	Squashed commit of the following: commit c8111942e07437034a74b33887c33b5ad78e476a Author: Karina <ruil2@cisco.com> Date: Wed Mar 23 14:31:18 2016 +0800 update SHA table commit f36a25344c25a131581dcbcd2d103fc4b131012e Author: Karina <ruil2@cisco.com> Date: Wed Mar 23 13:45:58 2016 +0800 fix bitrate overflow issue when adaptive quality turns on	2016-03-23 10:23:33 -07:00
HaiboZhu	c0641f40d9	Merge pull request #2423 from shihuade/SPSUpdate fix bug for debug mode	2016-03-23 13:27:35 +08:00
Haibo Zhu	e4a4fb6577	(1) Fix the level limit check wrong condition (2) Fix the FMO return value overflow bug	2016-03-23 11:15:26 +08:00
Forrest Shi	47ad929c25	fix bug for debug mode	2016-03-23 11:14:40 +08:00
huade	a7a5b7b0f4	refactor for slice buffer init/allocate/free	2016-03-22 13:51:20 +08:00
sijchen	33bb96f604	Merge pull request #2420 from sijchen/fix_sps [Encoder] fix the lack of eSpsPpsIdStrategy==INCREASING_ID under simulcast avc on	2016-03-21 21:51:07 -07:00
sijchen	8103988cde	Merge pull request #2418 from ruil2/refine_init fix preprocessing initialization logic	2016-03-21 11:32:24 -07:00
Karina	228cdeba1b	refine reset function	2016-03-21 10:48:41 +08:00
Karina	7c15d68e24	fix preprocessing initialization logic	2016-03-18 16:43:11 +08:00
Karina	316ab31882	fix bitrate update issue	2016-03-18 14:28:32 +08:00
zhilwang	d7570bfa52	Merge pull request #2401 from saamas/decoder-use-encoder-x86-idct-routines [Decoder] Use encoder x86 IDCT routines	2016-03-18 08:50:33 +08:00
HaiboZhu	a8ab4afe5b	Merge pull request #2410 from HaiboZhu/Add_disable_assert_in_release Diable assert in release with -DNDEBUG macro	2016-03-17 15:46:25 +08:00
Haibo Zhu	43f767d06e	Diable assert in release with -DNDEBUG macro Update the code to avoid the function unused warning	2016-03-17 11:24:01 +08:00
unknown	693fd14272	fix memory leak when alloc failed in decoder	2016-03-17 10:31:25 +08:00
Sindre Aamås	b6c4a5447c	[Decoder/x86] IDCT one block at a time with SSE2 At lower bitrates, it is overall faster to conditionally do one block at a time with SSE2 on Haswell and likely other common architectures. At higher bitrates, it is faster to use the wider routine that IDCTs four blocks at a time. To avoid potential performance regressions as compared to MMX, stick with single-block IDCTs with SSE2. There is still a performance advantage as compared to MMX because the single-block SSE2 routine is faster than the corresponding MMX routine. Stick with four blocks at a time with AVX2 for which that appears to be consistently faster on Haswell.	2016-03-16 19:55:11 +01:00
sijchen	90deb80b50	rename the functions	2016-03-14 21:41:08 -07:00
sijchen	c009183e97	fix the lack of eSpsPpsIdStrategy==INCREASING_ID under simulcast avc on	2016-03-14 11:28:44 -07:00
Karina	f84f2315ab	change downsampling logic that downsampling source is from the nearest layer instead of the highest layer	2016-03-14 09:55:36 +08:00
HaiboZhu	25f53a2e3d	Merge pull request #2399 from saamas/encoder-x86-add-avx2-satd-routines [Encoder/x86] Add AVX2 SATD routines	2016-03-10 09:59:33 +08:00
Sindre Aamås	98042f1600	[Decoder] Use encoder x86 IDCT routines Move asm routines to common. Delete obsolete decoder routines. Use wider routines where applicable. ~1.07x overall faster decode on a quick 720p30 4Mbps test on Haswell.	2016-03-09 10:41:42 +01:00
Haibo Zhu	31de8bb3a0	Change the level limit check behavior to make the compatibility	2016-03-09 08:34:07 +08:00
Sindre Aamås	48a520915a	[Encoder/x86] Add AVX2 SATD routines WelsSampleSatd16x16_avx2 (~2.31x speedup over SSE4.1 on Haswell). WelsSampleSatd16x8_avx2 (~2.19x speedup over SSE4.1 on Haswell). WelsSampleSatd8x16_avx2 (~1.68x speedup over SSE4.1 on Haswell). WelsSampleSatd8x8_avx2 (~1.53x speedup over SSE4.1 on Haswell).	2016-03-08 11:31:17 +01:00
volvet	d4c68527b1	Merge pull request #2389 from saamas/common-x86-deblock-chroma-horizontal-ssse3-optimizations [Common/x86] Deblock chroma horizontal ssse3 optimizations	2016-03-08 17:09:08 +08:00
HaiboZhu	d9bfc9204b	Merge pull request #2394 from sijchen/th021 [Common] remove sink in WelsThreadPool and hide the construtor to finish the s…	2016-03-08 16:29:40 +08:00
Karina	fee9d502bb	format update and fix build issue when turn on STAT_OUTPUT macro	2016-03-04 13:55:14 +08:00
sijchen	316f740630	Merge pull request #2390 from sijchen/th012 [Common] put CWelsThreadPool to singleTon for future usage	2016-03-03 09:47:20 -08:00
Martin Storsjö	7f53c29302	Fix a return value check In 9cb4f4e8e21af, the error code returned from CheckIntraNxNPredMode was changed - therefore, these return value checks, that look for a specific error code, need to be updated accordingly. This fixes crashes in DecodeCrashTestAPI.DecoderCrashTest with some seeds.	2016-03-03 10:15:34 +02:00
sijchen	4db9c32976	remove sink in WelsThreadPool and hide the construtor to finish the singleTon	2016-03-02 17:08:09 -08:00
sijchen	d4f09d9048	put CWelsThreadPool to singleTon for future usage (including add sink for IWelsTask)	2016-02-29 11:40:25 -08:00
HaiboZhu	52d25f544a	Merge pull request #2386 from huili2/return_info_change modify return value check inside decoder	2016-02-29 09:21:31 +08:00
Sindre Aamås	a009153741	[Common/x86] DeblockChromaEq4H_ssse3 optimizations Use packed 8-bit operations rather than unpack to 16-bit. ~5.80x speedup on Haswell (x86-64). ~1.69x speedup on Haswell (x86 32-bit).	2016-02-26 10:58:16 +01:00
Sindre Aamås	9909c306f1	[Common/x86] DeblockChromaLt4H_ssse3 optimizations Use packed 8-bit operations rather than unpack to 16-bit. ~5.72x speedup on Haswell (x86-64). ~1.85x speedup on Haswell (x86 32-bit).	2016-02-26 10:58:16 +01:00
unknown	9cb4f4e8e2	modify return value check inside decoder	2016-02-26 16:29:35 +08:00

1 2 3 4 5 ...

2613 Commits