openh264

Author	SHA1	Message	Date
volvet	6714b8ae99	Merge pull request #463 from mstorsjo/dont-clobber-neon-registers Avoid clobbering the neon registers q4-q7 Review and verified by zhilwang	2014-03-14 10:28:55 +08:00
volvet	8962b7c98b	Merge pull request #482 from sijchen/me_refactor1 mv range setting refactor	2014-03-13 10:21:39 +08:00
sijchen	d809a7981b	mv range setting refactor	2014-03-13 10:18:01 +08:00
volvet	8b907c18fd	fix idr interval issue	2014-03-12 17:38:25 +08:00
ruil2	c7f2a0b7f6	3Author: ruil2 <ruil2@cisco.com> modify the parameter verification for SM_AUTO_SLICE mode -- uiSliceNum iis ignored	2014-03-12 10:44:13 +08:00
ruil2	7c8ce799c0	fix SM_FIXEDSLCNUM_SLICE bug, add SM_AUTO_SLICE mode	2014-03-11 10:23:46 +08:00
Martin Storsjö	c011890764	Push clobbered neon registers on the stack According to the calling convention, the registers q4-q7 should be preserved by functions. The caller (generated by the compiler) could be using those registers anywhere for any intermediate data. Functions that use more than 12 of the qX registers must push the clobbered registers on the stack in order to be able to restore them afterwards. In functions that don't use all 16 registers, but clobber some of the callee saved registers q4-q7, one or more of them are remapped to reduce the number of registers that have to be saved/restored. This incurs a very small (around 0.5%) slowdown in the decoder and encoder.	2014-03-10 22:07:36 +02:00
Martin Storsjö	811c647c0e	Remap registers to avoid clobbering the neon registers q4-q7 According to the calling convention, the registers q4-q7 should be preserved by functions. The caller (generated by the compiler) could be using those registers anywhere for any intermediate data. Functions that use 12 or less of the qX registers can avoid violating the calling convention by simply using other registers instead of the callee saved registers q4-q7. This change only remaps the registers used within functions - therefore this does not affect performance at all. E.g. in functions using registers q0-q7, we now use q0-q3 and q8-q11 instead.	2014-03-10 22:07:25 +02:00
ruil2	a922155c9a	Merge pull request #466 from sijchen/add_memalign_test Add memalign unit test	2014-03-10 17:25:41 +08:00
sijchen	385128e403	Merge pull request #465 from ruil2/encoder_trace use global trace in encoder reviewed at https://rbcommons.com/s/OpenH264/r/176/	2014-03-10 17:22:19 +08:00
sijchen	53a570556d	add memalign unit test	2014-03-10 16:28:05 +08:00
ruil2	02bafd9320	Merge pull request #445 from mstorsjo/use-thread-param Use the iMultipleThreadIdc field from SEncParamExt	2014-03-10 15:28:04 +08:00
ruil2	86f37f047c	Merge pull request #452 from mstorsjo/use-slice-mode-enum Use SliceModeEnum as data type for the slice mode fields	2014-03-10 15:27:04 +08:00
ruil2	2539d6e447	Merge pull request #462 from mstorsjo/fix-typos Fix two typos in variable and macro names	2014-03-10 15:25:20 +08:00
ruil2	ba6b2a8d62	use global trace in encoder	2014-03-10 15:22:40 +08:00
Martin Storsjö	cc7b81f3c3	Fix a typo in arm assembly, LORD -> LOAD	2014-03-09 19:19:38 +02:00
Martin Storsjö	7c435ad295	Remove a stray inline keyword in a function signature comment in x86 assembly Assembly functions written in external assembly files is obviously not inlined.	2014-03-09 19:18:03 +02:00
Martin Storsjö	8d6b368a1c	Remove unnecessary stray __cdecl annotations in function signature comments in x86 assembly	2014-03-09 19:18:02 +02:00
Martin Storsjö	5df2e2a996	Use SliceModeEnum as data type for the slice mode fields This makes the use of the field clearer and safer by allowing the compiler check that users actually assign proper enum values.	2014-03-08 00:23:58 +02:00
Martin Storsjö	ce7b00ea72	Get rid of an unnecessary cast by declaring the right pointer type	2014-03-08 00:17:30 +02:00
Ethan Hugg	fb4f677f77	Merge pull request #446 from mstorsjo/remove-unnecessary-public-param Move the iCountThreadsNum field to SWelsSvcCodingParam	2014-03-07 09:18:52 -08:00
Ethan Hugg	7632510209	Merge pull request #450 from mstorsjo/publish-slice-mode-enum Move the slice mode enum to the public API	2014-03-07 09:17:03 -08:00
Martin Storsjö	5f1c207845	Move the slice mode enum to the public header This simplifies setting the slice mode in the public API.	2014-03-07 14:53:29 +02:00
Martin Storsjö	495a4a392e	Make ParamValidationExt use the actual type instead of a void pointer	2014-03-07 14:51:34 +02:00
Martin Storsjö	656e4c5c35	Move the iCountThreadsNum field to SWelsSvcCodingParam There is no point in the user setting this field, it's only used as an internal field within the encoder.	2014-03-07 14:48:38 +02:00
Martin Storsjö	dbc324d5bb	Use the iMultipleThreadIdc field from SEncParamExt	2014-03-07 14:47:43 +02:00
Martin Storsjö	5b8ee37162	Merge WelsThreadDestroy into WelsThreadJoin Now calling WelsThreadJoin is enough to finish and clean up the thread on all platforms. This unifies the thread cleanup code between windows and unix. Now all of the threading code should use the exact same codepaths between windows and unix.	2014-03-07 10:51:28 +02:00
Martin Storsjö	b4aa9be7de	Use WelsThreadJoin on windows as well This avoids using a separate event just for signalling that a thread has finished running.	2014-03-07 10:51:28 +02:00
Martin Storsjö	baaa38737e	Use pExitEncodeEvent instead of thread cancellation on unix as well This works now that we've got a suitably working implementation of WelsMultipleEventsWaitSingleBlocking.	2014-03-07 10:49:39 +02:00
volvet	38a3fada24	Merge pull request #435 from mstorsjo/threadlib-wait-single-unix Make WelsMultipleEventsWaitSingleBlocking usable on unix as well	2014-03-07 16:47:38 +08:00
Licai Guo	1b9aae8434	Merge pull request #439 from zhilwang/mc-arm-asm mv mc_neon.S to common,add MC arm code to encoder	2014-03-07 16:36:48 +08:00
ruil2	b3c45946ff	modify typing format	2014-03-07 16:29:12 +08:00
Licai Guo	e5f36822a9	Update targets.mk files	2014-03-07 16:22:59 +08:00
Licai Guo	d986c27b9d	remove mc_neon.S from encoder	2014-03-07 16:11:36 +08:00
ruil2	f0c6c2b318	Merge branch 'master' of https://github.com/cisco/openh264 into encoder_update	2014-03-07 15:59:23 +08:00
Licai Guo	71467f948a	mv mc_neon.S to common,add MC arm code to encoder	2014-03-07 12:18:58 +08:00
Licai Guo	a4cecd8004	Merge pull request #426 from volvet/simplify-layer-process simplify-layer-process	2014-03-07 10:58:28 +08:00
volvet	14f5518e6a	Merge pull request #437 from mstorsjo/fix-arm-encoder-android Fix building arm encoder assembly for android	2014-03-07 10:41:34 +08:00
ruil2	594fc4fe7b	dump file refactor	2014-03-07 10:23:25 +08:00
Martin Storsjö	c0043f7053	Use the three-operand form of add/sub with shift When using unified syntax, the two operand form with a shift isn't allowed.	2014-03-06 16:21:54 +02:00
Martin Storsjö	f1502c26e3	Don't use WELS_ASM_FUNC_END in the middle of a function WELS_ASM_FUNC_END declares the end of the function, and needs to be paired with WELS_ASM_FUNC_BEGIN.	2014-03-06 16:21:54 +02:00
Martin Storsjö	4e4bfcc1bc	Regenerate makefiles to include the encoder arm assembly	2014-03-06 16:11:54 +02:00
Martin Storsjö	ce4fa9e272	Correct the endif comment The code block is about HAVE_NEON, not X86_ASM.	2014-03-06 15:43:04 +02:00
Martin Storsjö	636df2bebb	Use WelsMultipleEventsWaitSingleBlocking within the worker thread on unix as well This avoids using a separate thread for handling pUpdateMbListEvent events, and later allowing using the encode exit event on unix instead of pthread cancellation.	2014-03-06 15:34:35 +02:00
Martin Storsjö	801da26d1d	Use WelsMultipleEventsWaitSingleBlocking with a master event for waiting on finished threads This allows using the same codepath for both unix and windows for distributing new slices to code to threads. This also improves the performance on unix - instead of waiting for all the current threads to finish their current slice before handing out a new slice to each of them (where the threads that finish first will just wait instead of immediately getting a new slice to work on), we now use the same logic as on windows. In one setup, it improves the performance of encoding from ~920 fps to ~950 fps, and in another setup it goes from ~390 fps to ~660 fps. (These tests were done with the SM_ROWMB_SLICE mode, which heavily exercises the code for distributing new slices to the worker threads.) The extra WelsEventSignal call on windows where it isn't strictly necessary doesn't incur any measurable slowdown, so it is kept without any extra ifdefs to keep the code more readable and unified.	2014-03-06 15:33:37 +02:00
Martin Storsjö	de32455d87	Remove the timeout parameter from WelsMultipleEventsWaitSingleBlocking All users of the function passed the value corresponding to "infinite", and the (currently unused) unix implementation of it only supported infinite wait as well.	2014-03-06 15:03:59 +02:00
volvet	8cc332dea1	Merge pull request #432 from zhilwang/arm-asm Arm asm	2014-03-06 16:50:56 +08:00
volvet	73452e0993	Merge pull request #429 from mstorsjo/simplify-ifdef-with-macro Use a macro for conditionally logging based on ENABLE_TRACE_MT	2014-03-06 16:01:41 +08:00
Licai Guo	7bfe801874	Remove trailing space	2014-03-06 14:55:36 +08:00
Licai Guo	67534b0fc0	arm asm code refine.	2014-03-06 14:30:16 +08:00

1 2 3 4 5 ...

316 Commits