openh264

Author	SHA1	Message	Date
sijchen	ffb85046b4	Refactoring: Wrap all the operations related to eSpsPpsIdStrategy to class, to improve code readability	2016-05-04 15:06:02 -07:00
HaiboZhu	c30cc41261	Merge pull request #2448 from saamas/encoder-getnonzerocount-sse42 [Encoder] Add an SSE4.2 implementation of WelsGetNonZeroCount	2016-05-04 09:49:47 +08:00
ruil2	e9dc97803d	Merge pull request #2447 from saamas/encoder-cavlcparamcal-sse42 [Encoder] Add an SSE4.2 implementation of CavlcParamCal	2016-04-28 09:08:44 +08:00
Sindre Aamås	4645bd26aa	[Encoder] Add an SSE4.2 implementation of WelsGetNonZeroCount Avoid touching some cache lines by using popcnt instead of table lookups. Also gives a speedup of ~1.4x on Haswell as compared with SSE2.	2016-04-20 19:10:24 +02:00
Sindre Aamås	d906dda224	[UT] Improve GetNonZeroCount tests Reduce duplication. Test more combinations. Always test boundary cases.	2016-04-20 19:10:24 +02:00
Sindre Aamås	3f31aff4dc	[Encoder] Add an SSE4.2 implementation of CavlcParamCal Use a combination of table lookups and pshufb to convert coefficients to zero run/level format. Two 16-entry lookup tables are used for a total of 192 bytes worth of tables. (The existing SSE2 version uses a table of size 2048 bytes.) Speedup is ~1.5x-3x as compared with the SSE2 version on Haswell (the speedup is greater for input with many trailing zeros). The use of popcnt makes it require SSE4.2. This can be replaced with a small LUT and accumulation which would reduce the requirement to SSSE3.	2016-04-20 18:37:08 +02:00
Sindre Aamås	502b16925e	[UT] Add tests for CavlcParamCal_c and CavlcParamCal_sse2	2016-04-20 18:37:08 +02:00
Sindre Aamås	bb49e23719	[Encoder] Add AVX2 4x4 quantization routines WelsQuantFour4x4Max_avx2 (~2.06x speedup over SSE2) WelsQuantFour4x4_avx2 (~2.32x speedup over SSE2) WelsQuant4x4Dc_avx2 (~1.49x speedup over SSE2) WelsQuant4x4_avx2 (~1.42x speedup over SSE2)	2016-04-13 11:56:47 +02:00
Sindre Aamås	1e83bec860	[UT] Add some missing quantization tests	2016-04-13 11:56:44 +02:00
Sindre Aamås	abaf3a4104	[UT] Reduce duplication in quantization tests	2016-04-13 08:59:16 +02:00
Sindre Aamås	48a520915a	[Encoder/x86] Add AVX2 SATD routines WelsSampleSatd16x16_avx2 (~2.31x speedup over SSE4.1 on Haswell). WelsSampleSatd16x8_avx2 (~2.19x speedup over SSE4.1 on Haswell). WelsSampleSatd8x16_avx2 (~1.68x speedup over SSE4.1 on Haswell). WelsSampleSatd8x8_avx2 (~1.53x speedup over SSE4.1 on Haswell).	2016-03-08 11:31:17 +01:00
Gregory J. Wolfe	03890fe86f	Added support for "video signal type present" information. The "Video signal type present" information is written to the output video file when it is created, and later is used by the decoder to properly decode the compressed video data. The saved attributes are: - format type (PAL, NTSC, etc.) - color primaries (BT709, SMPTE170M, etc.) - transfer characteristics (BT709, SMPTE170M, etc.) - color matrix ((BT709, SMPTE170M, etc.) These modifications allow the client to specify these attributes and, if specified, makes sure they are written to the output file.	2016-02-24 10:33:18 -05:00
Gregory J. Wolfe	c7fcba06c7	Added support for "video signal type present" information. The "Video signal type present" information is written to the output video file when it is created, and later is used by the decoder to properly decode the compressed video data. The saved attributes are: - format type (PAL, NTSC, etc.) - color primaries (BT709, SMPTE170M, etc.) - transfer characteristics (BT709, SMPTE170M, etc.) - color matrix ((BT709, SMPTE170M, etc.) These modifications allow the client to specify these attributes and, if specified, makes sure they are written to the output file.	2016-02-23 13:21:06 -05:00
sijchen	aaa25160ec	Merge pull request #2353 from saamas/encoder-x86-dct-opt2 [Encoder] x86 DCT optimizations	2016-02-08 15:00:12 -08:00
Sindre Aamås	c8c74903f8	[Encoder] Add single-block AVX2 4x4 DCT/IDCT routines We do four blocks at a time when possible, but need to handle single blocks at a time for intra prediction. ~3.15x speedup over MMX for the DCT on Haswell. ~2.94x speedup over MMX for the IDCT on Haswell. Returns diminish with increasing vector length because a larger proportion of the time is spent on load/store/shuffling.	2016-02-02 17:22:49 +01:00
Sindre Aamås	f90960983c	[Encoder] Add single-block SSE2 4x4 DCT/IDCT routines We do four blocks at a time when possible, but need to handle single blocks at a time for intra prediction. ~2.31x speedup over MMX for the DCT on Haswell. ~1.92x speedup over MMX for the IDCT on Haswell.	2016-02-02 17:22:48 +01:00
unknown	3873addc3d	fix frame size constraints for width and height	2016-02-01 15:55:53 +08:00
Sindre Aamås	cc8d541432	[UT] Utilize DCT function pointer typedefs	2016-01-19 22:00:24 +01:00
Sindre Aamås	a45c10cf91	[UT] Only run AVX2 tests if host supports AVX2	2016-01-19 14:27:46 +01:00
Sindre Aamås	3088d96978	[Encoder] Add an AVX2 4x4 IDCT implementation ~2.03x faster on Haswell as compared to the SSE2 version.	2016-01-19 13:12:28 +01:00
Sindre Aamås	b267163f10	[Encoder] Add an AVX2 4x4 DCT implementation ~2.52x faster on Haswell as compared to the SSE2 version.	2016-01-19 13:12:28 +01:00
Sindre Aamås	b9adbcf37c	[UT] Add missing SSE2 4x4 IDCT test IDCT input is defined in such a way that the intermediate values cannot legally overflow an int16_t. The use of random values as input causes such overflows. This results in implementation- dependent output depending on which type is used to hold intermediate results. Use a template for the test reference implementation to test implementations with different intermediate representation.	2016-01-19 13:12:28 +01:00
Sindre Aamås	8764231784	[UT] Improve DCT tests Initialize input arrays with different random values. Otherwise, the input to the DCT routines is effectively all zero values after taking the difference. Reduce duplication.	2016-01-19 13:12:28 +01:00
sijchen	aeb5ab4b99	[Encoder] put the logic related to multiple D layer into a class for better structure	2015-11-11 22:55:16 -08:00
sijchen	33c378f7b7	change API for slicing part for easier usage (the UseLoadBalancing flag is still under working)	2015-11-10 09:50:06 -08:00
Sijia Chen	819f6f5d93	[Encoder] add encoder tasks and task-management class https://rbcommons.com/s/OpenH264/r/1334/	2015-10-19 22:48:28 -07:00
karina li	2c830e64d7	exception case for width or height is less than 16	2015-09-08 17:21:56 +08:00
Guangwei Wang	e42ce60cc9	add UT for sub8x8 modes assembly functions	2015-07-30 10:02:32 +08:00
Martin Storsjö	78e0ec6130	Convert tabs to spaces before comments	2015-06-10 10:22:29 +03:00
Martin Storsjö	764793d74b	Remove tabs in struct and class definitions	2015-06-10 10:22:01 +03:00
Martin Storsjö	ca51ee0f44	Remove tabs where a simple space is just enough	2015-06-10 10:21:52 +03:00
Martin Storsjö	51efa57a3d	Convert tabs to spaces in vertically aligned code	2015-06-10 10:21:29 +03:00
Martin Storsjö	723044837a	Convert tabs to spaces in defines	2015-06-10 10:21:25 +03:00
Martin Storsjö	ebbcb67fb7	Convert tabs to spaces in assignment of SIMD function pointers	2015-06-03 15:39:30 +03:00
Martin Storsjö	0298b3f580	Initialize enough samples in the new 4x8 tests This fixes valgrind warnings about tests using uninitialized data.	2015-06-03 09:45:06 +03:00
huili2	f76325edc7	Merge pull request #1973 from huili2/sub8 modify some functions extending to sub8x8 usage, especially in ME part	2015-06-02 14:44:06 +08:00
huili2	c3cfce5223	modify some functions extending to sub8x8 usage, especially in ME part	2015-06-02 13:39:38 +08:00
sijchen	5588e82fce	Merge pull request #1961 from mstorsjo/fix-warnings Remove a redundant check of this!=NULL	2015-06-01 10:42:56 +08:00
Martin Storsjö	1239bb24ba	Remove a redundant check of this!=NULL 'this' can't be NULL in well-defined C++ code. This fixes a warning with clang 3.6 from Xcode 6.3.	2015-05-27 11:46:53 +03:00
Sijia Chen	9442a7a0b5	add parameter checking on resolution and related UT	2015-05-26 15:41:47 +08:00
Martin Storsjö	b90eca78cd	Avoid endian assumptions in FillQpelLocationByFeatureValue_c These values are read as two separate 16 bit integers from an array in the FeatureSearchOne function, therefore we should also store them in a well-defined order. This fixes encoding of screen content on big endian; now the full testsuite passes on big endian.	2015-05-15 13:11:23 +03:00
Martin Storsjö	7a80c21526	Reformat tables without tabs	2015-05-13 22:06:58 +03:00
Haibo Zhu	61b82d28c4	Add framerate & spatialbitrate comparison for encoder UT	2015-05-05 18:53:50 -07:00
Martin Storsjö	8d34c68ad6	Add a missing newline at the end of a file Some tools (like git) complain if a file lacks a newline at the end of a file, and some editors will automatically readd it when editing such files.	2015-05-04 12:46:48 +03:00
Sijia Chen	1922b533f6	change the range of frame rate from 30 to 60	2015-04-16 12:45:43 +08:00
ruil2	cce966fbba	update bGapsInFrameNumValueAllowedFlag according to parameters setting	2015-03-18 13:44:03 +08:00
ruil2	7d055cae94	Merge pull request #1786 from sijchen/fix_over improve error logging in UT	2015-02-06 12:17:56 +08:00
sijchen	5fdd01ec0c	Merge pull request #1787 from mstorsjo/remove-stray-semicolon Remove accidental double semicolons	2015-02-02 18:15:02 +08:00
sijchen	e7a7a35611	Merge pull request #1779 from mstorsjo/share-memalign Move the memory allocation/deallocation routines to the common library	2015-02-02 18:14:55 +08:00
Martin Storsjö	a3063531c4	Remove accidental double semicolons	2015-02-02 09:20:35 +02:00

1 2 3 4 5

203 Commits