generic-library/vpx

Author	SHA1	Message	Date
Marco	eb7d431cb5	Compound prediction mode for nonrd pickmode. Allow for compound prediction mode in nonrd_pickmode for ZEROMV. For real-time encoding, 1 pass with non-zero lag-in-frames. Added speed feature to control the feature. Enabled for speed >=6 for now, under VBR mode. avgPSNR/SSIM metrics positive on ytlive set, for speed 6: some clips up by ~3-5%, some clips neutral gain, average gain across clips is ~1%. Small/negligible decrease in speed. Change-Id: I7a60c7596e69b9a928410c5ee2f9141eecd8613d	2017-11-03 10:13:05 -07:00
Johann	5fe82459ec	fail early on oversize frames Even though frame_size is calculated in uint64_t, it winds up in an int size value. This was exposed with the msan test because the memset is called with (int)frame_size, leading to a segfault. Change-Id: I7fd930360dca274adb8f3e43e5e6785204808861	2017-11-03 09:49:13 -07:00
Jerome Jiang	3ba9a2c8b2	Merge "vp9: Move allocation of vt2 after early exits."	2017-11-01 16:58:01 +00:00
Jerome Jiang	34805d6d0d	vp9: Move allocation of vt2 after early exits. Remove the memory deallocation on the early exits. Change-Id: I00b4a814ae6705105ecab89644d055ca3311d9f4	2017-10-31 17:04:04 -07:00
Jerome Jiang	0c84b9b703	Merge "vp9: Reduce stack usage of choose_partitioning."	2017-10-31 21:42:18 +00:00
Jerome Jiang	18b470f486	vp9: Reduce stack usage of choose_partitioning. Move vt2 to heap. Reduce the stack usage from ~87K to ~44K. BUG=b/68362457 Change-Id: I8f5f93712934d59a8cc4564378172d409a736a2e	2017-10-31 13:10:27 -07:00
Jerome Jiang	c77822615e	Merge "vp9: Reduce stack usage of choose_partioning."	2017-10-30 23:39:41 +00:00
Jerome Jiang	cc47231187	vp9: Reduce stack usage of choose_partioning. Change type of sum_square_error from int64_t to uint32_t. Change type of sum_error from int64_t to int32_t. This reduces the stack usage from ~131K to ~87K. BUG=b/68362457 Change-Id: I147d7c7b226bceb4f0817bb86848e1fa9d9ac149	2017-10-30 13:53:20 -07:00
James Zern	acb9460929	vp8: correct if/else '{' placement swap '{' and c-style comments removing a few redundant ones along the way; covers most leftovers from the clang-tidy run against an x86_64-linux config. Change-Id: I67a45596f80a12389faca49c5be440875092a7df	2017-10-27 12:27:10 -07:00
Scott LaVarnway	3bf02ad74a	vpx: hadamard: use ptrdiff_t instead of int for stride Eliminates the following instruction for the x86 (64 bit) intrinsic code: movslq %esi,%rax Change-Id: I8f5ebd40726f998708a668b0f52ea7a0576befae	2017-10-26 11:41:48 -07:00
Kyle Siefring	037e596f04	Merge "Optimize convolve8 SSSE3 and AVX2 intrinsics"	2017-10-24 19:22:36 +00:00
Kyle Siefring	ae35425ae6	Optimize convolve8 SSSE3 and AVX2 intrinsics Changed the intrinsics to perform summation similiar to the way the assembly does. The new code diverges from the assembly by preferring unsaturated additions. Results for haswell SSSE3 Horiz/Vert Size Speedup Horiz x4 ~32% Horiz x8 ~6% Vert x8 ~4% AVX2 Horiz/Vert Size Speedup Horiz x16 ~16% Vert x16 ~14% BUG=webm:1471 Change-Id: I7ad98ea688c904b1ba324adf8eb977873c8b8668	2017-10-24 10:39:48 -04:00
Scott LaVarnway	e0aa6b24aa	Merge "vpx: [x86] vpx_hadamard_16x16_avx2() highbitdepth fix"	2017-10-23 22:02:59 +00:00
Marco	0738d90169	vp9-svc: Allow for adapt_rd_thresh with row-mt. Set adaptive_row_thresh_mt = 1 at speed >= 7, for svc when multi-threading is used with row-mt. This allow the adaptive_rd_thresh feature to be used in the nonrd-pickmode. ~1-2% speedup for SVC encoding with small quality loss (< 0.6%) on RTC set. Change-Id: Iab9878dff117bccdaef3e4d0645165db9808cdfc	2017-10-23 11:47:18 -07:00
Scott LaVarnway	512bf4e029	vpx: [x86] vpx_hadamard_16x16_avx2() highbitdepth fix Use an intermediate buffer before storing to coeffs when highbitdepth is enabled. Change-Id: I101981a1995f1108ad107c55c37d6e09eadb404b	2017-10-23 08:49:32 -07:00
Scott LaVarnway	4906cea027	vpx: [x86] vpx_hadamard_16x16_avx2() improvements ~10% performance gain. Fixed the cosmetics noted in the previous commit. Change-Id: Iddf475f34d0d0a3e356b2143682aeabac459ed13	2017-10-20 08:55:06 -07:00
Scott LaVarnway	b58259ab55	Merge "vpx: [x86] add vpx_hadamard_16x16_avx2()"	2017-10-19 23:32:10 +00:00
Paul Wilkins	199971d606	Merge "Corpus VBR tweak for undershoot."	2017-10-19 10:07:45 +00:00
Paul Wilkins	0c493cbe2b	Merge "Increase precision of some debug stats output for corpus VBR."	2017-10-19 10:07:30 +00:00
Paul Wilkins	d8c34a2552	Merge "Prevent double application of min rate in two pass."	2017-10-19 10:06:33 +00:00
Scott LaVarnway	55c126a5d7	vpx: [x86] add vpx_hadamard_16x16_avx2() This version is ~1.91x faster than the sse2 version. When highbitdepth is enabled, it is ~1.74x. Change-Id: I2b0e92ede9f55c6259ca07bf1f8c8a5d0d0955bd	2017-10-18 18:00:00 -07:00
Jerome Jiang	401e6d48bf	Merge "Add datarate test for vp8 ROI."	2017-10-18 19:39:26 +00:00
Jerome Jiang	bd6d82e881	Add datarate test for vp8 ROI. BUG=webm:1470 Change-Id: Icbc848837e64eacc49491dcc26b4c5802af2ee13	2017-10-18 11:19:59 -07:00
Jerome Jiang	ec2fced451	Merge "vp8: Enable use of ROI map."	2017-10-18 18:16:44 +00:00
Kyle Siefring	b3a36f7946	Merge "Refactor x86/vpx_subpixel_8t_intrin_avx2.c"	2017-10-18 16:19:52 +00:00
Shiyou Yin	df15220a89	Merge "vp8: [loongson] optimize idct with mmi"	2017-10-18 00:55:36 +00:00
Jerome Jiang	dbb8926b86	vp8: Enable use of ROI map. Disable cyclic refresh if ROI is used and add flag to properly handle the static_thresh deltas. Remove the ROI test for cyclic refresh (it's allowed but disabled if ROI is used). Add an example in vpx_temporal_svc_encoder.c. Turned off by default. BUG=webm:1470 Change-Id: Ief9ba1d7f967bc00511b412b491c3f70943bfbda	2017-10-17 15:23:03 -07:00
Linfeng Zhang	9336e01621	Merge changes I17fff122,Ic149e3cb * changes: Add 4 to 3 scaling SSSE3 optimization Test extreme inputs in frame scale functions	2017-10-17 16:03:29 +00:00
Linfeng Zhang	0d2e95193b	Merge "Generalize CheckScalingFiltering in ConvolveTest"	2017-10-17 16:03:07 +00:00
Kyle Siefring	55805e2786	Refactor x86/vpx_subpixel_8t_intrin_avx2.c Change-Id: I6539111dfb35a43028e9755785b2e9ea31854305	2017-10-17 11:57:40 -04:00
Shiyou Yin	577d4fa792	vp8: [loongson] optimize idct with mmi 1. vp8_dequant_idct_add_y_block_mmi 2. vp8_dequant_idct_add_uv_block_mmi Change-Id: I9987147be2685ac79d4b045d1d56f6709ee1223c	2017-10-17 03:27:31 +00:00
Linfeng Zhang	580d32240f	Add 4 to 3 scaling SSSE3 optimization Note this change will trigger the different C version on SSSE3 and generate different scaled output. Its speed is 2x compared with the version calling vpx_scaled_2d_ssse3(). Change-Id: I17fff122cd0a5ac8aa451d84daa606582da8e194	2017-10-16 15:42:42 -07:00
Marco	a9248457b1	Adjust threshold in gf_boost for 1 pass vbr Small inncrease the sad_thresh1, avoids some false detection of possible scene changes within lag. Small improvement in few clips on ytlive, otherwise neutral change. Change-Id: Ia79b53bb657bbce65a7aac7d20666b6373d5af8b	2017-10-13 15:33:51 -07:00
Paul Wilkins	12df840777	Merge "Further Corpus VBR change."	2017-10-13 15:59:58 +00:00
Paul Wilkins	eaa593d293	Merge "Corpus Wide VBR test implementation."	2017-10-13 15:59:45 +00:00
paulwilkins	8842ee0b0d	Corpus VBR tweak for undershoot. In cases of strong undershoot adjust Q range down faster. Change-Id: I84982beceb3c9b6dc50e52e4a6e891c7dd395d03	2017-10-13 10:27:15 +01:00
Shiyou Yin	3e2770de4f	Merge "vp8: [loongson] optimize dct with mmi"	2017-10-13 00:37:57 +00:00
Marco Paniconi	28d1c0535d	Merge "Adjust to scene detection for 1 pass vbr."	2017-10-12 19:36:33 +00:00
Marco	a673b4f4af	Adjust to scene detection for 1 pass vbr. Expose the threshold for setting key frame on cut, and increase it for speed 5. Also small adjustment to min_thresh. No change in overall metrics or fps. Small quality improvement and lower encode time on scene cuts. Change-Id: I36e06ff3b26b6c29aede39c23fce454525fc9026	2017-10-12 10:59:23 -07:00
Jerome Jiang	175b36cb6d	Merge "vp9: use nonrd pick_intra for small blocks on keyframes."	2017-10-12 17:29:27 +00:00
Kyle Siefring	caa116c9be	Merge changes I38783d97,If5160c0c * changes: Extend 16 wide AVX2 convolve8 code to support averaging. Add AVX2 version of vpx_convolve8_avg.	2017-10-12 16:12:38 +00:00
paulwilkins	2b247ae91c	Increase precision of some debug stats output for corpus VBR. Change-Id: I75841797cc0c215781b5b36e3a3e9f4b0e35ba63	2017-10-12 10:07:21 +01:00
Jerome Jiang	288890cd43	vp9: use nonrd pick_intra for small blocks on keyframes. Keyframe encoding is more than 2x faster. Disabled on Speed 8. Change-Id: I2157318b6ac8253fa5398322c72d98cd7fa9b2b6	2017-10-11 21:38:01 -07:00
Shiyou Yin	f70de09f2a	vp8: [loongson] optimize dct with mmi 1. vp8_short_fdct4x4_mmi 2. vp8_short_fdct8x4_mmi 3. vp8_short_walsh4x4_mmi Change-Id: I89a7df25cfd09fae309fac257ad8b6a3dc1c8acb	2017-10-12 08:50:04 +08:00
Shiyou Yin	bc4098a8e9	Merge "vp8: [loongson] optimize quantize with mmi"	2017-10-12 00:33:17 +00:00
Marco	72c69e14ad	Adjust threshold in datarate tests for 1 pass VBR Small increase in threshold for the 1 pass VBR datarate tests. Needed due to commit: <017257a Adjustment to scene detection and key frame> Change-Id: I28b3bd7db2192a8cc2bccc3cb0e3b8dbb910ca16	2017-10-11 11:48:36 -07:00
Linfeng Zhang	1fa3ec3023	Test extreme inputs in frame scale functions Change-Id: Ic149e3cb59be2ee0f98a3fcfd83226ad5ea30c99	2017-10-11 11:35:19 -07:00
paulwilkins	416b7051d7	Prevent double application of min rate in two pass. The initial allocation of bits in the two pass code to each frame should be within the min max limits on the command line. However, when forming an ARF group the cost of the ARF is shared by frames in that group such that the residual bits for a frame could drop below the min value. This change prevents the minimum being re-applied after the cost of the ARF has been deducted as this may otherwise cause low rate sections to overshoot their target. Test runs comparing to a baseline run with min and max section pct 0-2000% vs one closer to the YT use case (50-150%) suggest that this fix not only results in better rate control but also gives a better rd outcome. For example the HD set vs 0-2000% baseline (opsnr, ssim). Old code (50-150): +0.751, +1.099 New code(50-150): +0.241, -0.009 Change-Id: I715da7b130bf53ba8aa609532aa9e18b84f5e2ef	2017-10-11 18:00:44 +01:00
Shiyou Yin	e8ed2bb762	vp8: [loongson] optimize quantize with mmi 1. vp8_fast_quantize_b_mmi 2. vp8_regular_quantize_b_mmi Change-Id: Ic6e21593075f92c1004acd67184602d2aa5d5646	2017-10-11 16:45:58 +08:00
Linfeng Zhang	16166bfdaa	Add 4 to 1 scaling x86 optimization Change-Id: I51c190f0a88685867df36912522e67bdae58a673	2017-10-10 16:24:06 -07:00

... 4 5 6 7 8 ...

18218 Commits