generic-library/vpx

Author	SHA1	Message	Date
James Zern	10cb17aec0	Merge "runtime error fix: bitdepth_conversion_avx2.h"	2017-11-10 00:15:03 +00:00
Jerome Jiang	6246d8aa76	vp9: Fix mem rel for non-ref for external buffer. Release frame buffers for non-ref when the decoder is destroyed. Enable the non ref test. BUG=b/68819248 Change-Id: Id87ef3b0a62318f9812e927cd957c05c859047fa	2017-11-09 15:47:21 -08:00
Jerome Jiang	0665b09661	Merge "vp9: SVC feature to use partition from lower resolution."	2017-11-09 23:28:44 +00:00
Jerome Jiang	fdb054a05d	vp9: SVC feature to use partition from lower resolution. For SVC with 3 spatial layers: Add feature to copy/upscale partition from middle spatial layer to the upper/highest resolution, when superblock sad is not high. Enabled for speed >= 7 and only for non-reference frames. Speedup ~3-4%, small loss in avgPNSR/SSIM of ~1%. Change-Id: I7f0a2716c0fde28bade0f86159d11b7e31d6ab8d	2017-11-09 14:16:50 -08:00
Scott LaVarnway	2387024f41	runtime error fix: bitdepth_conversion_avx2.h Change-Id: I7364a157de39eb7137b599808474b8d46d19d376	2017-11-09 12:26:43 -08:00
Johann Koenig	bdb8b3ad86	Merge "fail early on oversize frames"	2017-11-09 19:50:04 +00:00
Scott LaVarnway	62ab5e99c1	vpx: [x86] add vp9_block_error_fp_avx2() SSE2 asm vs AVX2 intrinsics speed gains: blocksize 16: ~1.00 blocksize 64: ~1.17 blocksize 256: ~1.67 blocksize 1024: ~1.81 Change-Id: I2a86db239cf57e3ff617890ccb2d236aba83ad5e	2017-11-09 05:02:31 -08:00
paulwilkins	d6e29868ac	Fix to frames considered in arf boost calculation. For a chosen interval "i" the existing arf boost calculation examined frames +/- (i-1) frames from the current location in the second pass. This change checks to make sure that the forward search does not extend beyond the next key frame in the event that the distance to the next key frame is < (i - 1). Small metrics gains on all our test sets but these are localized to a few clips (e.g. midres set psnr-hvs sintel -2.59% but overall average was only -0.185%) Change-Id: I26fc9ce582b6d58fa1113a238395e12ad3123cf6	2017-11-09 10:46:10 +00:00
Jerome Jiang	adbb4c4d32	Merge "vp9: Add nonref frame buffer test."	2017-11-09 04:41:10 +00:00
Jerome Jiang	a68bbcff29	vp9: Add nonref frame buffer test. The new test will run a SVC bitstream which has non ref frames. It checks the number of buffer acquired and released to make sure all external frame buffers are released. Add a new test bitstream: vp90-2-22-svc_1280x720_1.webm which has 400 frames in total, and 1 spatial layer and 2 temporal layers. There is one non ref frame every other frame. Disabled for now. Will be enabled with the fix. BUG=b/68819248 Change-Id: I0515336fd9809a9e1fceba90e4dce53dabaf53a5	2017-11-08 18:41:33 -08:00
Johann Koenig	cf8039c25f	Merge "Support building AVX-512 and implement sadx4 for AVX-512"	2017-11-08 16:28:40 +00:00
paulwilkins	93e83fd7cf	CVBR command line option. Added command line control of Corpus VBR. The new corpus vbr mode is a variant of standard VBR (end-usage=0) where the complexity distribution mid point is passed in rather than calculated for a specific clip or chunk. The new variant is enabled by setting a new command line parameter --corpus-complexity to a zero value. Omitting this parameter or setting it to 0 will cause the codec to use standard vbr mode. The correct value for a given corpus needs to be derived experimentally using a training set such that the average rate for the corpus is close to the target value. For example our using our low res test set with upper and lower vbr limits of 50%-150% and a corpus complexity value of 650 gives a similar average data rate across the set to using standard vbr. However, with the corpus mode easier clips will be allocated fewer bits and harder clips more bits rather than having the same rate target for all. Change-Id: I03f0fc8c6fb0ee32dc03720fea6a3f1949118589	2017-11-08 10:41:04 +00:00
Marco	6fbc354c97	Nonrd_pickmode: avoid computing UV cost when early_term is set. For nonrd_pickmode: if early_term is set there should be no need to include UV in rdcost (when color_sensitivity is set). Neutral change on RTC and RTC_derf metrics, for speed >= 5. No change for ytlive metrics. Very small speed gain (~0.5%) on some clips with strong color content. Change-Id: Ifc00928ecd935fc71e94935ceef0ae7481249f07	2017-11-06 10:22:14 -08:00
Kyle Siefring	b383a17fa4	Support building AVX-512 and implement sadx4 for AVX-512 The added AVX-512 support requires the subset of AVX-512 added in Skylake-X. Change-Id: I39666b00d10bf96d06c709823663eb09b89265b7	2017-11-03 13:37:23 -04:00
Marco	eb7d431cb5	Compound prediction mode for nonrd pickmode. Allow for compound prediction mode in nonrd_pickmode for ZEROMV. For real-time encoding, 1 pass with non-zero lag-in-frames. Added speed feature to control the feature. Enabled for speed >=6 for now, under VBR mode. avgPSNR/SSIM metrics positive on ytlive set, for speed 6: some clips up by ~3-5%, some clips neutral gain, average gain across clips is ~1%. Small/negligible decrease in speed. Change-Id: I7a60c7596e69b9a928410c5ee2f9141eecd8613d	2017-11-03 10:13:05 -07:00
Johann	5fe82459ec	fail early on oversize frames Even though frame_size is calculated in uint64_t, it winds up in an int size value. This was exposed with the msan test because the memset is called with (int)frame_size, leading to a segfault. Change-Id: I7fd930360dca274adb8f3e43e5e6785204808861	2017-11-03 09:49:13 -07:00
Jerome Jiang	3ba9a2c8b2	Merge "vp9: Move allocation of vt2 after early exits."	2017-11-01 16:58:01 +00:00
Jerome Jiang	34805d6d0d	vp9: Move allocation of vt2 after early exits. Remove the memory deallocation on the early exits. Change-Id: I00b4a814ae6705105ecab89644d055ca3311d9f4	2017-10-31 17:04:04 -07:00
Jerome Jiang	0c84b9b703	Merge "vp9: Reduce stack usage of choose_partitioning."	2017-10-31 21:42:18 +00:00
Jerome Jiang	18b470f486	vp9: Reduce stack usage of choose_partitioning. Move vt2 to heap. Reduce the stack usage from ~87K to ~44K. BUG=b/68362457 Change-Id: I8f5f93712934d59a8cc4564378172d409a736a2e	2017-10-31 13:10:27 -07:00
Jerome Jiang	c77822615e	Merge "vp9: Reduce stack usage of choose_partioning."	2017-10-30 23:39:41 +00:00
Jerome Jiang	cc47231187	vp9: Reduce stack usage of choose_partioning. Change type of sum_square_error from int64_t to uint32_t. Change type of sum_error from int64_t to int32_t. This reduces the stack usage from ~131K to ~87K. BUG=b/68362457 Change-Id: I147d7c7b226bceb4f0817bb86848e1fa9d9ac149	2017-10-30 13:53:20 -07:00
James Zern	acb9460929	vp8: correct if/else '{' placement swap '{' and c-style comments removing a few redundant ones along the way; covers most leftovers from the clang-tidy run against an x86_64-linux config. Change-Id: I67a45596f80a12389faca49c5be440875092a7df	2017-10-27 12:27:10 -07:00
Scott LaVarnway	3bf02ad74a	vpx: hadamard: use ptrdiff_t instead of int for stride Eliminates the following instruction for the x86 (64 bit) intrinsic code: movslq %esi,%rax Change-Id: I8f5ebd40726f998708a668b0f52ea7a0576befae	2017-10-26 11:41:48 -07:00
Kyle Siefring	037e596f04	Merge "Optimize convolve8 SSSE3 and AVX2 intrinsics"	2017-10-24 19:22:36 +00:00
Kyle Siefring	ae35425ae6	Optimize convolve8 SSSE3 and AVX2 intrinsics Changed the intrinsics to perform summation similiar to the way the assembly does. The new code diverges from the assembly by preferring unsaturated additions. Results for haswell SSSE3 Horiz/Vert Size Speedup Horiz x4 ~32% Horiz x8 ~6% Vert x8 ~4% AVX2 Horiz/Vert Size Speedup Horiz x16 ~16% Vert x16 ~14% BUG=webm:1471 Change-Id: I7ad98ea688c904b1ba324adf8eb977873c8b8668	2017-10-24 10:39:48 -04:00
Scott LaVarnway	e0aa6b24aa	Merge "vpx: [x86] vpx_hadamard_16x16_avx2() highbitdepth fix"	2017-10-23 22:02:59 +00:00
Marco	0738d90169	vp9-svc: Allow for adapt_rd_thresh with row-mt. Set adaptive_row_thresh_mt = 1 at speed >= 7, for svc when multi-threading is used with row-mt. This allow the adaptive_rd_thresh feature to be used in the nonrd-pickmode. ~1-2% speedup for SVC encoding with small quality loss (< 0.6%) on RTC set. Change-Id: Iab9878dff117bccdaef3e4d0645165db9808cdfc	2017-10-23 11:47:18 -07:00
Scott LaVarnway	512bf4e029	vpx: [x86] vpx_hadamard_16x16_avx2() highbitdepth fix Use an intermediate buffer before storing to coeffs when highbitdepth is enabled. Change-Id: I101981a1995f1108ad107c55c37d6e09eadb404b	2017-10-23 08:49:32 -07:00
Scott LaVarnway	4906cea027	vpx: [x86] vpx_hadamard_16x16_avx2() improvements ~10% performance gain. Fixed the cosmetics noted in the previous commit. Change-Id: Iddf475f34d0d0a3e356b2143682aeabac459ed13	2017-10-20 08:55:06 -07:00
Scott LaVarnway	b58259ab55	Merge "vpx: [x86] add vpx_hadamard_16x16_avx2()"	2017-10-19 23:32:10 +00:00
Paul Wilkins	199971d606	Merge "Corpus VBR tweak for undershoot."	2017-10-19 10:07:45 +00:00
Paul Wilkins	0c493cbe2b	Merge "Increase precision of some debug stats output for corpus VBR."	2017-10-19 10:07:30 +00:00
Paul Wilkins	d8c34a2552	Merge "Prevent double application of min rate in two pass."	2017-10-19 10:06:33 +00:00
Scott LaVarnway	55c126a5d7	vpx: [x86] add vpx_hadamard_16x16_avx2() This version is ~1.91x faster than the sse2 version. When highbitdepth is enabled, it is ~1.74x. Change-Id: I2b0e92ede9f55c6259ca07bf1f8c8a5d0d0955bd	2017-10-18 18:00:00 -07:00
Jerome Jiang	401e6d48bf	Merge "Add datarate test for vp8 ROI."	2017-10-18 19:39:26 +00:00
Jerome Jiang	bd6d82e881	Add datarate test for vp8 ROI. BUG=webm:1470 Change-Id: Icbc848837e64eacc49491dcc26b4c5802af2ee13	2017-10-18 11:19:59 -07:00
Jerome Jiang	ec2fced451	Merge "vp8: Enable use of ROI map."	2017-10-18 18:16:44 +00:00
Kyle Siefring	b3a36f7946	Merge "Refactor x86/vpx_subpixel_8t_intrin_avx2.c"	2017-10-18 16:19:52 +00:00
Shiyou Yin	df15220a89	Merge "vp8: [loongson] optimize idct with mmi"	2017-10-18 00:55:36 +00:00
Jerome Jiang	dbb8926b86	vp8: Enable use of ROI map. Disable cyclic refresh if ROI is used and add flag to properly handle the static_thresh deltas. Remove the ROI test for cyclic refresh (it's allowed but disabled if ROI is used). Add an example in vpx_temporal_svc_encoder.c. Turned off by default. BUG=webm:1470 Change-Id: Ief9ba1d7f967bc00511b412b491c3f70943bfbda	2017-10-17 15:23:03 -07:00
Linfeng Zhang	9336e01621	Merge changes I17fff122,Ic149e3cb * changes: Add 4 to 3 scaling SSSE3 optimization Test extreme inputs in frame scale functions	2017-10-17 16:03:29 +00:00
Linfeng Zhang	0d2e95193b	Merge "Generalize CheckScalingFiltering in ConvolveTest"	2017-10-17 16:03:07 +00:00
Kyle Siefring	55805e2786	Refactor x86/vpx_subpixel_8t_intrin_avx2.c Change-Id: I6539111dfb35a43028e9755785b2e9ea31854305	2017-10-17 11:57:40 -04:00
Shiyou Yin	577d4fa792	vp8: [loongson] optimize idct with mmi 1. vp8_dequant_idct_add_y_block_mmi 2. vp8_dequant_idct_add_uv_block_mmi Change-Id: I9987147be2685ac79d4b045d1d56f6709ee1223c	2017-10-17 03:27:31 +00:00
Linfeng Zhang	580d32240f	Add 4 to 3 scaling SSSE3 optimization Note this change will trigger the different C version on SSSE3 and generate different scaled output. Its speed is 2x compared with the version calling vpx_scaled_2d_ssse3(). Change-Id: I17fff122cd0a5ac8aa451d84daa606582da8e194	2017-10-16 15:42:42 -07:00
Marco	a9248457b1	Adjust threshold in gf_boost for 1 pass vbr Small inncrease the sad_thresh1, avoids some false detection of possible scene changes within lag. Small improvement in few clips on ytlive, otherwise neutral change. Change-Id: Ia79b53bb657bbce65a7aac7d20666b6373d5af8b	2017-10-13 15:33:51 -07:00
Paul Wilkins	12df840777	Merge "Further Corpus VBR change."	2017-10-13 15:59:58 +00:00
Paul Wilkins	eaa593d293	Merge "Corpus Wide VBR test implementation."	2017-10-13 15:59:45 +00:00
paulwilkins	8842ee0b0d	Corpus VBR tweak for undershoot. In cases of strong undershoot adjust Q range down faster. Change-Id: I84982beceb3c9b6dc50e52e4a6e891c7dd395d03	2017-10-13 10:27:15 +01:00

... 4 5 6 7 8 ...

18232 Commits