generic-library/vpx

Author	SHA1	Message	Date
Yaowu Xu	7e89c102c4	vp9-highbitdepth -> vpx-highbitdepth Change-Id: I1e90cf7ab4bb02c0ef119b0bd1596771edefedff	2016-08-05 15:41:33 -07:00
Sarah Parker	166c3250a3	Add buf0, width, height fields to buf_2d These are needed for the warping function in the global motion experiment. Change-Id: Iaab176d0c0b90f6b938e2bac48b24c07e87e3cd9	2016-07-18 11:04:56 -07:00
Geza Lore	4c4f04ac11	Optimize and cleanup obmc predictor and rd search. Use vpx_blend_a64_hmask and vpx_blend_a64_vmask to speed up computing the obmc predictor. Clean up calc_target_weighted_pred. Encoder speedup: 1.3% Decoder speedup: 6.5% Change-Id: I0c774fe53d22399e92a10d1daf3af0010d88d2c5	2016-07-13 16:54:20 +00:00
Geza Lore	cd489264e1	Optimize and cleanup supertx predictor. Use vpx_blend_a64_hmask and vpx_blend_a64_vmask to speed up computing the supertx predictor. Decoder speedup of up to 4% has been observed. Change-Id: I255a5ba4cc24f78dc905d25b6e2f7fbafac13253	2016-07-11 18:14:21 +00:00
Geza Lore	bfa59b4a5f	Improve vpx_blend_* functions. - Made source buffers pointers to const. - Renamed vpx_blend_mask6b to vpx_blend_a64_mask. This is more indicative that the function does alpha blending. The 6, or 6b suffix was misleading, as the max mask value (64) does not fit into 6 bits. - Added VPX_BLEND_* macros to use when needing to blend scalars. - Use VPX_BLEND_A256 in combine_interintra to be more explicit about the operation being done. - Added versions of vpx_blend_a64_* which take 1D horizontal/vertical masks directly and apply them to all rows/columns (vpx_blend_a64_hmask and vpx_blend_a64_vmask). The SSE4.1 optimzied horizontal version now falls back on the 2D version. This can be improved upon if it show up high enough in a profile. - All vpx_blend_a64_* functions now support block sizes down to 1x1 (ie: a single pixel). This is for usage convenience. The SSE4.1 optimized versions fall back on the C implementation if w <= 2 or h <= 2. This can again be improved if it becomes hot code. Change-Id: I13ab3835146ffafe3e1d74d8e9cf64a5abe4144d	2016-07-11 19:05:17 +01:00
Debargha Mukherjee	72ef6d7704	Refactor and clean up on blend_mask6 Change-Id: Ie9188471e7dc07ab9c95b22f258b1662e895c533	2016-07-08 15:02:57 -07:00
Geza Lore	fc28be3b23	Clean up build_wedge_inter_predictor_from_buf Change-Id: I715f8ffa3e81056a74ca8ac94793009afb781221	2016-07-07 13:12:57 +01:00
Geza Lore	135d663159	Reinstate "Optimize wedge partition selection." without tests. This reinstates commit efda2831e5f758b4f350679b5c55c0b9282449b0 without the tests and with fixes for 32 bit x86 builds. Change-Id: I34be4fe1e8a67686d26ba256fd7efe0eb6a569e8	2016-06-21 20:31:50 +01:00
Jingning Han	48f5125749	Merge "Fix enc/dec mismatch in non-420 settings" into nextgenv2	2016-06-14 21:54:08 +00:00
Jingning Han	a4ea8fd8b8	Fix enc/dec mismatch in non-420 settings This commit makes the dual filter experiment work with non-420 settings. It fixes unit test failure in EndToEndTestLarge. Change-Id: I04f7afdee78f91389d9ff72947efa152098af930	2016-06-14 00:21:48 +00:00
Debargha Mukherjee	902ee5060c	A crash fix for supertx / ext-inter combination. Change-Id: I9860376c98aa3b25f5bf86ed13d4a7631fa6b153	2016-06-13 13:57:30 -07:00
Debargha Mukherjee	81f8b3f31c	Merge "Some refactoring to support warped motion mode" into nextgenv2	2016-06-10 23:18:39 +00:00
Zoe Liu	1d1286bfb4	Fix one typo in the comment Change-Id: Ie98fd60426b18980ec85572f3cfc9ce0b97a5361	2016-06-10 15:58:30 -07:00
Debargha Mukherjee	03be30ba3e	Some refactoring to support warped motion mode Change-Id: I15d54a3ae48b2b33082668116792c6595bdb3ddb	2016-06-10 12:04:18 -07:00
Angie Chiang	95340fccb3	Revert "Optimize wedge partition selection." This reverts commit efda2831e5f758b4f350679b5c55c0b9282449b0. This commit causes segmentation fault at SSE2/SumSquares2DTest.RandomValues/0 Change-Id: I171937e4daf6f15323e8206418773deb03bd8c53	2016-06-09 19:17:37 -07:00
Debargha Mukherjee	d3180b8b97	Merge "Fix build failure happened in reconinter.c" into nextgenv2	2016-06-07 14:22:25 +00:00
Debargha Mukherjee	13155e7725	Merge "Optimize wedge partition selection." into nextgenv2	2016-06-07 09:50:13 +00:00
Angie Chiang	2250c6b07b	Fix build failure happened in reconinter.c Change-Id: Ifd5ed91e4e91238fb53a202c8d76c11fbb9ccf7c	2016-06-06 14:41:14 -07:00
Geza Lore	efda2831e5	Optimize wedge partition selection. We can optimize wedge partition selection by pre-computing the residuals of the 2 underlying predictors, and then blend these to compute the sse of the compound predictor, without actually having to compute and subtract the compound predictor. Similarly we can pre-compute a proxy array which we can use to cheaply check which mask sign would have lower sse. Details are in wedge_utils.c. Mathematically these are equivalence transformations, but due to the finite precision the encoder output will be perturbed, though on average this should make 0% difference. ext-inter gains about ~4.5% speedup. Change-Id: Ib2657c3209ae161b4090b58b4b6c392641bf2792	2016-06-06 14:43:10 +01:00
Debargha Mukherjee	cbf51c5ba0	Merge "Pre-compute and use contiguous wedge masks." into nextgenv2	2016-06-03 13:27:02 +00:00
Geza Lore	ab29978e9f	Pre-compute and use contiguous wedge masks. This is purely a refactoring patch and has no functional effect. Uses of these masks can be arranged such that all input blocks are contiguous in memory (stride == block width). In this case 1D versions of operations can be used. 1D vector operations have superior performance over 2D block equivalents as they are more processor cache friendly and they can do away with a second loop overhead. Change-Id: I2b76c9888aea2c857cc497e8a4b2841fd3dad54e	2016-06-03 00:16:22 -07:00
Geza Lore	888e90e823	Use standard rounding in combine_interintra. Use the same rounding method that is used throughout the codebase, where the halfway value is rounded up rather than down. Change-Id: I04e92850bc69a7d7a07b06e3d2ce97f6f2ada321	2016-06-02 16:26:05 +01:00
Geza Lore	2935b4db0e	Remove redundant memcpy from wedge predictor. Removing redundant calls to memcpy from build_wedge_inter_predictor_from_buf yields a net 4% encoder speedup with ext-inter only. The output is identical. Change-Id: If97d4e323a5c8aca90c84a25a72085e006b05446	2016-05-24 11:31:18 +01:00
Geza Lore	62b6331753	Pick up bit-depth from the right place Change-Id: Icbdb036d7927b77b84bd78e8348ec8b5be88df08	2016-05-24 11:08:23 +01:00
Debargha Mukherjee	fb65f9b54b	Merge "Add optimized vpx_blend_mask6" into nextgenv2	2016-05-23 23:43:52 +00:00
Geza Lore	a661bc87c4	Add optimized vpx_blend_mask6 This is to replace vp10/common/reconinter.c:build_masked_compound. Functionality is equivalent, but the interface is slightly more generic. Total encoder speedup with ext-inter: ~7.5% Change-Id: Iee18b83ae324ffc9c7f7dc16d4b2b06adb4d4305	2016-05-23 16:28:58 +01:00
Debargha Mukherjee	fa5022978d	Merge "Wedge refactoring to handle signs better" into nextgenv2	2016-05-20 23:19:39 +00:00
Debargha Mukherjee	e5de2ad632	Wedge refactoring to handle signs better Mostly refactoring. Handles signs better though results are more or less neutral. Change-Id: If499537c8f8da4f34d104ebfda072eb4c85fb12f	2016-05-20 14:12:52 -07:00
Jingning Han	0f513752a0	Rework sub8x8 chroma component inter predictor This commit makes the sub8x8 chroma component inter predictor operate at 2x2 block level. This allows one to use the actual motion vector associated with each individal pixel block. It improves the compression performance lowres 0.40% midres 0.25% hdres 0.15% Change-Id: Ia40e07cc7fde463dbf660018850e024932136c4f	2016-05-19 09:03:57 -07:00
Debargha Mukherjee	fb8ea1736b	Various wedge enhancements Increases number of wedges for smaller block and removes wedge coding mode for blocks larger than 32x32. Also adds various other enhancements for subsequent experimentation, including adding provision for multiple smoothing functions (though one is used currently), adds a speed feature that decides the sign for interinter wedges using a fast mechanism, and refactors wedge representations. lowres: -2.651% BDRATE Most of the gain is due to increase in codebook size for 8x8 - 16x16. Change-Id: I50669f558c8d0d45e5a6f70aca4385a185b58b5b	2016-05-16 12:41:47 -07:00
Debargha Mukherjee	81abbc203e	Adjust smoothing function for wedge to be sharper Improves performance by 0.2% lowres: -2.052% BDRATE Also increases precision of the shift parameters (for further investigation into different wedge shifts). Change-Id: I59fcab9baa002e52a6487ed8d617185840a678ed	2016-05-11 09:35:43 -07:00
Debargha Mukherjee	3fbe6e5e49	Merge "Wedge rd improvements" into nextgenv2	2016-05-10 20:34:00 +00:00
Debargha Mukherjee	447032eb32	Wedge rd improvements Improves speed by about 10-15% by combining y-only rd with modeling function in a better way. Also, coding efficiency improves by about 0.1% lowres: -1.805% BDRATE with ext-inter Change-Id: I6ef1f8942ec6806252f3fcf749ae4f30dffe42b1	2016-05-10 11:47:48 -07:00
Yaowu Xu	bf692e853d	Merge "Fix build without dual-filter" into nextgenv2	2016-05-10 18:10:44 +00:00
Geza Lore	559e8d8e50	Fix build without dual-filter Change-Id: I91946940c1540c9f935161da89155ed304055fda	2016-05-10 13:12:07 +01:00
Geza Lore	9ab9438fbb	Break tile row dependencies. When not using ext-tile, there were still dependencies between tile rows due to various tools (eg intra predictors) relying on the above row or above mode info, which can be in the above tile. This is now broken (the same way as it was when ext-tile is enabled) by fixing the appropriate predicates. Change-Id: I107dd0d8481775a792f14e05cfbbd761f16cdc1e	2016-05-10 13:09:47 +01:00
Geza Lore	e9d2e36264	Fix interintra predictor buffer overflow. When constructing the intra predictor for rectangular interintra blocks, the last row/column of the first square is copied back into the source image (which is the current reconstructed image buffer) before predicting the second square. The code used to use the height instead of width for vertical rectangles, and vice versa for horizontal rectangles, leading to overwriting the block on the right/below. This leads to an encode/decode mismatch if the right/below block is in a different tile and is encoded before the current block, which did happen with multi-threaded encoding tests. This is now fixed. Change-Id: I073a2a447a98b842b1394d72cc774a78cb296921	2016-05-10 09:53:29 +01:00
Jingning Han	0a91b2da26	Merge "Fix unit test failure due to ext-inter and dual filter" into nextgenv2	2016-05-09 23:54:07 +00:00
Jingning Han	1215793007	Fix unit test failure due to ext-inter and dual filter Make the inter predictor use the right filter type to avoid enc/dec mismatch. Change-Id: I2aa416d50450188ec2057dca3338fa258314e562	2016-05-09 16:41:57 +00:00
Jingning Han	9de916eb20	Fix dual filter type for high bit-depth This commit fixes the compiler error in high bit-depth inter predictor when dual filter type experiment is turned on. Change-Id: I404a76a246477f2fcffc38a3275007d5dfe229cd	2016-05-09 02:14:48 +00:00
Jingning Han	bd33326372	Dual prediction filter type for motion compensated reference Make the bit-stream level support per direction filter type coding for motion compensated reference. Change-Id: I61a2360b301075f6734cfd9711b7ae68f214174d	2016-05-07 03:03:04 +00:00
Yaowu Xu	f0c7e76717	Change to call build_masked_compound_highbd() from combine_interintra_highbd(). This fixes a crash in encoder in highbitdepth build. Change-Id: I0aa4cc30200703ff21e9990163bb26ace41aabbc	2016-05-04 15:58:15 -07:00
Yaowu Xu	357c5387d7	Remove the use of non-declared "plane" The variable is not defined, it is not needed by the called function either. Change-Id: Ia601c03231afc0ae68a10ae1f35e8fc4121c3d28	2016-05-04 12:39:37 -07:00
Debargha Mukherjee	4f5045299e	Merge "Refactoring and uv fix for wedge" into nextgenv2	2016-05-03 22:36:24 +00:00
Debargha Mukherjee	3407785536	Refactoring and uv fix for wedge lowres: -1.72% Change-Id: I4c883097caac72fab8e01945454579891617145e	2016-05-03 08:02:08 -07:00
Yue Chen	c1d473849e	Bug fixes for obmc/ext-inter/ext-tile experiment Fix 1: in ext-inter + obmc config, properly identify if the left predictor used for obmc is a compound one in the case that the neighbor uses wedgeinterinter pred and we will dump the ALTREF part. This will fix the seg fault in unit test: VP10/AltRefForcedKeyTestLarge.Frame1IsKey/0 Fix 2: in ext-tile + obmc experiment, handle the case that the above block does not fit in the same row tile with the current one, so as to prevent potential crashes. Change-Id: I1c177d4f4ad15e10d11d8756e146496437753eea	2016-04-29 19:03:39 -07:00
Debargha Mukherjee	88fe7871be	Refactor wedge generation Change-Id: I2ec4f562e28a4673477e20186f9d6167b24b76b8	2016-04-28 17:51:21 -07:00
Yue Chen	3ac12aecc5	Optimization for EXT_INTER + OBMC Remove the restriction that the neighboring predictor cannot be used in obmc prediction if it is an interintra or wedgeinterinter block. The inter predictor of the interintra block, or the first inter predictor(using LAST or GOLDEN frame) of the wedgeinterinter block will be exploited in obmc prediction. Coding gain: 0.248% (2.833%->3.081%) lowres Change-Id: I4ac0368b9d2f2956f266b30c1ac97db8bafa0742	2016-04-26 16:50:10 -07:00
Yue Chen	6daf1a460e	Fix EXT_INTER unit test failure in 32-bit builds Align new buffers that are used in interintra and wedgeinterinter prediction. BUG=https://bugs.chromium.org/p/webm/issues/detail?id=1196 Change-Id: I1ef49fdf13c79a22cf8a1737e3d3052da0a92dfe	2016-04-22 22:37:13 -07:00
Yue Chen	c0fd271932	Remove an unsuccessful adaption of overlap sizes in obmc experiment We removed this adaption, which intended to reduce the size of overlapped region if the neighboring block is a non-skip one. Thus, now the width/height of the overlapping region is fixed as a half of the current block. Performance improvement (lowres/midres): 0.111%/0.102% Change-Id: Ife75dad9d4eb355c78a05178b50cc015c442884f	2016-04-18 15:27:59 -07:00

1 2

72 Commits