generic-library/vpx

Author	SHA1	Message	Date
Yunqing Wang	bca4564683	Make allow_exhaustive_searches feature no longer adaptive A previous patch turned on allow_exhaustive_searches feature only for FC_GRAPHICS_ANIMATION content. This patch further modified the feature by removing the exhaustive search limit, and made it no longer adaptive. As a result, the 2 counts that recorded the number of motion searches were removed, which helped achieve the determinism in the row based multi-threading encoding. Tests showed that this patch didn't cause the encoder much slower. Used exhaustive_searches_thresh for this speed feature, and removed allow_exhaustive_searches. Also, refactored the speed feature code to follow the general speed feature setting style. Change-Id: Ib96b182c4c8dfff4c1ab91d2497cc42bb9e5a4aa	2017-04-21 11:14:02 -07:00
Yunqing Wang	f22b828d68	Fix an integer overflow in vp9_mcomp.c The MV unit test revealed an integer overflow issue in vp9_mcomp.c. This was caused if the MV was very large. In mv_err_cost(), when mv->row = 8184, mv->col = 8184 and ref_mv is 0, mv_cost = 34363 and error_per_bit = 132412, causing the overflow. BUG=webm:1406 Change-Id: I35f8299f22f9bee39cd9153d7b00d0993838845e	2017-04-10 18:09:50 -07:00
Yunqing Wang	1aa46abbdf	VP9 motion vector unit test To prevent the motion vector out of range bug, added a motion vector unit test in VP9. In the 4k video encoding, always forced to use extreme motion vectors and also encouraged to use INTER modes. In the decoding, checked if the motion vector was valid, and also checked the encoder/decoder mismatch. The tests showed that this unit test could reveal the issue we saw before. Change-Id: I0a880bd847dad8a13f7fd2012faf6868b02fa3b4	2017-04-06 00:50:56 +00:00
Ranjit Kumar Tulabandu	bf15ca1091	Fix for out of range motion vector bug in sub-pel motion estimation BUG=webm:1397 (yunqingwang) To verify that this patch wouldn't cause much performance change, the Borg tests were run. Here was the result: avg_psnr overall_psnr ssim hdres: -0.002 0.006 0.013 midres: 0 0 0 lowres: 0 0 0 Change-Id: Iae395ae7b741e0513cf5bab9dcace110b792a67d	2017-04-03 16:16:49 +00:00
Yunqing Wang	f2c1aea118	Merge "Row based multi-threading of encoding stage"	2017-02-15 00:54:10 +00:00
Ranjit Kumar Tulabandu	71061e9332	Row based multi-threading of encoding stage (Yunqing Wang) This patch implements the row-based multi-threading within tiles in the encoding pass, and substantially speeds up the multi-threaded encoder in VP9. Speed tests at speed 1 on STDHD(using 4 tiles) set show that the average speedups of the encoding pass(second pass in the 2-pass encoding) is 7% while using 2 threads, 16% while using 4 threads, 85% while using 8 threads, and 116% while using 16 threads. Change-Id: I12e41dbc171951958af9e6d098efd6e2c82827de	2017-02-15 00:49:34 +00:00
clang-format	4b402746ca	apply clang-format Change-Id: I75e4a9e0b37bd4586f26c8d6c1fa27f3f6ff1bce	2017-02-14 12:45:52 -08:00
Ranjit Kumar Tulabandu	91f01a2060	Row based multi-threading of ARNR filtering stage Change-Id: Ic238d32c7e10b730342224ab56712a89a6026a8f	2017-02-07 14:03:19 +05:30
Yunqing Wang	770c6663d6	Merge "Changes to facilitate row based multi-threading of ARNR filtering"	2017-02-01 22:04:15 +00:00
Ranjit Kumar Tulabandu	359a6796da	Changes to facilitate row based multi-threading of ARNR filtering Change-Id: I2fd72af00afbbeb903e4fe364611abcc148f2fbb	2017-02-01 13:03:52 -08:00
Jingning Han	969957f9f2	Fix real-time compression regression in hbd mode This commit resolves the compression performance regression in real-time encoding setting when high bit-depth mode is enabled. The current solution temporarily disables the SIMD implementations of vpx_satd, hadamard8x8, and hadamard16x16 in high bit-depth mode. The commit makes the coding results bit-wise identical between regular coding pipeline and high bit-depth at profile 0. BUG=webm:1365 Change-Id: Icfb900821733749685370460a1a5a7e07f76f4bf	2017-01-31 23:17:09 -08:00
Yunqing Wang	b987bc36af	Remove marco MVC in mcomp.c Removed MVC so that mv_err_cost() is always called while calculating the mv cost. Change-Id: I28123e05fbfc2352128e266c985d2ab093940071	2017-01-23 17:03:12 -08:00
Yunqing Wang	99c573f018	Merge "Fix for out of range motion vector bug in joint motion search"	2017-01-03 17:46:15 +00:00
Ranjit Kumar Tulabandu	b67e1f701f	Fix for out of range motion vector bug in joint motion search Clamped the initial mv in vp9_refining_search_8p_c. BUG=webm:1354 Change-Id: I47d302b350937e3e6e52e95c983b5fb0b4c64fba	2017-01-03 09:12:32 -08:00
Yunqing Wang	1d12559b09	Make sub-pixel mv search's return value consistent with the return type For out-of-range cases, returned UINT_MAX instead of INT_MAX in the sub-pixel mv search to be consistent with the "uint32_t" return type. Change-Id: I8e206d771228c13d89bafbbe9f14722c8ecc6a7a	2016-12-27 12:08:38 -08:00
clang-format	5f6d143b41	apply clang-format Change-Id: I501597b7c1e0f0c7ae2aea3ee8073f0a641b3487	2016-09-15 15:07:53 -07:00
Alex Converse	6554333b59	Refactor mv limits. Change-Id: Ifebdc9ef37850508eb4b8e572fd0f6026ab04987	2016-08-08 11:54:00 -07:00
clang-format	e0cc52db3f	vp9/encoder: apply clang-format Change-Id: I45d9fb4013f50766b24363a86365e8063e8954c2	2016-08-02 16:47:11 -07:00
James Zern	ca88d22f39	s/UINT32_MAX/UINT_MAX/ provides better toolchain compatibility Change-Id: I8561a6de668a68ff54fe3886a4ee6300f0ae9c04	2016-06-25 12:15:51 -07:00
Yaowu Xu	7738bcb350	Rationalize type to avoid integer out of range BUG=webm:1250 Change-Id: Id5bb2762ca1bf996ba4f9a60eec977a7994c1d94	2016-06-24 13:58:02 +00:00
Yaowu Xu	87bf1a149c	Fix ubsan warnings: vp9/encoder/vp9_mcomp.c This commit fixes a number of ubsan warnings in HBD build. BUG=webm:1219 Change-Id: I05f0fd0ef50e93db4ba34205005c54af1ed32acc	2016-06-21 15:37:59 -07:00
Alex Converse	6dd5ec7efb	mcomp: Remove an obsolete undef. The macro was removed in `6724676`. Change-Id: I412c24aac49bd1ff60a331a30933e0d8ae3f2dd5	2016-05-10 18:04:24 -07:00
Alex Converse	7764f8af3e	mcomp: Remove an obsolete comment. This was copied over from VP8. VP9 doesn't seem to do this buffer copy. Change-Id: I28a8bbf0503a7f99b2cb60620ab3674adde863bb	2016-05-10 18:04:24 -07:00
Alex Converse	55859e8428	Use whole pixel only at speed 8 screen content. +5.857% BD-RATE on SCREEN_CONTENT Leaving this off for non-screen content because: +25.300% on TWITCH120 +37.833% BD-RATE on RTC Change-Id: Ie0a312182d6cc859fb04298e4cd81d02b39e23fe	2016-03-15 15:04:48 -07:00
Alex Converse	fac947df77	Restore previous motion search bit-error scale. The bit to error transformation got doubled as a result of going from 8-bit to 9-bit costs (change `d13385c`). Use defines to derive the scale numbers and comment some of the fields. derf: -0.023 BDRATE hevcmr: +0.067 BDRATE stdhd: +0.098 BDRATE (These are substantially smaller than than the original gains from 8 to 9 bit costing.) Change-Id: I6a2b3b029b2f1415e4f90a05709b2333ec0eea9b	2016-02-09 13:20:25 -08:00
hui su	1c9b0918b3	Fix some interger overflow errors Change-Id: I7e44bd952f28ce9925e8bdf6ee8ca2bb13de1b49	2016-02-02 17:32:15 -08:00
Alex Converse	ad43a73883	Fix a signed overflow in vp9 motion cost. Change-Id: I5975e3aede62202d8ee6ced33889350c0a56554a	2016-02-01 14:27:32 -08:00
Scott LaVarnway	5232326716	VP9: Eliminate MB_MODE_INFO Change-Id: Ifa607dd2bb366ce09fa16dfcad3cc45a2440c185	2016-01-19 16:40:20 -08:00
Scott LaVarnway	de993a847f	VP9: inline vp9_use_mv_hp() Change-Id: Ib275bfc4c29c572d6c70e5ec6dbfc241590d3e3e	2016-01-13 08:02:05 -08:00
James Zern	d36659cec7	move vp9_avg to vpx_dsp Change-Id: I7bc991abea383db1f86c1bb0f2e849837b54d90f	2015-12-14 14:42:12 -08:00
paulwilkins	0149fb3d6b	Changes to exhaustive motion search. This change alters the nature and use of exhaustive motion search. Firstly any exhaustive search is preceded by a normal step search. The exhaustive search is only carried out if the distortion resulting from the step search is above a threshold value. Secondly the simple +/- 64 exhaustive search is replaced by a multi stage mesh based search where each stage has a range and step/interval size. Subsequent stages use the best position from the previous stage as the center of the search but use a reduced range and interval size. For example: stage 1: Range +/- 64 interval 4 stage 2: Range +/- 32 interval 2 stage 3: Range +/- 15 interval 1 This process, especially when it follows on from a normal step search, has shown itself to be almost as effective as a full range exhaustive search with step 1 but greatly lowers the computational complexity such that it can be used in some cases for speeds 0-2. This patch also removes a double exhaustive search for sub 8x8 blocks which also contained a bug (the two searches used different distortion metrics). For best quality in my test animation sequence this patch has almost no impact on quality but improves encode speed by more than 5X. Restricted use in good quality speeds 0-2 yields significant quality gains on the animation test of 0.2 - 0.5 db with only a small impact on encode speed. On most clips though the quality gain and speed impact are small. Change-Id: Id22967a840e996e1db273f6ac4ff03f4f52d49aa	2015-11-13 10:16:31 +00:00
Geza Lore	5eefd3ebfd	Add AVX vectorized vp9_diamond_search_sad This function now has an AVX intrinsics version which is about 80% faster compared to the C implementation. This provides a 2-4% total speed-up for encode, depending on encoding parameters. The function utilizes 3 properties of the cost function lookup table, constructed in 'cal_nmvjointsadcost' and 'cal_nmvsadcosts'. For the joint cost: - mvjointsadcost[1] == mvjointsadcost[2] == mvjointsadcost[3] For the component costs: - For all i: mvsadcost[0][i] == mvsadcost[1][i] (equal per component cost) - For all i: mvsadcost[0][i] == mvsadcost[0][-i] (Cost function is even) These must hold, otherwise the AVX version of the function cannot be used. Change-Id: I6c2791d43022822a9e6ab43cd124a773946d0bdc	2015-11-11 14:03:47 +00:00
James Zern	30466f26b4	Revert "Add AVX vectorized vp9_diamond_search_sad" This reverts commit `f1342a7b07`. This breaks 32-bit builds: runtime error: load of misaligned address 0xf72fdd48 for type 'const __m128i' (vector of 2 'long long' values), which requires 16 byte alignment + _mm_set1_epi64x is incompatible with some versions of visual studio Change-Id: I6f6fc3c11403344cef78d1c432cdc9147e5c1673	2015-11-06 13:15:01 -08:00
Geza Lore	f1342a7b07	Add AVX vectorized vp9_diamond_search_sad This function now has an AVX intrinsics version which is about 80% faster compared to the C implementation. This provides a 2-4% total speed-up for encode, depending on encoding parameters. The function utilizes 3 properties of the cost function lookup table, constructed in 'cal_nmvjointsadcost' and 'cal_nmvsadcosts'. For the joint cost: - mvjointsadcost[1] == mvjointsadcost[2] == mvjointsadcost[3] For the component costs: - For all i: mvsadcost[0][i] == mvsadcost[1][i] (equal per component cost) - For all i: mvsadcost[0][i] == mvsadcost[0][-i] (Cost function is even) These must hold, otherwise the AVX version of the function cannot be used. Change-Id: I184055b864c5a2dc37b2d8c5c9012eb801e9daf6	2015-11-05 10:02:17 +00:00
Geza Lore	965a8dea0b	Convert motion search config from AoS to SoA This is a prerequisite for vectorizing vp9_diamond_search_sad_c. Change-Id: I49cd9148782410ca8b16e8a468ca9e7c6d088410	2015-10-28 15:30:43 +00:00
Johann	c5f11912ae	Include vpx_dsp_common.h when using VPXMIN/MAX Change-Id: I2e387a06484a06301f3cd6600c4ba2f4335b61ee	2015-08-31 14:36:35 -07:00
James Zern	ff03d5448a	vp9_mcomp: make search functions private vp9_full_pixel_search() can be used as a replacement as it dispatches to all search methods Change-Id: I57fcb79c1362b569dc95237bdcc8390f54efd440	2015-08-28 18:54:10 -07:00
James Zern	5e16d397bd	vpx_dsp_common: add VPX prefix to MIN/MAX prevents redeclaration warnings; vp8 has its own define which will be resolved in a future commit Change-Id: Ic941fef3dd4262fcdce48b73075fe6b375f11c9c	2015-08-26 20:11:32 -07:00
Yunqing Wang	4bc6ae4342	Merge "Improve the second-level sub-pixel motion search"	2015-08-07 16:05:59 +00:00
Yunqing Wang	7418b176ce	Improve the second-level sub-pixel motion search Re-investigated the second-level sub-pixel motion search. Improved the way of choosing search points. Rewrote the second-level search code. At speed 0, the borg tests showed: 1. for stdhd set, Avg PSNR gain: 0.216%; Overall PSNR gain: 0.196%; SSIM gain: 0.206%. Only 1 out of 15 clips showed PSNR loss. 2. for derf set, Avg PSNR gain: 0.171%; Overall PSNR gain: 0.192%; SSIM gain: 0.207%. Only 3 out of 30 clips showed PSNR losses. Added the condition for third-point checking, namely, less points were checked. Speed tests showed no speed loss(Avg 0.3% speedup at speed 0). Change-Id: I6284ebb3fa7ba63be8528184c49e06757211a7f1	2015-08-06 16:28:32 -07:00
Jingning Han	b4f2c567c8	Cosmetic - align format in vp9 Change-Id: I83ed3422f1f4009675ad2f5c4b7236bc7b83b30e	2015-08-06 15:56:11 -07:00
Yunqing Wang	726d1b841b	Minor adjustment in diagonal sub-pixel point checking Choose a different diagonal point to check when the two costs are the same, making it consistent with the way we choose the best mv. This slightly changes the encoding result, and the derflr set borg test at speed 0 shows 0.027% Overall PSNR gain, 0.024% Avg PSNR gain, and 0.043% SSIM gain. Change-Id: Ic8ee3a6767394866d159e4f9e1c777604dd73c17	2015-08-04 12:16:47 -07:00
Yunqing Wang	a3d22aa2a4	Small improvement in sub-pixel motion search If the current best mv(namely, the search center) is still the best mv after the first level search, the second level checks is skipped. This patch doesn't change the bitstream. At speed 0, it speeds up the encoder by 1% - 2%. Change-Id: I054c91b884d3f7aef157436c061744562bd6506d	2015-08-04 12:06:21 -07:00
James Zern	aaa49f0485	vp9_mcomp: make search_step_table static Change-Id: I2552d8101cf49ed951782ab69adce407579700fc	2015-06-12 18:11:54 -07:00
James Zern	7ea431df98	vp9_mcomp: don't mark setup_center_error() inline this function is a bit too involved for the hint; avoids a -Winline warning Change-Id: Ib82e424764aa78b37ddb94116e2b009a6de31d35	2015-06-12 17:56:33 -07:00
Johann	eb88b172fe	Make vp9 subpixel match vp8 The only difference between the two was that the vp9 function allowed for every step in the bilinear filter (16 steps) while vp8 only allowed for half of those. Since all the call sites in vp9 (<< 1) the input, it only ever used the same steps as vp8. This will allow moving the subpel variance to vpx_dsp with the rest of the variance functions. Change-Id: I6fa2509350a2dc610c46b3e15bde98a15a084b75	2015-06-03 22:10:51 -07:00
Johann	dee70d355f	Merge "Move variance functions to vpx_dsp"	2015-05-26 23:02:11 +00:00
Johann	c3bdffb0a5	Move variance functions to vpx_dsp subpel functions will be moved in another patch. Change-Id: Idb2e049bad0b9b32ac42cc7731cd6903de2826ce	2015-05-26 12:01:52 -07:00
Jingning Han	96dba4902c	Fix integral projection motion search for frame resize This commit fixes the integral projection motion search crash when frame resize is used. It fixes issue 994. Change-Id: Ieeb52619121d7444f7d6b3d0cf09415f990d1506	2015-05-22 15:40:45 -07:00
Johann	1d7ccd5325	Relocate memory operations for common code With the sad functions, and hopefully the variance functions soon, moving to the vpx_dsp location, place the defines used in the reference C code in a common location. Change-Id: I4c8ce7778eb38a0a3ee674d2f1c488eda01cfeca	2015-05-13 11:41:15 -07:00

1 2 3 4 5

219 Commits