generic-library/vpx

Author	SHA1	Message	Date
Alex Converse	bc1b3d8412	Allow DC/H/V/TM on screen content. 6.3% better compression less than 1% compression time increase Change-Id: Ie83c059436e54c09de9e7c87e06e0a6d40dc38fe	2014-11-20 18:04:57 -08:00
Alex Converse	722e9d611b	Drop special inter mode selection for screen content. Better mode selection was implemented for all content. Change-Id: I479778ed21d3968892f4dce396c83733583f4f23	2014-11-20 18:04:57 -08:00
Yunqing Wang	54ba65a63e	Merge "vp9_ethread: move max/min partition size to mb struct"	2014-11-20 14:00:37 -08:00
Yunqing Wang	ad7586a9e1	vp9_ethread: move max/min partition size to mb struct The max_partition_size and max_partition_size are set at the beginning while setting speed features, and then adjusted at SB level. Moving them to mb struct ensures there is a local copy for each thread. Change-Id: I7dd08dc918d9f772fcd718bbd6533e0787720ad4	2014-11-20 09:24:50 -08:00
Yunqing Wang	70c9d2983b	Revert "vp9_ethread: include a pointer to mb in VP9_COMP" This reverts commit 6906d218ddd1af97228a797f4558e402231d94f1. Another way will be used to handle mb struct. Change-Id: Ic1111a46b2b1ee00f8f9e3fcd4cf3eb6030b2dc4	2014-11-20 08:31:12 -08:00
Yaowu Xu	1687c47bfd	change to call vp9_refining_search_sad() directly The function pointer in compressor instance does not change, so this commit changes to call the function directly. Change-Id: I9c9c460e3475711c384b74c9842f0b4f3d037cc5	2014-11-17 11:30:17 -08:00
Yunqing Wang	6906d218dd	vp9_ethread: include a pointer to mb in VP9_COMP Modified VP9_COMP struct to include MACROBLOCK *mb. This change makes it feasible in multi-thread case to allocate a mb for each thread. Change-Id: I624d6d1aa9c132362200753e5d90b581b1738d6e	2014-11-14 12:31:06 -08:00
Adrian Grange	35de9db312	Merge "Prepare for dynamic frame resizing in the recode loop"	2014-11-13 15:01:49 -08:00
Adrian Grange	0d085ebc0a	Prepare for dynamic frame resizing in the recode loop Prepare for the introduction of frame-size change logic into the recode loop. Separated the speed dependent features into separate static and dynamic parts, the latter being those features that are dependent on the frame size. Change-Id: Ia693e28c5cf069a1a7bf12e49ecf83e440e1d313	2014-11-13 11:41:20 -08:00
Jingning Han	e717d22b63	Use reconstructed pixels for intra prediction This commit makes the speed -6 and above use the reconstructed boundary pixels for precise intra prediction. This allows more intra prediction modes to be tested in the non-RD coding process. Enabling horizontal and vertical intra prediction modes can improve the speed -6 compression performance for rtc set by 0.331%. Change-Id: I3a99f9d12c6af54de2bdbf28c76eab8e0905f744	2014-11-11 10:04:43 -08:00
Yaowu Xu	0271ff7775	Fix speed 7 and speed 12 for rt A recent change has introduced big quality drops for speed 7 and 12 for --rt mode. The change reverted the big drop and improved quality by 9.5% for speed 7 and 13.4% for speed 12. Change-Id: I07b82e3bb6002a73af486a083458c88877bdad01	2014-10-31 17:29:02 -07:00
Yunqing Wang	aed48c786a	Remove unused speed feature Partition_check was unused and removed. Change-Id: I15ec9162d86dc61f04c09229c498629878ed7155	2014-10-29 17:05:04 -07:00
Jingning Han	9349a28e80	Enable mode search threshold update in non-RD coding mode Adaptively adjust the mode thresholds after each mode search round to skip checking less likely selected modes. Local tests indicate 5% - 10% speed-up in speed -5 and -6. Average coding performance loss is -1.055%. speed -5 vidyo1 720p 1000 kbps 16533 b/f, 40.851 dB, 12607 ms -> 16556 b/f, 40.796 dB, 11831 ms nik 720p 1000 kbps 33229 b/f, 39.127 dB, 11468 ms -> 33235 b/f, 39.131 dB, 10919 ms speed -6 vidyo1 720p 1000 kbps 16549 b/f, 40.268 dB, 10138 ms -> 16538 b/f, 40.212 dB, 8456 ms nik 720p 1000 kbps 33271 b/f, 38.433 dB, 7886 ms -> 33279 b/f, 38.416 dB, 7843 ms Change-Id: I2c2963f1ce4ed9c1cf233b5b2c880b682e1c1e8b	2014-10-29 10:55:34 -07:00
Yaowu Xu	87665f16f4	Merge "Change speed features for good quality(cpu-used=5)"	2014-10-22 08:40:15 -07:00
Yaowu Xu	c30f7e6cc5	Change speed features for good quality(cpu-used=5) The existing speed features produce horrible encoding results, almost 30% worse than cpu-used=4, this commit adjust the speed features to produce relatively resonable results to be within 3%-5% of cpu-used=4. Change-Id: I0ca6ebafb33024d4a0cbcf04c78a4a00b8dd1ecf	2014-10-21 11:59:12 -07:00
Jingning Han	abb2fbb10e	Remove deprecated constrain_copy_partitioning function Its functionality has been replaced with choose_partitioning and threshold based control on split mode check. Change-Id: Ic9bb321df06b524f5c38ea5874dc6f6a8f93c5e3	2014-10-20 17:08:21 -07:00
Jingning Han	e62ce79e1a	Remove deprecated use_lastframe_partitioning feature This speed feature has been deprecated in both yt and rtc coding modes. This commit removes the related operations. Change-Id: I079c79c6adafe45581af2ebf8b98faebcface1ce	2014-10-20 17:03:38 -07:00
Jingning Han	9f128b3ed9	Hybrid partition search for rtc coding mode This commit re-designs the recursive partition search scheme in rtc speed -5. It first checks if the current block is under cyclic refresh mode. If so, apply recursive partition search. Otherwise, perform sub-sampled pixel based partition selection. When the pre-selection finds the partition size should be 32x32 or above, use the partition size directly. Otherwise, apply partition search at nearby levels around the preset partition size. It is enabled in speed -5. The compression performance of rtc speed -5 is improved by 9.4%. Speed wise, the run-time goes slower from 1% to 10%. nik_720p, 1000 kbps 33220 b/f, 38.977 dB, 10109 ms -> 33200 b/f, 39.119 dB, 10210 ms vidyo1_720p, 1000 kbps 16536 b/f, 40.495 dB, 10119 ms -> 16536 b/f, 40.827 dB, 11287 ms Change-Id: I65adba352e3adc03bae50854ddaea1b421653c6c	2014-10-20 13:02:12 -07:00
Jingning Han	5e766ccee0	Use rate/distortion thresholds to control non-RD partition search Compare the estimated rate and distortion to the thresholds scaled according to the operating block size and determine if further split partition search will be run. The compression performance of speed -5 is changed by -0.074%. The encoding speed is 10% - 15% faster. vidyo1 720p 16545 b/f, 40.492 dB, 11475 ms -> 16535 b/f, 40.486 dB, 10100 ms nik720p 16624 b/f, 36.310 dB, 10071 ms -> 16617 b/f, 36.313 dB, 8346 ms Change-Id: Ic9197ab5761279ae55d2fb7813b2af0e0db497b8	2014-10-15 13:40:33 -07:00
Jingning Han	89b8c7a513	Replace copy_partitioning use case with choose_partitioning This commit replaces the use of copy_partitioning with choose_partitioning based on the sse of subsamped pixels, which provides significantly better coding performance and runs at similar speed, as compared to copy_partitioning. It improves rtc speed 5 coding performance by 3%. Change-Id: I52d3682a12dce0147f5e52383a594fc242ca3228	2014-10-15 11:37:20 -07:00
Deb Mukherjee	3117830af3	Merge "Subpel search cleanups and enhancements"	2014-10-09 11:14:51 -07:00
Deb Mukherjee	d78dbff09a	Subpel search cleanups and enhancements - Some fixes to surface fit. - Returns variance function as cost rather than sad in the pattern search and diamond search functions. Only vp9_pattern_search_sad function used in bigdia search uses sad as integer 1-away costs. - Deploys SUBPEL_TREE_PRUNED_MORE for speed 4+. Results: derf [Speed 3]: About +0.036% in coding efficiency without any discernible speed loss. derf [Speed 4]: About 2-3% faster at -0.199% loss in coding efficiency. derf [Speed 5]: About 3-4% faster at -0.149% loss in coding efficiency. Change-Id: I8462f94f6adb46966ca964f2bd0400977357fd63	2014-10-08 23:59:43 -07:00
Yunqing Wang	e18edd5eb6	Allow mode search breakout at very low prediction errors In model_rd_for_sb function, the spatial domain SSE and variance are checked to see if transform coefficients are quantized to 0. Besides that, this patch adds another set of thresholds that are much more strict. These thresholds are used to conduct a partition block level check to measure if all its TX blocks are skippable for YUV planes. If it is true, x->skip is set for this partition block, and thus its mode search is terminated. This speeds up the encoding at very low prediction error case, such as screen sharing application. This patch covers what rd_encode_breakout_test() does, so that function is removed. Borg test at speed 3 shows: For stdhd set, psnr: +0.008%, ssim: +0.014%; For derf set, psnr: +0.018%, ssim: +0.025%. No noticeable speed change. Change-Id: I4e5f15cf10016a282a68e35175ff854b28195944	2014-10-08 17:46:22 -07:00
Jim Bankoski	0ce51d823f	experimental : partition using 1/8 x 1/8 image The concept: There's too much noise in source pixels for variance and at low bitrate the reconstructed looks nothing like the source so we have problems getting good partitionings with either. This skirts the issue by using a box blur scaled down version for variance calculations. To compare against source_var_ moved keyframe to be rd based like source_var. Change-Id: Ie3babdbfadae324b7b5a76bea192893af27f0624	2014-10-07 16:36:14 -07:00
Jingning Han	bb260d9076	Rework partition search skip scheme This commit enables the encoder to skip split partition search if the bigger block size has all non-zero quantized coefficients in low frequency area and the total rate cost is below a certain threshold. It logarithmatically scales the rate threshold according to the current block size. For speed 3, the compression performance loss: derf -0.093% stdhd -0.066% Local experiments show 4% - 20% encoding speed-up for speed 3. blue_sky_1080p, 1500 kbps 51051 b/f, 35.891 dB, 67236 ms -> 50554 b/f, 35.857 dB, 59270 ms (12% speed-up) old_town_cross_720p, 1500 kbps 14431 b/f, 36.249 dB, 57687 ms -> 14108 b/f, 36.172 dB, 46586 ms (19% speed-up) pedestrian_area_1080p, 1500 kbps 50812 b/f, 40.124 dB, 100439 ms -> 50755 b/f, 40.118 dB, 96549 ms (4% speed-up) mobile_calendar_720p, 1000 kbps 10352 b/f, 35.055 dB, 51837 ms -> 10172 b/f, 35.003 dB, 44076 ms (15% speed-up) Change-Id: I412e34db49060775b3b89ba1738522317c3239c8	2014-10-03 11:54:30 -07:00
Yunqing Wang	b1b6fd85db	Merge "Skip the partition search for still frames"	2014-09-30 11:59:05 -07:00
Deb Mukherjee	4e9c0d2ad4	Adds two new subpel search methods One is a more aggressive version of the pruned subpel tree search where only a single halfpel candidate is searched. The search candidate is based on a surface fit result. The other is a method to obtain the subpel position at one shot based on the same surface fit. The methods have not been deployed in any speed setting yet. Change-Id: I34fef3f2e34f11396c9d1ba97f4be8c4ffca62d3	2014-09-29 12:51:20 -07:00
Yunqing Wang	1fcbf6ed56	Skip the partition search for still frames This patch re-enabled the feature in Pengchong's patch (commit 12861260732a4fd5f6b667ce9d5105dc9b606eda). Originally, it was turned on while use_lastframe_partitioning > 0(not used anymore). Now it was added as a feature, and turned on while speed >= 2. As described in the original patch, this feature helps speed up the slideshows in YouTube. Change-Id: I1b0f18d65da1ee1c8d1e117dabba910c5207c471	2014-09-26 09:03:52 -07:00
Yaowu Xu	8751e49a6f	Merge "Adapt mode based rd_threshold for similar block size"	2014-09-23 22:28:08 -07:00
Deb Mukherjee	6c6213d960	Merge "Pruned subpel search for speed 3."	2014-09-23 17:12:03 -07:00
Yaowu Xu	4a101310e8	Adapt mode based rd_threshold for similar block size The rd_thresholds are adaptively changed based on best mode tested. It was only changed for the same block size, this commit makes the adaptation for similar block sizes too. The commit also made minor adjustment and code cleanups. The impact on encoding time for _ped: 118089 ms -> 111927 ms The impact on compression: derf: -0.339% stdhd: -0.303% Change-Id: I8817fed1102350497f2ec631849e43f753878e5d	2014-09-23 16:10:59 -07:00
Deb Mukherjee	c94b17f4b2	Pruned subpel search for speed 3. Adds code to return an integer cost list for NSTEP search. Then uses it for pruned subpel search in speed 3. derf: -0.06% Speed on mobcal 720p increaes from 10.28 fps to 10.65 fps. [Subject to further testing]. Change-Id: Ib591382d25b2c11bcaba9d3a27a93a9d1ab27a96	2014-09-23 11:27:58 -07:00
Jingning Han	eee904c9b9	Adaptive mode search scheduling This commit enables an adaptive mode search order scheduling scheme in the rate-distortion optimization. It changes the compression performance by -0.433% and -0.420% for derf and stdhd respectively. It provides speed improvement for speed 3: bus CIF 1000 kbps 24590 b/f, 35.513 dB, 7864 ms -> 24696 b/f, 35.491 dB, 7408 ms (6% speed-up) stockholm 720p 1000 kbps 8983 b/f, 35.078 dB, 65698 ms -> 8962 b/f, 35.054 dB, 60298 ms (8%) old_town_cross 720p 1000 kbps 11804 b/f, 35.666 dB, 62492 ms -> 11778 b/f, 35.609 dB, 56040 ms (10%) blue_sky 1080p 1500 kbps 57173 b/f, 36.179 dB, 77879 ms -> 57199 b/f, 36.131 dB, 69821 ms (10%) pedestrian_area 1080p 2000 kbps 74241 b/f, 41.105 dB, 144031 ms -> 74271 b/f, 41.091 dB, 133614 ms (8%) Change-Id: Iaad28cbc99399030fc5f9951eb5aa7fa633f320e	2014-09-22 09:28:16 -07:00
Jingning Han	f02e0b6cf6	Merge "Remove unused speed feature"	2014-09-13 10:43:03 -07:00
Deb Mukherjee	c0dfecfb89	Merge "Use bigdia search with pruned subpel search"	2014-09-12 16:42:18 -07:00
Deb Mukherjee	83c76118eb	Use bigdia search with pruned subpel search Improves function to return sad of integer pels by reusing integer pels already visited in the smallest scale. Turns on BIGDIA search for speed 4. Also, turns on the first version of the pruned subpel search at this speed. derf: -0.32% (speed 4) Speed seems to improve by at least 5% but subject to verification. Change-Id: Iaec8eaffd61d6237ac029e6a2a1b0a88b2a35271	2014-09-12 10:25:12 -07:00
Jingning Han	00fe92c22f	Remove unused speed feature The speed feature that skips compound inter prediction modes was subsumed by other speed features and effectively was not in use. This commit removes it. Change-Id: I22b0c71a8ddd15d93b25d86fa63a1dce2ba6a1a9	2014-09-11 15:54:53 -07:00
Jingning Han	82757250d6	Merge "Refactor to remove speed feature dependency on mode search order"	2014-09-11 11:14:26 -07:00
Jingning Han	f9f0879756	Refactor to remove speed feature dependency on mode search order This commit refactor the rate-distortion optimization search for regular block sizes to remove the speed feature dependency on mode search order. Change-Id: Ied033ee484c2957e17baa7b6450b720fe7dd0e7d	2014-09-10 17:09:14 -07:00
Yunqing Wang	f10d7eeda2	Remove the use of use_lastframe_partitioning at speed 4 The use of use_lastframe_partitioning is totally removed in good- quality encoding. Its usage in real-time encoding needs to be evaluated to see if it can be removed too. The Borg tests at speed 4 showed: stdhd set: 0.220% psnr gain, 0.166% ssim gain; derf set: 0.329% psnr gain, 0.476% ssim gain. Speed test on selected clips showed 1.54% speedup.(Worst case: pedestrian_area_1080p25.y4m, speed loss: 1.5%) Change-Id: I1c844d329b0b5678558439b887297c1be7ddab00	2014-09-09 10:54:07 -07:00
Yunqing Wang	1092140379	No longer use use_lastframe_partitioning speed feature The speedup in rd_pick_partition() function makes it possible to drop use_lastframe_partitioning feature. By doing that, we achieve good PSNR gain with small speed loss. Also, this makes encoding loop less complicated. The code cleanup patch will follow. Borg tests showed: 1. At speed 2, stdhd set: 0.201% PSNR gain, 0.133% SSIM gain; derf set: 0.262% PSNR gain, 0.276% SSIM gain. 2. At speed 3, stdhd set: 0.139% PSNR gain, 0.109% SSIM gain; derf set: 0.447% PSNR gain, 0.442% SSIM gain. The average speed loss over selected test clips is within 1% with the worst case of 4%. Change-Id: Icfd2ded7869372b585a6972855d933b3d0280d90	2014-09-05 16:24:41 -07:00
Yaowu Xu	7a33712475	Change last_partition_redo_frequency for speed 3 From 3 to 2, which seems to be slightly positive on compression for all test sets, also reduces encoding time by 2%-5%, varying on the test clips. Change-Id: If045417bd27311700c919b4a335eff0dc1130ae0	2014-09-03 09:34:10 -07:00
Yaowu Xu	cdda17ed77	Remove redundant code Change-Id: I453b167f03811a3cd3592089593b3f2823f62ab3	2014-09-03 09:34:10 -07:00
Jingning Han	4282955ee1	Skip intra mode tests depending on inter residuals This commit allows encoder to skip intra coding mode test, when the known inter residual is less than the source variance. It reduces the runtime of speed 3 for test clips: bus cif 1000 kbps: 8587 ms -> 8260 ms, 3.8% speed-up pedestrian 1080p 2000 kbps: 161381 ms -> 155241 ms, 3.7% speed-up. The compression performance is down by derf -0.36% stdhd -0.25% Change-Id: I75ce1e035b4da2153cb1ac14111d1a07c05a735d	2014-08-29 08:37:35 -07:00
Yunqing Wang	4d2c376923	Early termination in encoding partition search In the partition search, the encoder checks all possible partitionings in the superblock's partition search tree. This patch proposed a set of criteria for partition search early termination, which effectively decided whether or not to terminate the search in current branch based on the "skippable" result of the quantized transform coefficients. The "skippable" information was gathered during the partition mode search, and no overhead calculations were introduced. This patch gives significant encoding speed gains without sacrificing the quality. Borg test results: 1. At speed 1, stdhd set: psnr: +0.074%, ssim: +0.093%; derf set: psnr: -0.024%, ssim: +0.011%; 2. At speed 2, stdhd set: psnr: +0.033%, ssim: +0.100%; derf set: psnr: -0.062%, ssim: +0.003%; 3. At speed 3, stdhd set: psnr: +0.060%, ssim: +0.190%; derf set: psnr: -0.064%, ssim: -0.002%; 4. At speed 4, stdhd set: psnr: +0.070%, ssim: +0.143%; derf set: psnr: -0.104%, ssim: +0.039%; The speedup ranges from several percent to 60+%. speed1 speed2 speed3 speed4 (1080p, 100f): old_town_cross: 48.2% 23.9% 20.8% 16.5% park_joy: 11.4% 17.8% 29.4% 18.2% pedestrian_area: 10.7% 4.0% 4.2% 2.4% (720p, 200f): mobcal: 68.1% 36.3% 34.4% 17.7% parkrun: 15.8% 24.2% 37.1% 16.8% shields: 45.1% 32.8% 30.1% 9.6% (cif, 300f) bus: 3.7% 10.4% 14.0% 7.9% deadline: 13.6% 14.8% 12.6% 10.9% mobile: 5.3% 11.5% 14.7% 10.7% Change-Id: I246c38fb952ad762ce5e365711235b605f470a66	2014-08-28 11:27:28 -07:00
Deb Mukherjee	bb2a9abb1e	Merge "Updates vp9_pattern search to return integer sads"	2014-08-28 09:38:56 -07:00
Deb Mukherjee	04b100b23e	Updates vp9_pattern search to return integer sads Updates the vp9_pattern_search function to return integer one-away neighbors' sad values, for subsequent use in speeding up the sub-pel search. Also, removes code for the do_refine option which is not being used currently. Updates the integer and subpel functions to pass in a 5-element sad list for output or input. A new pruned sub-pel search algorithm is implemented that uses the sad returned from the integer pel search. But it is not deployed yet. Change-Id: Ifa9f5ad024b5b660570366d2bd900343e1891520	2014-08-28 06:49:58 -07:00
Yaowu Xu	bcfb1ffb9d	Merge "add a new interp filter search strategy."	2014-08-26 17:30:42 -07:00
Yaowu Xu	1144fee3d5	add a new interp filter search strategy. This commit addes a new strategy to reduce the search for optimal interpolation filter type. The encoder counts and store how many each filter type is selected and used for each of the reference frames. A filter type that is rarely used for all three reference frames is masked out to avoid computation. The impact on compression is neglectible: -0.02% on derf +0.02% on stdhd Encoding time is seen to reduce by 2~3%. Change-Id: Ibafa92291b51185de40da513716222db4b230383	2014-08-26 09:05:04 -07:00
Dmitry Kovalev	0082727cb7	Merge "Adding is_keyframe temp var."	2014-08-25 18:36:59 -07:00

1 2 3 4

189 Commits