Commit Graph

232 Commits

Author SHA1 Message Date
Jerome Jiang
02463273c9 vp9: Using source sad for speedup for dynamic resizing.
Only for speed >= 7.

Change-Id: I3ac85fbb4023cf7e6f8333806b345b0174382a09
2017-03-14 15:47:19 -07:00
Marco
f0a22b23fe vp9: Fix to source_sad feature for SVC.
Allow speed feature sf->use_source_sad to be used
on highest spatial layer for SVC.

Change-Id: I260eb0478902764f49f83e43b17024fe86ff3b22
2017-03-13 11:00:40 -07:00
Marco
ea3c817ac2 vp9: Enable two speed features for SVC real-time mode.
Enable short_circuit_low_temp_var and limit_newmv_early_exit
for SVC, 1 pass CBR mode.

Change-Id: I77df2b2c6cc40657bb8ea76e19dfc2fdaad6389e
2017-03-08 16:13:59 -08:00
Yunqing Wang
099e9bf1ff Make the partition search early termination feature to be frame size dependent
The 2 thresholds(i.e. partition_search_breakout_dist_thr and
partition_search_breakout_rate_thr) are used as the partition search
early termination speed feature. This refactoring patch made this
feature to be frame size dependent consistently throughout the code.

Change-Id: Idaa0bd8400badaa0f8e2091e3f41ed2544e71be9
2017-03-08 12:56:41 -08:00
Vignesh Venkatasubramanian
9e7140b451 Merge "vp9,realtime: Enable row multithreading for non-rd" 2017-03-03 19:05:52 +00:00
Marco
b60617f5ff vp9: Speed 8: reduce the adaptive_rd_thresh level.
Reduce the level from 4 to 2.
This gives ~1-2% quality gain on RTC set, with small decreaee in speed (~1-2% on mac).

Change-Id: I7d959731badcee3d45b2f4a08efe378765016a13
2017-03-02 13:34:10 -08:00
Vignesh Venkatasubramanian
453f18040f vp9,realtime: Enable row multithreading for non-rd
Enable row level multithreading for realtime encodes where non-rd
path is used (speed >= 5).

Change-Id: I5439cb49a02171166d8e1de06c7d5e6f8e819a41
2017-03-02 11:03:56 -08:00
Vignesh Venkatasubramanian
5881601488 vp9: Rename new_mt to row_mt
new_mt is a very generic name that will get obsolete soon enough.
Since this is exposed as a codec control, renaming it to row_mt to
signify row level paralellism. Also renaming the ETHREAD_BIT_MATCH
codec control to ROW_MT_BIT_EXACT.

Change-Id: Ic7872d78bb3b12fb4cf92ba028ec8e08eb3a9558
2017-02-27 09:43:26 -08:00
Jerome Jiang
a6b6258284 Merge "vp9: Non-rd pickmode: use simple block_yrd under some conditons." 2017-02-22 23:19:29 +00:00
Marco
7e7d820d5b vp9: Non-rd pickmode: use simple block_yrd under some conditons.
For speed 8 only.
3% speed up for QVGA and 6.3% for VGA on Nexus 6.
~3% avgPSNR decrease on rtc_derf and 2.9% on rtc.

Disabled for now.

Change-Id: I70133f1f6c804d663d594df437bfe7fdb0030d6a
2017-02-22 13:22:53 -08:00
Marco
7f2daa74a0 vp9: Incorporate source sum_diff into non-rd partition thresholds.
Increase the variance partition thresholds for superblocks that
have low sum-diff (from source analysis prior to encoding frame).
Use it for now only for speed >= 7 or for denoising on.

Small change on metrics for rtc set: less than ~0.1 avgPNSR decrease
on RTC set, for both speed 7 and 8.

Change-Id: I38325046ebd5f371f51d6e91233d68ff73561af1
2017-02-21 17:22:11 -08:00
Yunqing Wang
f2c1aea118 Merge "Row based multi-threading of encoding stage" 2017-02-15 00:54:10 +00:00
Ranjit Kumar Tulabandu
71061e9332 Row based multi-threading of encoding stage
(Yunqing Wang)
This patch implements the row-based multi-threading within tiles in
the encoding pass, and substantially speeds up the multi-threaded
encoder in VP9.

Speed tests at speed 1 on STDHD(using 4 tiles) set show that the
average speedups of the encoding pass(second pass in the 2-pass
encoding) is 7% while using 2 threads, 16% while using 4 threads,
85% while using 8 threads, and 116% while using 16 threads.

Change-Id: I12e41dbc171951958af9e6d098efd6e2c82827de
2017-02-15 00:49:34 +00:00
clang-format
4b402746ca apply clang-format
Change-Id: I75e4a9e0b37bd4586f26c8d6c1fa27f3f6ff1bce
2017-02-14 12:45:52 -08:00
Marco
219cdab676 vp9: Add feature to use block source_sad for realtime mode.
Only for speed >= 7, and affects skipping of intra modes.
Threshold is set low for now, needs to be tuned.
Small/no difference in metrics on rtc clips.

Change-Id: If9bdbd43f08d1f80407cdd2e9e5e96780dcd2424
2017-01-20 11:57:02 -08:00
Jerome Jiang
ee5b29ae30 vp9: Stop copying partition every a fixed number of frames.
Avoid quality loss when copying partition of superblock with large motions.
Maximum consecutively copied frames can be set (currently 5).

Change-Id: I11c30575514f02194c0f001444cf4021609e5049
2017-01-18 11:23:59 -08:00
Jerome Jiang
9152d434dc vp9: Disable partition copy when resizing is enabled.
Change-Id: I4fa3262e0f1c4018604c954b020ec5d1e3d1465c
2017-01-17 18:21:31 -08:00
Jerome Jiang
255866419d Merge "vp9: Set low variance flag when partition is copied." 2017-01-17 21:02:52 +00:00
Jerome Jiang
0c65aed099 vp9: Set low variance flag when partition is copied.
Also set the flag to 1 when exit early choosing 64x64 block
such that skipping new mv for golden works in these scenerios.

Change the size of prev_segment_id to the number of superblocks
to save memory.

Borg test shows quality regression of 0.012% on average PSNR
and 0.035% on SSIM.

Change-Id: I5014224c8617d439d35c66ece3fed9ae30b31d23
2017-01-17 11:14:50 -08:00
Marco
159cc3b33c vp9: Add speed feature flag for computing average source sad.
If enabled will compute source_sad for every superblock on every frame,
prior to encoding. Off by default, only on for speed=8 when
copy_partition is set.

Change-Id: Iab7903180a23dad369135e8234b7f896f20e1231
2017-01-13 11:52:12 -08:00
Jerome Jiang
f129e09529 vp9: Turn on the partition copy for speed 8. Tune threshold.
For speed 8, it speeds up the encoding on android by 6% for QVGA and
7.4% for VGA with the new threshold. Overall PSNR is improved by 0.667
for rtc.

Change-Id: I4a644560b32c0b5b4e9f49ffb953d000413a3732
2017-01-11 10:48:16 -08:00
Jerome Jiang
198b834c97 vp9: Set less aggresive short_circuit_low_temp_var for HD at speed 8.
Quality improved by 1.866 and 0.386 for two noisy clips (dark720p and
marcooffice720p), respectively.

Change-Id: Ib33a7672ae9ca53da156208f7cd13f07b5543e44
2017-01-09 16:44:07 -08:00
Jerome Jiang
267e73446c vp9: Enable more aggresive short circuit for speed 8.
Set short_circuit_low_temp_var to 3 for speed 8 for all res.
No strong visual difference on all clips.

Change-Id: Ia6d9a314291ab1c14d5421bbdd769974083aeb2a
2017-01-06 10:23:34 -08:00
Jerome Jiang
72746c079d vp9: Set short circuit to level 3 for VGA for speed 8.
vp9: Set short circuit to level 3 for VGA for speed 8. Also change the
threshold_32x32 to 5/8*thresholds[1] to improve quality regression
caused to VGA clips.

Change-Id: Ia1590e91e7cb22be78d5b85013387bb1be4272e3
2017-01-04 11:28:31 -08:00
Jerome Jiang
1d5ca84df6 vp9: Add feature to copy partition from the last frame.
Add feature to copy partition from the last frame.
The copy is only done under certain conditions that SAD is below threshold.
Feature is currently disabled, until threshold is tuned.
Feature will be initially used for Speed 8 (ARM).

Under extreme case of always copying partition for speed 8:
Encode time is reduced by 5.4% on rtc_derf and 7.8% on rtc.
Overall PSNR reduced by 2.1 on rtc_derf and 0.968 on rtc.

Change-Id: I1bcab515af3088e4d60675758f72613c2d3dc7a5
2016-12-19 16:24:03 -08:00
Jingning Han
f473e892f7 Merge "Enable asymptotic closed-loop encoding decision" 2016-11-19 04:12:55 +00:00
Jerome Jiang
360217a233 vp9: Speed 8: More aggresive golden skip for low res.
Add a new, more aggresive short circuit: short_circuit_low_temp_var = 3 to skip
golden of any mode when variance is lower than threshold for low res.
This change only affects speed = 8, low resolution.

Metrics for avgPSNR/SSIM on rtc_derf (low resolution) show loss of
0.27/0.31%.
On Nexus 6, the encoding time is reduced by ~2.3% on average across all
low-res clips.

Visually little change on rtc_derf clips.

Change-Id: Ia8f7366fc2d49181a96733a380b4dbd7390246ec
2016-11-15 13:56:27 -08:00
Jingning Han
44f8ee7258 Enable asymptotic closed-loop encoding decision
This commit enables asymptotic closed-loop encoding decision for
the key frame and alternate reference frame. It follows the regular
rate control scheme, but leaves out additional iteration on the
updated frame level probability model. It is enabled for speed 0.

The compression performance is improved:

lowres 0.2%
midres 0.35%
hdres  0.4%

Change-Id: I905ffa057c9a1ef2e90ef87c9723a6cf7dbe67cb
2016-11-14 09:22:55 -08:00
Marco
a7d116aa67 vp9: Speed=8 real-time: Keep the bias_golden feature on.
Small/no change in metrics on RTC set, speed increase by 2-3%.

Change-Id: Iee997bd7433e8e508216e9267b1c31c5a9aa5121
2016-10-20 17:03:51 -07:00
Marco
57c6bf291e 1 pass vbr: Allow for lookahead alt-ref in real-time mode.
For 1 pass vbr real-time mode:
Allow for the usage of alt-ref frame when non-zero lag-in-frames is used.
Use non-filtered alt-ref, and select usage based on fast scene/content
analysis/detection within the lag of frames.

Positive gains on ytlive set: overall avgPSNR ~3-4%.
Several clips are up between 5-14%, a few clips are neutral/small change.

Current speed decrease is about ~5-10%.

Use the flag USE_ALTREF_FOR_ONE_PASS to enable this feature
(off by default for now).

Change-Id: I802d2bf3d44f9cf01f6d15c76be9c90192314769
2016-10-11 10:13:17 -07:00
Marco
d017548be6 vp9 real-time mode: Change loopfilter speed feature at speed 8.
For real-time mode at speed 8: turn off MINIMAL_LF at speed 8,
for non-screen content mode.

Visually better, avgPSNR/SSIM on rtc set go up by ~4-5%.
Speed decrease of about ~3%.

Change-Id: I8eb69330f02e0ceece1507d43cfc8a049a1d8291
2016-09-29 12:59:01 -07:00
paulwilkins
6fc07a217d Modified resize loop constraints.
Using a tighter resize constraint on undershoot seems to help
results (especially SSIM) as significant undershoot on a frame
seems to have more of a damaging impact than overshoot.

This patch has been tuned so that in local testing using the
derf set it is encode speed neutral for speed  setting 2.

Average quality result for speed 2 (psnr,ssim) were  as follows:-

 lowres  0.039,  0.453
 midres  0.249, 0.853
 hdres  0.159, 0.659
 NetFlix -0.241, 0.360

Change-Id: Ie8d3a0d7d6f7ea89d9965d1821be17f8bda85062
2016-08-31 12:45:49 +01:00
Paul Wilkins
129814fcb4 Merge "Adjust coefficient optimization and tx_domain rd speed features." 2016-08-30 16:54:40 +00:00
Paul Wilkins
badd32d914 Merge "Add ALLOW_RECODE_FIRST speed mode." 2016-08-26 15:46:45 +00:00
paulwilkins
dc42f343ae Add ALLOW_RECODE_FIRST speed mode.
This patch is to address concerns that changes to allow
recodes on the first frame in each ARF group do not give a
good enough speed quality trade off for speed 2. Though the
average impact  on encode speed is 1-2%, for some hard clips
it is > 5% rise.  For speed 1 this is less an issue and for Speed 0
the previous patch actually  improves speed.

Change-Id: Ie1bcefdbfdf846d3f4428590173f621465dffe3a
2016-08-26 11:43:47 +01:00
paulwilkins
635ae8bdc1 Adjust coefficient optimization and tx_domain rd speed features.
Previously Tx domain rd was used in all cases above speed 0.
Coefficient optimization was only enabled for best and speed 0.

This patch selectively sets these features at other speed settings
based on block complexity.

For the Netflix and HD sets in particular the quality gains are
large compared to the speed hit. At speed 1 the average psnr
gain in the NF set  is > 2.5% with one clip coming in at 18%
and some points almost 30%.  Average gains for the lower
resolution test sets are around 1%.

The gains are biggest at low Q so some further optimization
may be possible.

Change-Id: I340376c7b2a78e5389a34b7ebdc41072808d0576
2016-08-25 15:36:16 +01:00
Yunqing Wang
ef98f49cb0 Disable split mode in 4k video encoding
Disabled the split mode while encoding 4k video to speed
up the encoder.

Borg test result on 4k set:
Overall PSNR: +0.029%; SSIM: +0.009%.
Average encoder speedup at speed 2 is 2.5%.

Change-Id: I1519c658f07c3ac838affbe5aff0ed9b94f3f8f4
2016-08-22 19:46:44 -07:00
Yunqing Wang
37169c0bd4 Merge "Adjust speed features for 4k video encoding" 2016-08-19 23:11:05 +00:00
Yunqing Wang
fe488cceff Adjust speed features for 4k video encoding
Adjusted speed 2 features to speed up 4k video encoding.
BDBR results from borg test:
PSNR: +0.313%; SSIM: +0.268%.
Average speedup: 8.5%

Change-Id: I1e2695a01fb3f3817c1df4480e184c2aed8f2eba
2016-08-19 09:30:32 -07:00
JackyChen
8be7e572a7 vp9 svc: SVC encoder speed up.
Bias towards base_mv and skip 1/4 pixel motion search when using base mv.
2~3% speed up for 2 spatial layers, 3~5% speed up for 3 spatial layers.
PSNR loss:
(2 layers) 0.07dB for gips_stationary, 0.04dB for gips_motion;
(3 layers) 0.07dB for gips_stationary, 0.06dB for gips_motion.

Change-Id: I773acbda080c301cabe8cd259f842bcc5b8bc999
2016-08-18 11:25:45 -07:00
Marco
7eb7d6b227 vp9 non-rd pickmode: Add limit on newmv-last and golden bias.
Add option, for newmv-last, to limit the rd-threshold update for early exit,
under a source varianace condition.
This can improve visual quality in low texture moving areas,
like forehead/faces.

Also add bias against golden to improve the speed/fps,
will little/negligible loss in quality.

Only affects CBR mode, non-svc, non-screen-content.

Change-Id: I3a5229eee860c71499a6fd464c450b167b07534d
2016-08-17 14:33:44 -07:00
paulwilkins
5d881770e5 Change default recode rule for good speed 0 and best.
Changes the default recode rule for Speed 0 and best quality
from ALLOW_RECODE to ALLOW_RECODE_KFARFGF.

Tested on the NF, hdres, midres and lowres test sets, this setting
when combined with patch I40cb559... now performs "as well" in
metrics terms (in fact it came out a tiny amount better overall)
but encode time is 9.6%  faster (measured as the average
from 27 mid rate local encodes on clips in the derf/lowres set.

Change-Id: I8c781c0cdfa3a9929cd9406d15582fce47d6ae3b
2016-08-15 10:52:54 +01:00
clang-format
e0cc52db3f vp9/encoder: apply clang-format
Change-Id: I45d9fb4013f50766b24363a86365e8063e8954c2
2016-08-02 16:47:11 -07:00
Jingning Han
efccbc9fb5 Disable trellis optimization when lossless is on
Disable trellis coefficient optimization when the lossless mode
is turned on.

Change-Id: I9001bf626e86dc3c8c32331ede04fd39036e5f7c
2016-07-12 09:00:16 -07:00
Jingning Han
62aa642d71 Enable uniform quantization with trellis optimization in speed 0
This commit allows the inter prediction residual to use uniform
quantization followed by trellis coefficient optimization in
speed 0. It improves the coding performance by

lowres 0.79%
midres 1.07%
hdres  1.44%

Change-Id: I46ef8cfe042a4ccc7a0055515012cd6cbf5c9619
2016-07-07 12:25:33 -07:00
Jingning Han
e357b9efe0 Support measure distortion in the pixel domain
Use pixel domain distortion metric in speed 0. This improves the
compression performance by 0.3% for both low and high resolution
test sets.

Change-Id: I5b5b7115960de73f0b5e5d0c69db305e490e6f1d
2016-07-06 18:25:17 -07:00
JackyChen
f9c0587200 vp9: Encoding cycle reduction for speed 8.
1. Skip golden non-zeromv and newmv-last for bsize >= 16x16 if the
temporal variance obtained from choose_partitioning is very low.
2. Skip horz and vert INTRA mode for speed 8.

This change works best on the clips with little noise and with some
motion (e.g. gips_motion which has > 5% speed up). PSNR drop is 1.78%
on rtc test set, no obvious visual quality regression found.

Change-Id: Ib43b5b20e67809d03c5a6890818ddff59e1fc94a
2016-06-13 09:33:22 -07:00
JackyChen
891dbe1e52 vp9: Fix valgrind failure for short circuit on low temporal vaiance block.
Add check for actual split before using the variance of the split.

Change-Id: If0f93248be0b16d17738675d16c90516054dad2b
2016-06-02 15:56:58 -07:00
JackyChen
a32f341539 Disable short circuit feature for low temporal variance.
The featrue fails in libvpx_unit_tests-valgrind. Will re-enable it after
fixing the issue.

Change-Id: I8ba132f04e98f4615b31fbff2097eda83c5e42bc
2016-06-02 09:45:00 -07:00
jackychen
bacc67f4a8 vp9: Skip some modes when variance is low for big blocks, for 1 pass real-time.
Skip intra-mode and some inter-modes (newmv, nearmv, nearestmv) for
golden frame if the variance got from choose_partitioning is very low.
Only for 1 pass real-time CBR mode and bsize >= 32x32, it has ~2.5%
speed up with less than 0.1% PSNR drop for rtc test set. Don't see
visual regression.

Change-Id: I70efbc95a1007231ae36f02c5b2fbf6cd35077ad
2016-06-01 13:54:18 -07:00