Commit Graph

511 Commits

Author SHA1 Message Date
Marco
fc83fcb7c4 vp9: SVC: fix to allow output of denoised result.
Change-Id: Iaf55cfb5e9621d074eb33d6a32f184e4777968f8
2017-03-29 14:02:54 -07:00
Marco
66c6b4d6fc vp9: 1 pass: Move source sad computation into encodeframe loop.
Refactor to split the 1 passs source sad computation into scene
detection (currently used for VBR and screen-content mode), and
superblock based source sad computation (used in non-rd CBR mode).

This allows the source sad computation for CBR mode to be
multi-threaded.

No change in compression.

Change-Id: I112f2918613ccbd37c1771d852606d3af18c1388
2017-03-27 11:11:05 -07:00
Marco
07ad5a15c2 vp9: Fix to condition on using source_sad for 1 pass real-time.
Make the source_sad feature work properly for cases of VBR or
screen_content with SVC.

Added unittest for SVC with screen-content on.

Change-Id: Iba5254fd8833fb11da521e00cc1317ec81d3f89b
2017-03-24 10:21:47 -07:00
Marco
06c8713e89 vp9: Use sb content measure to bias against golden.
For each superblock, keep track of how far from current frame
was the last significant content change, and use that (along
with GF distance), to turnoff GF search in non-rd pickmode.

Only enabled for speed >= 8.

avgPNSR on RTC/RTC_derf down by ~0.9/1.2.
Speedup on mac: ~3-5%.
Speedup on arm: 3.6% for VGA and 4.4% for HD.

Change-Id: Ic3f3d6a2af650aca6ba0064d2b1db8d48c035ac7
2017-03-20 12:42:26 -07:00
Jerome Jiang
bf40776aa4 Merge "Refactor: Change cpi->resize_state to enum values." 2017-03-16 16:43:42 +00:00
Marco
a340c64a79 vp9: Fix some issues with denoiser and SVC.
Fix the update of the denoiser buffer when the base
spatial layer is a key frame. And allow for better/lower
QP on high spatial layers when their base layer is key frame.

Change-Id: I96b2426f1eaa43b8b8d4c31a68b0c6d68c3024a2
2017-03-15 17:19:17 -07:00
Jerome Jiang
b5f7f7737a Refactor: Change cpi->resize_state to enum values.
Change-Id: Iab1409b0fc1175bc5a14afc4749a08c536c98c41
2017-03-15 17:16:17 -07:00
Jerome Jiang
27d5a57072 Merge "vp9: Using source sad for speedup for dynamic resizing." 2017-03-15 00:03:52 +00:00
Jerome Jiang
2fa7092808 Merge "vp9: Enable row multithreading for SVC in real-time mode." 2017-03-14 23:29:46 +00:00
Jerome Jiang
02463273c9 vp9: Using source sad for speedup for dynamic resizing.
Only for speed >= 7.

Change-Id: I3ac85fbb4023cf7e6f8333806b345b0174382a09
2017-03-14 15:47:19 -07:00
Marco
f0a22b23fe vp9: Fix to source_sad feature for SVC.
Allow speed feature sf->use_source_sad to be used
on highest spatial layer for SVC.

Change-Id: I260eb0478902764f49f83e43b17024fe86ff3b22
2017-03-13 11:00:40 -07:00
Marco
ffb3c50da1 vp9: Enable row multithreading for SVC in real-time mode.
Enable row-mt for SVC for real-time mode, speed >=5.

Add the controls to the sample encoders, but keep it off for now.
Add the control and enable it for the 1 pass CBR unittests.

For speed 7, 3 layer SVC, 2 threads, row-mt enabled gives about ~5% speedup.

Change-Id: Ie8e77323c17263e3e7a7b9858aec12a3a93ec0c1
2017-03-10 01:01:07 +00:00
James Zern
2f31a16445 move vp9_scale_and_extend_frame_c to vp9_frame_scale.c
this is similar to the x86 configuration and helps mitigate an issue
with a circular dependency between this function and the ssse3 variant
causing an outsized increase in binary size (~300K for chrome)
chrome.dll:
.text 255B000 -> 252B000
.data 7B000 -> 75000
-221184 bytes

BUG=chromium:697956

Change-Id: Ic95b142ecd62dd4f1795788aa27dd8fab59b708c
2017-03-08 21:13:50 -08:00
Marco
45de35fc58 vp9: Fix for denoising with SVC.
Fix the conditon for getting last_source when denoising is on.
This avoids unneeded scaling in the case of SVC.

No change in quality.

Change-Id: I32c1c2c9085104da51af8535716bcc4d55fb0f42
2017-03-08 09:45:58 -08:00
Vignesh Venkatasubramanian
453f18040f vp9,realtime: Enable row multithreading for non-rd
Enable row level multithreading for realtime encodes where non-rd
path is used (speed >= 5).

Change-Id: I5439cb49a02171166d8e1de06c7d5e6f8e819a41
2017-03-02 11:03:56 -08:00
Vignesh Venkatasubramanian
5881601488 vp9: Rename new_mt to row_mt
new_mt is a very generic name that will get obsolete soon enough.
Since this is exposed as a codec control, renaming it to row_mt to
signify row level paralellism. Also renaming the ETHREAD_BIT_MATCH
codec control to ROW_MT_BIT_EXACT.

Change-Id: Ic7872d78bb3b12fb4cf92ba028ec8e08eb3a9558
2017-02-27 09:43:26 -08:00
Jerome Jiang
0998a146d4 Make vp9_scale_and_extend_frame_ssse3 work for hbd when bitdepth = 8.
Only works for bitdepth = 8 when compiled with high bitdepth flag.
4x speed ups for handling 1:2 down/upsampling.

Validated manually for:
1) Dynamic resize for a single layer encoding
2) SVC encoding with 3 spatial layers
Results are bitexact with the patch and the speed gain (~4x) in the
scaling was verified.

BUG=webm:1371

Change-Id: I1bdb5f4d4bd0df67763fc271b6aa355e60f34712
2017-02-23 20:40:28 -08:00
Yunqing Wang
66f36f4735 Merge "Refactored the row based multi-threading code" 2017-02-22 16:55:04 +00:00
Jerome Jiang
b1dcaf7f1e Merge "Fix segmentation fault caused by denoiser working with spatial SVC." 2017-02-22 04:44:55 +00:00
Marco
7f2daa74a0 vp9: Incorporate source sum_diff into non-rd partition thresholds.
Increase the variance partition thresholds for superblocks that
have low sum-diff (from source analysis prior to encoding frame).
Use it for now only for speed >= 7 or for denoising on.

Small change on metrics for rtc set: less than ~0.1 avgPNSR decrease
on RTC set, for both speed 7 and 8.

Change-Id: I38325046ebd5f371f51d6e91233d68ff73561af1
2017-02-21 17:22:11 -08:00
Jerome Jiang
0d1e5a21c4 Fix segmentation fault caused by denoiser working with spatial SVC.
Re-enable the affected test.
BUG=webm:1374

Change-Id: I98cd49403927123546d1d0056660b98c9cb8babb
2017-02-21 09:38:28 -08:00
Ranjit Kumar Tulabandu
97d6a4cbd1 Refactored the row based multi-threading code
Modified the code to facilitate bit-match tests in first pass
Added unit-tests to test the row based multi-threading behavior for bit-exactness

Change-Id: Ieaf6a8f935bb1075597e0a3b52d9989c8546d7df
2017-02-20 16:13:45 +05:30
paulwilkins
d218b0914e cosmetics: Fix spelling mistake in compile flag name.
agressive -> aggressive

after:
ce7b38459 Aggressive VBR method.

Change-Id: Ie0f30b1bbc77ed9f32bec047b4a9b3d0cf4853f5
2017-02-16 14:51:31 -08:00
paulwilkins
945ccfee59 Additional first pass stats.
Added counts that split the intra coded blocks into low and high variance.

Change-Id: Ic540144b34d5141659081bb22f7ee16fd6861f14
2017-02-15 10:44:37 +00:00
Paul Wilkins
7635ee0f37 Merge "Aggressive VBR method." 2017-02-15 10:37:02 +00:00
Ranjit Kumar Tulabandu
71061e9332 Row based multi-threading of encoding stage
(Yunqing Wang)
This patch implements the row-based multi-threading within tiles in
the encoding pass, and substantially speeds up the multi-threaded
encoder in VP9.

Speed tests at speed 1 on STDHD(using 4 tiles) set show that the
average speedups of the encoding pass(second pass in the 2-pass
encoding) is 7% while using 2 threads, 16% while using 4 threads,
85% while using 8 threads, and 116% while using 16 threads.

Change-Id: I12e41dbc171951958af9e6d098efd6e2c82827de
2017-02-15 00:49:34 +00:00
paulwilkins
ce7b38459a Aggressive VBR method.
VBR method that allows a wider Q range for the first normal frame
in each ARF group and then centers the min - max range for the rest of
the arf group on the chosen Q value for that first frame.

This allows for quite rapid adjustment of the active Q range even if the
initial estimate is poor.

In some cases where the ARF frames themselves are tending to
undershoot but the normal frames are overshooting this can still give
net undershoot. This can be corrected by allowing a larger Q delta for
arf frames but is usually is a sign that the allocation to the arfs was to
high.

Change-Id: Icec87758925d8f7aeb2dca29aac0ff9496237469
2017-02-13 15:42:11 +00:00
Paul Wilkins
c3f095c8b3 Merge "Fix to avoid abrupt relaxation of max qindex in recode path" 2017-02-09 17:17:55 +00:00
Yunqing Wang
b106abe570 Merge "Row based multi-threading of ARNR filtering stage" 2017-02-07 19:55:41 +00:00
Ranjit Kumar Tulabandu
91f01a2060 Row based multi-threading of ARNR filtering stage
Change-Id: Ic238d32c7e10b730342224ab56712a89a6026a8f
2017-02-07 14:03:19 +05:30
Ranjit Kumar Tulabandu
12ec948490 Changes to facilitate multi-threading of encoding stage
Modified the encoding stage to have row level entry points with relevant
initializations and to access the token information at row level

Change-Id: Ife10e55a7c1a420ee906d711caf75002688d9e39
2017-02-02 14:47:13 +05:30
Ranjit Kumar Tulabandu
6985a0f516 Disable multi-threading in first pass for SVC encoding
BUG=webm:1366

Change-Id: I204ef8496884ba7c4debe64f23f50d298b4090c3
2017-01-27 09:41:53 -08:00
Ranjit Kumar Tulabandu
8b0c11c358 Multi-threading of first pass stats collection
(yunqingwang)
1. Rebased the patch. Incorporated recent first pass changes.
2. Turned on the first pass unit test.

Change-Id: Ia2f7ba8152d0b6dd6bf8efb9dfaf505ba7d8edee
2017-01-24 15:48:02 -08:00
Marco
219cdab676 vp9: Add feature to use block source_sad for realtime mode.
Only for speed >= 7, and affects skipping of intra modes.
Threshold is set low for now, needs to be tuned.
Small/no difference in metrics on rtc clips.

Change-Id: If9bdbd43f08d1f80407cdd2e9e5e96780dcd2424
2017-01-20 11:57:02 -08:00
Jerome Jiang
ee5b29ae30 vp9: Stop copying partition every a fixed number of frames.
Avoid quality loss when copying partition of superblock with large motions.
Maximum consecutively copied frames can be set (currently 5).

Change-Id: I11c30575514f02194c0f001444cf4021609e5049
2017-01-18 11:23:59 -08:00
Jerome Jiang
255866419d Merge "vp9: Set low variance flag when partition is copied." 2017-01-17 21:02:52 +00:00
Jerome Jiang
0c65aed099 vp9: Set low variance flag when partition is copied.
Also set the flag to 1 when exit early choosing 64x64 block
such that skipping new mv for golden works in these scenerios.

Change the size of prev_segment_id to the number of superblocks
to save memory.

Borg test shows quality regression of 0.012% on average PSNR
and 0.035% on SSIM.

Change-Id: I5014224c8617d439d35c66ece3fed9ae30b31d23
2017-01-17 11:14:50 -08:00
Ranjit Kumar Tulabandu
5f21aba4b0 Fix to avoid abrupt relaxation of max qindex in recode path
The fix relaxes the max qindex based on the data from previous loop of
coding if output frame size is greater than maximum frame size allowed

Change-Id: Iac1f63ec67559d68766e090a7cbb80b812b2560f
2017-01-16 18:03:27 +05:30
Marco
159cc3b33c vp9: Add speed feature flag for computing average source sad.
If enabled will compute source_sad for every superblock on every frame,
prior to encoding. Off by default, only on for speed=8 when
copy_partition is set.

Change-Id: Iab7903180a23dad369135e8234b7f896f20e1231
2017-01-13 11:52:12 -08:00
Marco
7e3a82c384 vp9: Make the denoiser work with spatial SVC.
If enabled denoiser will only denoise the top spatial layer for now.

Added unittest for SVC with denoising.

Change-Id: Ifa373771c4ecfa208615eb163cc38f1c22c6664b
2017-01-10 17:23:58 -08:00
Marco
91fc730d83 vp9: 1 pass cbr: Adjustments to usage of gf_cbr_boost and aq=3 mode.
When aq=3 mode is on and the gf_cbr_boost is set: make sure golden frame
is always refreshed, and don't incorporate segement cost in qp setting
on the boosted golden frame.

Better performance on RTC set with gf_cbr_boost on,
for example with gf_cbr_boost=50, gains from ~0.5-3%.

Change-Id: Ie811f5e4d444ff3320bd6e2c1745b2c4c09a8460
2017-01-10 09:42:06 -08:00
Hui Su
c7e2bd6298 Merge "Add support for VP9 level targeting" 2017-01-07 00:55:41 +00:00
hui su
337ad83e58 Add support for VP9 level targeting
Constraints on encoder config:
-target_bandwidth is no larger than 80% of level bitrate limit
-target_bandwidth * (1 + max_over_shoot_pct) is no larger than
88% of level bitrate limit
-min_gf_interval is no smaller than level limit
-tile_columns is no larger than level limit

Constraints on rate control:
-current frame size plus previous three frames' size is no larger
than the CPB level limit
-current frame size is no larger than 50%/40%/20% of the CPB
level limit if it's a key/alt-ref/other frame.

Change-Id: I84d1a2d6d6e3c82bfd533b3309ce999cfaba2c8b
2017-01-06 10:07:31 -08:00
Jerome Jiang
afc8c4836f vp9: Compute source sad for every superblock when partition copy is on.
The source sad could be used to copy the partition without going into
choose_partitioning function to speed up vp9 encoding. Computing source
sad takes little time. Speed test on Android and Linux shows little
encoding time gain (less than 1.4%).

Turned off for now since partition copy is turned off.

Change-Id: I61c9d5b8f22329760cb29a4ee30a7f9c232ce8d3
2017-01-06 17:59:02 +00:00
Jerome Jiang
1d5ca84df6 vp9: Add feature to copy partition from the last frame.
Add feature to copy partition from the last frame.
The copy is only done under certain conditions that SAD is below threshold.
Feature is currently disabled, until threshold is tuned.
Feature will be initially used for Speed 8 (ARM).

Under extreme case of always copying partition for speed 8:
Encode time is reduced by 5.4% on rtc_derf and 7.8% on rtc.
Overall PSNR reduced by 2.1 on rtc_derf and 0.968 on rtc.

Change-Id: I1bcab515af3088e4d60675758f72613c2d3dc7a5
2016-12-19 16:24:03 -08:00
Jingning Han
44f8ee7258 Enable asymptotic closed-loop encoding decision
This commit enables asymptotic closed-loop encoding decision for
the key frame and alternate reference frame. It follows the regular
rate control scheme, but leaves out additional iteration on the
updated frame level probability model. It is enabled for speed 0.

The compression performance is improved:

lowres 0.2%
midres 0.35%
hdres  0.4%

Change-Id: I905ffa057c9a1ef2e90ef87c9723a6cf7dbe67cb
2016-11-14 09:22:55 -08:00
Debargha Mukherjee
f93305aa07 Merge "Speed-up recode loop for extreme bitrate diffs" 2016-11-03 17:04:17 +00:00
Paul Wilkins
295cd3b493 Merge "Fixed bug in formatting of debug stats." 2016-11-02 17:10:07 +00:00
paulwilkins
de76d2e315 Fixed bug in formatting of debug stats.
Fixed formatting bug introduced by the fix to BUG=webm:1322
( Iedc4477aef1746aa0a4f84d88a1156296fd3ba87)

Change-Id: I715ee446c0e8584967ab87ba4e355759dd394187
2016-11-02 09:38:18 +00:00
Debargha Mukherjee
1cd987d922 Speed-up recode loop for extreme bitrate diffs
Adjusts the q adjustement step depending on how far the
projected and target rates differ.

Change-Id: I498d03523ca233a270512ca3972c372daa4ca2a8
2016-10-27 11:08:44 -07:00