Set short_circuit_low_temp_var to 3 for speed 8 for all res.
No strong visual difference on all clips.
Change-Id: Ia6d9a314291ab1c14d5421bbdd769974083aeb2a
The source sad could be used to copy the partition without going into
choose_partitioning function to speed up vp9 encoding. Computing source
sad takes little time. Speed test on Android and Linux shows little
encoding time gain (less than 1.4%).
Turned off for now since partition copy is turned off.
Change-Id: I61c9d5b8f22329760cb29a4ee30a7f9c232ce8d3
vp9: Set short circuit to level 3 for VGA for speed 8. Also change the
threshold_32x32 to 5/8*thresholds[1] to improve quality regression
caused to VGA clips.
Change-Id: Ia1590e91e7cb22be78d5b85013387bb1be4272e3
For out-of-range cases, returned UINT_MAX instead of INT_MAX in the
sub-pixel mv search to be consistent with the "uint32_t" return type.
Change-Id: I8e206d771228c13d89bafbbe9f14722c8ecc6a7a
The function 'vp9_find_best_sub_pixel_tree_pruned_more' is modified
to return INT_MAX for handling invalid MV cases from UINT32_MAX.
yunqingwang:
patch 3: rebased on top of the tree.
patch 4: The return type of vp9_find_best_sub_pixel_tree* was changed
to uint32_t to fix ubsan warnings. Changing UINT_MAX back to INT_MAX
was not quite right. Patch 4 modified vp9_temporal_filter.c to accept
uint32_t.
(Note: Inconsistency exists in vp9_find_best_sub_pixel_tree*, which
will be fixed in a separate CL.)
Change-Id: Ib1a79dc2aa41ea6335c21669c76883cdbb7e0535
When source frame is altref, we only do zero-mv mode, so we can skip
the find_predictors(). No change in compression.
Small speed gain, ~1%.
Only affects 1 pass vbr with lookhead altref, for ytlive with
the macro flag USE_ALTREF_FOR_ONE_PASS on.
Change-Id: I9318c5da8521f017bf54919cd652438b3a6313d1
Remove superfluous test. Produces a small improvement in instruction scheduling.
Measured a 1% to 1.5% reduction in execution time for routine vp9_optimize_b
with different compilers.
No change in behavior.
TEST=Verified that encoded files match bit for bit, with and without this
change.
BUG=b/33678225
Change-Id: I2bf248d4c25fc0256147d7a8766ff9108ae9cba3
Add feature to copy partition from the last frame.
The copy is only done under certain conditions that SAD is below threshold.
Feature is currently disabled, until threshold is tuned.
Feature will be initially used for Speed 8 (ARM).
Under extreme case of always copying partition for speed 8:
Encode time is reduced by 5.4% on rtc_derf and 7.8% on rtc.
Overall PSNR reduced by 2.1 on rtc_derf and 0.968 on rtc.
Change-Id: I1bcab515af3088e4d60675758f72613c2d3dc7a5
Simplify address arithmetic on token_costs to reduce the number of generated
instructions that are used for address arithmetic inside routine
vp9_optimize_b. It also helps improve instruction scheduling depending on
compiler and optimization level.
Measured a 9.3% reduction in retired instructions and 5.3% reduction in
execution time for this routine with GCC v4.8.4 and optimization flags -O3,
and a reduction of up to 11.6% in execution time with other compilers.
No change in behavior.
TEST=Verified that encoded files match bit for bit, with and without this
change.
BUG=b/33678225
Change-Id: I6098650fb5cd2aa04e014fe6e68ca20761f3a21f
Correctly set interp_filter to SWITCHABLE for INTRA mode.
Also reduce threshold on noise level for re-evaluating zeromv.
Change-Id: Id32c01e193209fb380aa07204f0be3babf29f70a
For when denoising enabled: change condition to enable
the recheck_zeromv_after_denoising for only very high noise level.
This is causing an issue, so enabling it for very high noise
to effectively shut it off.
Change-Id: Ic40d6025f3f398338cedd270d17c0ccd9a3daa84
The flag USE_ALTREF_FOR_ONE_PASS allows for alt-ref lookahead
in 1 pass vbr (from https://chromium-review.googlesource.com/#/c/365498).
This change is to make sure this macro flag only has effect if
the config flag cpi->oxcf.enable_auto_altef is also on.
No change in ytlive encoding, as USE_ALTREF_FOR_ONE_PASS is not
yet enabled.
Change-Id: I1a69681e4a15c5244581a3dab4587fca08f02e0f
use_base_mv assumes 2x2 scaling, so fix is to shutoff
this feature unless spatial scale factors are 2.
Added svc unittest for 2 spatial layers with 5x5 scaling,
which generates the issue without this fix.
Also fix some settings in svc unittest:
let the speed setting vary (from 5 to 8), and enable static threshold.
BUG=webm:1344
Change-Id: Idfd0a6c633c21b49a0479601506302cfe974e30e
One of the first pass stats "new_mv_count" is no longer used in VP9,
and is removed. This also makes it easy to implement a multi-threaded
first pass. This change doesn't affect the coding performance, which
has been verified by borg tests.
Change-Id: I4c7c7bf9465fda838eb230814ef0c631c068c903
Use the segment weight factor based on the target (cr->percent_refresh)
if it less than the current estimate (avergae of past usage and target).
Small improvement at low bitrates.
Change-Id: Iba8fd909e203f94458901366d3a991f7ea854d49
Increase the motion threshold and qp-delta for segment#2 boost.
This can increase the frame-drop at low bitrates, but generally
better spatial quality.
Only affects real-time mode with aq-mode=3, at very low bitrates.
Change-Id: I5ccb784667f70d0c27d369806b93b1f93d5605d1
Use the same feature as https://chromium-review.googlesource.com/#/c/411327/,
but allow it to be used for speed = 6 and 7, where
short_circuit_low_temp_var = 1.
Speed up of ~2-3% for speed 7, with little/no loss in compression.
Change-Id: I263a0f261ad9929034392d68f0153dc6376fdb5f
Add a new, more aggresive short circuit: short_circuit_low_temp_var = 3 to skip
golden of any mode when variance is lower than threshold for low res.
This change only affects speed = 8, low resolution.
Metrics for avgPSNR/SSIM on rtc_derf (low resolution) show loss of
0.27/0.31%.
On Nexus 6, the encoding time is reduced by ~2.3% on average across all
low-res clips.
Visually little change on rtc_derf clips.
Change-Id: Ia8f7366fc2d49181a96733a380b4dbd7390246ec
Changes only affects speed = 8 for low resolutions.
Metrics for avgPSNR/SSIM on rtc_derf (low resolutions) show loss of
0.5/0.6%.
On Nexus 6, the encoding time is reduced by ~5.9% on average across all
low-res clips.
Visually little/no change on rtc_derf clips.
Change-Id: I68dd50e558d72dcc1af8317d224bfae5e3bd872d
This commit enables asymptotic closed-loop encoding decision for
the key frame and alternate reference frame. It follows the regular
rate control scheme, but leaves out additional iteration on the
updated frame level probability model. It is enabled for speed 0.
The compression performance is improved:
lowres 0.2%
midres 0.35%
hdres 0.4%
Change-Id: I905ffa057c9a1ef2e90ef87c9723a6cf7dbe67cb
For noisy content, be more aggressive in skippping some blocks for
delta-qp to reduce noise pulsing artifact. Also treat frame boundary
case when dimension is not multiple of superblock size/64.
Only affects non-screen content case, and when source noise
is measured to be high (at least level kMedium).
Change-Id: Ib13a2a20ed1ce37ff3c44d95c3ef2635fd695222
Add condition that usable_ref_frame > LAST.
This is to avoid potentially skipping all last-nonzero mv modes,
if golden is used as a reference but skipped completely for the
current block.
This has no effect currenty, as we always consider testing golden
mode for each block.
Change-Id: I3182cf44664081935a90ed43aa7b32e710e60e22
Fixed formatting bug introduced by the fix to BUG=webm:1322
( Iedc4477aef1746aa0a4f84d88a1156296fd3ba87)
Change-Id: I715ee446c0e8584967ab87ba4e355759dd394187
This change is a step in a larger change to the way boost and interval are
determined for ARF and Key frames.
This patch contains some pluming for the general case but focuses on the
key frame boost calculation. This now relies more heavily on the rate at
which the error score increases between the primary and secondary reference
frame. This seems to be less fragile when dealing with different frame sizes.
For example larger image formats tend in the first pass to see a higher
% of intra coded blocks and the use of this number in calculating the frame
decay factor was leading to much lower boost numbers for 4K, for example,
than the same clip coded at 2K.
This change does give overall gains but they are MUCH larger for the 4K Netflix
set. For the 4K Netflix set the average gain is around 3% with some clips > 20%
whereas for the same set at 2K the average gain is 0.5-1%.
In general for small image formats the boost is most often reduced a little whereas
4K clips the boost is increased. There are some -ve cases such as Akiyo at 352x288
where the reduced boost hurts the metrics, especially for SSIM, even while
the set as a whole improves. This is most notable at very low Q and may be the
subject of a future patch.
Some common code for KF and ARF was separated in this patch for the purposes of
tuning but may later be re-merged if appropriate.
Change-Id: Iaa15ac5a58d2be89181100d95cef6a8dc4b12d0d
Fixes a case where recode is not triggered based on the value
of maxq passed into the recode loop test function.
BUG=b/32375284
Change-Id: I15ad985d0525c68e0443cfaf842440d2754b2266
Removed a couple of adjustments that no longer move the needle
much but complicate the process of tuning.
Change-Id: Ie320f5cf155e6aac14a4757ea9ada2cd59f27590
This patch modified the motion search counts used in:
https://chromium-review.googlesource.com/#/c/305640/
These 2 counts were originally added as thread data, and used to
make decisions in motion search. The tile encoding order can be
inconsistent while using different number of threads, which can
cause bitstream mismatch. Here moved them to tile data to solve
the issue.
BUG=webm:1322
Change-Id: Iedc4477aef1746aa0a4f84d88a1156296fd3ba87