Add condition that usable_ref_frame > LAST.
This is to avoid potentially skipping all last-nonzero mv modes,
if golden is used as a reference but skipped completely for the
current block.
This has no effect currenty, as we always consider testing golden
mode for each block.
Change-Id: I3182cf44664081935a90ed43aa7b32e710e60e22
Fixed formatting bug introduced by the fix to BUG=webm:1322
( Iedc4477aef1746aa0a4f84d88a1156296fd3ba87)
Change-Id: I715ee446c0e8584967ab87ba4e355759dd394187
vp9_init_macroblockd() resets the error_info to cm's global copy; this
needs to be set to the thread-level target to avoid jumping to the
incorrect stack, resulting in hang or crash.
broken since:
1f4a6c8 vp9/tile_worker_hook: add multiple tile decoding
includes v1.5.0, v1.6.0
BUG=629481
Change-Id: Icbf1696b25ba8c479e845fbf227b3c3ca73542f5
This change is a step in a larger change to the way boost and interval are
determined for ARF and Key frames.
This patch contains some pluming for the general case but focuses on the
key frame boost calculation. This now relies more heavily on the rate at
which the error score increases between the primary and secondary reference
frame. This seems to be less fragile when dealing with different frame sizes.
For example larger image formats tend in the first pass to see a higher
% of intra coded blocks and the use of this number in calculating the frame
decay factor was leading to much lower boost numbers for 4K, for example,
than the same clip coded at 2K.
This change does give overall gains but they are MUCH larger for the 4K Netflix
set. For the 4K Netflix set the average gain is around 3% with some clips > 20%
whereas for the same set at 2K the average gain is 0.5-1%.
In general for small image formats the boost is most often reduced a little whereas
4K clips the boost is increased. There are some -ve cases such as Akiyo at 352x288
where the reduced boost hurts the metrics, especially for SSIM, even while
the set as a whole improves. This is most notable at very low Q and may be the
subject of a future patch.
Some common code for KF and ARF was separated in this patch for the purposes of
tuning but may later be re-merged if appropriate.
Change-Id: Iaa15ac5a58d2be89181100d95cef6a8dc4b12d0d
Fixes a case where recode is not triggered based on the value
of maxq passed into the recode loop test function.
BUG=b/32375284
Change-Id: I15ad985d0525c68e0443cfaf842440d2754b2266
Removed a couple of adjustments that no longer move the needle
much but complicate the process of tuning.
Change-Id: Ie320f5cf155e6aac14a4757ea9ada2cd59f27590
This patch modified the motion search counts used in:
https://chromium-review.googlesource.com/#/c/305640/
These 2 counts were originally added as thread data, and used to
make decisions in motion search. The tile encoding order can be
inconsistent while using different number of threads, which can
cause bitstream mismatch. Here moved them to tile data to solve
the issue.
BUG=webm:1322
Change-Id: Iedc4477aef1746aa0a4f84d88a1156296fd3ba87
Re-use the tile worker threads to pack the bitstream in parallel
on a per-tile basis. Restricting this to real-time only for now
(further testing is needed to ensure this does not make 2-pass
worse in any case).
BUG=webm:1309
Change-Id: I8a80da7c5089b837d0df79a5c49d5e3022dfc8ec
In variance partition low resolutions may use varianace based on
4x4 average for better partitioning.
Increase the threshold for doing this at speed = 8.
Improves speed by ~5%, with little loss, < 1%, on RTC_derf set.
Change-Id: Ib5ec420832ccff887a06cb5e1d2c73199b093941
This reverts commit 9e8efa5b18.
this change causes ubsan warnings, failures in
vpxenc_vp9_webm_rt_multithread_tiled
BUG=webm:1309
Change-Id: I020c7be985c771bfff4b3de1afe51cc8edb980da
Add stronger condition for splitting 64x64, for low noise content.
This reduces dragging artifact near moving head.
Little/no change in metrics on RTC set.
Change-Id: I39b38cfd20f2ece53ff49c2aaf76ba9f82761be1
Re-use the tile worker threads to pack the bitstream in parallel
on a per-tile basis. Restricting this to real-time only for now
(further testing is needed to ensure this does not make 2-pass
worse in any case).
BUG=webm:1309
Change-Id: Ia2c982da56697756e12f02643f589189b3271d98
For 1 pass vbr real-time mode:
Allow for the usage of alt-ref frame when non-zero lag-in-frames is used.
Use non-filtered alt-ref, and select usage based on fast scene/content
analysis/detection within the lag of frames.
Positive gains on ytlive set: overall avgPSNR ~3-4%.
Several clips are up between 5-14%, a few clips are neutral/small change.
Current speed decrease is about ~5-10%.
Use the flag USE_ALTREF_FOR_ONE_PASS to enable this feature
(off by default for now).
Change-Id: I802d2bf3d44f9cf01f6d15c76be9c90192314769
Put limit on gf interval based on lag, and allow
for the adjustment on next gf group also on key frame.
Small/neutral change on ytlive metrics.
Change only affects 1 pass vbr real-time mode.
Change-Id: I339c8f4398848698b6e10fe9482c52ca661b94a5
It only handles the realloc constraint (preserving low elements) by
serendipity, and we don't actually rely on that behavior anyway.
Meanwhile the calls may do extra copying that gets immediately clobbered
by the callers.
Change-Id: I8dfa89e4a81084b084889c27bd272fdf85184e8d
This change will make the highbd txfm input range check more comprehensive
The 25-bit highbd input range is composed by
12 signal input bits + 7 bits for 2D forward transform amplification + 5 bits for
1D inverse transform amplification + 1 bit for contingency in rounding and quantizing
BUG=https://bugs.chromium.org/p/webm/issues/detail?id=1286
BUG=https://bugs.chromium.org/p/chromium/issues/detail?id=651625
Change-Id: I04c0796edd7653f8d463fba5dc418132986131e7
These caused the following warning with GCC 5:
warning: logical not is only applied to the left hand side of
comparison [-Wlogical-not-parentheses]
assert(!is_compound == (cm->reference_mode == SINGLE_REFERENCE));
Change-Id: If296aabb2311ceb7d903b395c1549ef81c2cbf9b
(cherry picked from commit c6cf7a6111)
Rename vpx_lpf_horizontal_edge_8() to vpx_lpf_horizontal_16().
Rename vpx_lpf_horizontal_edge_16() to vpx_lpf_horizontal_16_dual().
Change-Id: I798ca8fbbd657d06d3db2bfb0fb3321168f49e52
This patch sets the 16x16 src_diff to zero and ensures correct calculation
of this_error for block sizes smaller than 16x16.
Change-Id: I7b7c02d267433c9f22c8ac9b8d5df2f499175172
change_config() may be called often in real-time application,
to update bitrate/framerate or qp-max/min.
No need to do update_frame_size() unless frame size has changed.
Change-Id: I23a51deade1e03adc91c468f9ffde3235298770c
For real-time mode at speed 8: turn off MINIMAL_LF at speed 8,
for non-screen content mode.
Visually better, avgPSNR/SSIM on rtc set go up by ~4-5%.
Speed decrease of about ~3%.
Change-Id: I8eb69330f02e0ceece1507d43cfc8a049a1d8291
Inline function called from test/dct16x16_test.cc wouldn't build due to:
invalid operands of types ‘__gnu_cxx::__enable_if<true, double>::__type
{aka double}’ and ‘int’ to binary ‘operator>>’
return (abs(ref->row) >> 3) < COMPANDED_MVREF_THRESH &&
this converts the test to abs() < COMPANDED_MVREF_THRESH << 3 which
hides the promotion issue.
Regression from commit de993a847f
BUG=webm:1291
Change-Id: I73b5943d07d5b61b709d299114216a2371a8fd62
Current version does not build with options:
--enable-vp9-highbitdepth --enable-coefficient-range-checking
Change-Id: Ic3285f1a3e0d6be88da7f2cd8fa5a631368dd03b
Reduce the filt_guess for 1 pass cbr on inter-frames.
This reduces visual artifact seen in rtc clip (jimred.vga),
and improves metrics on rtc set.
Metrics on rtc set for cbr mode overall positive, most clips are up:
Speed 7 rtc: avgPSNR/SSIM up by: ~2.6/3.9%
Speed 8 rtc: avgPSNR/SSIM up by: ~1.3/2.5%
Change-Id: Ia4eccea1c19d65b583516df28823cd756c49464f
Added a cap on the maximum boost for an arf based on interval length.
Fixed bug where by the image size was not accounted for in determining
two of the motion breakout thresholds.
Overall small gains of 0.2-0.4% psnr but on large image format clips with
slow zooms the gain may be as much as 20% or more (e.g. in_to_tree
at 1080P)
Change-Id: Id0a47391203026742daa9c97afac5705fd8c4dfb
Do nothing in vp9_highbd_iht#x#_##_add_c when input magnitude is beyond
20 bits. Note that, sign bit is not included here.
In the 20 bits, we use 12 bits for input signal, 7 bits for forward
transform amplification, and 1 bit for contingency in rounding and
quantizing
BUG=https://bugs.chromium.org/p/webm/issues/detail?id=1286
Change-Id: I332c6f68df4614fc2e7d2dc4c5bb0d0cff8a245c
the --enable-postproc-visualizer configure option remains as a no-op as
do the control names and values for compatibility
+ remove the corresponding debug flags from vpxdec: --pp-*
Change-Id: I4a001cd9962b59560d7d6bda6272d4ff32b8d37c
The vp9_mv_class0_tree is a balanced tree with two leafs and can
simply be coded as a boolean with probability class0[0].
Change-Id: If294dac825a5f945371092c74aa8e3f84cd962b6
(cherry picked from commit be8a8ab62ebdd111c6f2e9a33b15630570671eba)
Remove the experiment LIMIT_QP_ONEPASS_VBR_LAG, as its
not currently used and no plan to use in near future.
Change-Id: Ib069f8d7225195be04b765d0ab477510dfba6a3b
Added casts to remove warnings:
BUG=webm:1274
In regards to the safety of these casts they are of two types:-
- Normalized bits per (16x16) MB stored in a 32 bit int (This is safe as bits
per MB even with << 9 normalization cant overflow 32 bits. Even raw 12
bits hdr source even would only be 29 bits :- (4+4+12+9) and the encoder
imposes much stricter limits than this on max bit rate.
- Cast as part of variance calculations. There is an internal cast up to 64 bit
for the Sum X Sum calculation, but after normalization dividing by the number
of points the result will always be <= the SSE value.
Change-Id: I4e700236ed83d6b2b1955e92e84c3b1978b9eaa0
Using a tighter resize constraint on undershoot seems to help
results (especially SSIM) as significant undershoot on a frame
seems to have more of a damaging impact than overshoot.
This patch has been tuned so that in local testing using the
derf set it is encode speed neutral for speed setting 2.
Average quality result for speed 2 (psnr,ssim) were as follows:-
lowres 0.039, 0.453
midres 0.249, 0.853
hdres 0.159, 0.659
NetFlix -0.241, 0.360
Change-Id: Ie8d3a0d7d6f7ea89d9965d1821be17f8bda85062
This patch is to address concerns that changes to allow
recodes on the first frame in each ARF group do not give a
good enough speed quality trade off for speed 2. Though the
average impact on encode speed is 1-2%, for some hard clips
it is > 5% rise. For speed 1 this is less an issue and for Speed 0
the previous patch actually improves speed.
Change-Id: Ie1bcefdbfdf846d3f4428590173f621465dffe3a
This corrects a formatting error introduced in:
I1e9d548ce445d29002f0c59ebfd3957a6f15e702
where spaces were used as delimiters instead of tabs.
The corresponding fix for vp10 is in
Ica3d625d6672b3c47e0e208b45eede29b9004030.
Change-Id: Ibc4eb8fd82e6b926ba259a679dc98557cadba9b1
Current commit is just an API template for the rest of the code, and
I will add inner logic later.
Altref frames generate a lot of bitrate and at the same time
other frames refer to them a lot, so it makes sense to apply
special compensation-based adaptive quantization scheme for altref
frames. E.g., for blocks that are good predictors for the future
apply rate-control chosen quantizer while for bad predictors apply
worse one.
Change-Id: Iba3f8ec349470673b7249f6a125f6859336a47c8
Previously Tx domain rd was used in all cases above speed 0.
Coefficient optimization was only enabled for best and speed 0.
This patch selectively sets these features at other speed settings
based on block complexity.
For the Netflix and HD sets in particular the quality gains are
large compared to the speed hit. At speed 1 the average psnr
gain in the NF set is > 2.5% with one clip coming in at 18%
and some points almost 30%. Average gains for the lower
resolution test sets are around 1%.
The gains are biggest at low Q so some further optimization
may be possible.
Change-Id: I340376c7b2a78e5389a34b7ebdc41072808d0576
In the future this option will activate adaptive quantization special
for altref frames. Encoder will create the adaptive quantization map
on the basis of lookahead buffers similarity which is the estimate of
the future motion compensation performance.
Change-Id: Ia0088b3babb0f9a4899c79d8d819947ba5a03df2
Disabled the split mode while encoding 4k video to speed
up the encoder.
Borg test result on 4k set:
Overall PSNR: +0.029%; SSIM: +0.009%.
Average encoder speedup at speed 2 is 2.5%.
Change-Id: I1519c658f07c3ac838affbe5aff0ed9b94f3f8f4
Adjusted speed 2 features to speed up 4k video encoding.
BDBR results from borg test:
PSNR: +0.313%; SSIM: +0.268%.
Average speedup: 8.5%
Change-Id: I1e2695a01fb3f3817c1df4480e184c2aed8f2eba
this fixes a crash in vp9_dec_setup_mi() via
vp9_init_context_buffers() should decoding continue and the decoder
resyncs on a smaller frame
BUG=b/30593752
Change-Id: I9ce8d94abe89bcd058697e8bd8599690e61bd380
Bias towards base_mv and skip 1/4 pixel motion search when using base mv.
2~3% speed up for 2 spatial layers, 3~5% speed up for 3 spatial layers.
PSNR loss:
(2 layers) 0.07dB for gips_stationary, 0.04dB for gips_motion;
(3 layers) 0.07dB for gips_stationary, 0.06dB for gips_motion.
Change-Id: I773acbda080c301cabe8cd259f842bcc5b8bc999
Add option, for newmv-last, to limit the rd-threshold update for early exit,
under a source varianace condition.
This can improve visual quality in low texture moving areas,
like forehead/faces.
Also add bias against golden to improve the speed/fps,
will little/negligible loss in quality.
Only affects CBR mode, non-svc, non-screen-content.
Change-Id: I3a5229eee860c71499a6fd464c450b167b07534d
Changes the default recode rule for Speed 0 and best quality
from ALLOW_RECODE to ALLOW_RECODE_KFARFGF.
Tested on the NF, hdres, midres and lowres test sets, this setting
when combined with patch I40cb559... now performs "as well" in
metrics terms (in fact it came out a tiny amount better overall)
but encode time is 9.6% faster (measured as the average
from 27 mid rate local encodes on clips in the derf/lowres set.
Change-Id: I8c781c0cdfa3a9929cd9406d15582fce47d6ae3b
Allow recodes for the first inter frame in each arf group
even when the recode rule is set to ALLOW_RECODE_KFARFGF.
Small gains of 0.05%.
Change-Id: I40cb559d36a2bf0ebf5cf758c3f92e452b480577
This patch fixed a motion vector out of range bug:
vpxenc: ../libvpx/vp9/encoder/vp9_mcomp.c:69:
mv_cost: Assertion `mv->col >= -((1 << (11 + 1 + 2)) - 1) &&
mv->col < ((1 << (11 + 1 + 2)) - 1)' failed.
For blocks that returned without having full-pixel search, the original
MV limits were not restored, which caused the failure. Moved the set
MV limit function down to fix the bug.
Change-Id: Id7d798fc7214e95c6e4846c588f0233fcf1a4223
This patch fixed a motion vector(MV) out of range bug, which was caused
by not restoring the original values of the MV min/max thresholds after
the sub8x8 full pixel motion search. It occurred rarely and only was seen
while encoding a 4k clip for 200 frames.
BUG=webm:1271
Change-Id: Ibc4e0de80846f297431923cef8a0c80fe8dcc6a5
* changes:
Use common transpose for vpx_idct32x32_1024_add_neon
Use common transpose for vpx_idct8x8_[12|64]_add_neon
Use common transpose for vp9_iht8x8_add_neon
Use common transpose for vpx_idct16x16_[10|256]_add_neon
Increase the minimum distance.
Reduces the overshoot somewhat on some clips,
small gain in avgPSNR (~0.1%) on ytlive set.
Change-Id: Id5ddde20c2907dbdb536e79542eff775019c142b
Move best index into the token state. Shrink it down to one byte. This
is more cache friendly (access are group together) and uses less total
memory.
Results in 4% fewer cycles in optimize_b().
Change-Id: I75db484fb3dc82f59928d54b659d79c80ee40452
Shutdown all threads before reclaiming any memory. The frame-level
parallel decoder may access data from another worker.
BUG=webm:1259
Change-Id: I26856ebd1f77cc4a4545331baa19bbf3e01c4ea4
This commit changes the call in vp9 encoder from vp9_deblock() to
vp9_post_proc_frame() to ensure the data structures used in the call
are properly allocated. This fixes an encoder crash when configured
with --enable-internal-stats.
Change-Id: I2393b336c0f566665336df4f1ba91c405eb56764
Allow usage of lookahead for VBR in real-time mode, for 1 pass vbr.
Current usage is for fast checking of future scene cuts/changes,
and adjusting rate control (gf interval and active_worst/target size).
Added unittests (datarate) for 1 pass vbr mode, with non-zero lag.
Added an experimental option to limit QP based on lookahead.
Overall positive gain in metrics on ytlive set:
avgPNSR/SSIM up on average ~1-3%; several clips up by 5, 7%.
Change-Id: I960d57dfc89de121c4824b9a9bf88d2814e74b56
This change eliminates redundant computation in the two stage
downscaling, which saves ~1% encoding time in 3-layer svc encoding.
Change-Id: Ib4b218811b68499a740af1f9b7b5a5445e28d671
Modify the gfu_boost and af_ratio setting based on the
average frame motion level.
Change only affects 1 pass vbr.
Metrics overall positive on ytlive set.
On average up by ~1%, several clips up by 2-4%.
Change-Id: Ic18c49eb2df74cb4986b63cdb11be36d86ab5e8d
Use the precise context to estimate the zero token cost in trellis
optimization process. This improves the speed 0 coding performance
by 0.15% for lowres and 0.1% for midres. It improves the speed 1
coding performance by 0.2% for midres and hdres.
Change-Id: I59c7c08702fc79dc4f8534b64ca594da909e2c91
This commit allows the inter prediction residual to use uniform
quantization followed by trellis coefficient optimization in
speed 0. It improves the coding performance by
lowres 0.79%
midres 1.07%
hdres 1.44%
Change-Id: I46ef8cfe042a4ccc7a0055515012cd6cbf5c9619
Move the operations that update the context buffers outside this
function. The coeff_cost() takes all input as const value and returns
the coefficient cost.
This makes preparation for the next coefficient optimization CLs.
Change-Id: I850eec6e5470b91ea84646ff26b9231b09f70a0c
Replace the existing mv bias with a bias only for
NEWMV, and based on the motion vector difference of
its top/left neighbors.
For cbr non-screen-content mode.
Change-Id: I8a8cf56347cfa23e9ffd8ead69eec8746c8f9e09
Use a measure of noise energy to adjust Q estimate and
arf filter strength.
Gains 0.3-0.5% on Lowres and |Netflix sets.
Hdres and Midres neutral.
Change-Id: Ic0de552e7b6763e70eeeaa3651619831b423e151
Use pixel domain distortion metric in speed 0. This improves the
compression performance by 0.3% for both low and high resolution
test sets.
Change-Id: I5b5b7115960de73f0b5e5d0c69db305e490e6f1d
Safer to have the decoder operate normally and have
better-hw-compatibility only implement encoding changes.
Fixes some test failures.
Change-Id: I0dd70d002e4e893992f0cd59774b9363e6f7fe76
For real time CBR mode, use model_rd_for_sb_y for 32x32 if the sb is
a skin sb to avoid visual regression on the slowly moving face.
Refer to the cl: https://chromium-review.googlesource.com/#/c/356020/
Change-Id: I42c36666b2b474ce5ee274239d52ae8ab400fd46