Move the source_sad feature to speed 6 (from speed 7), and
add speed feature to switch from the variance-based partition
to reference_partition (which uses nonrd-pickmode for bsize selection)
if source_sad is high.
Currently used only for speed 6 for resoln <= 360p.
About 4-5% improvement on 360p in RTC set.
Some speed slowdown, but still ~30% faster than speed 5.
Change-Id: Ib0330ee5fe9fdd2608aed91359a2a339d967491c
For 8-bit the subtrahend is small enough to fit into uint32_t.
For 10/12-bit apply:
63a37d16f Prevent negative variance
previously:
47b9a0912 Resolve -Wshorten-64-to-32 in highbd variance.
c0241664a Resolve -Wshorten-64-to-32 in variance.
Change-Id: I181c85f0b9a03da37c2e8b89482d48aa3dbc0aee
0.007% regression on rtc and 0.004% gain on rtc_derf.
1 thread on QVGA,VGA and HD has ~0.2% speed regression while 2 threads has
~0.2% speed gain on Google Pixel.
Change-Id: Ia4a6ec904df670d7001e35e070b01e34149d23dc
To fix valgrind issueis with SVC tests.
SVC encoding uses prune_evenmore which is causing uinit value.
Will re-enable later when issue is resolved.
Change-Id: I257ff878cf78197ddd813db056582a4d5fe94f44
When content_state_sb is set to LowVarHighSumdiff, don't reset
it to VeryHighSad. Visually better on clips with strong lighting changes.
Small/negligible change in RTC metrics and speed.
Change-Id: I20c383e3c4cf8d1149de5f9260449c0b7cf7c6aa
When int_pro_motion_estimation is done for superblock in
choose_partitioning, use it to avoid the full_pixel_search
for NEWMV mode, if bsize is >= 32X32.
For speed > 7.
Small/neutral change on RTC metrics.
~1-2% speedup on arm on high motion clip.
Change-Id: I3cfe6833ff4bf75d4afa83eaf058ad45729de85b
Reduces memory usage, and speeds up encoding for some difficult clips.
No impact on output or metrics.
Ported from aomedia patch:
https://aomedia-review.googlesource.com/c/14501
Change-Id: I26ec69af8336f9e80da486a1cfbfc89a3596954d
This reintroduces the fix:
https://chromium-review.googlesource.com/c/422807/
and later reverted here:
https://chromium-review.googlesource.com/c/447843/
BUG=webm:1355
This time behind a compile time flag :
configure --disable-always_adjust_bpm
configure --enable-always_adjust_bpm
This should make side by side testing easier and let users of the
lib pick which way they want to go.
Change-Id: I7d7b37b83015dc001810af84c132cbc1e71ba8d6
For fixed pattern SVC: keep track of denoised last_frame buffer
for base temporal layer, and if alt_ref is updated on middle/upper
temporal layers, force an update to denoised last_frame buffer.
This allows for improved denoising on top temporal layers.
Change-Id: Icbd08566027d4d2eabc024d3b7a0d959d2f8c18b
This code is unused in vp9. Only vp8 still contains references to
vpx_sad_NxMx[3|8] and only for sizes 16x16, 16x8, 8x16, 8x8 and 4x4.
Remove the remaining sizes and all the highbitdepth versions.
BUG=webm:1425
Change-Id: If6a253977c8e0c04599e25cbeb45f71a94f563e8
Denoiser is used in real-time mode which does not use alt-ref.
Reduce memory usage when denoiser is enabled.
Change-Id: I54ba3bcaeeb1818bbdf718ef90e97d4897ff793d
In the content_state for a superblock is set to HighSad,
use that to bias some decisions in variance partition and
nonrd pickmde: use int_pro_motion for sad computation in
choose_partitioning, and set large_block in pickmode based
on the content_state_sb.
Only affects speed >= 7.
Immprovement for high motion content.
Small gain (~1%) in RTC metrics.
Speedup of ~5 for high motion clip on android (speed 8, 1 thread).
Change-Id: I5774c4854f012b89c8e969f6129b60988c2ce11c
This patch attempts to address a bug reported for 4K video.
https://b.corp.google.com/issues/62215394
In this instance a perfect storm of a moderate complexity section
followed by a much easier section where a CGI overlay helped to
suppress film grain noise, followed by a much harder and very grainy
section at the end, cause a massive local rate spike that pushed a chunk
over the upper allowed rate limit.
This patch detects cases where the rate for a frame is much higher than
expected and allows, in this special case, for rapid adjustment of the active
Q range.
For the example chunk in the bug report the target rate was 18Mb/s and the
observed rate was over 37 Mb/s with a surge for the last few frames to over
100Mb/s. This patch brings the overall chunk rate right back down to ~18.2 Mbit/s
and almost completely eliminates the rate spike at the end. (See graphs appended
to bug report)
Also see I108da7ca42f3bc95c5825dd33c9d84583227dac1 which fixes a bug
unearthed during testing of this patch and also has a bearing on high rate
encodes such as 4K.
This patch does have a negative impact on some metrics. Most notably there are
clips in our standard test set where it hurts global psnr (though in many cases it
conversely helps SSIM, FAST SSIM and PSNR-HVS). It is also worth noting that
the clips (and data rates) where there is a big metric impact, are almost all cases
where there is currently a significant overshoot vs the target rate and overall rate
accuracy is greatly improved.
Change-Id: I692311a709ccdb6003e705103de9d05b59bf840a
For nonrd_pickmode: add condition for checking
intra mode if the sb content state is VeryHighSad.
Reduces artifacts when sudden change in content.
Metrics on RTC/RTC_derf neutral (small gain).
No speed loss observed.
Change-Id: I07006d28fd2dc06c1d06b07630102b0fece50c40
Existing logic was only affecting resolutions above 720p.
Needs more testing for reducing subpel for speed >= 8.
No change on RTC metrics.
Change-Id: I2f4bf9f25891614aafa9a86aa5a5063a3ccfce4d
This could save some cycles since skin detection is used in multiple
places in vp9.
1~2% speed up on ARM.
Change-Id: I86b731945f85215bbb0976021cd0f2040ff2687c
Use the scene detection for CBR mode, and use it to reset the
rate control if large source sad is detected and rate
correctioni fact/QP is at minimum state.
Avoids large frame sizes after big content change following
low content period.
Only affects CBR mode for 1 pass at speeds 5, 6, 7.
Change-Id: I56dd853478cd5849b32db776e9221e258998d874
Fix misplaced cast that caused an overflow and incorrect rate adaptation
behavior for high data rates. This in particular will have affected 4k encodes
but could also have come into play for some higher rate 1080p cases.
In our standard test sets the quality impact is small though several high rate
clips show improved rate accuracy. This can also impact the number of recode
loop hits and on one problem 4k clip the encode time for speeds 0 and 1 was
reduced by >25%
Change-Id: I108da7ca42f3bc95c5825dd33c9d84583227dac1
Use it to limit NEWMV early exit in nonrd pickmode
Small change in RTC metrics, has some improvement
for high motion clips.
Change-Id: I1d89fd955e1b3486d5fb07f4472eeeecd553f67f
use an int to quiet an unsigned rollover warning similar to:
25110f283 Fix an ubsan warning: vp9_quantizer.c
Change-Id: Iedecb79a17249bc18f10c0920f88cf704920f12b
Adjust the threshold for turning off cyclic refresh for high motion,
and avoid testing golden in nonrd pickmode for speed >= 8 if
golden refresh was long ago.
No change/neutral on RTC metrics.
Change-Id: I40959b8d9637f3553e7458bbabd8c6024c2c09c0
Set the base_mv_aggressive for temporal enhancement layers (TL > 0).
Under the aggressive mode, skip the NEWMV depending on the
SSE of the base_mv. Also reduce the subpel motion to 1/2 under
aggressive mode if base_mv is good.
Speedup ~3% with small/negligible loss in quality on RTC.
Affects speed >= 6.
Change-Id: I89341b279cad6da2a04b76d5e726016191dacdb8
This was ported from the greedy version in AV1, written by Dake He
(dkhe@google.com).
See:
https://aomedia.googlesource.com/aom/+/master/av1/encoder/encodemb.c#137
Greedy version is disabled by default, but can be picked by setting
USE_GREEDY_OPTIMIZE_B to 1.
To be enabled by default later.
This is both faster and better in terms of compression.
Compression Improvement:
------------------------
lowres: -0.119
midres: -0.064
hdres: -0.405
Speed Improvement:
------------------
(Based on encode time of 3 videos of different difficulties at
3 different target bitrates)
With --cpu-used=0: 0.38% to 5.55% faster
With --cpu-used=1: 0.24% to 2.79% faster
With --cpu-used=2: 0.29% to 1.46% faster
Change-Id: Ia7a23b3b244ad8eb253ac9e43cd03c5e021d2635
Set subpel prune_evenmore only for non_reference frames,
instead of all TL > 0 frames. Gain some quality back at
cost of small speed loss (~1-2%).
Change only effects SVC encoding at speed >= 7.
Change-Id: I5b9f51e51dccfd7050521a66996176b0415ca3f9
Keep the 1/4subpel for all frames, use SUBPEL_TREE_PRUNED_EVENMORE
for all temporal enhancement layer frames.
Change-Id: Ibc681acbb6fc75b7b3c57fc483fcb11d591dfc9a
It is initialized to be { INT_MAX, 0, ... } in ffe0f9b.
No effect on encoders.
Make it consistent with other initializations.
BUG=webm:1440
Change-Id: Ie2a180d93626b55914c8c4255e466a1986d2b922
visual studio will warn if a 32-bit shift is implicitly converted to 64.
in this case integer storage is enough for the result.
since:
f3a9ae5ba Fix ubsan failure in vp9_mcomp.c.
Change-Id: I7e0e199ef8d3c64e07b780c8905da8c53c1d09fc
For SVC 1 pass non-rd mode:
Force subpel seach off for SVC for non-reference frames
under motion threshold.
Add flag to svc context to indicate if the frame is not used
as a reference.
Little/no quaity loss, ~2% speedup.
Change-Id: Ic433c44b514d19d08b28f80ff05231dc943b28e9
Speed >=8: for resolutions above CIF, and for low motion content,
set subpel_search_method to SUBPEL_TREE_PRUNED_EVENMORE.
Small speed gain (~2%) on vga clips,
RTC metrics up by ~2-3% on average.
Change-Id: Ie26ba0264589652f92dfe74308740debf94cf0cc
Split vp8/vp9 implementations on yv12_copy_frame_c.
Remove high-bitdepth codes from vp8_yv12_extend_frame_borders_c.
Clean up vp8 codes usage in vp9.
BUG=webm:1435
Change-Id: Ic68e79e9d71e1b20ddfc451fb8dcf2447861236d
Fix the condition on usage of source_sad for temporal layers.
FIx allows it to be used for the case of 1 temporal layer.
Change-Id: I02b1b0ade67a7889d1b93cee66d27c0951131fc3
Adjust the max_copied_frame setting for temporal layers.
Keep the same setting for non-SVC at speed 8.
This change also enables copy_partiton for non-SVC at speed 7,
but with smaller value of max_copied_frame (=2).
~2% speedup for SVC speed 7, 3 layers, with little/no quality loss.
Change-Id: Ic65ac9aad764ec65a35770d263424b2393ec6780
For aq-mode=3: refactor the condition for turning off
the refresh. Add some adjustments for high motion content.
No/little change in RTC metrics, only affects high motion case.
Change-Id: I7da8eabfb0e61db014be4562806f72ee5ef4a43b
When temporal layers are used, only allow for copy partition
on the top temporal enhancement layer frames.
Change-Id: I5472abdc0f9f6c8dafa75a7a84c615e08ae22af8
Only affects speed 8.
Make changes to copy partition to fix a bug in setting microblock
offset. Avg PSNR shows 0.02% gain on rtc_derf and 0.08% loss on rtc.
Change-Id: I61c3e5914dde645331344388e7437e5638acd4f3
The modified error was a derivative of the "coded_error"
that was used to allocate bits between different frames on the
assumption that the allocation should be linear in terms of this
modified error. I.e. a frame with double the modified error score
should all things being equal get double the number of bits. The
code also included upper and lower caps derived from input
VBR parameters.
This patch improves the initial calculation of the clip mean error
(now called "mean_mod_score" as it is no longer a prediction error)
used as the midpoint for the rate distribution function and normalizes
the output "modified scores" scores such that 1.0 indicates a frame
in the middle of the distribution. The VBR upper and lower caps are
then applied directly to a frame's normalized score.
This refactoring is intended to make it easier to drop in alternative
distribution functions or to base the rate allocation on a corpus wide
midpoint (rather than the clip mean).
Change-Id: I4fb09de637e93566bfc4e022b2e7d04660817195
Most existing first pass stats are stored in a form normalized to a
macro-block scale. However the error scores for intra / inter etc were
stored as frame level values but mainly used as MB level values.
This change fixes that. Normalized per MB values make comparisons
between different formats easier and in any case this is usually what is
wanted.
An change in results should be limited to slight differences in rounding.
*** Change after patch 8 +2 requiring new approval.
Final pre-submit testing showed one 4K clip with above expected change.
Investigation showed this was due to a value used to test for ultra low intra
complexity in key frame detection. This was a per frame not per MB value but
also did not scale with frame size. Replacement with a small per MB value
(based on original per frame value and cif frame size) resolved the KF detection
problem.
Also converted kf_group_error_left to a double in line with other error values
to reduce rounding problems in KF group bit allocation
All clips and sets now show nominal (or 0) change as expected.
Change-Id: Ic2d57980398c99ade2b7380e3e6ca6b32186901f
Move the tran_low_t helper functions to a new file. Additional
load/store functions will be added here.
Change-Id: I52bf652c344c585ea2f3e1230886be93f5caefc3