Change only affects 1 pass, vbr, speed = 5 (real-time mode).
Some improvement for high motion content.
AvgPSNR/SSIM metrics for ytlive set all up, on average ~2%,
some clips (high motion ones) up 4/5%.
Encoder speed down: on mynintendo_x1.1280_720.y4m: 47fps -> 44fps.
Change-Id: I9e3eaa6392dcb6b5b44ee6f43004f97ba859bc11
The mv is clamped in dec_find_mv_refs() to a smaller region
than the clamp in dec_find_best_ref_mvs(). See clamp_mv_ref
and clamp_mv2.
Change-Id: I47dd5f7fa8b42f2cc593559b4d7c782fe7bcb1db
In multi-thread case, the encoder may crash if using encoder option
tile-rows > 0. To prevent that, force tile-rows=0 in this situation.
This is a workaround for WebM issue 1095:
https://bugs.chromium.org/p/webm/issues/detail?id=1095
The further fix can be done by adding synchronizations after a tile
row is encoded. But this will hurt multi-threaded encoder performance.
So, it is recommended to use tile-rows=0 while encoding with threads
> 1.
Change-Id: I656cbcc200f8d0410d09530e7981ad8f32fe7bc9
Allow the encode loop to select from a wider range of Q values
when encoding normal (non arf or kf) frames.
This change is targeted at improving psycho-visual quality in some
easy sections that are currently not getting enough bits.
This is likely to be a little worse from a metrics perspective and may also
have a small impact on encode speed in cases where extra recode
iterations are triggered.
Change-Id: I667eebf33c753bcbcf8b93596467369e5708b889
Adds a second threshold for recodes even on frames where
recode is normally disabled if there is a big rate miss.
Change-Id: Ifd4a34707da55ec15eb7cfb87de4644b8d76deb2
Fix the threshold for forcing refresh of golden frame based
on high motion. The current comparison was incorrect and
prevented this (force update of gf on high motion) from being used.
For now keep this logic under a flag (and off for now) so as to
not change behavior, until further testing.
Change-Id: Ib5f0082159a428b0603b9534e4bcb6f83e4ccb25
+5.857% BD-RATE on SCREEN_CONTENT
Leaving this off for non-screen content because:
+25.300% on TWITCH120
+37.833% BD-RATE on RTC
Change-Id: Ie0a312182d6cc859fb04298e4cd81d02b39e23fe
For 1 pass vbr mode: Increase the period of gf update on scene
cut (keep it same as orginal/default setting for now).
Change-Id: I679c3bd21152f6c4e486c8098d931c00e1d26b5f
This is the identical change submitted for vp8 here:
https://chromium-review.googlesource.com/#/c/274107/
Tested this change on Mac OSX (10.10) and Linux
(Linux Mint 17 / Ubuntu 14.04) and in both cases:
- downloaded and compiled latest source for libvpx and ffmpeg
- confirmed ffmpeg would build sub-second frame rate webm files
via the previous patch
- confirmed ffmpeg would *not* build fps < 1 for vp9
- made this change, recompiled libvpn and ffmpeg
- confirmed ffmpeg would now create the same webm with
fps < 1
- confirmed the resulting file would play and was vp9 (e.g.
would not play in Firefox (Linux version complained it was
VP9 but mostly could play it) or older vlc, etc., but does
play just fine in Google Chrome and a newer version of vlc.
Sorry I didn't catch this last time - but this seems a solid
change and it's handy to be able to create frame rates
less than one second.
-jk
Change-Id: I38fa32148de8c4c359f228cf08b9a4b83b5a52fb
The change https://chromium-review.googlesource.com/#/c/329181/
also changed behavior for cbr mode, which causes some regression
in screenshare test in webrtc.
Resetting the specific change to leave the cbr behavior
unchanged for now.
Change-Id: I52df158806422f86398e1d2f522e92067d8325eb
Some adjustments to inter-mode selection for vbr mode.
Condition some of the bias to low/zero motion on cbr mode, and
don't use int_pro_motion_estimation for golden ref
(treat it same as last ref).
Change only affect 1 pass vbr mode, speed >=5 (non-rd pickmode).
Encoding time increase within ~5%.
Avg PSNR/SSIM on RTC set increase by ~2%, all clips up,
ranging from 0.5 to 4%.
Change-Id: I0048d0104a8816773d91a2b1484d601169d9bad7
Don't advance the svc frame counters on dropped frame,
since this can break the referencing scheme and lead
to a crash/assert.
Updated svc-datarate unittest to add a lower bitrate test.
Change only affects 1 pass cbr svc, with frame dropper enabled.
Change-Id: Ibb7530b7a587a9344d46898d9286fd9e2ef0779c
Use the superframe counter to set the key frame, and force
it to the key frame on base spatial layer only.
Also, update svc frame counters under frame dropping.
Update unittest: add specific tests with short key frame period.
https://bugs.chromium.org/p/webm/issues/detail?id=1150
Change-Id: I5b1c9a09253e6e5fbfce51b4cf603ae22d422b01
For 1 pass cbr mode: allow for two-stage 1:2 scaling
(which will use the 1:2 optimized scaler) if the spatial
layer is 1/4x1/4 of souce.
Without this change, the base layer for 3 spatial layers would
be using the non-normative scaler which is un-optimized/C code.
Change-Id: I9d73f92a4a96927d0f1d6bf75315c1e60513226a
Use sharp filter to generate motion compensated reference for
temporal filtering. It improves the average coding performance of
VP9 speed 0:
derf 0.34%
hevcmr 0.38%
stdhd 0.58%
Change-Id: I1772a051be545de8c343055274e5ca0929d19cda
This commit back ports the fix from
https://chromium-review.googlesource.com/#/c/326940
It corrects the block partition context fetching in rate-distortion
optimization. It improves the average coding performance of speed 0:
derf 0.098%
hevcmr 0.102%
stdhd 0.282%
Change-Id: I8bcc6fe40ba5c6b50a6136daac116dcc738937ec
The double pointer in xd->mi handles this for us.
Cuts encode_suberblock()'s self time in half at rt speed 8.
Change-Id: I820dae24efdbf9a140bbeae82e4e2a5850317766
This reverts commit f51f0998e1.
This causes datarate tests to fail. Some are due to the new default
keyframe distance, another causes an assert even forcing 9999:
[ RUN ] VP9/DatarateOnePassCbrSvc.OnePassCbrSvc3SpatialLayers/0
test_libvpx:
vpx_dsp/x86/vpx_subpixel_8t_intrin_ssse3.c:853: scaledconvolve2d:
Assertion `y_step_q4 <= 32' failed.
Change-Id: I4ee4fea97f47e4f1a23b82a62e6afc6280961e38
Reset the scale factors before build_inter_predictors.
Add datarate tests for 3 spatial layers, which exposed this issue.
Change-Id: I7f81efbe44345ecea9fdd5f639a4cca76aed3874
For 1 pass cbr mode: allow for two-stage 1:2 scaling
(which will use the 1:2 optimized scaler) if the spatial
layer is 1/4x1/4 of souce.
Without this change, the base layer for 3 spatial layers would
be using the non-normative scaler which is un-optimized/C code.
Change-Id: Ifcf526ec2aaf3e5fa7924588d9dd8660bf02fb46
the same as vp8, with the same reasoning from:
2a0d7b1 Reduce the default kf_max_dist to 128.
see also:
https://trac.ffmpeg.org/ticket/4904https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=815673
+ restore vpxenc behavior of taking the library default rather than
forcing 5s
This change also exposes an issue with one-pass svc in cbr mode, keep
the old default in datarate_test.cc for now.
Change-Id: Id6d1244f42490b06fefc1a7b4e12a423a1f83e88
Use the existing scene/content change detection to better
update/adjust golden frame refresh.
Change only affects 1 pass real-time vbr mode, speed >=5.
Change-Id: I2963a5bb7ca4a19f8cf8511b0a925e502f60e014
Don't initialize first pass costs for a number of symbols where first
pass probabilities aren't initialized.
This brings a 1.22x first pass speedup.
https://bugs.chromium.org/p/webm/issues/detail?id=1089
Change-Id: I97438c357bd88f52f5a15c697031cf0c3cc8f510
replace with vpx_highbd_lpf_horizontal_edge_16 and
vpx_highbd_lpf_horizontal_edge_8 to avoid passing a count parameter
Change-Id: I551f8cec0fce57032cb2652584bb802e2248644d
replace with vpx_lpf_horizontal_edge_16 and vpx_lpf_horizontal_edge_8 to
avoid passing a count parameter
Change-Id: I848c95c02a3c6ebaa6c2bdf0983dce05cd645271
move to encoder_encode() as vp9_get_compressed_data() allocates data and
would require some modification to make its error return meaningful.
Change-Id: I8ddc390a1441afd0ff937842fa4ad1053c956133
Add frame-level condition for reference masking: under external or
internal dynamic resize, allow for reference masking if none of
the references have been scaled.
Peviously, reference masking was turned off for the stream if dynamic
resize feature was enabled or an external resize event occurred.
reference_masking gives speed up with little/no loss in compression.
For speed 7 on rtc set: encoding time decreases by about 5-7%,
avgPSNR/SSIM goes down ~0.2%.
Change-Id: Ie4444577451ef954414d8fb4b2c99d65cadf1746
This commit fixes issue 1141. The issue was triggered in multi-tile
encoding. The change properly saves and restores the block context
information in the real-time mode selection process. It removes
several redundant memcpy operations in sub8x8 intra block mode search.
Change-Id: I35c9ad197f4bd500ec39b5fc833f052f19eee010
External dynamic resize with swapping width and height was
not handled properly.
Fix is to re-init loop-filter under certain condtions.
Modify unittest to test this case.
Without this change test will fail.
Relates to: https://bugs.chromium.org/p/webm/issues/detail?id=1140
Change-Id: I7d81ca7fe0783b3bc103a52a7b7cf073a96be26e
allocations done within this function are protected with
vpx_internal_error; adding the setjmp fixes a crash in
vp9_lookahead_push() under low memory conditions.
Change-Id: I4b79dca37cc7fadc4b7633f0db44c0e406799bc6
An issue exists with reference_masking in non-rd pickmode for spatial
scaling. It was kept off for internal dynamic resizing and svc, this
change is to keep it off also for external dynamic resizing.
Update to external resize test, and update TODO to re-enable this
at frame level when references have same scale as source.
Change-Id: If880a643572127def703ee5b2d16fd41bdbf256c
to vp9_setup_pre_planes(), preventing the function
unscaled_value() from being called. unscaled_value()
returns the same value that was passed in. See
scaled_buffer_offset() in vp9_reconinter.h.
Change-Id: I2a6fbaf07972c2f212834929d29a2cbe72e399c3
The bit to error transformation got doubled as a result of going from
8-bit to 9-bit costs (change d13385c).
Use defines to derive the scale numbers and comment some of the fields.
derf: -0.023 BDRATE
hevcmr: +0.067 BDRATE
stdhd: +0.098 BDRATE
(These are substantially smaller than than the original gains from 8 to
9 bit costing.)
Change-Id: I6a2b3b029b2f1415e4f90a05709b2333ec0eea9b
When the codec frame size is the same as the reference frame size,
release the scaled reference before assigning it a new buf_idx.
Only affects 1 pass non-svc mode, where the scaled references are
release only under certain conditions (to prevent un-needed scaling
of the references every frame).
Modified a unittest that can trigger this bug without this change.
https://code.google.com/p/chromium/issues/detail?id=582598
Change-Id: I9a884e36ebd7608b1641ec2a469e20a4f829cf43
If the application changes frame size (external size changes),
and aq-mode=3 is on, reset the cyclic refresh.
Modify the TestExternalResize unittest (longer run with more resize
actions). Without this change an assert would be triggered on this
longer test.
Change-Id: I0eefd2cd7ffa0c557cca96ae30d607034a2599ce
Make this consistent with regular block size rate-distortion
optimization. It improves the compression performance:
derf 0.055%
hevcmr 0.129%
Change-Id: I112fe734f592c21bc7aa6efb7e3f269c4214ee7b
For 1 pass real-time mode. No change in behavior as only last
and golden are used as references in 1 pass real-time mode.
Change-Id: Ie4655014eee1a8b271542f29d74b2c6f7fed54c9
delete apply_cyclic_refresh_bitrate(). unused since:
3472cbb vp9 aq-mode=3: Keep it on even at low bitrates.
Change-Id: I0fac9a31b59504e31000ac3a8f0b68e8d4320113
The definition is for the number of frames to check to determine the
recent decay rate, further to determine the next key frame in the
first pass of the encoder.
Change-Id: Ic696d6eb518a86fa296842273cf8767ef0b0e27a
-use larger threshold on y (as in vp8).
-add distance threshold for each cluster
-use larger skin distance threshold for first cluster
-add some early exist checks.
Keep default setting to model=0.
Change-Id: I1044b99ade4bb1f215a860a019a4d84cee2f7715
It improves the compression performance of VP9 by 0.1% across all
test sets. No speed change is observed.
Change-Id: I59338c5c9e67bae22188f35fc3afbfe2a6bba6b0
The postproc vp9_denoise() is a spatial denoise/blur function.
It was not intended to be used if temporal denoising is enabled.
Change-Id: I97d2dcb941e7cc49bbafce99d9286beb2693249d
Put check to avoid possible out of bounds when looping
over the blocks to estimate noise level.
No change in behavior.
Change-Id: I4b7b19b7edee0ae1c35b9dc0700b1bf9b304d7f5
the lookahead buffer allocation is deferred to receipt of the first
frame to allow profile changes. if the encoder was flushed before
supplying any frames the encoder would crash trying to dereference the
NULL buffer. vp8 is unaffected.
fixes mozilla bug:
https://bugzilla.mozilla.org/show_bug.cgi?id=1237848
Change-Id: Icee4b64de760476eee0d33b568f0a1010335ff13
Use multiple clusters instead of one and decrease
the distance thresholds.
Add a define to switch between models.
Default is set to existing (1 cluster) model.
Change-Id: I802cd9bb565437ae8983ef39453939f5d5073bb1
If a superblock contains alot of "skin" then force split
of 64x64 partition, and make some adjustments in mode selection.
This helps to reduce artifacts on moving face/skin areas at low bitrates.
Little/no change in metrics: avgPSNR/SSIM down by ~0.12%.
Small encoding time increase < 1%.
Change-Id: Ic57f52148c3716f391419fab0530d916e4c1d186
For aqmode=3, golden period update is set based on period of cyclic refresh.
Put a limit on max golden period update, for now set to 40.
And fix comment.
Change-Id: Icb61dd87c796cce2a5f5f7331c6a129540994696
Limit oscilation detection in the case where overshoot is very very
large.
This keeps the 9-bit cost patch from breaking the DownUp reisze test.
The patch pushed us to an 11% undershoot right before a scene cut
causing a 1200% overshoot. (Whereas before we were undershooting by
only 6% before overshooting by 1200%).
Change-Id: Id90ccfab8aba872ccadc45b73b3bb097b895677f
In inter mode search skip all modes except NEARESTMV and DC_PRED.
10% less encode latency for large frames using the chromium remoting_perftests.
+0.313% BDRATE on the screencast set at speed -6.
Change-Id: Ib97a39dd8bcdeab545509e0e02d78ce7033f8c63
Changes to mode selection for 1 pass SVC mode:
use base layer motion vector, changes to intra-prediction.
Change-Id: I3e883aa04db521cfa026a0b12c9478ea35a344c9
This patch fixes a bug that causes the loop filter search to reset to
a low value or zero after each arf overlay frame. We expect the overlay
frames to need little or no loop filtering but this should not propagate.
Change-Id: I895b28474cf200f20d82793f3de40b60b19579fd
This is a pure-refactor in preparation to potentially raise the bit-cost
resolution.
Verified at good speed 0 and rt speed -6.
Change-Id: I5347e6e8c28a9ad9dd0aae1d76a3d0f3c2335bb9
More aggresive on avoiding denoising on skin.
May supplement this later by adding condtion onn consec_zeromv.
Change-Id: Ied92b332f9b24e821d2009f81d1565758588d9a5
Different quality levels are used for different regions in
the frame depending on how far they are vertically from the
center. Specifically, three segments are used based on the
mi_row index with respect number to the number of mi_rows in
the frame.
Change-Id: Ifc8b777bc58ea8521dffc4640360c67d99f8d381
Prior to this patch, read_inter_block_mode_info() would
find the nearmv and nearestmv for all modes. Now it does not
search for ZEROMV modes and breaks out early for NEARMV and
NEWMV modes.
Change-Id: Ifa7b1eaf58bb03b9c7792ea5012fef477527d0fd
This commit enables encoder to avoid 8x4 and 4x8 partitions for
scaled reference frames when libvpx is configured and built with
--enable-better-hw-compatibility
Change-Id: I02ad65c386f5855f4325d72570c49164ed52f413
Move the logic for forcing zero_mode after the
(ref_frame & flag_list) check.
This was causing an memory leak under msan:
https://bugs.chromium.org/p/webrtc/issues/detail?id=5402
Change-Id: Ie9d243369f8ed7c332f46178275945331da4fd85
Under --enable-better-hw-compabibility, this commit adds the asserts
that no mv clamping is applied for scaled references, so when built
with this configure option, decoder will assert if an input bitstream
triggger mv clamping for scaled reference frames.
Change-Id: I786e86a2bbbfb5bc2d2b706a31b0ffa8fe2eb0cb
This commit adds a new configure option:
--enable-better-hw-compatibility
The purpose of the configure option is to provide information on known
hardware decoder implementation bugs, so encoder implementers may
choose to implement their encoders in a way to avoid triggering these
decoder bugs.
The WebM team were made aware of that a number of hardware decoders
have trouble in handling the combination of scaled frame reference
frame and 8x4 or 4x8 partitions. This commit added asserts to vp9
decoder, so when built with above configure option, the decoder can
assert if an input bitstream triggers such decoder bug.
Change-Id: I386204cfa80ed16b50ebde57f886121ed76200bf
Add function to compute skin map for a given block, as its
used in several places (cyclic refresh, noise estimation, and denoising).
Change-Id: Ied622908df43b6927f7fafc6c019d1867f2a24eb
Set initial values for these parameters in the vp9_init_layer_context().
This also fixes an issue in the svc-bypass mode when frame flags are
passed via the vpx_codec_encode().
Change-Id: I0968f04672f8d3d2fe2cea6b8a23f79f80d7a8b1
For coding block sizes <=16X16, if the block is determined to be skin,
then always allow for that block to be candidate for refresh. So if that
block happens to be on the boost segment(s), segment won't get reset to 0
and delta-q will be applied.
PSNR/SSIM metrics neutral (little/no change) on RTC clips.
Speed increase small/negligible (< 1%).
Some visual improvement on faces in a few RTC clips.
Change-Id: I6bf0fce6f39d820b491ce05d7c017ad168fce7d6
H/V intra mode was only enabled for bsize < 16x16,
enable it also for bsize=16x16.
Metrics are neutral with this change:
Overall very small gain (0.1%), small visual gain on some RTC clips.
Change-Id: Ib2d7a44382433bfc11cf324aa3cc5c382ea9e088
For testing implemented a fixed pattern and delta, 1 pass,
fixed Q, low delay mode.
This has not in any way been tuned or optimized.
Change-Id: Idf5ee179b277fa15d07a97f14f2ce5bbaae80a04
The one pass VBR mode selects a Q range based on a
moving average of recent Q values. This calculation
should have been excluding arf overlay frames as these
are usually coded at the highest allowed value. Their
inclusion skews the average and can cause it to drift
upwards even when the clip as a whole is undershooting.
As such it can undermine correct adaptation of the allowed
Q range especially for easy content.
Change-Id: I7d10fe4227262376aa2dc2a7aec0f1fd82bf11f9
Keep track of frame indexes for the references, and
constrain inter mode search for reference with same
temporal alignment.
Improves speed by about ~15%, no noticeable loss in
compression performance.
Change-Id: I5c407a8acca921234060c4fcef4afd7d734201c8
Lower the threshold for splitting 32x32->16x16 based on average variance,
and add lower bound condition for this split to occur. This prevents
unneccassry splitting for areas with very low variance.
Change-Id: Ibeb33b3d993632c2019f296eb87ef3b7e3568189
For non-rd variannce partition, speed >= 5:
Adjustments to reduce dragging artifcat of background area near
slow moving boundary.
-Decrease base threshold under low source noise conditions.
-Add condition to split 64x64/32x32 based on average variances
of lower level blocks.
PSNR/SSIM metrics go down ~0.7/0.9% on average on RTC set.
Visually helps to reduce dragging artifact on some rtc clips.
Change-Id: If1f0a1aef1ddacd67464520ca070e167abf82fac
This commit makes the sub8x8 block rate-distortion optimization
scheme use precise motion compensated prediction to compute the rd
cost. It fixes a potential buffer overflow issue related to sub8x8
motion search on scaled reference frame.
Change-Id: I4274992ef4f54eaacfde60db045e269c13aaa2de
This commit enables the new temporal filter system for VP9. For
speed 1, it improves the compression performance:
derf 0.54%
stdhd 1.62%
Change-Id: I041760044def943e464345223790d4efad70b91e