+5.857% BD-RATE on SCREEN_CONTENT
Leaving this off for non-screen content because:
+25.300% on TWITCH120
+37.833% BD-RATE on RTC
Change-Id: Ie0a312182d6cc859fb04298e4cd81d02b39e23fe
For 1 pass vbr mode: Increase the period of gf update on scene
cut (keep it same as orginal/default setting for now).
Change-Id: I679c3bd21152f6c4e486c8098d931c00e1d26b5f
The change https://chromium-review.googlesource.com/#/c/329181/
also changed behavior for cbr mode, which causes some regression
in screenshare test in webrtc.
Resetting the specific change to leave the cbr behavior
unchanged for now.
Change-Id: I52df158806422f86398e1d2f522e92067d8325eb
Some adjustments to inter-mode selection for vbr mode.
Condition some of the bias to low/zero motion on cbr mode, and
don't use int_pro_motion_estimation for golden ref
(treat it same as last ref).
Change only affect 1 pass vbr mode, speed >=5 (non-rd pickmode).
Encoding time increase within ~5%.
Avg PSNR/SSIM on RTC set increase by ~2%, all clips up,
ranging from 0.5 to 4%.
Change-Id: I0048d0104a8816773d91a2b1484d601169d9bad7
Don't advance the svc frame counters on dropped frame,
since this can break the referencing scheme and lead
to a crash/assert.
Updated svc-datarate unittest to add a lower bitrate test.
Change only affects 1 pass cbr svc, with frame dropper enabled.
Change-Id: Ibb7530b7a587a9344d46898d9286fd9e2ef0779c
Use the superframe counter to set the key frame, and force
it to the key frame on base spatial layer only.
Also, update svc frame counters under frame dropping.
Update unittest: add specific tests with short key frame period.
https://bugs.chromium.org/p/webm/issues/detail?id=1150
Change-Id: I5b1c9a09253e6e5fbfce51b4cf603ae22d422b01
For 1 pass cbr mode: allow for two-stage 1:2 scaling
(which will use the 1:2 optimized scaler) if the spatial
layer is 1/4x1/4 of souce.
Without this change, the base layer for 3 spatial layers would
be using the non-normative scaler which is un-optimized/C code.
Change-Id: I9d73f92a4a96927d0f1d6bf75315c1e60513226a
Use sharp filter to generate motion compensated reference for
temporal filtering. It improves the average coding performance of
VP9 speed 0:
derf 0.34%
hevcmr 0.38%
stdhd 0.58%
Change-Id: I1772a051be545de8c343055274e5ca0929d19cda
This commit back ports the fix from
https://chromium-review.googlesource.com/#/c/326940
It corrects the block partition context fetching in rate-distortion
optimization. It improves the average coding performance of speed 0:
derf 0.098%
hevcmr 0.102%
stdhd 0.282%
Change-Id: I8bcc6fe40ba5c6b50a6136daac116dcc738937ec
The double pointer in xd->mi handles this for us.
Cuts encode_suberblock()'s self time in half at rt speed 8.
Change-Id: I820dae24efdbf9a140bbeae82e4e2a5850317766
This reverts commit f51f0998e1.
This causes datarate tests to fail. Some are due to the new default
keyframe distance, another causes an assert even forcing 9999:
[ RUN ] VP9/DatarateOnePassCbrSvc.OnePassCbrSvc3SpatialLayers/0
test_libvpx:
vpx_dsp/x86/vpx_subpixel_8t_intrin_ssse3.c:853: scaledconvolve2d:
Assertion `y_step_q4 <= 32' failed.
Change-Id: I4ee4fea97f47e4f1a23b82a62e6afc6280961e38
Reset the scale factors before build_inter_predictors.
Add datarate tests for 3 spatial layers, which exposed this issue.
Change-Id: I7f81efbe44345ecea9fdd5f639a4cca76aed3874
For 1 pass cbr mode: allow for two-stage 1:2 scaling
(which will use the 1:2 optimized scaler) if the spatial
layer is 1/4x1/4 of souce.
Without this change, the base layer for 3 spatial layers would
be using the non-normative scaler which is un-optimized/C code.
Change-Id: Ifcf526ec2aaf3e5fa7924588d9dd8660bf02fb46
Use the existing scene/content change detection to better
update/adjust golden frame refresh.
Change only affects 1 pass real-time vbr mode, speed >=5.
Change-Id: I2963a5bb7ca4a19f8cf8511b0a925e502f60e014
Don't initialize first pass costs for a number of symbols where first
pass probabilities aren't initialized.
This brings a 1.22x first pass speedup.
https://bugs.chromium.org/p/webm/issues/detail?id=1089
Change-Id: I97438c357bd88f52f5a15c697031cf0c3cc8f510
move to encoder_encode() as vp9_get_compressed_data() allocates data and
would require some modification to make its error return meaningful.
Change-Id: I8ddc390a1441afd0ff937842fa4ad1053c956133
Add frame-level condition for reference masking: under external or
internal dynamic resize, allow for reference masking if none of
the references have been scaled.
Peviously, reference masking was turned off for the stream if dynamic
resize feature was enabled or an external resize event occurred.
reference_masking gives speed up with little/no loss in compression.
For speed 7 on rtc set: encoding time decreases by about 5-7%,
avgPSNR/SSIM goes down ~0.2%.
Change-Id: Ie4444577451ef954414d8fb4b2c99d65cadf1746
This commit fixes issue 1141. The issue was triggered in multi-tile
encoding. The change properly saves and restores the block context
information in the real-time mode selection process. It removes
several redundant memcpy operations in sub8x8 intra block mode search.
Change-Id: I35c9ad197f4bd500ec39b5fc833f052f19eee010
External dynamic resize with swapping width and height was
not handled properly.
Fix is to re-init loop-filter under certain condtions.
Modify unittest to test this case.
Without this change test will fail.
Relates to: https://bugs.chromium.org/p/webm/issues/detail?id=1140
Change-Id: I7d81ca7fe0783b3bc103a52a7b7cf073a96be26e
allocations done within this function are protected with
vpx_internal_error; adding the setjmp fixes a crash in
vp9_lookahead_push() under low memory conditions.
Change-Id: I4b79dca37cc7fadc4b7633f0db44c0e406799bc6
An issue exists with reference_masking in non-rd pickmode for spatial
scaling. It was kept off for internal dynamic resizing and svc, this
change is to keep it off also for external dynamic resizing.
Update to external resize test, and update TODO to re-enable this
at frame level when references have same scale as source.
Change-Id: If880a643572127def703ee5b2d16fd41bdbf256c
The bit to error transformation got doubled as a result of going from
8-bit to 9-bit costs (change d13385c).
Use defines to derive the scale numbers and comment some of the fields.
derf: -0.023 BDRATE
hevcmr: +0.067 BDRATE
stdhd: +0.098 BDRATE
(These are substantially smaller than than the original gains from 8 to
9 bit costing.)
Change-Id: I6a2b3b029b2f1415e4f90a05709b2333ec0eea9b
When the codec frame size is the same as the reference frame size,
release the scaled reference before assigning it a new buf_idx.
Only affects 1 pass non-svc mode, where the scaled references are
release only under certain conditions (to prevent un-needed scaling
of the references every frame).
Modified a unittest that can trigger this bug without this change.
https://code.google.com/p/chromium/issues/detail?id=582598
Change-Id: I9a884e36ebd7608b1641ec2a469e20a4f829cf43
If the application changes frame size (external size changes),
and aq-mode=3 is on, reset the cyclic refresh.
Modify the TestExternalResize unittest (longer run with more resize
actions). Without this change an assert would be triggered on this
longer test.
Change-Id: I0eefd2cd7ffa0c557cca96ae30d607034a2599ce
Make this consistent with regular block size rate-distortion
optimization. It improves the compression performance:
derf 0.055%
hevcmr 0.129%
Change-Id: I112fe734f592c21bc7aa6efb7e3f269c4214ee7b
For 1 pass real-time mode. No change in behavior as only last
and golden are used as references in 1 pass real-time mode.
Change-Id: Ie4655014eee1a8b271542f29d74b2c6f7fed54c9
delete apply_cyclic_refresh_bitrate(). unused since:
3472cbb vp9 aq-mode=3: Keep it on even at low bitrates.
Change-Id: I0fac9a31b59504e31000ac3a8f0b68e8d4320113
The definition is for the number of frames to check to determine the
recent decay rate, further to determine the next key frame in the
first pass of the encoder.
Change-Id: Ic696d6eb518a86fa296842273cf8767ef0b0e27a
-use larger threshold on y (as in vp8).
-add distance threshold for each cluster
-use larger skin distance threshold for first cluster
-add some early exist checks.
Keep default setting to model=0.
Change-Id: I1044b99ade4bb1f215a860a019a4d84cee2f7715
It improves the compression performance of VP9 by 0.1% across all
test sets. No speed change is observed.
Change-Id: I59338c5c9e67bae22188f35fc3afbfe2a6bba6b0
The postproc vp9_denoise() is a spatial denoise/blur function.
It was not intended to be used if temporal denoising is enabled.
Change-Id: I97d2dcb941e7cc49bbafce99d9286beb2693249d
Put check to avoid possible out of bounds when looping
over the blocks to estimate noise level.
No change in behavior.
Change-Id: I4b7b19b7edee0ae1c35b9dc0700b1bf9b304d7f5
the lookahead buffer allocation is deferred to receipt of the first
frame to allow profile changes. if the encoder was flushed before
supplying any frames the encoder would crash trying to dereference the
NULL buffer. vp8 is unaffected.
fixes mozilla bug:
https://bugzilla.mozilla.org/show_bug.cgi?id=1237848
Change-Id: Icee4b64de760476eee0d33b568f0a1010335ff13
Use multiple clusters instead of one and decrease
the distance thresholds.
Add a define to switch between models.
Default is set to existing (1 cluster) model.
Change-Id: I802cd9bb565437ae8983ef39453939f5d5073bb1
If a superblock contains alot of "skin" then force split
of 64x64 partition, and make some adjustments in mode selection.
This helps to reduce artifacts on moving face/skin areas at low bitrates.
Little/no change in metrics: avgPSNR/SSIM down by ~0.12%.
Small encoding time increase < 1%.
Change-Id: Ic57f52148c3716f391419fab0530d916e4c1d186
For aqmode=3, golden period update is set based on period of cyclic refresh.
Put a limit on max golden period update, for now set to 40.
And fix comment.
Change-Id: Icb61dd87c796cce2a5f5f7331c6a129540994696
Limit oscilation detection in the case where overshoot is very very
large.
This keeps the 9-bit cost patch from breaking the DownUp reisze test.
The patch pushed us to an 11% undershoot right before a scene cut
causing a 1200% overshoot. (Whereas before we were undershooting by
only 6% before overshooting by 1200%).
Change-Id: Id90ccfab8aba872ccadc45b73b3bb097b895677f
In inter mode search skip all modes except NEARESTMV and DC_PRED.
10% less encode latency for large frames using the chromium remoting_perftests.
+0.313% BDRATE on the screencast set at speed -6.
Change-Id: Ib97a39dd8bcdeab545509e0e02d78ce7033f8c63
Changes to mode selection for 1 pass SVC mode:
use base layer motion vector, changes to intra-prediction.
Change-Id: I3e883aa04db521cfa026a0b12c9478ea35a344c9
This patch fixes a bug that causes the loop filter search to reset to
a low value or zero after each arf overlay frame. We expect the overlay
frames to need little or no loop filtering but this should not propagate.
Change-Id: I895b28474cf200f20d82793f3de40b60b19579fd
This is a pure-refactor in preparation to potentially raise the bit-cost
resolution.
Verified at good speed 0 and rt speed -6.
Change-Id: I5347e6e8c28a9ad9dd0aae1d76a3d0f3c2335bb9
More aggresive on avoiding denoising on skin.
May supplement this later by adding condtion onn consec_zeromv.
Change-Id: Ied92b332f9b24e821d2009f81d1565758588d9a5
Different quality levels are used for different regions in
the frame depending on how far they are vertically from the
center. Specifically, three segments are used based on the
mi_row index with respect number to the number of mi_rows in
the frame.
Change-Id: Ifc8b777bc58ea8521dffc4640360c67d99f8d381
This commit enables encoder to avoid 8x4 and 4x8 partitions for
scaled reference frames when libvpx is configured and built with
--enable-better-hw-compatibility
Change-Id: I02ad65c386f5855f4325d72570c49164ed52f413
Move the logic for forcing zero_mode after the
(ref_frame & flag_list) check.
This was causing an memory leak under msan:
https://bugs.chromium.org/p/webrtc/issues/detail?id=5402
Change-Id: Ie9d243369f8ed7c332f46178275945331da4fd85
Add function to compute skin map for a given block, as its
used in several places (cyclic refresh, noise estimation, and denoising).
Change-Id: Ied622908df43b6927f7fafc6c019d1867f2a24eb
Set initial values for these parameters in the vp9_init_layer_context().
This also fixes an issue in the svc-bypass mode when frame flags are
passed via the vpx_codec_encode().
Change-Id: I0968f04672f8d3d2fe2cea6b8a23f79f80d7a8b1
For coding block sizes <=16X16, if the block is determined to be skin,
then always allow for that block to be candidate for refresh. So if that
block happens to be on the boost segment(s), segment won't get reset to 0
and delta-q will be applied.
PSNR/SSIM metrics neutral (little/no change) on RTC clips.
Speed increase small/negligible (< 1%).
Some visual improvement on faces in a few RTC clips.
Change-Id: I6bf0fce6f39d820b491ce05d7c017ad168fce7d6
H/V intra mode was only enabled for bsize < 16x16,
enable it also for bsize=16x16.
Metrics are neutral with this change:
Overall very small gain (0.1%), small visual gain on some RTC clips.
Change-Id: Ib2d7a44382433bfc11cf324aa3cc5c382ea9e088
For testing implemented a fixed pattern and delta, 1 pass,
fixed Q, low delay mode.
This has not in any way been tuned or optimized.
Change-Id: Idf5ee179b277fa15d07a97f14f2ce5bbaae80a04
The one pass VBR mode selects a Q range based on a
moving average of recent Q values. This calculation
should have been excluding arf overlay frames as these
are usually coded at the highest allowed value. Their
inclusion skews the average and can cause it to drift
upwards even when the clip as a whole is undershooting.
As such it can undermine correct adaptation of the allowed
Q range especially for easy content.
Change-Id: I7d10fe4227262376aa2dc2a7aec0f1fd82bf11f9
Keep track of frame indexes for the references, and
constrain inter mode search for reference with same
temporal alignment.
Improves speed by about ~15%, no noticeable loss in
compression performance.
Change-Id: I5c407a8acca921234060c4fcef4afd7d734201c8
Lower the threshold for splitting 32x32->16x16 based on average variance,
and add lower bound condition for this split to occur. This prevents
unneccassry splitting for areas with very low variance.
Change-Id: Ibeb33b3d993632c2019f296eb87ef3b7e3568189
For non-rd variannce partition, speed >= 5:
Adjustments to reduce dragging artifcat of background area near
slow moving boundary.
-Decrease base threshold under low source noise conditions.
-Add condition to split 64x64/32x32 based on average variances
of lower level blocks.
PSNR/SSIM metrics go down ~0.7/0.9% on average on RTC set.
Visually helps to reduce dragging artifact on some rtc clips.
Change-Id: If1f0a1aef1ddacd67464520ca070e167abf82fac
This commit makes the sub8x8 block rate-distortion optimization
scheme use precise motion compensated prediction to compute the rd
cost. It fixes a potential buffer overflow issue related to sub8x8
motion search on scaled reference frame.
Change-Id: I4274992ef4f54eaacfde60db045e269c13aaa2de
This commit enables the new temporal filter system for VP9. For
speed 1, it improves the compression performance:
derf 0.54%
stdhd 1.62%
Change-Id: I041760044def943e464345223790d4efad70b91e
This change has been imported from VP9 and
alters the nature and use of exhaustive motion search.
Firstly any exhaustive search is preceded by a normal step search.
The exhaustive search is only carried out if the distortion resulting
from the step search is above a threshold value.
Secondly the simple +/- 64 exhaustive search is replaced by a
multi stage mesh based search where each stage has a range
and step/interval size. Subsequent stages use the best position from
the previous stage as the center of the search but use a reduced range
and interval size.
For example:
stage 1: Range +/- 64 interval 4
stage 2: Range +/- 32 interval 2
stage 3: Range +/- 15 interval 1
This process, especially when it follows on from a normal step
search, has shown itself to be almost as effective as a full range
exhaustive search with step 1 but greatly lowers the computational
complexity such that it can be used in some cases for speeds 0-2.
This patch also removes a double exhaustive search for sub 8x8 blocks
which also contained a bug (the two searches used different distortion
metrics).
For best quality in my test animation sequence this patch has almost
no impact on quality but improves encode speed by more than 5X.
Restricted use in good quality speeds 0-2 yields significant quality gains
on the animation test of 0.2 - 0.5 db with only a small impact on encode
speed. On most natural video clips, however, where the step search
is performing well, the quality gain and speed impact are small.
Change-Id: Iac24152ae239f42a246f39ee5f00fe62d193cb98
For non-rd variance partition: Adjust variance threhsold based
on noise level estimate. This change allows the adjustment to be
updated more frequently.
Change-Id: Ie2abf63bf3f1ee54d0bc4ff497298801fdb92b0d
For low resolutions, whem 4x4downsample is used for variance,
use the same force split (that is used for 8x8downsample) for 16x16 blocks.
No change in metrics. Small improvement visually.
Change-Id: I915b9895902d0b9a41e75d37fee1bf3714d2366d
This is so we may update level at any time (e.g., to be used
for setting thresholds in variance-based partition).
Change-Id: I32caad2271b8e03017a531f9ea456a6dbb9d49c7
Under certain denoising conditons, check for re-evaluation of
zero_last mode if best mode was golden reference.
Change-Id: Ic6cdfd175eef2f7d68606300c7173ab6654b3f6e
For non-rd variance partition: only allow minmax computation
(which currently has no arm-neon optimization) for speeds < 8.
Performance loss is small: On RTC set with speed 8, few clips lose ~2/3%,
average loss is < 1%.
Change-Id: Ia9414f4d0b77dc83c3e73ca8de5d903f64b425ce
Change initial state of noise level, and only update
denoiser with noise level when estimate is done.
Change-Id: If44090d29949d3e4927e855d88241634cdb395dc
For denoising, and for noise level above threshold, re-evaluate
ZEROMV for mode selection after denoising.
Current change only does this check if selected best mode (before denoising)
was intra.
Change-Id: I4b1435b68d26c78f7597b995ee7bff0ddd5f9511
This change makes sure last reference with zero mv
is always checked for mode selection.
No change in metrics.
Change-Id: Iaf01877bf34272b966c78bfe18daad882a0a419e
the final sum may use up to 26 bits
+ add a unit test
+ disable the sse2 as the result will rollover; this will be fixed in a
future commit
Change-Id: I2a49811dfaa06abfd9fa1e1e65ed7cd68e4c97ce
Change on affects 1 pass CBR.
On key frame, temporal layer_id is reset to 0 for 1 pass CBR,
but since "layer" is reset, the svc.layer_context[layer].is_key_frame
was not correspondingly set properly.
Change-Id: I08f6da0a55ac7429ccfbaddfb7be14479e43543b
Small changes to the best quality default speed trade off.
Some speedup settings are worth while even for best quality as they
have only a very small impact on quality but a significant impact on
encode time.
These changes give as much as a further 50-60% increase in encode
speed for my test animations clip with minimal impact on quality.
For this sequence these changes improve the best quality encode speed
to about the same level as good quality speed 0 in Q3 2015 whilst
retaining the large quality gain of over 1 db
For many natural videos though the quality difference from good 0
to best is much smaller.
Change-Id: I28b3840009d77e129817a78a7c41e29cb03e1132
This change alters the nature and use of exhaustive motion search.
Firstly any exhaustive search is preceded by a normal step search.
The exhaustive search is only carried out if the distortion resulting
from the step search is above a threshold value.
Secondly the simple +/- 64 exhaustive search is replaced by a
multi stage mesh based search where each stage has a range
and step/interval size. Subsequent stages use the best position from
the previous stage as the center of the search but use a reduced range
and interval size.
For example:
stage 1: Range +/- 64 interval 4
stage 2: Range +/- 32 interval 2
stage 3: Range +/- 15 interval 1
This process, especially when it follows on from a normal step
search, has shown itself to be almost as effective as a full range
exhaustive search with step 1 but greatly lowers the computational
complexity such that it can be used in some cases for speeds 0-2.
This patch also removes a double exhaustive search for sub 8x8 blocks
which also contained a bug (the two searches used different distortion
metrics).
For best quality in my test animation sequence this patch has almost
no impact on quality but improves encode speed by more than 5X.
Restricted use in good quality speeds 0-2 yields significant quality gains
on the animation test of 0.2 - 0.5 db with only a small impact on encode
speed. On most clips though the quality gain and speed impact are small.
Change-Id: Id22967a840e996e1db273f6ac4ff03f4f52d49aa
This function now has an AVX intrinsics version which is about 80%
faster compared to the C implementation. This provides a 2-4% total
speed-up for encode, depending on encoding parameters. The function
utilizes 3 properties of the cost function lookup table, constructed
in 'cal_nmvjointsadcost' and 'cal_nmvsadcosts'.
For the joint cost:
- mvjointsadcost[1] == mvjointsadcost[2] == mvjointsadcost[3]
For the component costs:
- For all i: mvsadcost[0][i] == mvsadcost[1][i]
(equal per component cost)
- For all i: mvsadcost[0][i] == mvsadcost[0][-i]
(Cost function is even)
These must hold, otherwise the AVX version of the function cannot be used.
Change-Id: I6c2791d43022822a9e6ab43cd124a773946d0bdc
Change is only for real-time mode, speed >= 5, and non-screen content mode.
Add bias to zero/low motion for big blocks, if noise estimation
is enabled and noise level is above threshold.
Change-Id: I3a0a4608ede6aa535bda6eca528d20f8aba738e7
For 1 pass CBR mode: increase waiting time after key frame
before we start sampling rate control behavior for determining
resize. This change need to disable one internal resize(DownUp)
temporally since it requires a longer clip to do so.
Change-Id: If21beda1be23f169ee541ab4dd642f718347887a
Use same setting for speed 5 (as it is for speed > 5).
Change is only for real-time (non-rd) mode.
Change-Id: I830250eac654328373cb318baa89d4f0e63942e1
Reduces Linux perf estimated cycle count for pack_mb_tokens on a
lossless encode on my desktop from 61858501855 to 48154040219 or from
26% of the overall profile to 21%.
Change-Id: I9ca3426d7e3272bc7f7030abda4f0d0cec87fb4a
This reverts commit f1342a7b07.
This breaks 32-bit builds:
runtime error: load of misaligned address 0xf72fdd48 for type 'const
__m128i' (vector of 2 'long long' values), which requires 16 byte
alignment
+ _mm_set1_epi64x is incompatible with some versions of visual studio
Change-Id: I6f6fc3c11403344cef78d1c432cdc9147e5c1673
Add threshold/condition on spatial_variance and brightness level.
Modification to normalization of block variance.
Change resolution limit below which we disable noise estimation.
Change-Id: If5be08a26ceda351242d8a58d2f0bc88c0a918f0
This function now has an AVX intrinsics version which is about 80%
faster compared to the C implementation. This provides a 2-4% total
speed-up for encode, depending on encoding parameters. The function
utilizes 3 properties of the cost function lookup table, constructed
in 'cal_nmvjointsadcost' and 'cal_nmvsadcosts'.
For the joint cost:
- mvjointsadcost[1] == mvjointsadcost[2] == mvjointsadcost[3]
For the component costs:
- For all i: mvsadcost[0][i] == mvsadcost[1][i]
(equal per component cost)
- For all i: mvsadcost[0][i] == mvsadcost[0][-i]
(Cost function is even)
These must hold, otherwise the AVX version of the function cannot be used.
Change-Id: I184055b864c5a2dc37b2d8c5c9012eb801e9daf6
Change is only for real-time mode, speed > 5, and non-screen content mode.
Bias is based on block size and motion vector level (motion above some threshold).
Helps to improves stability in background from lightning changes.
PSNR/SSIM metrics on RTC set almost no change/neutral (within +/- 0.1).
Change-Id: I7eac13c1ae10be4ab1f40acc7f9f1df5653ece9d
Only use non-zero threshold(s) for breakout if
the motion level of the current tested mode is low.
Change-Id: I22aae961cc42371b49d3f648560181cc54708502
Source noise level estimate is also useful for
setting variance encoder parameters (variance thresholds,
qp-delta, mode selection, etc), so allow it to be used also
if denoising is not on.
Change-Id: I4fe23d47607b4e17a35287057f489c29114beed1
Width and height of downscaling resolution should not be lower
than min_width and min_height which can be set as needed, both
are 180 for now.
Change-Id: I34d06704ea51affbdd814246e22ee8d41d991f00
Adjust variance threshold, delta-qp, and intra penalty cost,
based on estimated noise level in source.
Replace denoising_on with a level value=L/M/H.
Change-Id: I0c017dae75a5d897367d2c42dec26f2f37e447c1
Bug relating to issue:- http://b/25090786
base_frame_target is supposed to track the idealized bit
allocation based on error score and not the actual bits
allocated to each frame.
The clamping of this value based on the VBR min and max pct values
was causing a bug where in some cases the loop that adjusts the
active max quantizer for each GF group was running out of bits at
the end of a KF group. This caused a spike in Q and some ugly artifacts.
A second change makes sure that the calculation of the active
Q range for a group DOES, however, take account of clamping.
Change-Id: I31035e97d18853530b0874b433c1da7703f607d1
Periodically estiamte noise level in source, and only denoise
if estimated noise level is above threshold.
Change-Id: I54f967b3003b0c14d0b1d3dc83cb82ce8cc2d381
A new version of vp9_highbd_error_8bit is now available which is
optimized with AVX assembly. AVX itself does not buy us too much, but
the non-destructive 3 operand format encoding of the 128bit SSEn integer
instructions helps to eliminate move instructions. The Sandy Bridge
micro-architecture cannot eliminate move instructions in the processor
front end, so AVX will help on these machines.
Further 2 optimizations are applied:
1. The common case of computing block error on 4x4 blocks is optimized
as a special case.
2. All arithmetic is speculatively done on 32 bits only. At the end of
the loop, the code detects if overflow might have happened and if so,
the whole computation is re-executed using higher precision arithmetic.
This case however is extremely rare in real use, so we can achieve a
large net gain here.
The optimizations rely on the fact that the coefficients are in the
range [-(2^15-1), 2^15-1], and that the quantized coefficients always
have the same sign as the input coefficients (in the worst case they are
0). These are the same assumptions that the old SSE2 assembly code for
the non high bitdepth configuration relied on. The unit tests have been
updated to take this constraint into consideration when generating test
input data.
Change-Id: I57d9888a74715e7145a5d9987d67891ef68f39b7
Added optimization of the 8 bit assembly quantizer routines. This makes
these functions up to 100% faster, depending on encoding parameters.
This patch maskes the encoder faster in both the high bitdepth and 8bit
configurations. In the high bitdepth configuration, it effects profile 0
only.
Based on my profiling using 1080p input the net gain is between 1-3% for
the 8 bit config, and around 2.5-4.5% for the high bitdepth config,
depending on target bitrate. The difference between the 8 bit and high
bitdepth configurations for the same encoder run is reduced by 1% in all
cases I have profiled.
Change-Id: I86714a6b7364da20cd468cd784247009663a5140
Adjust the qp threshold and consec_zeromv threshold for
limiting cyclic refresh. Also increase the refresh period
when the limit amount is significant, and some code-cleanup.
Small gain in PSNR/SSIM metrics: ~0.25/0.3 gain on RTC set, speed 7.
Change only affects non-screen content.
Change-Id: I1ced87a89a132684c071e722616e445b2d18236a
Adjust the qp threshold based on the denoising setting; not allow
to scale directly from original resolution to one half and vise versa.
Change-Id: I032a9b22f8e1c88de6bb81cf8351367223a3e40d
For the re-encoding (at max-qp) on the detected high-content change:
update rate correction factor, reset rate over/under-shoot flags,
and update/reset the rate control for layered coding.
Change-Id: I5dc72bb235427344dc87b5235f2b0f31704a034a
Changes to the breakout behavior for partition selection.
The biggest impact is on speed 0 where encode speed in
some cases more than doubles with typically less than 1%
impact on quality.
Speed 0 encode speed impact examples
Animation test clip: +128%
Park Joy: +59%
Old town Cross: + 109%
Change-Id: I222720657e56cede1b2a5539096f788ffb2df3a1
If high bit depth configuration is enabled, but encoding in profile 0,
the code now falls back on optimized SSE2 assembler to compute the
block errors, similar to when high bit depth is not enabled.
Change-Id: I471d1494e541de61a4008f852dbc0d548856484f
The artifact occurs periodically when VP9 denoiser is on and
refresh_golden_frame happen. When refresh_golden_frame happen,
we should copy the frame buffer instead of swapping the pointers.
Change-Id: Ib3204c4b04db28ecf439c6d9e61f3d146f04196d
Small code cleanup. consec_zeromv refresh threshold
does not need to be computed for every super-block.
No change in behavior.
Change-Id: I8c4b1b28072f42b01d917fff6d1f62722f1e1554
Use the existing VP9_SET_SVC control to set the
first spatial layer to encode.
Since we loop over all spatial layers inside the encoder, the
setting of spatial_layer_id via VP9_SET_SVC has no relevance.
Use it instead to set the first_spatial_layer_to_encode,
which allows an application to skip encoding lower layer(s).
Change only affects the 1 pass CBR SVC.
Change-Id: I5d63ab713c3e250fdf42c637f38d5ec8f60cd1fb
The resolution check fixs the issue which resets resize_pending
unnecessarily and causes not-bitexact with previous one-step version.
Change-Id: I4e7660b3c8f34f59781e2e61ca30d61080c322de
Temporary fix to denoiser when dynamic resizing is on.
-Reallocate denoiser buffers on resized frame.
-Force golden update on resized frame.
-Don't denoise resized frame, and copy source into denoised buffers.
Change-Id: Ife7638173b76a1c49eac7da4f2a30c9c1f4e2000
For screen-content mode, with frame dropper off, put a limit
on how low encoder buffer can go.
Under hard slide changes, the buffer level can go too low and then
take long time to come back up (in particular when frame-dropping
is not used), which will affect the active_worst and target frame size.
Change-Id: Ie9fca097e05cd71141f978ec687f852daf9de332
Dynamic resizing now support two-steps scaling: first go down to
3/4 and then 1/2. This feature is under a flag which controls the
switch between two-steps scaling and one-step scaling (1/2 only).
Change-Id: I3a6c1d3d5668cf8e016a0a02aeca737565604a0f
The loopfilter masks are now built in the decode loop.
This is done so we can eventually reduce the number of
MODE_INFO structs required by the decoder.
The encoder builds the masks for the entire frame prior
to calling the loopfilter.
Change-Id: Ia2146b07e0acb8c50203e586dfae0c4c5b316f11
In the decoder, map this to the output variable vpx_image_t.r_w/h.
This is intended as an improved version of VP9D_GET_DISPLAY_SIZE,
which doesn't work with parallel frame decoding. In the encoder,
map this to a codec control func (VP9E_SET_RENDER_SIZE) that takes
a w/h pair argument in a int[2] (identical to VP9D_GET_DISPLAY_SIZE).
Also add render_size to the encoder_param_get_to_decoder unit test.
See issue 1030.
Change-Id: I12124c13602d832bf4c44090db08c1009c94c7e8
The name "display_*" (or "d_*") is used for non-compatible information
(that is, the cropped frame dimensions in pixels, as opposed to the
intended screen rendering surface size). Therefore, continuing to use
display_* would be confusing to end users. Instead, rename the field
to render_*, so that struct vpx_image can include it.
Change-Id: Iab8d2eae96492b71c4ea60c4bce8121cb2a1fe2d
Use the existing QP condition on limiting cyclic refresh, and add
addiitonal condition that block has been encoded with zero/small motion
x frames in row (where x is at least several times the refresh period).
Additional condition only affect non-screen content mode.
This helps to improve visual stability for noisy input, where on steady
background areas the application of delta_qp may lead to encoding the noise.
Also added a change to use the true skip (after encoding) to update the
last QP.
Change-Id: I234a1128d017d284cf767fdb58ef6c59d809f679
Limit transform size for intra to 16x16, for non-screen content mode.
Little/no change in speed or metrics.
32x32 intra block is rarley selected in RTC (non-screen content) case,
but some visual improvement can be seen in some example,
e.g., captured_video_dark_whd.yuv.
Change-Id: I68e2db87875343b3fb9bb407a7709f0088f84072
Reallocation of mi buffer fails if change size on the first frame and
change config in subsequent frames. Add a condition for resolution
check to avoid assertion failure.
BUG=1074
Change-Id: Ie26ed816a57fa871ba27a72db9805baaaeaba9f3
Reference frame masking logic may skip checking zeromv-last mode.
Fix to avoid this and make sure zero-last is always checked.
No noticeable change in speed, and PSNR/SSIM metrics on RTC set overall
neutral (very small gain ~0.02).
Small visual improvement on few RTC clips.
Change-Id: I26eacdc449126424001a4a64e5ac31949f064417
Add SVC codec control to set the frame flags and buffer indices
for each spatial layer of the current (super)frame to be encoded.
This allows the application to set (and change on the fly) the
reference frame configuration for spatial layers.
Added an example layer pattern (spatial and temporal layers)
in vp9_spatial_svc_encoder for the bypass_mode using new control.
Change-Id: I05f941897cae13fb9275b939d11f93941cb73bee
In decoder, export (eventually) into vpx_image_t.range field. In
encoder, use oxcf->color_range to set it (same way as for
color_space).
See issue 1059.
Change-Id: Ieabbb2a785fa58cc4044bd54eee66f328f3906ce
For 1 pass CBR spatial-SVC:
Add cyclic refresh parameters to the svc-layer context.
This allows cyclic refresh (aq-mode=3) to be applied to
the whole super-frame (all spatial layers).
This gives a performance improvement for spatial layer encoding.
Addd the aq_mode mode on/off setting as command line option.
Change-Id: Ib9c3b5ba3cb7851bfb8c37d4f911664bef38e165
Fixes temporal scalability. Updates were inadvertently turned
off for two pass svc causing crashes due to gf_group.index
growing unchecked.
Change-Id: Iff759946bf61bbde70630347cc8fa4d51a8c2d2f
The normative (convolve8) filter is optimized/faster than
the nonnormative one. Pass usage of scaler (normative/nonomorative)
to vp9_scale_if_required(), and always use normative one for 1 pass.
Change-Id: I2b71d9ff18b3c7499b058d1325a9554de993dd52
Access scaled reference frame in the sub8x8 rate-distortion
optimization loop only when the current test mode is an inter mode.
This prevents an ioc warning triggered by sending intra_frame index
to fetch scaled reference frame.
Change-Id: I6177ecc946651dd86c7ce362e3f65c4074444604
This commit allows the encoder to include sub8x8 inter mode with
scaled reference frame in the rate-distortion optimization scheme.
Change-Id: Ibbe9678801592826ef22566566dcdeeb008350d5
If the encoder dynamic resize is triggered and change config()
is then called, it will reset the current (resized) codec width/height
back to the the config (unresized) width/height (which will then
prevent the resizing action from occurring in encoder_loop).
Avoid this by checking for a change in the config width/height
before resetting the cm->width/height.
Change-Id: Id9d50c0ee8a943abe4b6c72bbaa02d9696f93177
For one pass CBR: only check for updating refresh_golden
if ext_refresh_frame_flags_pending is not set (i.e., == 0).
And move the resetting of ext_refresh_frame_flags_pending = 0
down to after the encode_loop (and account for dropped frames).
This is to prevent changing refresh_golden flga when the user
supplies the reference/update flags.
Change-Id: I4d87b3e705ba43f243667e367503b585c61e2a54
In high bitdepth setting, the rate multipier may be set as 0. In
lossless mode, the RD cost would always be 0, resulting in bad
partition and prediction mode choices.
Change-Id: I297014dd8bfa8a07ff0ab480119f75678300ff68
This patch just fixes the test for the time being, but does not
actually solve the underlying issue, which still needs investigation.
Change-Id: I54a35de839723f5b499b57e38dd2bdd400adc427
Switch to use the normative (convolve8) filter for source scaling,
only for 1/2x1/2 scaling for now. This is faster and has better
quality than either the vpx_scale_frame or the nonnormative scaler.
Remove the vp9_scale_if_required_fast, which is now not used.
Change-Id: I2f7d73950589d19baafb1fa650eac987d531bcc8
For 1 pass CBR mode under screen content mode:
if pre-analysis (source temporal-sad) indicates significant
change in content, then check the projected frame size after
encode_frame(), and if size is above threshold, force re-encode
of that frame at max QP.
Change-Id: I91e66d9f3167aff2ffcc6f16f47f19f1c21dc688
Only test for using golden as reference for variance partition
selection if it is used as a reference for that frame.
For temporal layers, golden may not be a reference on a given frame,
even though it was for some previous frame. If it is not a reference
for current frame, don't check/use it for partition selection.
Change-Id: I6b0f2bd36aebbb5903077c9a0a66d80f1de9a7b1
For speed 7, real-time mode: Base layer frames are further apart
(for #temporal layers = 3, this is every 4 frames) so worth keeping
same motion search parameters (as in speed 6) on the base layer frames.
Change-Id: Idebf49dda6ef4f3d9a55aee55129a68253f692fb
* changes:
Only use .text sections for aout
Use newer x86inc.asm
Use .text instead of .rodata on macho
Copy PIC handling code from x86_abi_support
Set 'private_extern' visibility for macho targets
Avoid 'amdnop' when building with nasm
Catch all elf formats
Expand PIC default to macho64 and respect CONFIG_PIC from libvpx
Use libvpx defines to set name mangling rules
Customize x86inc.asm for libvpx
Rename updated version of x86inc.asm
Use "private_prefix" instead of "program_name" and make vpx the default
prefix.
Change-Id: I4883a99b2aee8e5dc9f2c16a2e6f4b5d6e4de458
Use the correct period (in terms of cr->percent_refresh) for the condition
of larger delta-qp following key frame.
And account for larger interval for temporal layers.
Change-Id: Ibb43f5200f9b1eeb8bbb8211327b08ecda3c3b8a
Re-investigated the second-level sub-pixel motion search. Improved the
way of choosing search points. Rewrote the second-level search code.
At speed 0, the borg tests showed:
1. for stdhd set, Avg PSNR gain: 0.216%; Overall PSNR gain: 0.196%;
SSIM gain: 0.206%. Only 1 out of 15 clips showed PSNR loss.
2. for derf set, Avg PSNR gain: 0.171%; Overall PSNR gain: 0.192%;
SSIM gain: 0.207%. Only 3 out of 30 clips showed PSNR losses.
Added the condition for third-point checking, namely, less points
were checked. Speed tests showed no speed loss(Avg 0.3% speedup at
speed 0).
Change-Id: I6284ebb3fa7ba63be8528184c49e06757211a7f1
-For ambient qp in active_worst setting: increase the initial
averaging time (from very first frame) to account for avg_qp of key_frame.
-In postencode on key frame: update the last_q/avg_q[key_frame] for
all temporal layers.
Change-Id: I5313153d350b1045b4835ce948dfffb7d2039b52
Condition usage of rc.frames_since_golden to non-svc mode.
rc.frames_since_golden, which is used in non-svc mode to add second reference,
was causing, under certain condiiton, the turning off of golden reference
for svc case.
Change-Id: Icec644d235d0471e56d8ff73d6c37278bd6ecd3b
and FUN_CONV_2D macros. The predict lut now handles
this case. The encoder now calls vpx_scaled_2d() instead
of vpx_convolve8() for scaling.
Change-Id: Ia1c8af8a31e4cb4887a587143108cb45835f7df7
This commit clears all the vp9_ prefix use case in vpx_dsp. It gets
the vp9 folder ready to branch out vp10.
Change-Id: I2906eec179ee792b4af8c9b4161313653050e931
Choose a different diagonal point to check when the two costs are
the same, making it consistent with the way we choose the best mv.
This slightly changes the encoding result, and the derflr set borg
test at speed 0 shows 0.027% Overall PSNR gain, 0.024% Avg PSNR
gain, and 0.043% SSIM gain.
Change-Id: Ic8ee3a6767394866d159e4f9e1c777604dd73c17
If the current best mv(namely, the search center) is still the best mv
after the first level search, the second level checks is skipped. This
patch doesn't change the bitstream. At speed 0, it speeds up the encoder
by 1% - 2%.
Change-Id: I054c91b884d3f7aef157436c061744562bd6506d
Ssim_vars is used to accumulate stats based 4x4 pixel blocks, this
commit changes the allocations size to be based on mi_rows and mi_cols
to avoid out-of-bound memory access for larger size videos. The hard
coded 720x480 can only work for image size up to 2880x1920.
Change-Id: Id9d07f3f777385b448ac88a6034b7472e4cf3c79
This commit moves the module inverse transform functions from vp9
to vpx_dsp folder. The hybrid transform wrapper functions stay in
the vp9 folder, since it involves codec-specific data structures.
Change-Id: Ib066367c953d3d024c73ba65157bbd70a95c9ef8
This got erroneously changed during the refactor. This fixes
SvcTest.TwoPassEncode2TemporalLayersWithMultipleFrameContextsAndTiles.
Change-Id: Ifa5ab0e098396c5e2d10478db87df256eadfa4c7
It in essence refactors the code for both the interpolation
filtering and the convolution. This change includes the moving
of all the files as well as the changing of the code from vp9_
prefix to vpx_ prefix accordingly, for underneath architectures:
(1) x86;
(2) arm/neon; and
(3) mips/msa.
The work on mips/drsp2 will be done in a separate change list.
Change-Id: Ic3ce7fb7f81210db7628b373c73553db68793c46
Don't run rate_block (cost_coeffs) if distortion alone is enough to
surpass best_rd.
This decreases 2nd pass runtime on HD at speed 2 by about 2%. There is
zero effect on output if tx_cache is removed.
Change-Id: Ia3b1cc77bfbe6ee988c395fde06c0eb92940b784
1. The RD scores obtained during the tx size selection were stored in the
tx cache, and used to help make the tx decision for the following frames.
This wasn't used anymore in VP9 encoder. Recovered the related decision
making code from 1.5+ years ago, and borg tests didn't show any quality
gain. This patch removed it to lower the complexity.
2. An optimization was done after the above refactoring. If the tx_mode
is not TX_MODE_SELECT, we only need to test the chosen tx size instead
of all posible tx sizes. This gave a 1.5% average speed gain at speed 2,
and a 1% average speed gain at speed 3.
Change-Id: Id8cd650e066a8cef33829d8c15388a8138adc78c
The forward 32x32 2D-DCT functions are aligned in vpx_dsp folder.
The vp9_dct.h file is not effectively used now.
Change-Id: Ie7946b6fdd784b8e91496242337bc9002c75c281
This commit replaces vp9_idct.h with txfm_common.h in many SIMD
implementation files for precise file dependency.
Change-Id: If73dd726bb16537e7494f28538b0a169810f9756
Separate the common coefficient constant into vpx_dsp/txfm_common.h.
Move the SSE2 macro definitions to vpx_dsp/x86/txfm_common_sse2.h.
This clears the use case of vp9_idct.h in vpx_dsp folder.
Change-Id: I319735a2abf42888e5080ac14cfbcde34be7b121
Avoid scaling the references if they have already been scaled.
Change only affects 1 pass non-svc mode for now.
Change-Id: I204f4079c026cba7adce7a7f855d072f6139ccec
The RD and load save/code grabs it as groups of four. In practice there
is no change to physical allocations becaquse this is backed by a 16-byte
memalign.
Change-Id: I01e89769872300e23227e03dd24a6e229f482025
Add vpx_dsp_rtcd.h to the header file list. The od_bin_fdct8x8()
here depends on forward 8x8 2D-DCT.
Change-Id: I1d71edc71f07069808823d2445c1cafd285e1b94
This commit factors out common macro definitions from the forward
and inverse transform implementations into vpx_dsp. It removes
the duplicate macro definitions from encoder and decoder folders.
Change-Id: I92301acbd3317075e9c5f03328a25abb123bca78
This commit factors the 4x4, 8x8, and 16x16 2D-DCT forward
transform operations into vpx_dsp folder.
Change-Id: I084b117b79c0925edcbcabb93f62b9f4bf8dbe7d
This commit limits the scope of 1-D DCT and ADST functions within
vp9_dct.c and makes them static. This largely clears out the cross
referencing issue between vp9_dct.c and the SIMD optimizations.
Change-Id: If7cac478b11bb32328ccf70a9f60b709dad43d7f
The SSE2 version high bit-depth forward hybrid transforms are
essentially using the C functions via cross referencing to 1-D
functions in vp9_dct.c. This commit unifies the two versions and
removes the unnecessary dependency.
Change-Id: Ib4d0702a138f8daf7d0bd97c141ee7088f293765
Separate the hybrid transform case from 2D-DCT case. This will
allow us to clear up cross dependency between c and SIMD
implementations later.
Change-Id: Iaa499e8b096850a1c5a0c50a3b6e63e15d0184bf
The following quantization functions were moved:
vp9_quantize_b
vp9_quantize_b_32x32
vp9_highbd_quantize_b
vp9_highbd_quantize_b_32x32
vp9_quantize_dc
vp9_quantize_dc_32x32
vp9_highbd_quantize_dc
vp9_highbd_quantize_dc_32x32
The purpose of doing that was to allow these functions to be shared
by multiple codecs.
Change-Id: Id8ab939f283353cdd07bd930d47db3d932a5d87f
Remove the use of drop_frames_water_mark, as this is used for
frame dropping control. Use fixed threshold for now on buffer underflow.
Change-Id: If0ddda9f7f6fa96067cdcb0eccb42e17bda37c32
In aq-mode=3 under a resizing action (i.e., resize_pending != 0),
force an update of the golden reference frame.
Change-Id: I14806f6db71b5f8c827678cc5e1fc913c138a9a4
Fix bug in setting this flag for animated content.
The bug did cause quality to increase because far
more frames are not boosted than boosted.
However, the speed trade off to gain is a lot less
favorable and the behavior was not as intended.
Change-Id: I89fb70419c88b26f40b3534de0481730a1b3fcfa
Use drop_frames_water_mark for threshold on buffer underflow,
and change threshold for resize down.
Change-Id: I2de19adce50abe9bcdc0b107528cec8cc1857fcc
The fast scaling for 1 pass mode was being used only on the
first frame after resizing event (because resize_scale_num/den
is set to 1 and only changed for first frame following resize event).
Change-Id: I723b63e21823eb858f25f5662d2bbe4f1842e61f
Proper use/update of resize_state and resize_pending to constrain
the total amount of downsizing to be at most one scale down, for now.
Change-Id: Id18fc32499f2fbdbec16728dcdc9e4eac09098f0
From Change Ibf0c30b72074b3f71918ab278ccccc02a95a70a0
There is still an issue relating to one animated test clip with repeat
patterns where this change effectively increase the default maximum
arf interval by +1. This can be examined seperately.
Change-Id: Idd01d5480fc45202d8a059a0c3afc0997cc5bdd1
This commit simplifies the intra block boundary condition logic.
It removes the block index from the argument set.
Change-Id: If00142512eb88992613d6609356dfd73ba390138
The clamp calls with INT32_MIN and INT32_MAX have no effect at all on
int values passed in, therefore this commit removes those effectless
clamps and also adds more const intermediate results to make the code
more readable.
Change-Id: I66d8811f58bb74ec31cbec9a6c441983a662352e
The encoder gets its dqcoeff from the context tree. In the decoder move
it to directly after MACROBLOCKD.
Change-Id: I46c9b76f26956a360d17de0b26ecb994dae34ecb
Even if the recode loop is not enabled for the current frame type
trap the case where the projected size of a a frame is above the
maximum allowed in recode_loop_test()
Change-Id: I453004694b8f8699e3c2a83252e9f83adccdda4e
Changes to allow more use of rectangular partitions at
speeds 1 and 2 for content classed by the first pass as
animation and for blocks near the active image edge.
This has quite a big impact in quality for the animated
test sequence but also hurts encode speed for speed 2.
For other content types the impact on both speed and
quality is small.
Added some plumbing for detection of internal vertical
image edges.
Change-Id: I3fc48de2349f8cb87946caaf0b06dbb0ea261a9a
Change speed features / behavior for split mode when there
is an internal active edge (e.g. formatting bars).
Remove some threshold constraints in rd code near the active
edge of the image.
Add some plumbing for left and right active edge detection.
Patch set 5. Limit rd pass through for sub 8x8 to internal active edges.
This takes away any speed penalty for most clips but keeps the enhanced
edge coding for the more critical case of internal image edges
Change-Id: If644e4762874de4fe9cbb0a66211953fa74c13a5
If the pre-selected partition size (from variance partition) is
32x32, also apply nonrd partition search for 32x32 and 16x16 size.
Overall small positive gain in metrics, average ~1%.
Some visual improvement, for lower resolutions.
Change-Id: I69cb425bda94f7d13d34c451ab30e9276335a30e
Adds two new vp9 parameters --min-gf-interval and --max-gf-interval
to enable testing based on frequency of alt-ref frames.
Also adds a unit-test to test enforcement of min-gf-interval.
For both these parameters the default value is 0, which indicates
they are picked by the encoder, based on resolution and framerate
considerations. If they are greater than zero, the specified
parameter is honored.
(Additional note by paulwilkins)
Note that there is a slight oddity in that key frames are also GFs and
considered part of GF only group. However they are treated as not
being part of an arf group because for arf groups the previous GF is
assumed to be the terminal or overlay frame for the previous group.
(end note)
Change-Id: Ibf0c30b72074b3f71918ab278ccccc02a95a70a0
This reverts commit a42df86c03.
this change causes MSA/VP9SubpelVarianceTest.Ref and
MSA/VP9SubpelVarianceTest.ExtremeRef failures under
mips32r5el-msa-linux-gnu and mips64r6el-msa-linux-gnu
Change-Id: I40b71a0b774eaeb31f66f795733f95cf360909f7
This reverts commit 61774ad1c4.
this change causes MSA/VP9SubpelAvgVarianceTest.Ref failures under
mips32r5el-msa-linux-gnu and mips64r6el-msa-linux-gnu
Change-Id: I7fb520c12b2a3b212d5e84b7619a380a48e49bb0
Added code to reduce the minimum partition size searched
for super blocks at or straddling the edge of the image.
If the first pass has detected formatting bars the "active" edge
may not be the real edge.
Change-Id: I9c4bdd1477e60f162a75fac95ba6be7c3521e05c
Correct the ARF boost calculations to partly discount
inactive or very low energy regions of the image.
Examples (formatting bars and 0 energy areas of animated clips).
Change-Id: I241af058d10aba8c67a4deca36deb913047d4561
This commit moves the primitive multi-threading files from vp9
folder to vpx_thread, which will be accessible by all vpx codec.
Change-Id: Ib51e66e9c69801c10631fab56d35a0c0aaed5883
to MB_MODE_INFO_EXT. This saves 36 bytes per 8x8 area for
both the decoder and encoder. (encoder has two MODE_INFO
buffers)
Change-Id: If006abb2224acaf326df3c2be09e77e967662107
Only do the check for resizing if the feature is selected
(i.e., resize_mode = RESIZE_DYNAMIC).
And modify condition for checking to be resize_count >= window,
(since framerate can change).
Change-Id: Idceb4e50956bb965a1492b4993b0dcb393c9be4d
Reduce boost for segment#2 for low bitrates and low-res.
This change is to reduce the rate overshoot at low bitrates.
No change in behavior, except at the very low bitrates.
Change-Id: I0dbd9d3b6356da5804de94adf10fca6a7a8f8948
Keep the same transform cutoff and partition selection
for speed 5 as in speeds >=6 (non-rd speed settings).
Existing setting for key frame at speed 5 allowed transform size
up to 32x32 on key frames, and did not allow for 4x4 block partition size.
This created more visual artifacts on first few frames.
avgPSNR/overallPSNR/SSIM gains of 0.2/0.7/0.8 for rtc_derf(low-res) set,
and 0/0.7/1.1 gains for rtc set.
Change-Id: I8c139ec6c9bb74e14b4ffbad5f12e94f18a59c0b
For speed 5 real-time mode, the selection of the partition size for
superblocks on the segment (aq-mode=3) uses the non-rd recursive
pick partition search, and can sometimes select 64x64.
For low resolutions, visually better to limit this to 32x32.
Change-Id: I69657a7ed8899f8b3cf8c9c318a2509c5c72c565
For screen content don't refresh a block at a quantizer higher than
it was last coded at. PReviosuly at realtime speeds the encoder had a
tendency to recode a block from GOLDEN with a higher Q than it was last
coded at.
Change-Id: Iacd561806c769dcce1a81b9827ffc70090f5ba18
Decision to scale down/up is based on buffer state and average QP
over previous time window. Limit the total amount of down-scaling
to be at most one scale down for now.
Reset certain quantities after resize (buffer level, cyclic refresh,
rate correction factor).
Feature is enable via the setting rc_resize_allowed = 1.
Change-Id: I9b1a53024e1e1e953fb8a1e1f75d21d160280dc7
The internal behavior of block_yrd differs in high bit depth
settings from 8-bit one. This causes the assertion condition not
true for high bit depth.
Change-Id: I15dc02e7162d27cabe78c451941d769d488b1174