Commit Graph

6835 Commits

Author SHA1 Message Date
Jerome Jiang
682135fa60 vp9: Compute skin only for blocks eligible for noise estimation.
Change-Id: Iddcb83a5968db57cfd312c5bc44b2a226a2a3264
2017-07-14 15:14:30 -07:00
Marco
666e394d41 vp9: Adjust minmax threshold for variance partitioning.
Only affects speed 7. Improvement on high motion clips.

Change-Id: Ibddb68fed9c63207df29ffd790f9205b1cecf687
2017-07-13 21:19:37 -07:00
James Zern
b578d59623 Merge "remove vp9_firstpass.c w/CONFIG_REALTIME_ONLY" 2017-07-12 23:30:04 +00:00
Marco Paniconi
f6586b8bf8 Merge "vp9: Fix to SVC and denoising for fixed pattern case." 2017-07-12 19:13:05 +00:00
Urvang Joshi
1dee320446 Merge "Remove the token state array from greedy optimize_b." 2017-07-12 00:08:56 +00:00
James Zern
df18412f32 remove vp9_firstpass.c w/CONFIG_REALTIME_ONLY
BUG=webm:1446

Change-Id: I6e0ea9342c715d354c641109737172afa649b85b
2017-07-11 13:10:16 -07:00
Urvang Joshi
5322a31b18 Remove the token state array from greedy optimize_b.
Reduces memory usage, and speeds up encoding for some difficult clips.
No impact on output or metrics.

Ported from aomedia patch:
https://aomedia-review.googlesource.com/c/14501

Change-Id: I26ec69af8336f9e80da486a1cfbfc89a3596954d
2017-07-11 13:05:29 -07:00
James Bankoski
7d5afa227a Merge "Reintroduce fix for max qindex calculation of a gf interval" 2017-07-11 19:47:16 +00:00
Jerome Jiang
1a4d8f2033 Merge "vp9: Move skinmap computation into multithreading loop." 2017-07-11 19:44:22 +00:00
Jim Bankoski
689ad89e86 Reintroduce fix for max qindex calculation of a gf interval
This reintroduces the fix:
  https://chromium-review.googlesource.com/c/422807/
and later reverted here:
  https://chromium-review.googlesource.com/c/447843/

BUG=webm:1355

This time behind a compile time flag :

configure --disable-always_adjust_bpm
configure --enable-always_adjust_bpm

This should make side by side testing easier and let users of the
lib pick which way they want to go.

Change-Id: I7d7b37b83015dc001810af84c132cbc1e71ba8d6
2017-07-11 18:40:26 +00:00
Marco
3818a3723b vp9: Fix to SVC and denoising for fixed pattern case.
For fixed pattern SVC: keep track of denoised last_frame buffer
for base temporal layer, and if alt_ref is updated on middle/upper
temporal layers, force an update to denoised last_frame buffer.
This allows for improved denoising on top temporal layers.

Change-Id: Icbd08566027d4d2eabc024d3b7a0d959d2f8c18b
2017-07-11 11:27:04 -07:00
Jerome Jiang
3d6b0cb825 vp9: Move skinmap computation into multithreading loop.
Change-Id: Iebc9dd293d8b1449c0674c0295349297e9b90646
2017-07-10 17:18:15 -07:00
Johann Koenig
4b78c6e6f7 Merge "remove vp9_full_sad_search" 2017-07-10 20:42:40 +00:00
Jerome Jiang
125a532b34 Merge "vp9: Remove alt-ref from denoiser." 2017-07-10 20:03:51 +00:00
Johann
109faffe9b remove vp9_full_sad_search
This code is unused in vp9. Only vp8 still contains references to
vpx_sad_NxMx[3|8] and only for sizes 16x16, 16x8, 8x16, 8x8 and 4x4.

Remove the remaining sizes and all the highbitdepth versions.

BUG=webm:1425

Change-Id: If6a253977c8e0c04599e25cbeb45f71a94f563e8
2017-07-10 11:20:35 -07:00
Jerome Jiang
2ac7c549e9 vp9: Remove alt-ref from denoiser.
Denoiser is used in real-time mode which does not use alt-ref.
Reduce memory usage when denoiser is enabled.

Change-Id: I54ba3bcaeeb1818bbdf718ef90e97d4897ff793d
2017-07-10 10:56:03 -07:00
James Zern
5d6060b62f Merge "cosmetics,vp9/: normalize inv/fwd_txfm naming" 2017-07-07 19:15:02 +00:00
James Zern
80b83c73ba cosmetics,vp9/: normalize inv/fwd_txfm naming
+ vpx_dsp/, test/

itxfm -> inv_txfm, ftxfm -> fwd_txfm

Change-Id: I3aacdb65143576d64cfe5c9b14dd358c17c1fe7e
2017-07-06 18:35:44 -07:00
Marco
8c3f18efa1 vp9: Nonrd mode: use content_state_sb for high motion.
In the content_state for a superblock is set to HighSad,
use that to bias some decisions in variance partition and
nonrd pickmde: use int_pro_motion for sad computation in
choose_partitioning, and set large_block in pickmode based
on the content_state_sb.

Only affects speed >= 7.

Immprovement for high motion content.
Small gain (~1%) in RTC metrics.
Speedup of ~5 for high motion clip on android (speed 8, 1 thread).

Change-Id: I5774c4854f012b89c8e969f6129b60988c2ce11c
2017-07-06 15:05:19 -07:00
Yaowu Xu
f2b1dc529f Merge "Further refactoring of mod error calculation." 2017-07-05 21:43:50 +00:00
paulwilkins
5b44ef0c50 Respond more rapidly to excessive local overshoot.
This patch attempts to address a bug reported for 4K video.
https://b.corp.google.com/issues/62215394

In this instance a perfect storm of a moderate complexity section
followed by a much easier section where a CGI overlay helped to
suppress film grain noise, followed by a much harder and very grainy
section at the end, cause a massive local rate spike that pushed a chunk
over the upper allowed rate limit.

This patch detects cases where the rate for a frame is much higher than
expected and allows, in this special case, for rapid adjustment of the active
Q range.

For the example chunk in the bug report the target rate was 18Mb/s and the
observed rate was over 37 Mb/s with a surge for the last few frames to over
100Mb/s. This patch brings the overall chunk rate right back down to ~18.2 Mbit/s
and  almost completely eliminates the rate spike at the end. (See graphs appended
to bug report)

Also see  I108da7ca42f3bc95c5825dd33c9d84583227dac1 which fixes a bug
unearthed during testing of this patch and also has a bearing on high rate
encodes such as 4K.

This patch does have a negative impact on some metrics. Most notably there are
clips in our standard test set where it hurts global psnr (though in many cases it
conversely helps SSIM, FAST SSIM and PSNR-HVS). It is also worth noting that
the clips (and data rates) where there is a big metric impact, are almost all cases
where there is currently a significant overshoot vs the target rate and overall rate
accuracy is greatly improved.

Change-Id: I692311a709ccdb6003e705103de9d05b59bf840a
2017-07-05 16:51:52 +01:00
paulwilkins
a1af335f44 Further refactoring of mod error calculation.
Further refactoring to support alternative error distributions.

Change-Id: I0f7fa3fd6f3baa4b0a1e53c6aa3be63966e97b82
2017-07-05 16:49:37 +01:00
paulwilkins
b0459ec8ea Fix incorrect index test in GF group rate assignment.
Correct test for middle frame in the group.

Change-Id: I1ee49fa33968eb3c4a01d6a27a60bb1409e3e68c
2017-07-05 16:45:36 +01:00
James Zern
37e03b1d13 Merge "cosmetics,vp9/encoder: s/txm/txfm/" 2017-06-30 21:57:16 +00:00
Marco
2290898ac7 vp9: Adjust condition for checking intra mode.
For nonrd_pickmode: add condition for checking
intra mode if the sb content state is VeryHighSad.

Reduces artifacts when sudden change in content.

Metrics on RTC/RTC_derf neutral (small gain).
No speed loss observed.

Change-Id: I07006d28fd2dc06c1d06b07630102b0fece50c40
2017-06-30 14:52:00 -07:00
James Zern
8d1bda93f4 cosmetics,vp9/encoder: s/txm/txfm/
txfm is more commonly used as an abbreviation through the codebase

Change-Id: I86fd90ef132468f9da270091c05daa1f5a49ece2
2017-06-29 15:08:47 -07:00
Jerome Jiang
74dc640565 vp9: Remove avg2x2 in skin detection and clean up.
Change-Id: I6510e36866138f8ac4cb82c207e58e0b9522e499
2017-06-29 03:06:23 +00:00
Jerome Jiang
8582d33a0d Merge "vp9: compute skinmap only once before encoding." 2017-06-28 18:01:46 +00:00
Marco
88d11f473c vp9: Speed >= 8: Remove logic on reducing subpel.
Existing logic was only affecting resolutions above 720p.
Needs more testing for reducing subpel for speed >= 8.

No change on RTC metrics.

Change-Id: I2f4bf9f25891614aafa9a86aa5a5063a3ccfce4d
2017-06-27 20:27:02 -07:00
James Zern
515fed8f38 Merge "highbd_quantize_fp_32x32: normalize abs_qcoeff type" 2017-06-27 23:30:17 +00:00
Jerome Jiang
a220b931f5 vp9: compute skinmap only once before encoding.
This could save some cycles since skin detection is used in multiple
places in vp9.

1~2% speed up on ARM.

Change-Id: I86b731945f85215bbb0976021cd0f2040ff2687c
2017-06-27 16:16:02 -07:00
Urvang Joshi
4bb99ee27e Enable greedy version of optimize_b() in VP9 by default.
Improvements were already mentioned in the previous patch:
https://chromium-review.googlesource.com/#/c/531675/

Change-Id: I4906ab1c61c25a815bdeb986016fad6dcb69eb71
2017-06-23 17:04:58 -07:00
Marco
18805eee6c vp9: Use scene detection for CBR mode.
Use the scene detection for CBR mode, and use it to reset the
rate control if large source sad is detected and rate
correctioni fact/QP is at minimum state.

Avoids large frame sizes after big content change following
low content period.

Only affects CBR mode for 1 pass at speeds 5, 6, 7.
Change-Id: I56dd853478cd5849b32db776e9221e258998d874
2017-06-23 11:44:50 -07:00
James Zern
88a302e743 Merge changes from topic 'missing-proto'
* changes:
  onyxd_int.h: add missing prototypes
  onyxd.h: add vp8dx_references_buffer prototype
  vp[89],vpx_dsp: add missing includes
  vp8,encodeframe.h: correct prototypes
  vp8: add temporal_filter.h
  add picklpf.h
  add ethreading.h
  vp8,bitstream.h: add missing prototypes
  vp8: remove vp8_fast_quantize_b_mmx
  vp8,loopfilter_filters: make some functions static
  vp9_ratectrl: make adjust_gf_boost_lag_one_pass_vbr static
  vp9_encodeframe: make scale_part_thresh_sumdiff static
  vp9_alt_ref_aq: correct vp9_alt_ref_aq_create proto
  tiny_ssim: make some functions static
2017-06-23 05:44:24 +00:00
Marco Paniconi
4f917912b9 Merge "vp9: Add high source sad to content state." 2017-06-22 22:18:48 +00:00
Paul Wilkins
92145006c3 Merge "Fix int overflow in rate control for high bit rates." 2017-06-22 16:30:40 +00:00
paulwilkins
efe1982e63 Fix int overflow in rate control for high bit rates.
Fix misplaced cast that caused an overflow and incorrect rate adaptation
behavior for high data rates. This in particular will have affected 4k encodes
but could also have come into play for some higher rate 1080p cases.

In our standard test sets the quality impact is small though several high rate
clips show improved rate accuracy. This can also impact the number of recode
loop hits and on one problem 4k  clip the encode time for speeds 0 and 1 was
reduced by >25%

Change-Id: I108da7ca42f3bc95c5825dd33c9d84583227dac1
2017-06-22 10:34:21 +01:00
Marco
d7515b1187 vp9: Add high source sad to content state.
Use it to limit NEWMV early exit in nonrd pickmode

Small change in RTC metrics, has some improvement
for high motion clips.
Change-Id: I1d89fd955e1b3486d5fb07f4472eeeecd553f67f
2017-06-21 20:57:17 -07:00
Marco Paniconi
33a9394eb1 Merge "vp9: Adjustments for aq-mode and pickmode for speed >= 8." 2017-06-22 03:27:47 +00:00
James Zern
44418c659f vp[89],vpx_dsp: add missing includes
quiets -Wmissing-prototypes

Change-Id: I841cfc019d592f2bc6b3fec5818051a31f4c53b5
2017-06-21 19:00:15 -07:00
James Zern
07f847873b vp9_ratectrl: make adjust_gf_boost_lag_one_pass_vbr static
quiets -Wmissing-prototypes

Change-Id: I72d899c2d8de1ddc52d90ac081f2629374b3a6e9
2017-06-21 19:00:14 -07:00
James Zern
9a329b5285 vp9_encodeframe: make scale_part_thresh_sumdiff static
quiets -Wmissing-prototypes

Change-Id: I696223d75860edba13c6b6f38c1f8db353a6f812
2017-06-21 19:00:14 -07:00
James Zern
3f296533f6 vp9_alt_ref_aq: correct vp9_alt_ref_aq_create proto
quiets -Wmissing-prototypes

Change-Id: Ib2d4f294f1982739bb2ac98155e789e040d309a1
2017-06-21 19:00:04 -07:00
James Zern
9e1d2de67c highbd_quantize_fp_32x32: normalize abs_qcoeff type
use an int to quiet an unsigned rollover warning similar to:
25110f283 Fix an ubsan warning: vp9_quantizer.c

Change-Id: Iedecb79a17249bc18f10c0920f88cf704920f12b
2017-06-21 18:56:10 -07:00
Marco
21afafa31a vp9: Put skin detection usage around cpi flag.
Skin detection usage in choose_partitioning should be
around the cpi->use_skin_detection.

Change-Id: I6986179af9ce94c60c0974d66c311fc07cc04cfe
2017-06-21 17:32:56 -07:00
Marco
8cf6f78fce vp9: Adjustments for aq-mode and pickmode for speed >= 8.
Adjust the threshold for turning off cyclic refresh for high motion,
and avoid testing golden in nonrd pickmode for speed >= 8 if
golden refresh was long ago.

No change/neutral on RTC metrics.
Change-Id: I40959b8d9637f3553e7458bbabd8c6024c2c09c0
2017-06-21 16:01:24 -07:00
Marco Paniconi
737aa5c9e4 Merge "vp9: SVC: Rework the usage of base_mv for SVC." 2017-06-20 03:08:32 +00:00
Marco
ff7fb4b280 vp9: Speed >= 8: Adjust resolution threshold for subpel.
Get some quality gain on RTC metrics (~7%), with
~5-8% speed slowdown.

Change-Id: I0d02942a77074424ee0326b6e110ddff09f2df5e
2017-06-19 13:58:08 -07:00
Marco
112cd95507 vp9: SVC: Rework the usage of base_mv for SVC.
Set the base_mv_aggressive for temporal enhancement layers (TL > 0).
Under the aggressive mode, skip the NEWMV depending on the
SSE of the base_mv. Also reduce the subpel motion to 1/2 under
aggressive mode if base_mv is good.

Speedup ~3% with small/negligible loss in quality on RTC.
Affects speed >= 6.

Change-Id: I89341b279cad6da2a04b76d5e726016191dacdb8
2017-06-18 22:35:46 -07:00
Urvang Joshi
a4ea7e131b VP9: Add greedy version of av1_optimize_b().
This was ported from the greedy version in AV1, written by Dake He
(dkhe@google.com).
See:
https://aomedia.googlesource.com/aom/+/master/av1/encoder/encodemb.c#137

Greedy version is disabled by default, but can be picked by setting
USE_GREEDY_OPTIMIZE_B to 1.
To be enabled by default later.

This is both faster and better in terms of compression.

Compression Improvement:
------------------------
lowres: -0.119
midres: -0.064
hdres:  -0.405

Speed Improvement:
------------------
(Based on encode time of 3 videos of different difficulties at
3 different target bitrates)
With --cpu-used=0: 0.38% to 5.55% faster
With --cpu-used=1: 0.24% to 2.79% faster
With --cpu-used=2: 0.29% to 1.46% faster

Change-Id: Ia7a23b3b244ad8eb253ac9e43cd03c5e021d2635
2017-06-15 11:19:08 -07:00
Linfeng Zhang
d6eeef9ee6 Clean array_transpose_{4X8,16x16,16x16_2) in x86
Change-Id: I341399ecbde37065375ea7e63511a26bfc285ea0
2017-06-13 16:50:44 -07:00
Linfeng Zhang
9c72e85e4c Remove array_transpose_8x8() in x86
Duplicate of transpose_16bit_8x8()

Change-Id: Iaa5dd63b5cccb044974a65af22c90e13418e311f
2017-06-13 16:50:44 -07:00
Jerome Jiang
a46bc0268b Merge "Remove duplication on vp8/9_write_yuv_frame." 2017-06-10 04:50:19 +00:00
Marco
e540ca7155 vp9: SVC: Use prune_evenemore only for non_reference.
Set subpel prune_evenmore only for non_reference frames,
instead of all TL > 0 frames. Gain some quality back at
cost of small speed loss (~1-2%).

Change only effects SVC encoding at speed >= 7.

Change-Id: I5b9f51e51dccfd7050521a66996176b0415ca3f9
2017-06-09 17:52:20 -07:00
Jerome Jiang
ff2d220d21 Remove duplication on vp8/9_write_yuv_frame.
Change-Id: Ib3546032a27c715bf509c0e24d26a189bc829da8
2017-06-09 17:08:26 -07:00
Jerome Jiang
943f9ee25c Merge "Merge skin detection code in vp8/9." 2017-06-08 16:36:00 +00:00
Jerome Jiang
658e854252 Merge skin detection code in vp8/9.
BUG=webm:1438

Change-Id: Ie3dc034c7dbb498a0b088a767b1936ddeed4df14
2017-06-07 21:20:34 -07:00
Marco
14d4718043 vp9: SVC: Enable simple_block_yrd for temporal layers.
Enable simple_block_yrd for temporal enhancement layers (TL > 0).
And remove block size condiiton for SVC mode.
Only affects speed >= 7 SVC.

Speedup ~3-4%.
avgPSNR regression on RTC for (3 spatial, 3 temporal) layers: ~1%.

Change-Id: Iff4fc191623b71c69cd373e7c0823385e7ac67ed
2017-06-07 11:41:50 -07:00
Marco
7d2f5f8e9d vp9: SVC: Adjust some speed settings for SVC speed >= 7.
Keep the 1/4subpel for all frames, use SUBPEL_TREE_PRUNED_EVENMORE
for all temporal enhancement layer frames.

Change-Id: Ibc681acbb6fc75b7b3c57fc483fcb11d591dfc9a
2017-06-06 15:30:24 -07:00
Jerome Jiang
cf07d85809 Initialize cost_list all to INT_MAX.
It is initialized to be { INT_MAX, 0, ... } in ffe0f9b.
No effect on encoders.
Make it consistent with other initializations.

BUG=webm:1440

Change-Id: Ie2a180d93626b55914c8c4255e466a1986d2b922
2017-06-06 10:42:37 -07:00
James Zern
6df142e2ab vp9_mcomp,get_cost_surf_min: quiet conversion warning
visual studio will warn if a 32-bit shift is implicitly converted to 64.
in this case integer storage is enough for the result.
since:
f3a9ae5ba Fix ubsan failure in vp9_mcomp.c.

Change-Id: I7e0e199ef8d3c64e07b780c8905da8c53c1d09fc
2017-06-05 22:52:58 -07:00
Jerome Jiang
968a5d6bc2 Merge "Fix valgrind failure on uninitialized variables." 2017-06-06 03:47:31 +00:00
Jerome Jiang
ffe0f9b7fb Fix valgrind failure on uninitialized variables.
BUG=webm:1440

Change-Id: I7074e42bdfa8dd25f11bbb3f2ab1b41d6f4c12e4
2017-06-05 13:09:29 -07:00
Jerome Jiang
f3a9ae5baa Fix ubsan failure in vp9_mcomp.c.
Change-Id: Iff1dea1fe9d4ea1d3fc95ea736ddf12f30e6f48d
2017-06-02 21:37:13 -07:00
Marco
e30781ff80 vp9: SVC: Force subpel search off under certain conditions.
For SVC 1 pass non-rd mode:
Force subpel seach off for SVC for non-reference frames
under motion threshold.

Add flag to svc context to indicate if the frame is not used
as a reference.

Little/no quaity loss, ~2% speedup.

Change-Id: Ic433c44b514d19d08b28f80ff05231dc943b28e9
2017-06-01 20:48:52 -07:00
Marco
8c6fa5c5e3 vp9: Speed >8: Set subpel_search_method for low motion.
Speed >=8: for resolutions above CIF, and for low motion content,
set subpel_search_method to SUBPEL_TREE_PRUNED_EVENMORE.

Small speed gain (~2%) on vga clips,
RTC metrics up by ~2-3% on average.

Change-Id: Ie26ba0264589652f92dfe74308740debf94cf0cc
2017-06-01 16:16:13 -07:00
Jerome Jiang
e254969df2 Fix corruption in skin map debugging output yuv.
For both vp8 and vp9.

BUG=webm:1437

Change-Id: Ifd06f68a876ade91cc2cc27c574c4641b77cce28
2017-06-01 16:59:43 +00:00
Jerome Jiang
a5ab38093f Merge "Fix vp8 race when build --enable-vp9-highbitdepth." 2017-05-30 05:47:44 +00:00
Jerome Jiang
0afa2dad76 Fix vp8 race when build --enable-vp9-highbitdepth.
Split vp8/vp9 implementations on yv12_copy_frame_c.
Remove high-bitdepth codes from vp8_yv12_extend_frame_borders_c.
Clean up vp8 codes usage in vp9.

BUG=webm:1435

Change-Id: Ic68e79e9d71e1b20ddfc451fb8dcf2447861236d
2017-05-26 09:45:01 -07:00
Marco
146005a911 vp9: SVC: Fix to condiiton on using source_sad.
Fix the condition on usage of source_sad for temporal layers.
FIx allows it to be used for the case of 1 temporal layer.

Change-Id: I02b1b0ade67a7889d1b93cee66d27c0951131fc3
2017-05-26 08:46:50 -07:00
Marco Paniconi
9ec9415fd9 Merge "vp9: Use source_sad only on top temporal enhancement layer." 2017-05-26 05:24:06 +00:00
Marco
ea914456af vp9: Use source_sad only on top temporal enhancement layer.
For 1 pass CBR SVC mode.

Change-Id: Ic026740f9d0ec5eee7c5845be9c5b15884fec48d
2017-05-25 16:32:05 -07:00
Marco
747cf7a505 vp9: SVC: Enable copy partition for SVC speed >= 7.
Adjust the max_copied_frame setting for temporal layers.
Keep the same setting for non-SVC at speed 8.
This change also enables copy_partiton for non-SVC at speed 7,
but with smaller value of max_copied_frame (=2).

~2% speedup for SVC speed 7, 3 layers, with little/no quality loss.

Change-Id: Ic65ac9aad764ec65a35770d263424b2393ec6780
2017-05-25 12:21:46 -07:00
Marco Paniconi
b3bf91bdc6 Merge "vp9: Adjustments to cyclic refresh for high motion." 2017-05-22 06:27:30 +00:00
Marco
2adc0443dd vp9: Adjustments to cyclic refresh for high motion.
For aq-mode=3: refactor the condition for turning off
the refresh. Add some adjustments for high motion content.

No/little change in RTC metrics, only affects high motion case.

Change-Id: I7da8eabfb0e61db014be4562806f72ee5ef4a43b
2017-05-21 22:21:44 -07:00
Marco
ff9395eb3b vp9: Speed >= 8: Modify condition for low-resoln.
No change on RTC metrics.

Change-Id: I5abc573cb56572188d900645d13ba479f55a1ea0
2017-05-21 22:14:38 -07:00
Paul Wilkins
a7977ece93 Merge "Changes to modified error." 2017-05-19 12:24:32 +00:00
Marco
1205e3207e vp9: SVC: Modify condition to allow for copy partition.
When temporal layers are used, only allow for copy partition
on the top temporal enhancement layer frames.

Change-Id: I5472abdc0f9f6c8dafa75a7a84c615e08ae22af8
2017-05-18 14:19:31 -07:00
Jerome Jiang
6b6ff9c969 Merge "vp9: Make copy partition work for SVC and dynamic resize." 2017-05-18 19:37:30 +00:00
Marco
2ba4729ef8 vp9: Make copy partition work for SVC and dynamic resize.
Only affects speed 8.

Make changes to copy partition to fix a bug in setting microblock
offset. Avg PSNR shows 0.02% gain on rtc_derf and 0.08% loss on rtc.

Change-Id: I61c3e5914dde645331344388e7437e5638acd4f3
2017-05-18 11:33:56 -07:00
paulwilkins
5680b4517f Changes to modified error.
The modified error was a derivative of the "coded_error"
that was used to allocate bits between different frames on the
assumption that the allocation should be linear in terms of this
modified error.  I.e. a frame with double the modified error score
should all things being equal get double the number of bits. The
code also included upper and lower caps derived from input
VBR parameters.

This patch improves the initial calculation of the clip mean error
(now called "mean_mod_score" as it is no longer a prediction error)
used as the midpoint for the rate distribution function and normalizes
the output "modified scores" scores such that 1.0 indicates a frame
in the middle of the distribution.  The VBR upper and lower caps are
then applied directly to a  frame's normalized score.

This refactoring is intended to make it easier to drop in alternative
distribution functions or to base the rate allocation on a corpus wide
midpoint (rather than the clip mean).

Change-Id: I4fb09de637e93566bfc4e022b2e7d04660817195
2017-05-18 12:56:02 +01:00
Yaowu Xu
bde2c04fb7 Merge "Experiment. Store first pass errors as per MB values." 2017-05-17 17:38:15 +00:00
paulwilkins
42e5073f94 Experiment. Store first pass errors as per MB values.
Most existing first pass stats are stored in a form normalized to a
macro-block scale. However the error scores for intra / inter etc were
stored as frame level values but mainly used as MB level values.

This change  fixes that. Normalized per MB values make comparisons
between different formats easier and in any case this is usually what is
wanted.

An change in results should be limited to slight differences in rounding.

*** Change after patch 8 +2 requiring new approval.

Final pre-submit testing showed  one 4K clip with above expected change.
Investigation showed this was due to a value used to test for ultra low intra
complexity in key frame detection. This was a per frame not per MB value but
also did not scale with frame size. Replacement with a small per MB value
(based on original per frame value and cif frame size) resolved the KF detection
problem.

Also converted kf_group_error_left to a double in line with other error values
to reduce rounding problems in KF group bit allocation

All clips and sets now show nominal (or 0) change as expected.

Change-Id: Ic2d57980398c99ade2b7380e3e6ca6b32186901f
2017-05-17 12:00:18 +01:00
Johann Koenig
8739a182c8 Merge "move neon load/stores to a new file" 2017-05-15 18:15:27 +00:00
Johann
1088b4f87c move neon load/stores to a new file
Move the tran_low_t helper functions to a new file. Additional
load/store functions will be added here.

Change-Id: I52bf652c344c585ea2f3e1230886be93f5caefc3
2017-05-15 08:29:43 -07:00
Jerome Jiang
6b9d130214 Merge "vp9: speed 8: Fix seg fault in partition copy when drop frames." 2017-05-13 03:20:49 +00:00
Cheng Chen
4c0655f26b Merge "Speed up encoding by skipping altref recode" 2017-05-13 01:29:59 +00:00
Jerome Jiang
1fcd5cca3c vp9: speed 8: Fix seg fault in partition copy when drop frames.
BUG=webm:1433

Change-Id: I4f3984ef28660d3218d48007d7c977bdbdaf8af6
2017-05-12 15:57:23 -07:00
Marco Paniconi
629279a45c Merge "vp9: Adjust speed features for speed 8 at low resoln." 2017-05-12 00:35:40 +00:00
Marco Paniconi
c64667c338 Merge "vp9: SVC: Increase the partiiton and acskip thresholds" 2017-05-11 23:37:32 +00:00
Marco Paniconi
37cdd3bfc2 Merge "vp9; Adjust noise estimation thresholds." 2017-05-11 21:58:40 +00:00
Marco
c5c31b9eb6 vp9: SVC: Increase the partiiton and acskip thresholds
Increase the partition and acskip thresholds for temporal
enhancement layers.

~1-2% speedup, with negligible loss in quality.

Change-Id: Id527398a05855298ad9ddac10ada972482415627
2017-05-11 12:28:19 -07:00
Marco
c5a4376aed vp9: SVC: allow for setting the interp_filter in non-rd pickmode.
For SVC 1 pass non-rd pickmode, the interpolation filter for the
upsampling of the golden (spatial) reference was not being explicitly
set and instead was takin gwhatever value was set in the previous
mode/block (which would be either EIGHTTAP or EIGHTAP_SMOOTH).

Fix it to the default EIGHTTAP for now, to be updated/selected
adaptively in a later change.

Minor adjustmemt to rate targeting thresholds in datarate unittests.

Change-Id: I52085048674072c6cfb7163e11e9a2658d773826
2017-05-11 11:45:09 -07:00
Paul Wilkins
3caaf21c5b Merge "Tuning of factor used to calculate Q range in two pass." 2017-05-11 18:25:45 +00:00
Jerome Jiang
d35541fe29 Merge "vp9: Fix ubsan failure in denoiser." 2017-05-11 16:38:59 +00:00
paulwilkins
9a7625652c Tuning of factor used to calculate Q range in two pass.
A more detailed explanation of the experimentation
leading to this change can be found in:-

https://docs.google.com/a/google.com/document/d/13lsYhxgPyxUHvEess6wg9nikaonIZKY9Ak_Lpafv5Mo/edit?usp=sharing

This change gives gains across all our standard test sets for
overall psnr, ssim, fast ssim and psnr-HVS.

Values expressed as % reduction in bitrate.

Low res set     -0.257, -0.192, -0.173, -0.101
Mid res set     -0.233, -0.336, -0.367, -0.139
High res set    -0.999, -1.039, -1.111, -0.567
NetFlix 2K set -0.734, -0.174, -0.389, -0.820
Netflix 4K set  -0.814, -0.485, -0.796, -0.839

Change-Id: Ie981fb3c895c9dfcfc8682640d201a86375db5c8
2017-05-11 16:19:59 +01:00
Cheng Chen
76567d84ce Speed up encoding by skipping altref recode
Speed up for speed 0.
Reduce 10+% of encoding time for hdres in speed 0,
with less than 0.1% PSNR loss.
Compute total difference of previous and current frame context probability
model. If the diff is less than the threshold, skip recoding the frame.

Borg test (positive number means performance loss):
		lowres    midres    hdres
PSNR:		0.030     0.032     0.065

Local speed test: bitrate set at 1200
		blue_sky  pedestrian  rush_hour
Encoding time:	 -10.0%     -16.5%      -16.5%

Change-Id: I4e2d200ea3115d48b2c3e890143596b31b8ef9e9
2017-05-10 22:15:01 -07:00
Marco
2f11a65c99 vp9; Adjust noise estimation thresholds.
Change-Id: Ia41a11df18e5a58d2b8bbecd11c249d357de2a8f
2017-05-10 16:48:10 -07:00
Jerome Jiang
597d1f4c03 vp9: Fix ubsan failure in denoiser.
Fix the overflow for subtraction between two unsigned integers.

BUG=webm:1432

Change-Id: I7b665e93ba5850548810eff23258782c4f5ee15a
2017-05-10 13:43:17 -07:00
Jerome Jiang
2574573fea vp9: Wrap threshold tuning for HD only when denoiser is enabled.
Fixes a speed regression.

Change-Id: I23d942e4af17fa81fe4a366c7369b3ad537e59b0
2017-05-10 12:15:41 -07:00
Marco Paniconi
db2fad7516 Merge "vp9: Adjustment to noise estimation." 2017-05-10 17:11:18 +00:00
Marco
1b59964162 vp9: Adjustment to noise estimation.
When the noise estimate is forced off due to large motion,
reset the counter and set smaller window for next estimate.

Change-Id: Ifa4ec95396134173a00d48353ad52f1b6a40c217
2017-05-10 09:39:08 -07:00
Marco
4e23998fb4 vp9: SVC: Add option to set downsampling filter type.
Add option in SVC to set the filter type and phase for
the frame level downsampling filters.

For 3 spatial layers: set downsampling filter type to bilinear
and set phase to 8, for lowest spatial layer.

Change-Id: Id81f4b1ba93db19c1cd37b6a46d1281a2c61bc43
2017-05-09 17:22:44 -07:00
Marco
9586d5e682 vp9: SVC: Modify conditon for setting downsample filter type.
Base the condition on the resolution of the spatial layer.
And remove restriction on scaling factor.

Change-Id: Iad00177ce364279d85661654bff00ce7f48a672e
2017-05-08 14:13:49 -07:00
Linfeng Zhang
2c3a2ad6f1 Merge changes I0cfe4117,I3581d80d,Ida62c941
* changes:
  Split dsp/x86/inv_txfm_sse2.c
  Update highbd idct functions arguments to use uint16_t dst
  Clean CONVERT_TO_BYTEPTR/SHORTPTR in idct
2017-05-08 16:15:57 +00:00
Marco Paniconi
f4653c1efc Merge "vp9: SVC: Set downsample filtertype for lowest spatial layer." 2017-05-06 02:31:00 +00:00
Marco
9b729748ac vp9: SVC: Set downsample filtertype for lowest spatial layer.
For lowest spatial layer, in 3 layer SVC, set the
downsampling filtertype to get averaging filter.
Needed for reducing aliasing on low-res layer,
small increase in overall encoder time.

Change-Id: Ia31460123bd91b72eca49b46dd924b9f226d4563
2017-05-05 19:29:09 -07:00
Jerome Jiang
3453c8d6c4 Merge "vp9: Neon optimization for denoiser. Add unit tests." 2017-05-06 01:28:32 +00:00
Jerome Jiang
069eedb3a0 vp9: Neon optimization for denoiser. Add unit tests.
Denoiser on Neon is 5x faster than C code.

BUG=webm:1420

Change-Id: I805ab64f809ff2137354116be6213e7ec29c1dcb
2017-05-05 16:40:52 -07:00
Marco
34cce144d8 vp9: Adjust some thresholds for noise estimation.
Adjust thresholds for noise estimation, for resolutions above VGA.
Tends to push cleaner/low noise clips to LowLow state.

No change in RTC metrics.

Change-Id: I739ca6b797d0a60ccd1c6c6a2775269b1f007e5e
2017-05-05 12:00:12 -07:00
Jerome Jiang
af69ed20c4 vp9: Enable noise estimation on low res.
Set noise level to kLowLow for high motion low res clips.
Change the normalization in noise metric for low res.
Reduce the initial time-window for all resolutions.

Change-Id: Iaed39dbb50b205cd9c735dc5b84822304fb01987
2017-05-04 15:38:23 -07:00
Linfeng Zhang
d5de63d2be Update highbd idct functions arguments to use uint16_t dst
BUG=webm:1388

Change-Id: I3581d80d0389b99166e70987d38aba2db6c469d5
2017-05-03 13:59:16 -07:00
Linfeng Zhang
081b39f2b7 Clean CONVERT_TO_BYTEPTR/SHORTPTR in idct
BUG=webm:1388

Change-Id: Ida62c941f2b836d6c9e27b427a7d5008ab6dc112
2017-05-03 13:58:31 -07:00
Hui Su
5048d6e7ee Merge "vp9 level: add tentative max cpb values for high levels" 2017-05-03 20:51:03 +00:00
Hui Su
f701a44305 Merge "Adjust alt-ref selection in define_gf_group()" 2017-05-03 20:50:29 +00:00
Johann Koenig
240a5a15ef Merge "block error sse2: sum in 32 bits when possible" 2017-05-02 14:16:47 +00:00
Johann
cd94d5f68e block error avx2: rename variables
Change-Id: I2b8a9253f2c3d1fd85304c2970ebe70213870fe9
2017-05-01 17:54:29 -07:00
Johann Koenig
b1a31f8066 Merge "block error avx2: sum in 32 bits when possible" 2017-05-02 00:52:59 +00:00
Marco Paniconi
1e112bce37 Merge "vp9: SVC: Early exit on golden ref in non-rd pickmode." 2017-05-01 21:04:52 +00:00
Linfeng Zhang
e8655d49f5 Merge "Clean vp9_highbd_build_inter_predictor() and highbd_inter_predictor()" 2017-05-01 19:54:40 +00:00
Kyle Siefring
760c214519 block error avx2: sum in 32 bits when possible
Add 31bit pairs before unpacking in x86 block error code

AVX2 code provides a very minor performance improvement.

BUG=webm:1210

Change-Id: I4c82308eaf65741dca2f5c6db9be9c85f905073a
2017-05-01 12:51:33 -07:00
Marco
ae0215f945 vp9: SVC: Early exit on golden ref in non-rd pickmode.
For SVC 1 pass real-time: add condition to skip the
golden (spatial) reference mode in non-rd pickmode.
Condition is to skip golden if the sse of zeromv-last mode
is below threshold. And change order in ref_mode_set_svc
to make sure golden zeromv is tested after last-nearest.

Speedup ~3-4% with little/negligible quality loss.

Change-Id: I6cbe314a93210454ba2997945f714015f1b2fca3
2017-05-01 10:36:54 -07:00
Kyle Siefring
8394990b27 block error sse2: sum in 32 bits when possible
Add 31bit pairs before unpacking in x86 block error code

BUG=webm:1210

Change-Id: I5ca8c7f7775585a17fe09d6bbfc25e1f2955eb0a
2017-05-01 09:59:18 -07:00
Johann
2ff01aa1e4 move vp9_error_intrin_avx2.c
There is only one avx2 implementation. Drop '_intrin'

Change-Id: I887a0d27d58567eaad49f749f127eca61313f312
2017-05-01 09:13:01 -07:00
Johann Koenig
ef5918098d Merge "Use uint32_t for accumulator" 2017-04-28 18:32:09 +00:00
Jerome Jiang
ce2e278059 Merge "vp9: Fix condition for disabling adaptive_rd_thresh." 2017-04-28 18:10:36 +00:00
Jerome Jiang
04de501229 vp9: Fix condition for disabling adaptive_rd_thresh.
Add speed constrains for disabling adaptive_rd_thresh when
row_mt_bit_exact is set.

Change-Id: I2445115c2f9a2e46b8a0966031a0fea488d4964e
2017-04-28 10:26:20 -07:00
Johann
657f3e9f14 Use uint32_t for accumulator
Be specific about the data type size.

Use convenience macro vp9_zero_array.

Change-Id: I5fadf7dbd408befb73820d85db0be4832e8cfcbd
2017-04-28 06:36:59 -07:00
Johann Koenig
94ebdba71d Merge "vp9 temporal filter: sse4 implementation" 2017-04-28 13:22:41 +00:00
Yaowu Xu
0e8fea6c13 Merge "VP9: enable trellis for high bitdepth intra" 2017-04-28 00:16:56 +00:00
Johann
6dfeea6592 vp9 temporal filter: sse4 implementation
Approximates division using multiply and shift.

Speeds up both sizes (8x8 and 16x16) by 30 times.

Fix the call sites to use the RTCD function.

Delete sse2 and mips implementation. They were based on a previous
implementation of the filter. It was changed in Dec 2015:
ece4fd5d22

BUG=webm:1378

Change-Id: I0818e767a802966520b5c6e7999584ad13159276
2017-04-26 22:03:05 -07:00
Jerome Jiang
43e0e082d1 vp9: Don't force disabling of adaptive_rd_thresh for realtime.
Don't force disabling of adaptive_rd_thresh for realtime when
row_mt_bit_exact is set.

Row based adaptive rd is made usable in CL
454882(https://chromium-review.googlesource.com/c/454882) for REALTIME.

Change-Id: Ief023414f0fd6eb86f299dd46ae58f4436875af5
2017-04-26 13:17:57 -07:00
Yunqing Wang
b68f14d0ed Merge "Make the row based multi-threaded encoder deterministic" 2017-04-26 16:12:14 +00:00
Linfeng Zhang
54c4e0f7a5 Merge "Update highbd convolve functions arguments to use uint16_t src/dst" 2017-04-26 15:50:46 +00:00
Marco Paniconi
004fab120a Merge "vp9: SVC: Adjust some speed settings for temporal layers." 2017-04-26 15:45:06 +00:00
Peter de Rivaz
66117b97c5 VP9: enable trellis for high bitdepth intra
BUG=webm:1409

Change-Id: I5236595aac1c09386c60ffe8ad621e01422ed5a7
2017-04-26 11:43:01 +01:00
hui su
d01c9febe9 vp9 level: add tentative max cpb values for high levels
Add tentative max cpb size values for levels 5.2 and up. Otherwise
encoding will fail when targeting for these levels.

Change-Id: Ib7e0ba4b9836ea1ac900b6822543812843d48463
2017-04-25 18:03:55 -07:00
hui su
8069f31076 Adjust alt-ref selection in define_gf_group()
107de19698 changes the encoder alt-ref selection behavior. Assuming
min_gf_interval = max_gf_interval = 4, the frame order would be
frm_1  arf_1  frm_2  frm_3  frm_4  frm_5  arf_2 before 107de19698;
frm_1  arf_1  frm_2  frm_3  frm_4  arf_2  frm_5 after 107de19698.

This patch reverts such alt-ref placement change.

Change-Id: I93a4a65036575151286f004d455d4fcea88a1550
2017-04-25 18:03:47 -07:00
Jerome Jiang
997e54ea43 Merge "vp9: speed >= 8: Skip uv variance in model_rd_sb_y_large" 2017-04-26 00:09:22 +00:00
Marco
c614164cb6 vp9: SVC: Adjust some speed settings for temporal layers.
Make some speed setting changes for temporal enhancement layers,
and remove the switch in subpel_force_stop for the aggressive_base_mv
in non-rd pickmode.

Gain some 2-3% speed with little/negligible quality loss.

Change-Id: I3e2a7f80ff45f38c0a6ceb01b34dbca2f53edbf0
2017-04-25 16:27:01 -07:00
Jerome Jiang
69b0242e9a vp9: speed >= 8: Skip uv variance in model_rd_sb_y_large
For speed >= 8 and color_sensitivity not set, skip the transform
skipping test in UV planes.
Add a new condition to check noise level to skip chroma check
for speed >= 8 if y_sad is high.

1~2% speedup on ARM for speed 8.

Borg tests show neutral results in both rtc and rtc_derf.

Change-Id: Idecd3ff6e28c97757a43bb6f3a7082c85f72109c
2017-04-25 16:21:36 -07:00
Linfeng Zhang
4758d20227 Clean vp9_highbd_build_inter_predictor() and highbd_inter_predictor()
BUG=webm:1388

Change-Id: I7ee32e0c08f0fb41712a8cc640b2c5bba872421d
2017-04-25 14:32:20 -07:00
Linfeng Zhang
51dc998f3a Update highbd convolve functions arguments to use uint16_t src/dst
BUG=webm:1388

Change-Id: I6912de2639895d817ce850da8ea9f6c8fe21da42
2017-04-25 14:22:19 -07:00
Marco
92ec0674fd vp9; Reduce artifact in non-rd pickmode for lighting changes.
Add a low-variance high-sumdiff to the superblock content state
and use it to limit the mv and bias some decisions in non-rd pickmode.
Only affects speed >= 6.

Reduces artifact for lighting changes.
Small/no difference in metrics on RTC set.

Change-Id: Ic84b2379fe0ae3fa71ae826ee6bae3eaf551a25b
2017-04-24 17:08:43 -07:00
Yunqing Wang
10a497bd38 Make the row based multi-threaded encoder deterministic
This patch followed allow_exhaustive_searches feature modification and
continued to modify the encoder to achieve the determinism in the row
based multi-threaded encoding. While row-mt = 1 and using multiple
threads, the adaptive feature in encoder was disabled, which gave
BDRate gain(at speed 1, -0.6% ~ -0.7%; at speed 2, -0.46% ~ -0.59%),
but some encoder speed losses(7% ~ 10% at speed 1 and 3% ~ 6% at
speed 2). These speed losses were acceptable considering the speed
gains obtained from row-mt.

Change-Id: I60d87a25346ebc487a864b57d559f560b7e398bb
2017-04-24 16:28:27 -07:00
Yunqing Wang
c530208ae3 Merge "Make allow_exhaustive_searches feature no longer adaptive" 2017-04-24 17:41:10 +00:00
Marco Paniconi
b35f64241f Merge "vp9: SVC: fix condition for partition/skip threshold when denoising." 2017-04-21 21:28:17 +00:00
Yunqing Wang
bca4564683 Make allow_exhaustive_searches feature no longer adaptive
A previous patch turned on allow_exhaustive_searches feature only for
FC_GRAPHICS_ANIMATION content. This patch further modified the feature
by removing the exhaustive search limit, and made it no longer adaptive.
As a result, the 2 counts that recorded the number of motion searches
were removed, which helped achieve the determinism in the row based
multi-threading encoding. Tests showed that this patch didn't cause
the encoder much slower.

Used exhaustive_searches_thresh for this speed feature, and removed
allow_exhaustive_searches. Also, refactored the speed feature code
to follow the general speed feature setting style.

Change-Id: Ib96b182c4c8dfff4c1ab91d2497cc42bb9e5a4aa
2017-04-21 11:14:02 -07:00
Jerome Jiang
58fe1bde59 Merge "vp9: Non-rd pickmode: Avoid computation duplication." 2017-04-21 00:51:47 +00:00
Marco
5de0e9ed08 vp9: SVC: fix condition for partition/skip threshold when denoising.
The more aggressive settings should only be used when denoise_svc
condition is satisfied (which means top spatial layer).

Change-Id: Ia8e3515b27f31bf21b1976ca80a2fa826daece3a
2017-04-20 16:36:55 -07:00
Jerome Jiang
7ae1e321a1 vp9: Non-rd pickmode: Avoid computation duplication.
In non-rd pickmode (speed >= 5), avoid duplication of computations in
model_rd_for_sb_y when the speed feature use_simple_block_yrd is
enabled (or for high bitdepth build under certain conditions).

QVGA, VGA and HD have 1.23%, 2.68% and 1.7% speedup on ARM for speed 8,
respectively.

Encoding results are bitexact for speed >= 5.

Change-Id: I3f9130810c21439f5ad7e159e21cb2243dcd05f1
2017-04-20 16:20:59 -07:00
Marco
29938b3a5a vp9: 1 pass SVC: Fix comment and condition for up-sampling reference.
No change in behavior.

Change-Id: I218fb30289091da623acb23324027435b8510d0e
2017-04-20 14:21:05 -07:00
Yunqing Wang
30ef50b522 Merge "Only allow allow_exhaustive_searches for FC_GRAPHICS_ANIMATION content" 2017-04-20 19:57:46 +00:00
Marco
3134a52d26 vp9: SVC: Redefine the source downsample filter choice.
Rename the source downsampling filter, and define it
per spatial layers. Used 1 pass CBR SVC.

Change-Id: I8135f2ab89c535c53429b9c58b586f746bb668c7
2017-04-20 10:17:13 -07:00
Yunqing Wang
e96e49c2f9 Only allow allow_exhaustive_searches for FC_GRAPHICS_ANIMATION content
The allow_exhaustive_searches feature improves the encoding quality
of FC_GRAPHICS_ANIMATION content a lot. For non-FC_GRAPHICS_ANIMATION
content, the quality test result is almost neutral. This patch makes
this feature to be used only for FC_GRAPHICS_ANIMATION content.

The motivation of doing that is to make this feature no longer adaptive,
which will be implemented in the following patch.

Change-Id: Ic911df6dd757402b6480789cc247801e99840369
2017-04-20 00:03:27 +00:00
Linfeng Zhang
fbbdba3b04 Merge changes I9e18a73b,Ie47c8cd4
* changes:
  Clean CONVERT_TO_BYTEPTR/SHORTPTR in convolve
  Create CAST_TO_BYTEPTR/SHORTPTR
2017-04-19 23:55:58 +00:00
Linfeng Zhang
bf8a49abbd Clean CONVERT_TO_BYTEPTR/SHORTPTR in convolve
Replace by CAST_TO_BYTEPTR/SHORTPTR.
The rule is: if a short ptr is casted to a byte ptr, any offset
operation on the byte ptr must be doubled. We do this by casting to
short ptr first, adding offset, then casting back to byte ptr.

BUG=webm:1388

Change-Id: I9e18a73ba45ddae58fc9dae470c0ff34951fe248
2017-04-19 12:13:49 -07:00
Marco
348bdc0195 vp9: Add phase to get averaging filter for 1:2 downsampling.
The scaling filter with zero shift will give sub-sampling for
2x downsampling. Allow for a phase shift to get an averaging filter.

Usage is for source scaling in 1 pass SVC mode for 1:2 downscale.
Reduces aliasing in downsampled image.

Keep the phase to 0/off for now.

Change-Id: Ic547ea0748d151b675f877527e656407fcf4d51e
2017-04-18 16:56:15 -07:00
Marco
ad2e3598d2 vp9: Add key_frame condition to is_reference check for loopfilter.
This condiiton is not needed as key_frame should set the refresh
of the reference frames, but good to have for clarity in condition.

Change-Id: Icf9838e7e4f0ff5cf0a9562ae3b5d6c7e6f78702
2017-04-17 15:18:46 -07:00
Marco Paniconi
9aa429a66d Revert "Revert "vp9: Avoid encoder loopfilter for non-reference frames.""
This reverts commit e9b7f98c56.

Reason for revert:
Commit d578bdad fixes the issue (encoder/decoder mismatch
in 3TL datarate test) that causes the original revert.

Original change's description:
> Revert "vp9: Avoid encoder loopfilter for non-reference frames."
>
> This reverts commit 863f860bfc.
>
> This causes encoder / decoder mismatches in various
> VP9/DatarateTestVP9Large.BasicRateTargeting3TemporalLayers tests
>
> BUG=webm:1408
>
> Change-Id: Ic200c39d7ed9c0b0247ef562f5d6f7b2625f7e14
>

TBR=jzern@google.com,marpan@google.com,builds@webmproject.org,jianj@google.com
BUG=webm:1408

Change-Id: Ifeb81460856d1d56482d4e0477a70ee98f8bfaa6
2017-04-17 11:02:02 -07:00
James Zern
e9b7f98c56 Revert "vp9: Avoid encoder loopfilter for non-reference frames."
This reverts commit 863f860bfc.

This causes encoder / decoder mismatches in various
VP9/DatarateTestVP9Large.BasicRateTargeting3TemporalLayers tests

BUG=webm:1408

Change-Id: Ic200c39d7ed9c0b0247ef562f5d6f7b2625f7e14
2017-04-14 11:50:06 -07:00
Marco
5f39262dcc vp9: Adjust speed features for speed 8 at low resoln.
For low resolutions (<= CIF): use quarter-pixel and simple_block_yrd.

~5% gain on RTC_derf.
~6-7% slowdown on ARM.

Change-Id: I4439ebd1116b9decac04786503f978840b68a60c
2017-04-14 11:35:47 -07:00
Marco Paniconi
b937f1c839 Merge "vp9: SVC: fix to allow use_base_mv to be used for 3 layers." 2017-04-14 17:12:58 +00:00
Marco
adb9b4eddf vp9: SVC: fix to allow use_base_mv to be used for 3 layers.
Allow use_base_mv to be used for 3 spatial layers where
base is 4x4 scale from the top layer.

Change-Id: If6641baf8b8e4d0fd5dc67619d873c6d75065f43
2017-04-13 20:43:43 -07:00
Marco Paniconi
f0ccaff553 Merge "vp9: Avoid encoder loopfilter for non-reference frames." 2017-04-14 00:45:42 +00:00
Marco
6bff6cb5a9 vp9: 1 pass VBR: Fix to rate control at low min-q.
Fix to avoid getting stuck at very low Q even
though content is changing, which can happen for --min-q=0.

Fix is to more aggressively increase active_worst_quality
when detecting significant rate_deviation at very low Q.

Change will only affect 1 pass VBR for --min-q < 4, so no
change in ytlive metrics for --min-q >= 4.

Change-Id: I4dd77dd7c08a30a4390da0ff2c8bda6fccfa76d7
2017-04-13 11:44:35 -07:00
Marco
863f860bfc vp9: Avoid encoder loopfilter for non-reference frames.
Useful for SVC, where the top layer enhancement frames may
not update any reference buffers, as is the case for the
patterns in the 1 pass CBR SVC when #temporal_layers > 1.

~3% encoder speedup for SVC patterns with temporal layers
in 1 pass CBR mode.

Updated the SVC datarate tests for the mismatch frames.
Set the frame-dropper off in some tests with #temporal_layers > 1
so we can correctly set #mismatch frames. Adjusted rate target
threshold for tests where frame-dropper was turned off.

Change-Id: Ia0c142f02100be0fed61cd2049691be9c59d6793
2017-04-13 09:51:55 -07:00
Yunqing Wang
f22b828d68 Fix an integer overflow in vp9_mcomp.c
The MV unit test revealed an integer overflow issue in vp9_mcomp.c.
This was caused if the MV was very large. In mv_err_cost(), when
mv->row = 8184, mv->col = 8184 and ref_mv is 0, mv_cost = 34363
and error_per_bit = 132412, causing the overflow.

BUG=webm:1406

Change-Id: I35f8299f22f9bee39cd9153d7b00d0993838845e
2017-04-10 18:09:50 -07:00
Jerome Jiang
2420f44342 Merge "vp9: speed >= 8: Adjust speed settings on ARM." 2017-04-11 00:45:21 +00:00
Jerome Jiang
f16f08e55b vp9: speed >= 8: Adjust speed settings on ARM.
Set adaptive_rd_thresh to 2 when simple block yrd is not used.

Fix regression caused by computing y sad without
int_pro_motion_estimation on low res motion clips.

Overall 0.07% quality loss on rtc_derf.

Change only affects low res on speed 8.

Change-Id: Ic6a188a56529f1034d6431005fb4b0e24e8a7e27
2017-04-11 00:26:56 +00:00
Marco
6557baf336 vp9: 1 pass CBR: avoid nonrd_pick_partition on segment.
For speed 5, 1 pass CBR: Don't use the nonrd_pick_partition
on the segment, rather use choose_partitioning followed by
nonrd_select_partition (as is done on base segment).

Little/no quality loss on RTC and RTC_derf (< 0.3%),
speedup of at least 5%.

Change-Id: I5273d5f950e60adf5e437b4ca8c4f63964641e83
2017-04-10 15:02:49 -07:00
Marco Paniconi
ff1fef9607 Merge "vp9: Fix to noise estimation for temporal denoising." 2017-04-07 17:13:22 +00:00
Yunqing Wang
f496032686 Merge "VP9 motion vector unit test" 2017-04-07 16:46:22 +00:00
Marco
349c3118bd vp9: Fix to noise estimation for temporal denoising.
If the noise estimation is avoided due to large motion,
the last_source for denoising should still be updated.

Change-Id: I67155ea7dbe9ac2785978e64a27bdafd7d57aac0
2017-04-07 09:23:30 -07:00
Marco
18b54ef468 vp9: Adjust consec_zeromv threshold for aq-mode=3.
To reduce refresh on partial super-blocks on boundary,
for noisy input. Reduces some artifacts on noisy input.

Change-Id: I10b5808a296874e08c7f378b3df58466591d8dbe
Edit
2017-04-07 08:54:09 -07:00
James Zern
04e9456567 Merge changes from topic 'Wshorten'
* changes:
  configure: enable -Wshorten-64-to-32 for hbd
  vp9_encodeframe: resolve -Wshorten-64-to-32 in hbd
  Resolve -Wshorten-64-to-32 in highbd variance.
2017-04-07 07:32:14 +00:00
Marco
3227a9be5f vp9; Move the denoising condition for speed 5.
Move the condition for effectively disabling the denoising
for speed 5 into the vp9_denoiser_denoise().

This is cleaner, and also moving the condition into vp9_denoiser_denoise
will keep the denoiser buffer updated with the current source.
This allows for more consistent behavior if speed is changed midstream.

Change-Id: Ia001f591c56e454bf724c3ae73c024badb183ef8
2017-04-06 11:03:04 -07:00
Jerome Jiang
c9fbb1881a Merge "vp9: speed 8: Compute y sad without int_pro_motion_estimation." 2017-04-06 02:57:16 +00:00
Jerome Jiang
705fc9f107 Merge "Refactor: Clean memory allocation for copy partition." 2017-04-06 02:57:08 +00:00
Yunqing Wang
1aa46abbdf VP9 motion vector unit test
To prevent the motion vector out of range bug, added a motion vector unit
test in VP9. In the 4k video encoding, always forced to use extreme motion
vectors and also encouraged to use INTER modes. In the decoding, checked if
the motion vector was valid, and also checked the encoder/decoder mismatch.

The tests showed that this unit test could reveal the issue we saw before.

Change-Id: I0a880bd847dad8a13f7fd2012faf6868b02fa3b4
2017-04-06 00:50:56 +00:00
James Zern
b3e2eb14c5 vp9_encodeframe: resolve -Wshorten-64-to-32 in hbd
vp9_high_get_sby_perpixel_variance the variance operated on in is
already in 32-bits

Change-Id: I97006eb9c08dbd0f88ee35e1a1ca205737508296
2017-04-05 17:34:06 -07:00
Jerome Jiang
288d73c861 vp9: speed 8: Compute y sad without int_pro_motion_estimation.
Little change in overall PSNR in rtc. 2-4% speedup on VGA on ARM.

Change-Id: I3395806d7afd456deacd4077c330adca13ab0645
2017-04-05 17:25:47 -07:00
Marco
2136de9374 vp9: Temporal denoising: avoid denoising for speed <= 5.
Temporal denoiser runs in non-rd pickmode, so it is only used
for speed >= 5. Regression exists for speed 5, due to use of
reference_partition (which use non-rd pickmode for partitioning).
Avoid denoising for now at speed 5.

Change-Id: I74a74d2e1404d7cfd33dcf4ec06dd2e503256cf0
2017-04-05 16:43:39 -07:00
Jerome Jiang
58ba880b94 Refactor: Clean memory allocation for copy partition.
Move the memory allocation from setting speed features.

Change-Id: I2e89dfaeb46daee63effe5a5df62feed732aa990
2017-04-05 15:33:24 -07:00
Jerome Jiang
fb60204d4c vp9: Remove legacy comments for avg_source_sad.
Change-Id: Ia6e8614535a097f17f37fc382cef8e22e03b70f6
2017-04-04 16:28:27 -07:00
Marco
8097b49997 vp9: Adjust condition of golden update with cyclic refresh.
Base the low_content_frame metric on the motion vectors,
and adjust the logic for preventing golden update.

Small change in behavior: small positive gain (~0.2-1%) on clips
with high activity.

Change-Id: I0b861c8e9666cd82b45cde5ee57ee8a1e5ab453c
2017-04-04 09:55:24 -07:00
Marco
6b3f4bc794 vp9: 1 pass CBR: cleanup to cyclic refresh.
Code cleanup: merged two functions that were doing postencode
update for cylic refresh, remove some unused code and fix comments.

No change in behavior.

Change-Id: I9be0d7e346d34dec29bf4e5bb380a7bf81c8480a
2017-04-03 16:37:45 -07:00
Yunqing Wang
41fac44707 Merge "Fix for out of range motion vector bug in sub-pel motion estimation" 2017-04-03 18:27:57 +00:00
Marco Paniconi
9d403d6f48 Merge "vp9: SVC: Fix issue with artifact for svc-denoising." 2017-04-03 16:23:25 +00:00
Ranjit Kumar Tulabandu
bf15ca1091 Fix for out of range motion vector bug in sub-pel motion estimation
BUG=webm:1397

(yunqingwang)
To verify that this patch wouldn't cause much performance change,
the Borg tests were run. Here was the result:
       avg_psnr   overall_psnr  ssim
hdres: -0.002     0.006         0.013
midres:   0         0             0
lowres:   0         0             0

Change-Id: Iae395ae7b741e0513cf5bab9dcace110b792a67d
2017-04-03 16:16:49 +00:00
Yunqing Wang
002cf38837 Merge "Enhance the row mt sync read to accept the sync_range greater than 1" 2017-04-03 15:59:51 +00:00
Yunqing Wang
f1600db3e4 Enhance the row mt sync read to accept the sync_range greater than 1
The row mt sync read uses sync_range = 1, and wouldn't work if we want
to use a sync_range that is greater than 1. To make it work, this sync
read code is modified. Pass in col instead of col - 1 to make it
consistent with other row mt code in VP9, and then add 1 in "while"
codition.

Change-Id: I4a0e487190ac5d47b8216368da12d80fec779c1a
2017-03-31 10:48:38 -07:00
Marco
c824eda6cc vp9: SVC: Fix issue with artifact for svc-denoising.
Issue/bug happens for denoising with spatial layers, where
the golden (spatial) reference is used in pickmode, but
denoising is only done wrt to last (temporal).

Fix is to make sure set_ref_ptrs is set before build predictors
in denoiser.

Change-Id: I793cf441341edf7c4a88b8ab1e1b22b3cb0eb508
2017-03-31 10:05:32 -07:00
Marco
fc83fcb7c4 vp9: SVC: fix to allow output of denoised result.
Change-Id: Iaf55cfb5e9621d074eb33d6a32f184e4777968f8
2017-03-29 14:02:54 -07:00
Marco
32b3d2f174 vp9: 1 pass SVC: Modify condition for intra-mode search.
Temporary override to condition for disallowing intra-search in SVC,
since golden (spatial) reference is currently suppressed due to
artifact issue.

Change-Id: I28ed7fdddc9fcdbcc0a4175a247a3ecc94c11767
2017-03-29 09:24:50 -07:00
Marco
0169a985d9 vp9: Speed >= 8: avoid chrome check under some condition.
For non-rd variance partition, avoid the chrome check
unless y_sad is below some threshold.

Small decrease in avgPSNR (~0.3) on RTC set.
Small/negligible decrease on RTC_derf.

Change-Id: I7af44235af514058ccf9a4f10bb737da9d720866
2017-03-27 13:18:21 -07:00
Marco
66c6b4d6fc vp9: 1 pass: Move source sad computation into encodeframe loop.
Refactor to split the 1 passs source sad computation into scene
detection (currently used for VBR and screen-content mode), and
superblock based source sad computation (used in non-rd CBR mode).

This allows the source sad computation for CBR mode to be
multi-threaded.

No change in compression.

Change-Id: I112f2918613ccbd37c1771d852606d3af18c1388
2017-03-27 11:11:05 -07:00
Marco
07ad5a15c2 vp9: Fix to condition on using source_sad for 1 pass real-time.
Make the source_sad feature work properly for cases of VBR or
screen_content with SVC.

Added unittest for SVC with screen-content on.

Change-Id: Iba5254fd8833fb11da521e00cc1317ec81d3f89b
2017-03-24 10:21:47 -07:00
Alex Converse
d7b220b467 Merge changes Ie989e60c,Ifc110b12
* changes:
  Backport "Optimize the use case of token_cost table" to VP9
  Drop vp9_get_token_extracost
2017-03-23 18:05:13 +00:00
Marco
4863e07c01 vp9: Non-rd partition: avoid unneeded call to chrome_check
Since y_sad is not computed yet (on the early exit due to source_sad),
no need to check for setting color_sensitiviy.

Only affects speed >=8. No change in behavior.

Change-Id: I3a6f2d20fed38d8b8ec51b75bcacf9a21f2db916
2017-03-22 22:40:28 -07:00