Commit Graph

9982 Commits

Author SHA1 Message Date
paulwilkins
2b247ae91c Increase precision of some debug stats output for corpus VBR.
Change-Id: I75841797cc0c215781b5b36e3a3e9f4b0e35ba63
2017-10-12 10:07:21 +01:00
Jerome Jiang
288890cd43 vp9: use nonrd pick_intra for small blocks on keyframes.
Keyframe encoding is more than 2x faster.
Disabled on Speed 8.

Change-Id: I2157318b6ac8253fa5398322c72d98cd7fa9b2b6
2017-10-11 21:38:01 -07:00
paulwilkins
416b7051d7 Prevent double application of min rate in two pass.
The initial allocation of bits in the two pass code to each frame
should be within the min max limits on the command line. However,
when forming an ARF group the cost of the ARF is shared by frames
in that group such that the residual bits for a frame could drop below
the min value. This change prevents the minimum being re-applied
after the cost of the ARF has been deducted as this may otherwise
cause low rate sections to overshoot their target.

Test runs comparing to a baseline run with min and max section pct
0-2000% vs one closer to the YT use case (50-150%) suggest that
this fix not only results in better rate control but also gives a better
rd outcome.

For example the HD set vs 0-2000% baseline (opsnr, ssim).
Old code (50-150):  +0.751, +1.099
New code(50-150): +0.241, -0.009

Change-Id: I715da7b130bf53ba8aa609532aa9e18b84f5e2ef
2017-10-11 18:00:44 +01:00
Linfeng Zhang
16166bfdaa Add 4 to 1 scaling x86 optimization
Change-Id: I51c190f0a88685867df36912522e67bdae58a673
2017-10-10 16:24:06 -07:00
Marco
017257a317 Adjustment to scene detection and key frame.
For 1 pass vbr: use higher threshold on avg_sad
and force key frame under scene cut detection if
above the threshold. Allow it for speed >= 6 for now,
since it does not use the full nonrd_pickmode partition
(as in speed 5).

Improves quality somewhat on scene cut frames.
Neutral on overall metrics and fps for speed 6 on
ytlive set.

Change-Id: I12626f7627419ca14f9d0d249df86c7104438162
2017-10-10 11:20:05 -07:00
Linfeng Zhang
963cc22cef Merge changes I9d4c1af5,I882da3a0
* changes:
  Rename some inline functions in NEON scaling
  Generalize 2:1 vp9_scale_and_extend_frame_ssse3()
2017-10-10 17:29:50 +00:00
paulwilkins
06d231c9fa Further Corpus VBR change.
Change to the bit allocation within a GF/ARF group.

Normal VBR and CQ mode allocate bits to a GF/ARF group based of the mean
complexity score of the frames in that group but then share bits evenly between
the "normal" frames in that group regardless of the individual frame complexity
scores (with the exception of the middle and last frames).

This patch alters the behavior for the experimental "Corpus VBR" mode such that
the allocation is always based on the individual complexity scores.

Change-Id: I5045a143eadeb452302886cc5ccffd0906b75708
2017-10-10 10:41:35 +01:00
paulwilkins
741bd6df4f Corpus Wide VBR test implementation.
This patch makes further changes to support an experimental
corpus wide VBR mode that uses a corpus complexity
number as the midpoint of the distribution used to allocate bits
within a clip, rather than some average error score derived from the
clip itself.

At the moment the midpoint number is hard wired for testing and
the mode is enabled or disabled through a #ifdef.  Ultimately this
would need to be controlled by command line parameters.

Change-Id: I9383b76ac9fc646eb35a5d2c5b7d8bc645bfa873
2017-10-10 10:40:44 +01:00
Linfeng Zhang
27d21a3d13 Rename some inline functions in NEON scaling
Change-Id: I9d4c1af53d57f72fc716bacbe3b0965719c045ac
2017-10-09 11:23:00 -07:00
Linfeng Zhang
e1ae3772da Merge "Update vp9_scale_and_extend_frame_ssse3()" 2017-10-09 16:20:00 +00:00
Marco Paniconi
5bc4c37a89 Merge "Revert "Speed >=5 real-time: add TM intra mode for high_source_sad."" 2017-10-06 22:41:34 +00:00
Marco Paniconi
bcbc6ed82d Revert "Speed >=5 real-time: add TM intra mode for high_source_sad."
This reverts commit 9311ef18b4.

Reason for revert:
Notice small regression in some clips.
Will revisit in another change.

Original change's description:
> Speed >=5 real-time: add TM intra mode for high_source_sad.
> 
> Small/neutral change in metrics or speed for ytlive.
> Some improvement in quality on frames with big content change.
> 
> Change-Id: Ib3b0703a5f28ea6710e90324436e27598ab7384d

TBR=marpan@google.com,builds@webmproject.org,jianj@google.com

Change-Id: I9d8ec5195bb05ddf329d325699355185affb9b13
No-Presubmit: true
No-Tree-Checks: true
No-Try: true
2017-10-06 22:14:56 +00:00
Marco
e405eb06b1 Adjust threshold in scene detection
For 1 pass vbr: increase min_thresh slightly, and also add
condition on golden/arf update for using full nonrd_pick_partition.

Reduces possible false detection for scene cut detection.

Neutral/small change in metrics or speed for speed 5.

Change-Id: I388f4d9a56e3cc763e0148338c1bc0381e58ad76
2017-10-06 11:08:56 -07:00
Marco
9311ef18b4 Speed >=5 real-time: add TM intra mode for high_source_sad.
Small/neutral change in metrics or speed for ytlive.
Some improvement in quality on frames with big content change.

Change-Id: Ib3b0703a5f28ea6710e90324436e27598ab7384d
2017-10-05 23:07:03 -07:00
Marco
18262a8576 Adjust threshold for adapt_partition for speed 6.
Lower SAD threshold to select non_rd pickmode partition
at superblock level more often.
Small gain in metrics, small/negligible decrease in speed.

Change-Id: I0f728236b91a604e4ca7e02039adc54d5985c4dc
2017-10-04 18:04:09 -07:00
Marco
4bc1fc58b6 Avoid nonrd_pick_partition for speed >= 6.
For 1 pass vbr speed >= 6: when REFERENCE_PARTITION is selected,
avoid doing the full nonrd_pickmode based partition.
No change in overall metrics or speed.
Reduces encode times on scene cuts by 10-20%.

Change-Id: I0310b1610cc1c83793a509e0a9059840e8f18308
2017-10-04 15:31:54 -07:00
Linfeng Zhang
127864deb3 Generalize 2:1 vp9_scale_and_extend_frame_ssse3()
Change-Id: I882da3a04884d5fabd4cd591c28682cbb2d76aa5
2017-10-04 12:35:39 -07:00
Linfeng Zhang
b809442521 Update vp9_scale_and_extend_frame_ssse3()
Change-Id: I22622faebfcc36f7a4d1f37e3800ae8ab87c8cd4
2017-10-04 12:32:30 -07:00
Marco
77e51e2035 Modify early exit for alt_ref in nonrd_pickmode.
For 1 pass vbr mode:
On no-show_frame/ARF: instead of skipping alt_ref_frame
completely in mode testing, allow for checking (0, 0) on alt_ref.

Small gain in metrics, ~0.18%, no change in speed.

Change-Id: I32a3c24faca64ab70dd5091071a0dc301db7dd1e
2017-10-04 11:53:39 -07:00
Marco
98dbf31c87 Enable arf usage for speed >= 6, 1 pass vbr.
For speed 6 on ytlive set:
On average, speed slowdown ~5%, quality gain ~2%.

Change-Id: Ia18237cc1d52c54d7e2cb3c71f571cf37ef61b44
2017-10-03 17:18:33 -07:00
Marco
ab2bd340ac vp9: 1 pass vbr: Limit qpdelta on high_source_sad.
For 1 pass vbr: when significant content/scene change is detected
(high_source_sad = 1) reduce/turnoff the additional qdelta on the
active_worst_quality. This helps somewhat to reduce the occurrence
of large frame sizes and large encode times.
Allow it only when use_altef_onepass is enabled.

Neutral/no change on metrics.

Change-Id: I1dd97dd2ab892d65f707b841b27a5de300b714ea
2017-10-03 16:27:17 -07:00
Marco
c8678fb7f3 Use adapt_partition for ARF in 1 pass.
For speed 6 real-time mode: use adapt_partition
on ARF frame instead of REFERENCE_PARTITION (which is slower).
This requires enabling compute_source_sad_onepass for no-show_frames.

Speedup of ~3-5% on some clips that heavily use ARF,
small loss (~0.2%) in quality on ytlive set.

Change-Id: Ib50acc97df06458244a6ac55d2bd882c30012536
2017-10-03 11:49:55 -07:00
Marco Paniconi
fe7b869104 Merge "ARF in 1 pass vbr: modify skip ref_frame in nonrd_pickmode." 2017-10-03 03:01:14 +00:00
Marco
33e10dfa7e ARF in 1 pass vbr: modify skip ref_frame in nonrd_pickmode.
Speedup of ~2-3% on 1080p clips speed 6.
Neutral/negligible loss in metrics on ytlive.

Change-Id: I7ac47a4d8b58c566920bae29a94a0e8d59c36dee
2017-10-02 19:04:03 -07:00
Linfeng Zhang
0e55b0b0a7 Add 4 to 3 scaling NEON optimization
Speed comparing with the one calling vpx_scaled_2d_neon()
  ~1.7 x in general
  ~2.8x for BILINEAR filter

BUG=webm:1419

Change-Id: I8f0a54c2013e61ea086033010f97c19ecf47c7c6
2017-10-02 15:04:09 -07:00
Linfeng Zhang
2c560c3c22 Specialize 4 to 3 frame scaling in C
Scale 3x3 block instead of 16x16 block in each loop. Disabled by
default.

Benefits:
1. Reduced number of different phase_scaler from 16 to 3.
   Optimization code will be smaller and faster.
2. Maximum phase_scaler drifting will be reduced from 5/16 to 1/24.
   (The drifting is 1/(3*16) in each step.)

BUG=webm:1419

Change-Id: I59a1f7496d89a1b090498c935d30cfcf1d0c282b
2017-10-02 11:56:15 -07:00
Marco
c8f6e7b99e Fix partition selection in speed features for arf overlay frame.
For real-time mode. Move the switch to fixed partition
for is_src_frame_alt_ref so all speeds may use it
if use_altref_onepass is set.

Improves metrics by ~2% for ytlive set at speed 4
(where use_altref_onepass is currently used).

Change-Id: I033240386598c9dbd0364da89ccbcca64bc663ee
2017-09-29 15:02:28 -07:00
Marco
f2c3d0a7a3 Enable use_altref_onepass for speed 4 real-time mode.
Used for VBR mode with lag-in-frames > 0.
On ytlive set at speed 4: ~3% average gain.

Change-Id: I45dad1700bf8be9d8f177815dc062774f6f2f0de
2017-09-29 10:56:14 -07:00
Marco
a2ef180dd0 Set rc->high_source_sad = 0 before scene detection.
Only has effect when sf->use_altref_onepass is enabled,
as in that case scene detection is skipped for non-show frame
and so high_source_sad does not get reset to 0.

No change in metrics or speed.

Change-Id: I421f066d239341449c18826089e1810b9fc5967f
2017-09-28 10:49:45 -07:00
Marco Paniconi
3b8cc214ef Merge "vp9: Modification to adapt the ARF usage for 1 pass vbr" 2017-09-28 16:52:28 +00:00
Marco
03e8f13337 vp9: Modification to adapt the ARF usage for 1 pass vbr
Add stats for past ARF usage, and use it to disable
ARF usage based on some conditions.

Overall improvement on ytlive set, reduces the regression
on the problem clips for this feature.

Only affects when sf->use_altref_onepass is enabled
(currently off by default).

Change-Id: I66267f227ea132dc86acb730e9882f85bead2cdb
2017-09-28 09:10:30 -07:00
Marco
c493ea1a6b Add use_svc condition to the scene detection in 1 pass.
Scene detection is not currently used in SVC 1 pass code.
Speedup of ~0.4%.

Change-Id: I0ab769300919de710cd2da1402014fa3f22a1f86
2017-09-27 14:51:46 -07:00
Marco Paniconi
786b124e20 Merge "Revert "Remove the speed condition on scene detection in 1 pass code."" 2017-09-27 20:42:48 +00:00
Marco Paniconi
8d438dc313 Revert "Remove the speed condition on scene detection in 1 pass code."
This reverts commit 535b7b915a.

This is actually used in CBR to reset the rate control if high source sad is detected.

Original change's description:
> Remove the speed condition on scene detection in 1 pass code.
> 
> Scene detection is used for VBR mode and for screen_content mode.
> 
> It was also enabled for CBR mode via the speed condition,
> but currently the analysis in the scene detection is not used
> in CRB mode (similar computations are done locally at superblock level
> when the source_sad feature is enabled).
> 
> For 1 pass code.
> No change in behavior. Small speed gain, ~0.5%.
> 
> Change-Id: I59991d7ef2af320bea7af4b907596e057affa42f

TBR=marpan@google.com,builds@webmproject.org,jianj@google.com

Change-Id: Ib4e6b02047f75632503e7b0fc870af97fa9291c3
No-Presubmit: true
No-Tree-Checks: true
No-Try: true
2017-09-27 19:42:48 +00:00
James Zern
6175f285a9 Merge "vp9_dx_iface: Stop using iter parameter incorrectly" 2017-09-27 18:37:20 +00:00
Marco
535b7b915a Remove the speed condition on scene detection in 1 pass code.
Scene detection is used for VBR mode and for screen_content mode.

It was also enabled for CBR mode via the speed condition,
but currently the analysis in the scene detection is not used
in CRB mode (similar computations are done locally at superblock level
when the source_sad feature is enabled).

For 1 pass code.
No change in behavior. Small speed gain, ~0.5%.

Change-Id: I59991d7ef2af320bea7af4b907596e057affa42f
2017-09-27 10:32:54 -07:00
Vignesh Venkatasubramanian
530e60143a vp9_dx_iface: Stop using iter parameter incorrectly
'iter' parameter is being checked for NULL in every call to
decoder_get_frame which is quite pointless because it is always
going to be NULL unless the application changed it. The code works
as described only because vp9_get_raw_frame returns -1 on all
subsequent calls after the first.

Change-Id: Ic736b9e8fe36fc1430fc11d6a9b292be02497248
2017-09-27 09:59:39 -07:00
Marco
819c5b365d Remove the speed condition in setting compute_source_sad.
The speed condition is not needed, feature can used for any
speed in 1 pass code.

Change-Id: I878ef3f63a075302eda48c0343fa243c80aab9ba
2017-09-26 15:48:34 -07:00
Marco
d5094cfde8 Replace flag USE_ALTREF_FOR_ONE_PASS with speed feature.
To be used for 1 pass VBR.
Off by default in speed features.

Change-Id: I5d6110d6d191990db526fe68ec9715379a4d1754
2017-09-26 11:16:50 -07:00
Linfeng Zhang
28762341ac Merge changes Ib9105462,Idfac00ed,If8d8a0e2
* changes:
  cosmetics: NEON scaling code
  Refactor convolve NEON code
  Refactor convolve code
2017-09-26 16:10:46 +00:00
James Zern
691585f6b8 Merge changes If59743aa,Ib046fe28,Ia2345752
* changes:
  Remove the unnecessary cast of (int16_t)cospi_{1...31}_64
  Remove the unnecessary upcasts of (int)cospi_{1...31}_64
  Change cospi_{1...31}_64 from tran_high_t to tran_coef_t
2017-09-22 07:35:55 +00:00
Andrew Lewis
10bab1ec29 Merge "Comma-separate VP9 encoder tmp.stt output" 2017-09-21 08:50:53 +00:00
Marco Paniconi
0b08f8892f Merge "vp9: Modify pickmode early exit for ARF in 1pass." 2017-09-21 01:33:12 +00:00
Marco
42373b21ce vp9: Modify pickmode early exit for ARF in 1pass.
Add the condition frames_since_golden > 0 to the
early exit check for ARF usage in nonrd_pickmode.
This improves quality of first frame following ARF, where
frame_since_golden = 0.

Small/neutral gain in metrics for speed 6, neutral change in speed.

Only affects when USE_ALTREF_FOR_ONE_PASS is enabled.

Change-Id: I82e73e6ff6fc849e5ca5448563cb8a0515fe0cdc
2017-09-20 15:02:37 -07:00
Linfeng Zhang
d586cdb4d4 Remove the unnecessary cast of (int16_t)cospi_{1...31}_64
BUG=webm:1450

Change-Id: If59743aafe99226e0ec67ab5d20678ce25f53ab8
2017-09-20 14:13:26 -07:00
James Zern
f7b276c26b Merge "Bug fix: fadst4() in vp9/encoder/vp9_dct.c" 2017-09-20 21:12:45 +00:00
Linfeng Zhang
24afb5d036 Bug fix: fadst4() in vp9/encoder/vp9_dct.c
A new bug was introduced in a80bdfd "Change sinpi_{1,2,3,4}_9 from
tran_high_t to int16_t". Reverted the change in this file.

BUG=webm:1450

Failed test C/TransHT.AccuracyCheck/26.

Change-Id: Id001f57aad811803ef7d367d2b2bc008d8499991
2017-09-20 12:27:29 -07:00
Linfeng Zhang
7c0529728a cosmetics: NEON scaling code
Change-Id: Ib91054622c1f09c4ca523bc6837d7d8ab9f03618
2017-09-19 16:39:17 -07:00
Marco
aaa6cdcc2e vp9: Modify simple_block_yrd condition for SVC
Modify simple_block_yrd condition in nonrd_pickmode for SVC:
allow it to be used also on base temporal_layer, only when
spatial_layer > 1 and block size < 32x32.

Speed up of about ~2% for 3 layer SVC, with little/negligible
loss in quality.

Change-Id: I7734bdae51cf51f22b96f6b2b27da20ea1d84344
2017-09-19 15:39:05 -07:00
Marco
cd463c7acb vp9: Fix condition for limiting ARF 1 pass vbr.
Fix the setting to frames_till_gf_update_due, and
adjust the limit value.
Only affects when USE_ALTREF_FOR_ONE_PASS is enabled.

Neutral change to metrics and speed for ytlive.

Change-Id: I266d9a00b36221bc8602fa2746d4e8a8f7d4dfae
2017-09-19 11:12:37 -07:00
Marco Paniconi
310e388423 Merge "vp9: Adjustments for ARF usage in 1 pass vbr." 2017-09-19 16:29:19 +00:00
Marco
ebb015a539 vp9: Adjustments for ARF usage in 1 pass vbr.
Only when USE_ALT_REF_ONE_PASS is enabled (off by default).
Force fixed partition to 64x64 when is_src_alt_ref_frame is true,
and don't force early exit for some modes in nonrd_pickmode
for ARF noshow frames.

Small gain ~0.2% on ytlive metrics for speed 6.
Neutral speed difference.

Change-Id: I27eb6622d0453c09a06ccdc3b16368762474d11d
2017-09-18 18:46:41 -07:00
Linfeng Zhang
a80bdfd081 Change sinpi_{1,2,3,4}_9 from tran_high_t to int16_t
Add "typedef int16_t tran_coef_t;"

BUG=webm:1450

Change-Id: I67866f104898d1dda8989e1abdaf6983fe324154
2017-09-18 09:26:03 -07:00
Linfeng Zhang
9d278465b5 Merge "cosmetics: vp9_rtcd_defs.pl" 2017-09-18 16:23:33 +00:00
Paul Wilkins
65f1c90652 Merge "Fix bug in intra mode rd penalty." 2017-09-15 15:43:29 +00:00
James Zern
c12b39626f Merge "Revert "Specialize 4 to 3 scaling in vp9_scale_and_extend_frame_c()"" 2017-09-15 00:27:41 +00:00
Hui Su
293734b755 Merge "VP9 level targeting: add a new AUTO mode" 2017-09-14 21:02:38 +00:00
James Zern
baf658ec4c Revert "Specialize 4 to 3 scaling in vp9_scale_and_extend_frame_c()"
This reverts commit afee58f2c4.

This causes ~8x slowdown in 4:3 in the C-code

Change-Id: I60a7ead12dc4ec1548b1b12cfe4b0be42ef04e0e
2017-09-14 13:07:21 -07:00
Hui Su
c3a6943c16 VP9 level targeting: add a new AUTO mode
In the new AUTO mode, restrict the minimum alt-ref interval and max column
tiles adaptively based on picture size, while not applying any rate control
constraints.

This mode aims to produce encodings that fit into levels corresponding to
the source picture size, with minimum compression quality lost. However, the
bitstream is not guaranteed to be level compatible, e.g., the average bitrate
may exceed level limit.

BUG=b/64451920

Change-Id: I02080b169cbbef4ab2e08c0df4697ce894aad83c
2017-09-14 16:20:29 +00:00
Linfeng Zhang
535dee0fb6 cosmetics: vp9_rtcd_defs.pl
Change-Id: I1bf57824e07fa4f8b3b5574984117f2bd7a1c086
2017-09-13 12:13:55 -07:00
Andrew Lewis
949730e2dc Comma-separate VP9 encoder tmp.stt output
Also add column headings so that the output can still be parsed if the
set of headers changes later.

Change-Id: I4beaf266521e093db4acf5f715b18fdfb7e3d1cd
2017-09-13 16:26:40 +01:00
Linfeng Zhang
afee58f2c4 Specialize 4 to 3 scaling in vp9_scale_and_extend_frame_c()
Scale 3x3 block instead of 16x16 block in each loop.

Benefits:
1. Reduced number of different phase_scaler from 16 to 3. Optimization code
   will be smaller and faster.
2. The maximum phase_scaler drifting will be reduced from 5/16 to 1/24.
   (The drifting is 1/(3*16) in each step.)

BUG=webm:1419

Change-Id: Ibb9242a629ddb03e1ff93b859bece738255e698c
2017-09-12 12:05:16 -07:00
Linfeng Zhang
a9bbe53dbb Add 4 to 1 scaling NEON optimization
BUG=webm:1419

Change-Id: If82a93935d2453e61b7647aae70983db1740bec7
2017-09-11 10:17:28 -07:00
paulwilkins
0657f4732c Fix bug in intra mode rd penalty.
The intra mode rd penalty was implemented as a rate penalty.
Code was added to scale the penalty according to block size but
this was not done correctly for the SB level or sub 8x8.

The code did a weird double scaling in regard to bit depth that
has been removed. Given that it is a rate penalty the bit depth
should not matter.

This bug fix improves average metrics  on our standard test
sets by about 0.1%

Change-Id: I7cf81b66aad0cda389fe234f47beba01c7493b1e
2017-09-08 15:10:53 +01:00
Linfeng Zhang
71b38a144e Add 2 to 1 scaling NEON optimization
BUG=webm:1419

Change-Id: I99c954ffa50a62ccff2c4ab54162916141826d9b
2017-09-07 12:33:50 -07:00
Linfeng Zhang
d5d2cbcc75 Add ScaleFrameTest
Move class VpxScaleBase to new file test/vpx_scale_test.h.
Add new file test/vp9_scale_test.cc with ScaleFrameTest.

BUG=webm:1419

Change-Id: Iec2098eafcef99b94047de525e5da47bcab519c1
2017-09-06 15:54:58 -07:00
Linfeng Zhang
d331e7a1c0 Remove get_filter_base() and get_filter_offset() in convolve
so that the convolve functions are independent of table alignment.

Change-Id: Ieab132a30d72c6e75bbe9473544fbe2cf51541ee
2017-09-05 15:22:36 -07:00
clang-format
7587a97551 apply clang-format
Change-Id: If4c3e8a396d0fcb304f407b44e28cac3219f038c
2017-09-01 01:24:03 -07:00
Jerome Jiang
ebf3ae1a29 vp9: Skip testing duplicate zero mv in nonrd-pickmode.
Neutral on rtc set for speed 8. Neutral on ytlive for speed 5.

Saves some computation cycles but no speed gain observed on Pixel.

Change-Id: I34c4642cd543aa89c5b9c4bff6b7113577c64c91
2017-08-31 17:13:31 +00:00
Jerome Jiang
7c10251f22 vp9: Speed 8: Enable skip_encode_sb
Neutral in borg tests.

Some clips show 3-4% speed gain on 2 threads on Pixel.

Change-Id: Ic959f34e44892a854551de6e9a3d9ec819ffed00
2017-08-28 17:05:48 -07:00
Jerome Jiang
64c55576b7 vp9: Remove resolution condition for using source_sad in speed 6.
Rev d147771 fixed the test failure. So remove the resolution condition
for using source_sad in speed 6.

BUG=webm:1452

Change-Id: I1efba97e1ef5bd4de5f886299f6fcb907187abcd
2017-08-28 12:49:54 -07:00
Marco Paniconi
255241c6d0 Merge "vp9: Speed 6 adapt_partition for live/vbr usage." 2017-08-25 22:00:08 +00:00
Marco
a0de2692fc vp9: Speed 6 adapt_partition for live/vbr usage.
Enable adapt_partition for vbr mode for speed 6.
This allows the usage of the pickmode-based partition
(used in speed 5), but only selectively for superblocks
with high source sad, otherwise the faster variance based
partition scheme is used.

For speed 6 on ytlive set: avgPSNR/SSIM metrics up by ~0.6%,
several clips up by ~1.5%. Small/negligible decrease in speed.

Change-Id: I12f3efef6b3e059391de330fdbe5a44c2587f1f8
2017-08-25 11:36:34 -07:00
Marco
a74593b30c vp9: SVC: Modify mv search condition in speed features.
For SVC at speed >= 7: only use the improved mv search
on base spatial layer, if top layer resolution is above 640x360.

~2.3% speedup
Small/negligible loss in avgPSNR metrics on rtc set.

Change-Id: Iaef75a57ebf1c248931bc1aa28d20b7fecac1851
2017-08-25 10:12:38 -07:00
Marco Paniconi
34e48d6115 Merge "vp9: Adjust 16x16 splot threshold for variance partition" 2017-08-24 22:26:43 +00:00
Marco
d14777157e vp9: Adjust 16x16 splot threshold for variance partition
For speeds < 7, increase threshold that controls the split
of 16x16->8x8 blocks, for resolutions 720p and higher.

Minor change for speed 5 (since it uses reference partition scheme
which only uses variance partition as first step).
For speed 6: ~0.5% increase in avgPSNR/SSIM metrics on ytlvie set.
No change in speed.

Change-Id: I5126580973201538d8ca26a9256b93c4d11d685b
2017-08-24 10:44:05 -07:00
Marco
c9ff7b6637 vp9: SVC: Skip NEWMV for small blocks for (0, 0) base_mv.
For SVC encoding:
average speedup ~1.5%, with small ~0.57 loss in avgPSNR metrics.

Change-Id: Icebce6f6ef4e819d7dfcf8db898c583167351de4
2017-08-23 13:08:27 -07:00
Johann
e83d99d7b8 quantize fp: neon implementation
About 4x faster when values are below the dequant threshold and 10x
faster if everything needs to be calculated.

Both numbers would improve if the division for dqcoeff could be
simplified.

BUG=webm:1426

Change-Id: I8da67c1f3fcb4abed8751990c1afe00bc841f4b2
2017-08-23 08:01:30 -07:00
Marco Paniconi
0207f17144 Merge "vp9: Condition lighting change detection on CBR mode." 2017-08-22 22:52:05 +00:00
Marco
a31461c853 vp9: Condition lighting change detection on CBR mode.
This feature is used for the CBR RTC encoding mode
at speed >= 6. This change will exclude it for VBR mode.

For speed 6 live encoding (VBR):
avgPSNR/SSIM metrics on ytlive set up by ~1% (few clips up by 2/3%).
No change in speed.

Change-Id: I1a0dd94c334f7df309ab5a48d477d7e25355b798
2017-08-22 14:59:37 -07:00
Johann
7a178a5631 quantize: capture skip block early
This should probably be handled before vp9_regular_quantize_b_4x4 even
gets called.

Fixes an assert resulting from removing skip_block from the quantize
functions.

BUG=webm:1459

Change-Id: I7f52b53f959b4654b3d4517ebda31a678f4d0fde
2017-08-22 12:10:55 -07:00
Johann
b527b47312 quantize fp: ignore skip_block in arm
Change-Id: Ie8ac00efa826eead2a227726a1add816e04ff147
2017-08-21 14:34:48 -07:00
Johann
7b13d99b98 quantize fp: ignore skip_block in x86
Change-Id: I1272917c49cf6e6710e52c36535b2fc8c8dced78
2017-08-21 14:33:41 -07:00
Johann
13eed991f9 Remove skip_block from quantize
This condition is handled before this code is reached. The ssse3 version
of the function has always crashed when attempting to handle the
skip_block condition.

Add assert() and comments regarding the usage of skip_block.

Removing the parameter is a fairly involved process so leave it be for
the moment.

Change-Id: Ib299f6fc6589d7ee102262cc74a7aeb60110bc5a
2017-08-21 09:49:04 -07:00
Johann Koenig
1426f04e91 Merge "quantize: normalize intermediate types" 2017-08-18 16:00:28 +00:00
Johann
7f602d6114 quantize: normalize intermediate types
Despite abs_coeff being a positive value, all the other implementations
treat it as signed which simplifies restoring the sign.

HBD builds cast qcoeff to avoid a visual studio warning. Match
vp9_quantize.c style of casting the entire expression.

Change-Id: I62b539b8df05364df3d7644311e325288da7c5b5
2017-08-17 12:34:28 -07:00
Paul Wilkins
f64e14047d Merge "Prevent parameters that can cause invalid ARF groups." 2017-08-16 18:25:57 +00:00
Linfeng Zhang
f95686895b Merge changes I08b562b6,Ia275940a,I51106e90
* changes:
  Add vpx_highbd_idct32x32_{34, 135, 1024}_add_{sse2, sse4_1}
  Update highbd idct x86 optimizations.
  Update 32x32 idct sse2 and ssse3 optimizations.
2017-08-16 16:36:37 +00:00
paulwilkins
b814e2d898 Prevent parameters that can cause invalid ARF groups.
Having a very low "lag_in_frames" value could cause the encoder to create
incorrect / corrupt ARF groups including displayed frames that update the
ARF buffer and false overlay frames that are coded at low rate but are not
actually overlays of a real ARF frame.

This is linked to a reported unit test "slow down" where the chosen parameters
(lag of 3 frames) gave rise to such "broken" ARF group(s).

See also BUG=webm:1454

Change-Id: If52d0236243ed5552537d1ea9ed3fed8c867232c
2017-08-16 14:33:59 +01:00
Paul Wilkins
0472382dbe Merge "Fix for encoder slowdown (for speeds >= 3)" 2017-08-16 13:01:38 +00:00
paulwilkins
e15be3025b Fix for encoder slowdown (for speeds >= 3)
Some clips in nightly unit test exhibiting significant encoder slowdown which
appears to bisect to Change-Id: I692311a709ccdb6003e705103de9d05b59bf840a.

The above change allowed for emergency iterations of the recode loop and
adjustment of the Q range if there is a large rate miss.

This patch disables the above adaptation for cases of cpu_speed >= 3 or more
specifically where cpi->sf.recode_loop >= ALLOW_RECODE_KFARFGF.

For speeds >= 3 the code does not currently run a dummy bit pack operation
inside the recode loop. Without this dummy pack operation there is no up to
date estimate of the current frame's size to use as a basis for assessing the
requirement for a recode. In practice it was using the previous frames size (or 0
for the first frame) which could cause odd behavior.

If we require the emergency rate correction added in  Change-Id: I6923.. for
the higher speed settings it will be necessary to enable the dummy pack
which will in turn hurt encode speed.

BUG=webm:1454

Change-Id: I4fb3c6062ca9508325a6f31582f8e80f1a9b126f
2017-08-16 10:56:52 +01:00
Jerome Jiang
6b9c691daf Merge "Clean up writing YUV files for debug purpose." 2017-08-15 18:28:54 +00:00
Jerome Jiang
a153080b55 Clean up writing YUV files for debug purpose.
Change legacy vp8/9_write_yuv_frame to vpx_write_yuv_files.
Delete some flags that can be enabled during build.

To enable writing denoised YUV, use the following command line:
CFLAGS='-DOUTPUT_YUV_DENOISED' ./configure
--enable-vp9-temporal-denoising

For skinmap, use CFLAGS='-DOUTPUT_YUV_SKINMAP'

Change-Id: I236974ac8b3cf279d20c4dc7f6162d8b480b6528
2017-08-15 10:44:03 -07:00
Marco
e9ccc6fe79 vp9: Denoiser fix: use correct bsize for skin detection.
Change-Id: I9d201fa3a4b00ebd147b57ed519fab8d59b0a802
2017-08-15 10:02:19 -07:00
Scott LaVarnway
7e8357d664 Merge "vp9: strip temporal filter code" 2017-08-15 15:35:33 +00:00
Paul Wilkins
ca393c9726 Merge "Patch relating to Issue 1456." 2017-08-15 14:57:56 +00:00
Paul Wilkins
5009302bce Merge "Enable emergency fast Q adaptation for VBR test case." 2017-08-15 14:57:22 +00:00
Linfeng Zhang
3f05a70c41 Update 32x32 idct sse2 and ssse3 optimizations.
Change-Id: I51106e90344035452621c49a6e1be7d5276b6c70
2017-08-14 16:59:31 -07:00
Scott LaVarnway
fa85cf131c vp9: strip temporal filter code
when CONFIG_REALTIME_ONLY is enabled.

BUG=webm:1446

Change-Id: Id547783ec75383966c40ab5cf6abb4a0f7984f52
2017-08-14 14:27:53 -07:00
Scott LaVarnway
1ab60466ec Merge "vp9: strip mb graph code" 2017-08-14 18:01:44 +00:00
Scott LaVarnway
e702b68b6c vp9: strip mb graph code
when CONFIG_REALTIME_ONLY is enabled.

BUG=webm:1446

Change-Id: I4b1b8e9a456830ba1b1bd3a8882e038d37ee7903
2017-08-11 12:59:40 -07:00
Jerome Jiang
d48be6ad73 Merge "vp9 SVC: Fix the denoiser frame buffer management." 2017-08-11 00:54:35 +00:00
Jerome Jiang
0f8ebddec4 vp9 SVC: Fix the denoiser frame buffer management.
Change the denoiser frame buffer management for SVC to more generally
handle the layer patterns in SVC (where last is not always refreshed).

This change is only for SVC with denoising and is bitexact.

Change-Id: Ic2b146a924cdf6e7114609158afa3d4880fe3fae
2017-08-10 16:56:46 -07:00
paulwilkins
db8fa86a6c Patch relating to Issue 1456.
Testing of 4k videos encoded with a fixed arbitrary chunking interval
uncovered a bug where by if a chunk ends 1 frame before a real scene cut,
the next chunk may be encoded with two consecutive key frames at the start
with the first being assigned 0 bits.

This fix insures that where there is a key frame group of length 1 it is
at least assigned 1 frames worth of bits not 0.

See also patch Change-Id: I692311a709ccdb6003e705103de9d05b59bf840a
which by virtue of allowing fast adaptation  of Q made this bug more visible.

BUG=webm:1456

Change-Id: Ic9e016cb66d489b829412052273238975dc6f6ab
2017-08-09 16:34:43 +01:00
Marco
427de67e63 vp9: Partition logic adjustment for speed 6 feature.
When adapt_partition_source_sad is enabled (currently only at
speed 6 for resoln <= 360p): use lower subsize (8x8 instead of 16x16)
for nonrd_select_partition on 32X32 blocks.

And force avoiding rectangular partition checks in
nonrd_pick_partition for speed >= 6.

Small increase ~0.5 in metrics for speed 6 on rtc_derf,
no change in speed.

Change-Id: Id751bc8f7573634571b2d6f5e29627cd5cebccae
2017-08-08 11:31:27 -07:00
paulwilkins
76d77aa013 Enable emergency fast Q adaptation for VBR test case.
Enable fast adaptation of Q when there is a large overshoot
for the  #ifdef AGGRESSIVE_VBR test case.

AGGRESSIVE_VBR  is not currently enabled by default.

Change-Id: I7240bb6589795964b6b0b66df4468e4f21504e0f
2017-08-03 12:06:07 +01:00
Yunqing Wang
bfd0f41f9b Force the bit exactness in the first pass
Originally, for the purpose of keeping a fast first pass, the first-pass
stats between row_mt_mode = 0 and row_mt_mode = 1 are not bit exact, but
that difference is very small that doesn't cause a mismatch between the
final bitstreams. However, if the encoder changes, this minor difference
may cause a mismatch. Thus, this patch always forces the first pass to
be bit exact.

BUG=webm:1453

Change-Id: I2b67cf529dee81f660f9d9e7fe9a60ea3c7b12b8
2017-08-02 15:58:39 -07:00
Paul Wilkins
3be14200fc Merge "Respond more rapidly to excessive local overshoot." 2017-08-01 08:58:36 +00:00
Marco
5d6c1c2d8f vp9: Adjust noise estimation for 360p.
Change-Id: Ib76875232491b14f7114061e8e913e87004427a0
2017-07-31 17:12:58 -07:00
Marco Paniconi
ebb023deb6 Merge "Revert "Revert "vp9: Speed feature to adapt partition based on source_sad.""" 2017-07-31 14:58:15 +00:00
Marco
999bd6ea84 vp9: Fix denoising condition when pickmode partition is used.
When the superblock partition is based on the nonrd-pickmode,
we need to avoid the denoising. Current condition was based on
the speed level. This change is to make the condition at the
superblock level, as the switch in partitioning may be done at
sb level based on source_sad (e.g., in speed 6).

Change-Id: I12ece4f60b93ed34ee65ff2d6cdce1213c36de04
2017-07-30 23:16:38 -07:00
Jerome Jiang
f027908ad0 Revert "Revert "vp9: Speed feature to adapt partition based on source_sad.""
This reverts commit c9266b8547.

Disable source_sad when resolution > 1080P. The test should
pass now.

BUG=webm:1452

Change-Id: I72dde88e66590ff9e41da5e5dd83f5550a83f082
2017-07-30 19:49:31 -07:00
James Zern
c9266b8547 Revert "vp9: Speed feature to adapt partition based on source_sad."
This reverts commit 064fc570ff.

This causes an assertion failure in vp9_mcomp.c when running
gtest_filter=VP9/MotionVectorTestLarge.OverallTest/41:
`mv->col >= -((1 << (11 + 1 + 2)) - 1) && mv->col < ((1 << (11 + 1 + 2))
- 1)'

Change-Id: I449e777bf18b661cb3f1d82253610c55c51687f6
2017-07-29 11:36:58 -07:00
Marco Paniconi
5d0bef4763 Merge "vp9: Adjust logic in source sad for screen content." 2017-07-29 01:46:58 +00:00
Marco Paniconi
e48dfcead1 Merge "vp9: Speed feature to adapt partition based on source_sad." 2017-07-29 01:45:19 +00:00
Jerome Jiang
ac211fe23e vp9: Adjust logic in source sad for screen content.
Change-Id: I917d106f4c95ea44e413e23881f6303982e1a6a3
2017-07-28 17:25:41 -07:00
Marco
064fc570ff vp9: Speed feature to adapt partition based on source_sad.
Move the source_sad feature to speed 6 (from speed 7), and
add speed feature to switch from the variance-based partition
to reference_partition (which uses nonrd-pickmode for bsize selection)
if source_sad is high.

Currently used only for speed 6 for resoln <= 360p.
About 4-5% improvement on 360p in RTC set.
Some speed slowdown, but still ~30% faster than speed 5.

Change-Id: Ib0330ee5fe9fdd2608aed91359a2a339d967491c
2017-07-29 00:20:26 +00:00
Urvang Joshi
7105e66d19 Remove the DP version of vp9_optimize_b().
The greedy version was already enabled by default here:
https://chromium-review.googlesource.com/c/546848/

And the speed+compression gains from greedy version were already
mentioned here:
https://chromium-review.googlesource.com/c/531675/

Change-Id: Iad9f7d03490c845ad1e230af028c9d39edddca97
2017-07-28 23:12:57 +00:00
James Zern
8836e46ffd set_var_thresh_from_histogram: prevent negative variance
For 8-bit the subtrahend is small enough to fit into uint32_t.

For 10/12-bit apply:
63a37d16f Prevent negative variance

previously:
47b9a0912 Resolve -Wshorten-64-to-32 in highbd variance.
c0241664a Resolve -Wshorten-64-to-32 in variance.

Change-Id: I181c85f0b9a03da37c2e8b89482d48aa3dbc0aee
2017-07-22 13:27:32 -07:00
Jerome Jiang
4526644615 vp9: Removed unused skin detection function.
Change-Id: I6702b7b11aa4ac9aac5fd54deef4377cdcb29c64
2017-07-18 14:52:04 -07:00
Jerome Jiang
59e461db1f Merge "vp9: Allocate alt-ref in denoiser for SVC." 2017-07-18 21:30:04 +00:00
Jerome Jiang
babef23a5f Merge "vp9: Remove isolated skin & non-skin blocks." 2017-07-18 20:48:32 +00:00
Jerome Jiang
fd216268ad vp9: Allocate alt-ref in denoiser for SVC.
When SVC is used, allocate alt-ref in denoiser.

Change-Id: I1b17221b55b9444cd23b97d481b54ff8d296d857
2017-07-18 13:22:47 -07:00
Jerome Jiang
adbfc4308a vp9: Remove isolated skin & non-skin blocks.
0.007% regression on rtc and 0.004% gain on rtc_derf.
1 thread on QVGA,VGA and HD has ~0.2% speed regression while 2 threads has
~0.2% speed gain on Google Pixel.

Change-Id: Ia4a6ec904df670d7001e35e070b01e34149d23dc
2017-07-18 11:29:14 -07:00
Marco
817f68cdcf vp9: Disable usage of sb_use_mv_part for SVC.
To fix valgrind issueis with SVC tests.
SVC encoding uses prune_evenmore which is causing uinit value.

Will re-enable later when issue is resolved.

Change-Id: I257ff878cf78197ddd813db056582a4d5fe94f44
2017-07-18 09:28:56 -07:00
Marco
ad56371343 vp9: Fix to setting content_state for real-time mode.
When content_state_sb is set to LowVarHighSumdiff, don't reset
it to VeryHighSad. Visually better on clips with strong lighting changes.

Small/negligible change in RTC metrics and speed.

Change-Id: I20c383e3c4cf8d1149de5f9260449c0b7cf7c6aa
2017-07-17 16:21:25 -07:00
Marco
0c9e2f4c15 vp9: Reuse motion from choose_partitioning in NEWMV search.
When int_pro_motion_estimation is done for superblock in
choose_partitioning, use it to avoid the full_pixel_search
for NEWMV mode, if bsize is >= 32X32.

For speed > 7.
Small/neutral change on RTC metrics.
~1-2% speedup on arm on high motion clip.

Change-Id: I3cfe6833ff4bf75d4afa83eaf058ad45729de85b
2017-07-17 13:15:48 -07:00
Jerome Jiang
682135fa60 vp9: Compute skin only for blocks eligible for noise estimation.
Change-Id: Iddcb83a5968db57cfd312c5bc44b2a226a2a3264
2017-07-14 15:14:30 -07:00
Marco
666e394d41 vp9: Adjust minmax threshold for variance partitioning.
Only affects speed 7. Improvement on high motion clips.

Change-Id: Ibddb68fed9c63207df29ffd790f9205b1cecf687
2017-07-13 21:19:37 -07:00
James Zern
b578d59623 Merge "remove vp9_firstpass.c w/CONFIG_REALTIME_ONLY" 2017-07-12 23:30:04 +00:00
Marco Paniconi
f6586b8bf8 Merge "vp9: Fix to SVC and denoising for fixed pattern case." 2017-07-12 19:13:05 +00:00
Urvang Joshi
1dee320446 Merge "Remove the token state array from greedy optimize_b." 2017-07-12 00:08:56 +00:00
James Zern
df18412f32 remove vp9_firstpass.c w/CONFIG_REALTIME_ONLY
BUG=webm:1446

Change-Id: I6e0ea9342c715d354c641109737172afa649b85b
2017-07-11 13:10:16 -07:00
Urvang Joshi
5322a31b18 Remove the token state array from greedy optimize_b.
Reduces memory usage, and speeds up encoding for some difficult clips.
No impact on output or metrics.

Ported from aomedia patch:
https://aomedia-review.googlesource.com/c/14501

Change-Id: I26ec69af8336f9e80da486a1cfbfc89a3596954d
2017-07-11 13:05:29 -07:00
James Bankoski
7d5afa227a Merge "Reintroduce fix for max qindex calculation of a gf interval" 2017-07-11 19:47:16 +00:00
Jerome Jiang
1a4d8f2033 Merge "vp9: Move skinmap computation into multithreading loop." 2017-07-11 19:44:22 +00:00
Jim Bankoski
689ad89e86 Reintroduce fix for max qindex calculation of a gf interval
This reintroduces the fix:
  https://chromium-review.googlesource.com/c/422807/
and later reverted here:
  https://chromium-review.googlesource.com/c/447843/

BUG=webm:1355

This time behind a compile time flag :

configure --disable-always_adjust_bpm
configure --enable-always_adjust_bpm

This should make side by side testing easier and let users of the
lib pick which way they want to go.

Change-Id: I7d7b37b83015dc001810af84c132cbc1e71ba8d6
2017-07-11 18:40:26 +00:00
Marco
3818a3723b vp9: Fix to SVC and denoising for fixed pattern case.
For fixed pattern SVC: keep track of denoised last_frame buffer
for base temporal layer, and if alt_ref is updated on middle/upper
temporal layers, force an update to denoised last_frame buffer.
This allows for improved denoising on top temporal layers.

Change-Id: Icbd08566027d4d2eabc024d3b7a0d959d2f8c18b
2017-07-11 11:27:04 -07:00
Jerome Jiang
3d6b0cb825 vp9: Move skinmap computation into multithreading loop.
Change-Id: Iebc9dd293d8b1449c0674c0295349297e9b90646
2017-07-10 17:18:15 -07:00
Johann Koenig
4b78c6e6f7 Merge "remove vp9_full_sad_search" 2017-07-10 20:42:40 +00:00
Jerome Jiang
125a532b34 Merge "vp9: Remove alt-ref from denoiser." 2017-07-10 20:03:51 +00:00
Johann
109faffe9b remove vp9_full_sad_search
This code is unused in vp9. Only vp8 still contains references to
vpx_sad_NxMx[3|8] and only for sizes 16x16, 16x8, 8x16, 8x8 and 4x4.

Remove the remaining sizes and all the highbitdepth versions.

BUG=webm:1425

Change-Id: If6a253977c8e0c04599e25cbeb45f71a94f563e8
2017-07-10 11:20:35 -07:00
Jerome Jiang
2ac7c549e9 vp9: Remove alt-ref from denoiser.
Denoiser is used in real-time mode which does not use alt-ref.
Reduce memory usage when denoiser is enabled.

Change-Id: I54ba3bcaeeb1818bbdf718ef90e97d4897ff793d
2017-07-10 10:56:03 -07:00
James Zern
5d6060b62f Merge "cosmetics,vp9/: normalize inv/fwd_txfm naming" 2017-07-07 19:15:02 +00:00
Marco Paniconi
2075af4b16 Merge "vp9: Nonrd mode: use content_state_sb for high motion." 2017-07-07 03:00:59 +00:00
James Zern
80b83c73ba cosmetics,vp9/: normalize inv/fwd_txfm naming
+ vpx_dsp/, test/

itxfm -> inv_txfm, ftxfm -> fwd_txfm

Change-Id: I3aacdb65143576d64cfe5c9b14dd358c17c1fe7e
2017-07-06 18:35:44 -07:00
Marco
8c3f18efa1 vp9: Nonrd mode: use content_state_sb for high motion.
In the content_state for a superblock is set to HighSad,
use that to bias some decisions in variance partition and
nonrd pickmde: use int_pro_motion for sad computation in
choose_partitioning, and set large_block in pickmode based
on the content_state_sb.

Only affects speed >= 7.

Immprovement for high motion content.
Small gain (~1%) in RTC metrics.
Speedup of ~5 for high motion clip on android (speed 8, 1 thread).

Change-Id: I5774c4854f012b89c8e969f6129b60988c2ce11c
2017-07-06 15:05:19 -07:00
James Zern
5227b8200b vp9: remove FrameWorkerData & vp9_dthread.h
the file was empty after the struct removal. the only remaining use was
within vp9_dx_iface, but the wrapper became unnecessary after the
removal of frame_parallel_decode.

BUG=webm:1395

Change-Id: I515ab585d701e77d388d12b2802d844c424f9bcd
2017-07-05 22:32:00 -07:00
James Zern
48c4a038eb vp9: remove (un)lock_buffer_pool
there is no threaded access to this pool after the removal of
frame_parallel_decode

BUG=webm:1395

Change-Id: I710769b87102edc898c59eb9a2e7a91d8c49107f
2017-07-05 21:07:00 -07:00
James Zern
af3cab7b24 Merge changes from topic 'rm-dec-frame-parallel'
* changes:
  vp9_onyxc_int,RefCntBuffer: rm unused members
  remove vp9_dthread.c
  vp9: reduce FRAME_BUFFERS by 3
2017-07-06 04:06:30 +00:00
James Zern
4ffd8350be Merge changes from topic 'rm-dec-frame-parallel'
* changes:
  VP9_COMMON: rm frame_parallel_decode
  VP9Decoder: rm frame_parallel_decode
  vp9_dx: rm worker thread creation
2017-07-05 23:53:22 +00:00
Yaowu Xu
f2b1dc529f Merge "Further refactoring of mod error calculation." 2017-07-05 21:43:50 +00:00
paulwilkins
5b44ef0c50 Respond more rapidly to excessive local overshoot.
This patch attempts to address a bug reported for 4K video.
https://b.corp.google.com/issues/62215394

In this instance a perfect storm of a moderate complexity section
followed by a much easier section where a CGI overlay helped to
suppress film grain noise, followed by a much harder and very grainy
section at the end, cause a massive local rate spike that pushed a chunk
over the upper allowed rate limit.

This patch detects cases where the rate for a frame is much higher than
expected and allows, in this special case, for rapid adjustment of the active
Q range.

For the example chunk in the bug report the target rate was 18Mb/s and the
observed rate was over 37 Mb/s with a surge for the last few frames to over
100Mb/s. This patch brings the overall chunk rate right back down to ~18.2 Mbit/s
and  almost completely eliminates the rate spike at the end. (See graphs appended
to bug report)

Also see  I108da7ca42f3bc95c5825dd33c9d84583227dac1 which fixes a bug
unearthed during testing of this patch and also has a bearing on high rate
encodes such as 4K.

This patch does have a negative impact on some metrics. Most notably there are
clips in our standard test set where it hurts global psnr (though in many cases it
conversely helps SSIM, FAST SSIM and PSNR-HVS). It is also worth noting that
the clips (and data rates) where there is a big metric impact, are almost all cases
where there is currently a significant overshoot vs the target rate and overall rate
accuracy is greatly improved.

Change-Id: I692311a709ccdb6003e705103de9d05b59bf840a
2017-07-05 16:51:52 +01:00
paulwilkins
a1af335f44 Further refactoring of mod error calculation.
Further refactoring to support alternative error distributions.

Change-Id: I0f7fa3fd6f3baa4b0a1e53c6aa3be63966e97b82
2017-07-05 16:49:37 +01:00
paulwilkins
b0459ec8ea Fix incorrect index test in GF group rate assignment.
Correct test for middle frame in the group.

Change-Id: I1ee49fa33968eb3c4a01d6a27a60bb1409e3e68c
2017-07-05 16:45:36 +01:00
James Zern
37e03b1d13 Merge "cosmetics,vp9/encoder: s/txm/txfm/" 2017-06-30 21:57:16 +00:00
Jerome Jiang
f781cc8a46 Merge "vp9: Adjust condition for checking intra mode." 2017-06-30 21:55:00 +00:00
Marco
2290898ac7 vp9: Adjust condition for checking intra mode.
For nonrd_pickmode: add condition for checking
intra mode if the sb content state is VeryHighSad.

Reduces artifacts when sudden change in content.

Metrics on RTC/RTC_derf neutral (small gain).
No speed loss observed.

Change-Id: I07006d28fd2dc06c1d06b07630102b0fece50c40
2017-06-30 14:52:00 -07:00
James Zern
303cb3106b vp9_onyxc_int,RefCntBuffer: rm unused members
the last frame_worker_owner, row and col references were removed in:
131bd06e6 remove vp9_dthread.c

BUG=webm:1395

Change-Id: Ia7fb2e8782b12a58d2a2263849d20a8abf06aef6
2017-06-30 12:03:07 -07:00
James Zern
bc837b223b VP9_COMMON: rm frame_parallel_decode
this has been 0 since the removal of frame_parallel_decode in
vp9_dx_iface.

BUG=webm:1395

Change-Id: I3a562b2c6b82050064d2b2ccb18a3e77c700b2da
2017-06-30 12:03:07 -07:00
James Zern
78fe5ca360 remove vp9_dthread.c
and the related prototypes in vp9_dthread.h. the last references were
removed in:
09dabc58d VP9_COMMON: rm frame_parallel_decode

vp9_dx_iface.c still uses FrameWorkerData

BUG=webm:1395

Change-Id: Ica8e98ae776fc0105f1fbbed9e0a729808980810
2017-06-30 12:03:07 -07:00
James Zern
ba76b662af VP9Decoder: rm frame_parallel_decode
this has been 0 since the removal of frame_parallel_decode in
vp9_dx_iface.

BUG=webm:1395

Change-Id: I3f579766ecfa4777395b99686738e1c5610f86ef
2017-06-30 12:03:07 -07:00
James Zern
fb134e759a vp9: reduce FRAME_BUFFERS by 3
the additional buffers are unneeded with the removal of
frame_parallel_decode

BUG=webm:1395

Change-Id: Id9ec4cb6462af5d07a0d3cf939bd216db27d9d9e
2017-06-30 12:03:07 -07:00
James Zern
86d51dfbf5 vp9_dx: rm worker thread creation
creating a thread associated with the sole worker isn't necessary when
only execute() is being used after the removal of frame_parallel_decode.

BUG=webm:1395

Change-Id: I2255ce72607321e5708bc82a632dc6825d4eff5c
2017-06-30 12:03:07 -07:00
James Zern
469986f963 Merge changes from topic 'rm-dec-frame-parallel'
* changes:
  vp9_dx,vpx_codec_alg_priv: rm *worker_id*
  vp9_dx,vpx_codec_alg_priv: rm *cache*
  vp9_dx,vpx_codec_alg_priv: rm frame_parallel_decode
2017-06-30 19:02:05 +00:00
James Zern
e3c8f2f152 vp9_dx,vpx_codec_alg_priv: rm *worker_id*
+ available_threads

these are unused with the removal of frame_parallel_decode

BUG=webm:1395

Change-Id: I59c5075542a5a74d4a539c213682f566b005f5a6
2017-06-29 16:21:50 -07:00
James Zern
dcdb013b55 vp9_dx,vpx_codec_alg_priv: rm *cache*
these fields are unused with the removal of frame_parallel_decode

BUG=webm:1395

Change-Id: Ia3821f7fb81d17b20033b094e5265b1030ee4030
2017-06-29 16:21:50 -07:00
James Zern
ab76f1f224 vp9_dx,vpx_codec_alg_priv: rm frame_parallel_decode
this field has been 0 since:
01d23109a vp9: make VPX_CODEC_USE_FRAME_THREADING a no-op

BUG=webm:1395

Change-Id: I15448e9401e15329b54c6878dda033b17be5ec6b
2017-06-29 16:21:50 -07:00
James Zern
8d1bda93f4 cosmetics,vp9/encoder: s/txm/txfm/
txfm is more commonly used as an abbreviation through the codebase

Change-Id: I86fd90ef132468f9da270091c05daa1f5a49ece2
2017-06-29 15:08:47 -07:00
Jerome Jiang
74dc640565 vp9: Remove avg2x2 in skin detection and clean up.
Change-Id: I6510e36866138f8ac4cb82c207e58e0b9522e499
2017-06-29 03:06:23 +00:00
Jerome Jiang
8582d33a0d Merge "vp9: compute skinmap only once before encoding." 2017-06-28 18:01:46 +00:00
Marco
88d11f473c vp9: Speed >= 8: Remove logic on reducing subpel.
Existing logic was only affecting resolutions above 720p.
Needs more testing for reducing subpel for speed >= 8.

No change on RTC metrics.

Change-Id: I2f4bf9f25891614aafa9a86aa5a5063a3ccfce4d
2017-06-27 20:27:02 -07:00
James Zern
515fed8f38 Merge "highbd_quantize_fp_32x32: normalize abs_qcoeff type" 2017-06-27 23:30:17 +00:00
Jerome Jiang
a220b931f5 vp9: compute skinmap only once before encoding.
This could save some cycles since skin detection is used in multiple
places in vp9.

1~2% speed up on ARM.

Change-Id: I86b731945f85215bbb0976021cd0f2040ff2687c
2017-06-27 16:16:02 -07:00
Linfeng Zhang
a76b6b232c Update load_input_data() in x86
Split to load_input_data4() and load_input_data8().
Use pack with signed saturation instruction for high bitdepth.

Change-Id: Icda3e0129a6fdb4a51d1cafbdc652ae3a65f4e06
2017-06-26 13:38:33 -07:00
Urvang Joshi
4bb99ee27e Enable greedy version of optimize_b() in VP9 by default.
Improvements were already mentioned in the previous patch:
https://chromium-review.googlesource.com/#/c/531675/

Change-Id: I4906ab1c61c25a815bdeb986016fad6dcb69eb71
2017-06-23 17:04:58 -07:00
Marco
18805eee6c vp9: Use scene detection for CBR mode.
Use the scene detection for CBR mode, and use it to reset the
rate control if large source sad is detected and rate
correctioni fact/QP is at minimum state.

Avoids large frame sizes after big content change following
low content period.

Only affects CBR mode for 1 pass at speeds 5, 6, 7.
Change-Id: I56dd853478cd5849b32db776e9221e258998d874
2017-06-23 11:44:50 -07:00
James Zern
88a302e743 Merge changes from topic 'missing-proto'
* changes:
  onyxd_int.h: add missing prototypes
  onyxd.h: add vp8dx_references_buffer prototype
  vp[89],vpx_dsp: add missing includes
  vp8,encodeframe.h: correct prototypes
  vp8: add temporal_filter.h
  add picklpf.h
  add ethreading.h
  vp8,bitstream.h: add missing prototypes
  vp8: remove vp8_fast_quantize_b_mmx
  vp8,loopfilter_filters: make some functions static
  vp9_ratectrl: make adjust_gf_boost_lag_one_pass_vbr static
  vp9_encodeframe: make scale_part_thresh_sumdiff static
  vp9_alt_ref_aq: correct vp9_alt_ref_aq_create proto
  tiny_ssim: make some functions static
2017-06-23 05:44:24 +00:00
Marco Paniconi
4f917912b9 Merge "vp9: Add high source sad to content state." 2017-06-22 22:18:48 +00:00
Paul Wilkins
92145006c3 Merge "Fix int overflow in rate control for high bit rates." 2017-06-22 16:30:40 +00:00
paulwilkins
efe1982e63 Fix int overflow in rate control for high bit rates.
Fix misplaced cast that caused an overflow and incorrect rate adaptation
behavior for high data rates. This in particular will have affected 4k encodes
but could also have come into play for some higher rate 1080p cases.

In our standard test sets the quality impact is small though several high rate
clips show improved rate accuracy. This can also impact the number of recode
loop hits and on one problem 4k  clip the encode time for speeds 0 and 1 was
reduced by >25%

Change-Id: I108da7ca42f3bc95c5825dd33c9d84583227dac1
2017-06-22 10:34:21 +01:00
Marco
d7515b1187 vp9: Add high source sad to content state.
Use it to limit NEWMV early exit in nonrd pickmode

Small change in RTC metrics, has some improvement
for high motion clips.
Change-Id: I1d89fd955e1b3486d5fb07f4472eeeecd553f67f
2017-06-21 20:57:17 -07:00
Marco Paniconi
33a9394eb1 Merge "vp9: Adjustments for aq-mode and pickmode for speed >= 8." 2017-06-22 03:27:47 +00:00
James Zern
44418c659f vp[89],vpx_dsp: add missing includes
quiets -Wmissing-prototypes

Change-Id: I841cfc019d592f2bc6b3fec5818051a31f4c53b5
2017-06-21 19:00:15 -07:00
James Zern
07f847873b vp9_ratectrl: make adjust_gf_boost_lag_one_pass_vbr static
quiets -Wmissing-prototypes

Change-Id: I72d899c2d8de1ddc52d90ac081f2629374b3a6e9
2017-06-21 19:00:14 -07:00
James Zern
9a329b5285 vp9_encodeframe: make scale_part_thresh_sumdiff static
quiets -Wmissing-prototypes

Change-Id: I696223d75860edba13c6b6f38c1f8db353a6f812
2017-06-21 19:00:14 -07:00
James Zern
3f296533f6 vp9_alt_ref_aq: correct vp9_alt_ref_aq_create proto
quiets -Wmissing-prototypes

Change-Id: Ib2d4f294f1982739bb2ac98155e789e040d309a1
2017-06-21 19:00:04 -07:00
James Zern
9e1d2de67c highbd_quantize_fp_32x32: normalize abs_qcoeff type
use an int to quiet an unsigned rollover warning similar to:
25110f283 Fix an ubsan warning: vp9_quantizer.c

Change-Id: Iedecb79a17249bc18f10c0920f88cf704920f12b
2017-06-21 18:56:10 -07:00
Marco
21afafa31a vp9: Put skin detection usage around cpi flag.
Skin detection usage in choose_partitioning should be
around the cpi->use_skin_detection.

Change-Id: I6986179af9ce94c60c0974d66c311fc07cc04cfe
2017-06-21 17:32:56 -07:00
Marco
8cf6f78fce vp9: Adjustments for aq-mode and pickmode for speed >= 8.
Adjust the threshold for turning off cyclic refresh for high motion,
and avoid testing golden in nonrd pickmode for speed >= 8 if
golden refresh was long ago.

No change/neutral on RTC metrics.
Change-Id: I40959b8d9637f3553e7458bbabd8c6024c2c09c0
2017-06-21 16:01:24 -07:00
hui su
d96ed96c0f VP9 level targeting: properly handle max_gf_interval
Don't overide max_gf_interval if it's not specified. It will
be assigned with a default value in vp9_rc_set_gf_interval_range().

BUG=b/62803416

Change-Id: Ide46ce00279ed076865fc54ce98c55a994f0c798
2017-06-20 16:29:04 -07:00
Marco Paniconi
737aa5c9e4 Merge "vp9: SVC: Rework the usage of base_mv for SVC." 2017-06-20 03:08:32 +00:00
Marco
ff7fb4b280 vp9: Speed >= 8: Adjust resolution threshold for subpel.
Get some quality gain on RTC metrics (~7%), with
~5-8% speed slowdown.

Change-Id: I0d02942a77074424ee0326b6e110ddff09f2df5e
2017-06-19 13:58:08 -07:00
Marco
112cd95507 vp9: SVC: Rework the usage of base_mv for SVC.
Set the base_mv_aggressive for temporal enhancement layers (TL > 0).
Under the aggressive mode, skip the NEWMV depending on the
SSE of the base_mv. Also reduce the subpel motion to 1/2 under
aggressive mode if base_mv is good.

Speedup ~3% with small/negligible loss in quality on RTC.
Affects speed >= 6.

Change-Id: I89341b279cad6da2a04b76d5e726016191dacdb8
2017-06-18 22:35:46 -07:00
Urvang Joshi
a4ea7e131b VP9: Add greedy version of av1_optimize_b().
This was ported from the greedy version in AV1, written by Dake He
(dkhe@google.com).
See:
https://aomedia.googlesource.com/aom/+/master/av1/encoder/encodemb.c#137

Greedy version is disabled by default, but can be picked by setting
USE_GREEDY_OPTIMIZE_B to 1.
To be enabled by default later.

This is both faster and better in terms of compression.

Compression Improvement:
------------------------
lowres: -0.119
midres: -0.064
hdres:  -0.405

Speed Improvement:
------------------
(Based on encode time of 3 videos of different difficulties at
3 different target bitrates)
With --cpu-used=0: 0.38% to 5.55% faster
With --cpu-used=1: 0.24% to 2.79% faster
With --cpu-used=2: 0.29% to 1.46% faster

Change-Id: Ia7a23b3b244ad8eb253ac9e43cd03c5e021d2635
2017-06-15 11:19:08 -07:00
Linfeng Zhang
d6eeef9ee6 Clean array_transpose_{4X8,16x16,16x16_2) in x86
Change-Id: I341399ecbde37065375ea7e63511a26bfc285ea0
2017-06-13 16:50:44 -07:00
Linfeng Zhang
9c72e85e4c Remove array_transpose_8x8() in x86
Duplicate of transpose_16bit_8x8()

Change-Id: Iaa5dd63b5cccb044974a65af22c90e13418e311f
2017-06-13 16:50:44 -07:00
Linfeng Zhang
cbb991b6b8 Convert 8x8 idct x86 macros to inline functions
Change-Id: Id59865fd6c453a24121ce7160048d67875fc67ce
2017-06-13 16:50:43 -07:00
Hui Su
21e1661b54 Merge "vp9 level targeting: more strict constraint on min_gf_interval" 2017-06-12 16:38:02 +00:00
Jerome Jiang
a46bc0268b Merge "Remove duplication on vp8/9_write_yuv_frame." 2017-06-10 04:50:19 +00:00
Marco
e540ca7155 vp9: SVC: Use prune_evenemore only for non_reference.
Set subpel prune_evenmore only for non_reference frames,
instead of all TL > 0 frames. Gain some quality back at
cost of small speed loss (~1-2%).

Change only effects SVC encoding at speed >= 7.

Change-Id: I5b9f51e51dccfd7050521a66996176b0415ca3f9
2017-06-09 17:52:20 -07:00
Jerome Jiang
ff2d220d21 Remove duplication on vp8/9_write_yuv_frame.
Change-Id: Ib3546032a27c715bf509c0e24d26a189bc829da8
2017-06-09 17:08:26 -07:00
Jerome Jiang
943f9ee25c Merge "Merge skin detection code in vp8/9." 2017-06-08 16:36:00 +00:00
Jerome Jiang
658e854252 Merge skin detection code in vp8/9.
BUG=webm:1438

Change-Id: Ie3dc034c7dbb498a0b088a767b1936ddeed4df14
2017-06-07 21:20:34 -07:00
hui su
21d2273efa vp9 level targeting: more strict constraint on min_gf_interval
min_gf_interval should be no less than min_altref_distance + 1,
as the encoder may produce bitstream with alt-ref distance being
min_gf_interval - 1.

BUG=b/38450599

Change-Id: Ifb733daa643ebc668d1b23e1ce92db94b66dabe8
2017-06-07 17:40:25 -07:00
Marco
14d4718043 vp9: SVC: Enable simple_block_yrd for temporal layers.
Enable simple_block_yrd for temporal enhancement layers (TL > 0).
And remove block size condiiton for SVC mode.
Only affects speed >= 7 SVC.

Speedup ~3-4%.
avgPSNR regression on RTC for (3 spatial, 3 temporal) layers: ~1%.

Change-Id: Iff4fc191623b71c69cd373e7c0823385e7ac67ed
2017-06-07 11:41:50 -07:00
Marco
7d2f5f8e9d vp9: SVC: Adjust some speed settings for SVC speed >= 7.
Keep the 1/4subpel for all frames, use SUBPEL_TREE_PRUNED_EVENMORE
for all temporal enhancement layer frames.

Change-Id: Ibc681acbb6fc75b7b3c57fc483fcb11d591dfc9a
2017-06-06 15:30:24 -07:00
Jerome Jiang
cf07d85809 Initialize cost_list all to INT_MAX.
It is initialized to be { INT_MAX, 0, ... } in ffe0f9b.
No effect on encoders.
Make it consistent with other initializations.

BUG=webm:1440

Change-Id: Ie2a180d93626b55914c8c4255e466a1986d2b922
2017-06-06 10:42:37 -07:00
James Zern
6df142e2ab vp9_mcomp,get_cost_surf_min: quiet conversion warning
visual studio will warn if a 32-bit shift is implicitly converted to 64.
in this case integer storage is enough for the result.
since:
f3a9ae5ba Fix ubsan failure in vp9_mcomp.c.

Change-Id: I7e0e199ef8d3c64e07b780c8905da8c53c1d09fc
2017-06-05 22:52:58 -07:00
Jerome Jiang
968a5d6bc2 Merge "Fix valgrind failure on uninitialized variables." 2017-06-06 03:47:31 +00:00
Jerome Jiang
ffe0f9b7fb Fix valgrind failure on uninitialized variables.
BUG=webm:1440

Change-Id: I7074e42bdfa8dd25f11bbb3f2ab1b41d6f4c12e4
2017-06-05 13:09:29 -07:00
Jerome Jiang
f3a9ae5baa Fix ubsan failure in vp9_mcomp.c.
Change-Id: Iff1dea1fe9d4ea1d3fc95ea736ddf12f30e6f48d
2017-06-02 21:37:13 -07:00
Marco
e30781ff80 vp9: SVC: Force subpel search off under certain conditions.
For SVC 1 pass non-rd mode:
Force subpel seach off for SVC for non-reference frames
under motion threshold.

Add flag to svc context to indicate if the frame is not used
as a reference.

Little/no quaity loss, ~2% speedup.

Change-Id: Ic433c44b514d19d08b28f80ff05231dc943b28e9
2017-06-01 20:48:52 -07:00
Marco
8c6fa5c5e3 vp9: Speed >8: Set subpel_search_method for low motion.
Speed >=8: for resolutions above CIF, and for low motion content,
set subpel_search_method to SUBPEL_TREE_PRUNED_EVENMORE.

Small speed gain (~2%) on vga clips,
RTC metrics up by ~2-3% on average.

Change-Id: Ie26ba0264589652f92dfe74308740debf94cf0cc
2017-06-01 16:16:13 -07:00
Jerome Jiang
e254969df2 Fix corruption in skin map debugging output yuv.
For both vp8 and vp9.

BUG=webm:1437

Change-Id: Ifd06f68a876ade91cc2cc27c574c4641b77cce28
2017-06-01 16:59:43 +00:00
Jerome Jiang
a5ab38093f Merge "Fix vp8 race when build --enable-vp9-highbitdepth." 2017-05-30 05:47:44 +00:00
Jerome Jiang
0afa2dad76 Fix vp8 race when build --enable-vp9-highbitdepth.
Split vp8/vp9 implementations on yv12_copy_frame_c.
Remove high-bitdepth codes from vp8_yv12_extend_frame_borders_c.
Clean up vp8 codes usage in vp9.

BUG=webm:1435

Change-Id: Ic68e79e9d71e1b20ddfc451fb8dcf2447861236d
2017-05-26 09:45:01 -07:00
Marco
146005a911 vp9: SVC: Fix to condiiton on using source_sad.
Fix the condition on usage of source_sad for temporal layers.
FIx allows it to be used for the case of 1 temporal layer.

Change-Id: I02b1b0ade67a7889d1b93cee66d27c0951131fc3
2017-05-26 08:46:50 -07:00
Marco Paniconi
9ec9415fd9 Merge "vp9: Use source_sad only on top temporal enhancement layer." 2017-05-26 05:24:06 +00:00
Marco
ea914456af vp9: Use source_sad only on top temporal enhancement layer.
For 1 pass CBR SVC mode.

Change-Id: Ic026740f9d0ec5eee7c5845be9c5b15884fec48d
2017-05-25 16:32:05 -07:00
Marco
747cf7a505 vp9: SVC: Enable copy partition for SVC speed >= 7.
Adjust the max_copied_frame setting for temporal layers.
Keep the same setting for non-SVC at speed 8.
This change also enables copy_partiton for non-SVC at speed 7,
but with smaller value of max_copied_frame (=2).

~2% speedup for SVC speed 7, 3 layers, with little/no quality loss.

Change-Id: Ic65ac9aad764ec65a35770d263424b2393ec6780
2017-05-25 12:21:46 -07:00
Marco Paniconi
b3bf91bdc6 Merge "vp9: Adjustments to cyclic refresh for high motion." 2017-05-22 06:27:30 +00:00
Marco
2adc0443dd vp9: Adjustments to cyclic refresh for high motion.
For aq-mode=3: refactor the condition for turning off
the refresh. Add some adjustments for high motion content.

No/little change in RTC metrics, only affects high motion case.

Change-Id: I7da8eabfb0e61db014be4562806f72ee5ef4a43b
2017-05-21 22:21:44 -07:00
Marco
ff9395eb3b vp9: Speed >= 8: Modify condition for low-resoln.
No change on RTC metrics.

Change-Id: I5abc573cb56572188d900645d13ba479f55a1ea0
2017-05-21 22:14:38 -07:00
Paul Wilkins
a7977ece93 Merge "Changes to modified error." 2017-05-19 12:24:32 +00:00
Marco
1205e3207e vp9: SVC: Modify condition to allow for copy partition.
When temporal layers are used, only allow for copy partition
on the top temporal enhancement layer frames.

Change-Id: I5472abdc0f9f6c8dafa75a7a84c615e08ae22af8
2017-05-18 14:19:31 -07:00
Jerome Jiang
6b6ff9c969 Merge "vp9: Make copy partition work for SVC and dynamic resize." 2017-05-18 19:37:30 +00:00
Marco
2ba4729ef8 vp9: Make copy partition work for SVC and dynamic resize.
Only affects speed 8.

Make changes to copy partition to fix a bug in setting microblock
offset. Avg PSNR shows 0.02% gain on rtc_derf and 0.08% loss on rtc.

Change-Id: I61c3e5914dde645331344388e7437e5638acd4f3
2017-05-18 11:33:56 -07:00
paulwilkins
5680b4517f Changes to modified error.
The modified error was a derivative of the "coded_error"
that was used to allocate bits between different frames on the
assumption that the allocation should be linear in terms of this
modified error.  I.e. a frame with double the modified error score
should all things being equal get double the number of bits. The
code also included upper and lower caps derived from input
VBR parameters.

This patch improves the initial calculation of the clip mean error
(now called "mean_mod_score" as it is no longer a prediction error)
used as the midpoint for the rate distribution function and normalizes
the output "modified scores" scores such that 1.0 indicates a frame
in the middle of the distribution.  The VBR upper and lower caps are
then applied directly to a  frame's normalized score.

This refactoring is intended to make it easier to drop in alternative
distribution functions or to base the rate allocation on a corpus wide
midpoint (rather than the clip mean).

Change-Id: I4fb09de637e93566bfc4e022b2e7d04660817195
2017-05-18 12:56:02 +01:00
Yaowu Xu
bde2c04fb7 Merge "Experiment. Store first pass errors as per MB values." 2017-05-17 17:38:15 +00:00
paulwilkins
42e5073f94 Experiment. Store first pass errors as per MB values.
Most existing first pass stats are stored in a form normalized to a
macro-block scale. However the error scores for intra / inter etc were
stored as frame level values but mainly used as MB level values.

This change  fixes that. Normalized per MB values make comparisons
between different formats easier and in any case this is usually what is
wanted.

An change in results should be limited to slight differences in rounding.

*** Change after patch 8 +2 requiring new approval.

Final pre-submit testing showed  one 4K clip with above expected change.
Investigation showed this was due to a value used to test for ultra low intra
complexity in key frame detection. This was a per frame not per MB value but
also did not scale with frame size. Replacement with a small per MB value
(based on original per frame value and cif frame size) resolved the KF detection
problem.

Also converted kf_group_error_left to a double in line with other error values
to reduce rounding problems in KF group bit allocation

All clips and sets now show nominal (or 0) change as expected.

Change-Id: Ic2d57980398c99ade2b7380e3e6ca6b32186901f
2017-05-17 12:00:18 +01:00
Johann Koenig
8739a182c8 Merge "move neon load/stores to a new file" 2017-05-15 18:15:27 +00:00
Johann
1088b4f87c move neon load/stores to a new file
Move the tran_low_t helper functions to a new file. Additional
load/store functions will be added here.

Change-Id: I52bf652c344c585ea2f3e1230886be93f5caefc3
2017-05-15 08:29:43 -07:00
Jerome Jiang
6b9d130214 Merge "vp9: speed 8: Fix seg fault in partition copy when drop frames." 2017-05-13 03:20:49 +00:00
Cheng Chen
4c0655f26b Merge "Speed up encoding by skipping altref recode" 2017-05-13 01:29:59 +00:00
Jerome Jiang
1fcd5cca3c vp9: speed 8: Fix seg fault in partition copy when drop frames.
BUG=webm:1433

Change-Id: I4f3984ef28660d3218d48007d7c977bdbdaf8af6
2017-05-12 15:57:23 -07:00
Marco Paniconi
9a66582604 Merge "vp9: Use INTERP_FILTER for filter_type in vp9_rtcd_defs.pl" 2017-05-12 17:02:50 +00:00
Marco Paniconi
629279a45c Merge "vp9: Adjust speed features for speed 8 at low resoln." 2017-05-12 00:35:40 +00:00
Marco Paniconi
c64667c338 Merge "vp9: SVC: Increase the partiiton and acskip thresholds" 2017-05-11 23:37:32 +00:00
Marco Paniconi
37cdd3bfc2 Merge "vp9; Adjust noise estimation thresholds." 2017-05-11 21:58:40 +00:00
Marco
c5c31b9eb6 vp9: SVC: Increase the partiiton and acskip thresholds
Increase the partition and acskip thresholds for temporal
enhancement layers.

~1-2% speedup, with negligible loss in quality.

Change-Id: Id527398a05855298ad9ddac10ada972482415627
2017-05-11 12:28:19 -07:00
Marco
c5a4376aed vp9: SVC: allow for setting the interp_filter in non-rd pickmode.
For SVC 1 pass non-rd pickmode, the interpolation filter for the
upsampling of the golden (spatial) reference was not being explicitly
set and instead was takin gwhatever value was set in the previous
mode/block (which would be either EIGHTTAP or EIGHTAP_SMOOTH).

Fix it to the default EIGHTTAP for now, to be updated/selected
adaptively in a later change.

Minor adjustmemt to rate targeting thresholds in datarate unittests.

Change-Id: I52085048674072c6cfb7163e11e9a2658d773826
2017-05-11 11:45:09 -07:00
Paul Wilkins
3caaf21c5b Merge "Tuning of factor used to calculate Q range in two pass." 2017-05-11 18:25:45 +00:00
Jerome Jiang
d35541fe29 Merge "vp9: Fix ubsan failure in denoiser." 2017-05-11 16:38:59 +00:00
paulwilkins
9a7625652c Tuning of factor used to calculate Q range in two pass.
A more detailed explanation of the experimentation
leading to this change can be found in:-

https://docs.google.com/a/google.com/document/d/13lsYhxgPyxUHvEess6wg9nikaonIZKY9Ak_Lpafv5Mo/edit?usp=sharing

This change gives gains across all our standard test sets for
overall psnr, ssim, fast ssim and psnr-HVS.

Values expressed as % reduction in bitrate.

Low res set     -0.257, -0.192, -0.173, -0.101
Mid res set     -0.233, -0.336, -0.367, -0.139
High res set    -0.999, -1.039, -1.111, -0.567
NetFlix 2K set -0.734, -0.174, -0.389, -0.820
Netflix 4K set  -0.814, -0.485, -0.796, -0.839

Change-Id: Ie981fb3c895c9dfcfc8682640d201a86375db5c8
2017-05-11 16:19:59 +01:00
Cheng Chen
76567d84ce Speed up encoding by skipping altref recode
Speed up for speed 0.
Reduce 10+% of encoding time for hdres in speed 0,
with less than 0.1% PSNR loss.
Compute total difference of previous and current frame context probability
model. If the diff is less than the threshold, skip recoding the frame.

Borg test (positive number means performance loss):
		lowres    midres    hdres
PSNR:		0.030     0.032     0.065

Local speed test: bitrate set at 1200
		blue_sky  pedestrian  rush_hour
Encoding time:	 -10.0%     -16.5%      -16.5%

Change-Id: I4e2d200ea3115d48b2c3e890143596b31b8ef9e9
2017-05-10 22:15:01 -07:00
Marco
2f11a65c99 vp9; Adjust noise estimation thresholds.
Change-Id: Ia41a11df18e5a58d2b8bbecd11c249d357de2a8f
2017-05-10 16:48:10 -07:00
Jerome Jiang
597d1f4c03 vp9: Fix ubsan failure in denoiser.
Fix the overflow for subtraction between two unsigned integers.

BUG=webm:1432

Change-Id: I7b665e93ba5850548810eff23258782c4f5ee15a
2017-05-10 13:43:17 -07:00
Jerome Jiang
2574573fea vp9: Wrap threshold tuning for HD only when denoiser is enabled.
Fixes a speed regression.

Change-Id: I23d942e4af17fa81fe4a366c7369b3ad537e59b0
2017-05-10 12:15:41 -07:00
Marco
d3aebeee4e vp9: Use INTERP_FILTER for filter_type in vp9_rtcd_defs.pl
Change-Id: I259d152c62864b365490368051f3c3b7d7f2f1c5
2017-05-10 12:06:44 -07:00