17183 Commits

Author SHA1 Message Date
Jingning Han
9a1a8f1d8e Speed up dynamic motion vector referencing system
Skip transform type search in modes with ref_mv_idx > 0. This
brings down the additional encoding time cost due to the DMR system
from 32% to 17%, at minimal coding performance regression.

Change-Id: Ie82e1d2831a313c6f1e47f7da221b51345023eb3
2016-04-13 15:51:36 -07:00
Marco Paniconi
e6657f32c5 Merge "vp9: Adjustment to scene-cut detection." 2016-04-13 22:36:29 +00:00
Jingning Han
f33a0a8215 Fix a few mis-use cases of MAX_MV_REF_CANDIDATES
Fix several use cases where MAX_MV_REF_CANDIDATES is mixed up with
is_compound flag to avoid potential coding interruption.

Change-Id: Ifdee1ef8a81ef6d1c155315c6c6a3074aa7a8b5e
2016-04-13 15:16:55 -07:00
Alex Converse
5d2b0f93b9 Use an exponential growth approach for the ANS reversal buffer.
Memory constrained hardware can window the data via our standard windowing
mechanism, tiles.

Change-Id: Ib1cfd157604a8c9d9f9a9f2b0ba3bc2fd0643082
2016-04-13 15:16:29 -07:00
Alex Converse
c3688e398c Disable the TestSuperframeIndexIsOptional test with ANS.
Change-Id: Id55a741e2015c4e01d156d3fe5319498b016b9cf
2016-04-13 14:58:40 -07:00
Marco
24db57f0e1 vp9: Adjustment to scene-cut detection.
Change recursive weight for average_source_sad and
put some constraint on spacing between detected scene-cuts.

Change only affects 1 pass real-time mode.

Change-Id: I1917e748d845e244812d11aec2a9d755372ec182
2016-04-13 14:40:08 -07:00
Jingning Han
e07dbaa2f5 Enable mode conversion in sub8x8 block
Convert the newmv mode into reference motion vector modes.

Change-Id: I51bd2543dafb70345c1340fba700b44f67f20853
2016-04-13 14:35:54 -07:00
Zoe Liu
1d043d56da Merge "Make ext-refs respect encoding flags." into nextgenv2 2016-04-13 19:31:30 +00:00
Debargha Mukherjee
200e50568b Merge "Fix 2 warning when building with GCC 5." into nextgenv2 2016-04-13 19:28:25 +00:00
hui su
6a7ddd84bb Speed-up in tx_size search
Do not consider 4x4 transform when the maximum possible transform
size is 32x32.

Overall encoding speed is increased by more than 10%. Compression
performance is neutral on lowres, midres, and hdres.

Change-Id: Ifac61c3c9f4b0ab392bffd4d1faa373d91014cf1
2016-04-13 10:19:00 -07:00
hui su
b72aa72a90 ext-tx: use raster scan order for identity transform
coding gain of ext-tx:
screen_content 12.73% -> 13.05%

Change-Id: I5fc8cf0db84c3e56dd3cb7675e1d81c9c575bc57
2016-04-13 09:42:43 -07:00
Alex Converse
70bd058352 Merge "Fix the tree diagram comment." into nextgenv2 2016-04-13 16:14:18 +00:00
Yunqing Wang
885872f899 Fix Visual Studio build warning
Fixed warning C4244: '-=' : conversion from 'const double' to 'int',
possible loss of data.

Change-Id: Ic4691346037767b244e7f71248c2f871f92002f3
2016-04-13 09:09:08 -07:00
Geza Lore
c50aaf3049 Make ext-refs respect encoding flags.
The VP8_EFLAG_NO_UPD_LAST and VP8_EFLAG_NO_REF_LAST flags can be
passed to the encoder to signal that it should not update/reference
the LAST ref frame when encoding the current frame. With
--enable-ext-refs turned on, the new LAST2 LAST3 and LAST4 ref frames
could still be used or updated, which causes the
  VP10/ErrorResilienceTestLarge.DropFramesWithoutRecovery/{0,1,2}
tests to fail.

With this patch, if --enable-ext-refs is used, then
VP8_EFLAG_NO_UPD_LAST and VP8_EFLAG_NO_REF_LAST also applies to the
new LAST2 LAST3 and LAST4 ref frames, as well as the LAST ref frame.

Change-Id: If482b1c09bbaf914eca8e0348a2367bff261661d
2016-04-13 12:03:58 +01:00
Geza Lore
c6cf7a6111 Fix 2 warning when building with GCC 5.
These caused the following warning with GCC 5:
     warning: logical not is only applied to the left hand side of
     comparison [-Wlogical-not-parentheses]
     assert(!is_compound == (cm->reference_mode == SINGLE_REFERENCE));

Change-Id: If296aabb2311ceb7d903b395c1549ef81c2cbf9b
2016-04-13 10:49:52 +01:00
James Zern
ef17fc46f3 dct32x32_test: s/HAVE_NEON_ASM/HAVE_NEON/
vpx_idct32x32_1024_add_neon is implemented with instrinsics

Change-Id: I072b18248b97ee2634f06b2751ffa2ced85f8e5b
2016-04-13 00:08:28 -07:00
James Zern
9c2ed00c8c Merge changes from topic 'arm64'
* changes:
  configure: Detect aarch64 toolchains automatically
  configure: Add an arm64-linux-gcc target configuration
2016-04-13 03:11:39 +00:00
Alex Converse
c1729d12b8 Merge "ANS: Remove extra buffer size checks causing a false decode error." into nextgenv2 2016-04-13 01:37:05 +00:00
Marco
6a3cf099aa vp9: Adjust threshold for scene-change detection.
For 1 pass vbr.

Change-Id: I10b7eefc36d65c30844d205e139515bec7fed6af
2016-04-12 18:28:04 -07:00
Alex Converse
af56299119 Fix the tree diagram comment.
Clear up a multiline comment warning and clarify the comment.

Change-Id: Ie0277b4ed4a088a9751e6998f2aeae57d302e6d4
2016-04-12 16:57:08 -07:00
Hui Su
9e8cad3be7 Merge "Add vp10_ prefix to full_to_model_counts and fill_token_costs" into nextgenv2 2016-04-12 23:38:47 +00:00
Alex Converse
48b81a1a3c Merge "Increase active map test coverage from RT speeds 0-5 to 0-8." 2016-04-12 22:20:12 +00:00
Alex Converse
493a585273 ANS: Remove extra buffer size checks causing a false decode error.
The minimal ans partition size is now one byte. This is checked in
ans_read_init().

The read_is_valid() condition is handled by setup_token_decoder().

Change-Id: I7b202b896630bc4285532208bf7cf84567afe158
2016-04-12 15:19:30 -07:00
Yi Luo
6db95602e4 Merge "Optimized HBD block subtraction for all block sizes" into nextgenv2 2016-04-12 21:22:32 +00:00
Marco Paniconi
f81b0000f6 Merge "vp9: Fix to active_best for GF/ARF in 1 pass vbr." 2016-04-12 21:18:41 +00:00
Debargha Mukherjee
ec1365a0c9 Merge "Extend variance based partitioning to 128x128 superblocks" into nextgenv2 2016-04-12 19:42:35 +00:00
Debargha Mukherjee
ff72cca8bb Merge "Step towards making the 2-pass cq mode perceptual" 2016-04-12 19:42:06 +00:00
Yi Luo
0f80b1f754 Optimized HBD block subtraction for all block sizes
- Interface function takes a local MxN function to call based on the
  block size.
- Repetition call (w/o cache line miss) shows improvement:
  ~63% - ~340%.
- Overall encoder speed improvement: ~0.9%.

Change-Id: Ieff8f3d192415c61d6d58d8b99bb2a722004823f
2016-04-12 12:04:43 -07:00
Alex Converse
ba5f7a514a Increase active map test coverage from RT speeds 0-5 to 0-8.
This test takes less than 100 ms for each of speeds 6-8.

Change-Id: Ibbeb3004a2607d25dcbf77cb5314ade87809e059
2016-04-12 11:14:10 -07:00
hui su
0792748646 Add vp10_ prefix to full_to_model_counts and fill_token_costs
Change-Id: I5e6c644fb09f7a80c88142dfdfa05cf5be260241
2016-04-12 11:06:47 -07:00
Angie Chiang
027d12b7d6 Merge changes I359aa49c,Ic8ca5afb into nextgenv2
* changes:
  Generalize txfm scale in highbd quantizer
  Parameterize transform scale for quantizer
2016-04-12 18:02:05 +00:00
Alex Converse
e7224b7866 Convert some vpx boolcoder calls back to vp10 generic calls.
Change-Id: I362f753ff42d4c4fb94df2419cdaad423d7a4229
2016-04-12 11:00:52 -07:00
Marco
3861b25be1 vp9: Fix to active_best for GF/ARF in 1 pass vbr.
Correct the setting of Q basis of GF/ARF in 1 pass vbr.

Existing logic would switch to using avg_QP of key frame if
avg_QP of inter is less than active worst (even if key frame is
not last frame).

Instead fix the logic (as per the comment) to use the lower of
active_worst_quality and avg_Q for inter as basis for GF/ARF
active_best_quality (unless last frame was key frame).

Increase in metrics: AvgPSNR/SSIM up by ~0.7/0.3 on ytlive set.

Change-Id: I9a628378ec6684bfda9457ebfc2384ef6d8579f7
2016-04-12 10:37:45 -07:00
Martin Storsjo
819f3c805d configure: Detect aarch64 toolchains automatically
Change-Id: Icafda81dbc3323fa0afdba5f1c8758e812cc592a
2016-04-12 14:15:07 +03:00
Martin Storsjo
babd308b5e configure: Add an arm64-linux-gcc target configuration
Change-Id: I23efc07572b2406ce5d9283340aef5aee8326280
2016-04-12 14:15:01 +03:00
Geza Lore
61af8981b0 Extend variance based partitioning to 128x128 superblocks
Change-Id: I41edf266d5540a9b070a5e65bc397dd3da210507
2016-04-12 09:40:11 +01:00
Debargha Mukherjee
648538959d Merge "Use reduced transform set for 16x16" into nextgenv2 2016-04-11 23:32:29 +00:00
Alex Converse
a3a10a323b pickmode: only cost the skip flag once per prediction block
RTC speed 6:
File    Match   Avg     BDRate  Low     Mid     High
OVERALL ✔       -0.040  -0.045  -0.031  -0.084  0.004

Screencast speed 6:
File     Match   Avg     BDRate  Low     Mid     High
OVERALL ✔       1.115   -0.162  0.203   2.470   0.541

Change-Id: I46bbc11c89301015b5d3eac25294c709f23f0897
2016-04-11 15:39:52 -07:00
Debargha Mukherjee
c4da5d500e Use reduced transform set for 16x16
Speed increase for ext-tx by 20% for a BDRATE drop of 0.26%.
The ext-tx expt becomes -2.66% BDRATE (reduced from -2.92%) for
the lowres set.

It turns out that reducing the set of transforms for intra from
12 to 5 makes very little difference in coding performance (~0.04%).
Most of the performance drop comes from the reduction is transform
set for inter. Currently there is a provision to control that with
a macro.

Change-Id: I7de05527bf72f96acc1e0ab8a74a849da0a141e5
2016-04-11 13:04:41 -07:00
Alex Converse
5b3d3b1909 Merge "Remove obsolete segment skip checks from tokenization." 2016-04-11 18:58:32 +00:00
Sarah Parker
33ccd0f85e Merge "Fix prune one and two to make compatible with new transforms" into nextgenv2 2016-04-11 17:12:28 +00:00
Yi Luo
fd367c243e Merge "Some cosmetic improvements since HBD variance 4x4 optimization" into nextgenv2 2016-04-11 16:04:02 +00:00
Yi Luo
4c792f2814 Merge "Add unit tests for HBD variance 4x4 SSE4.1 optimization" into nextgenv2 2016-04-11 16:03:38 +00:00
Paul Wilkins
1c187c4be0 Adjustment to prediction decay.
Adjustment to stop excessive prediction decay triggered by blocks
or frames with extremely low spatial complexity which rendered the
comparison of intra and inter coded errors meaningless.

This was causing much shorter than expected groups on some 4k
test content.

Change-Id: I3f2c64200ef6dcef4721fc9f2ec09e480056ffc2
2016-04-11 13:25:33 +01:00
Paul Wilkins
f659c7e99e Merge "Adjust motion component of prediction decay." 2016-04-11 10:27:30 +00:00
Paul Wilkins
c5a89b46b9 Merge "Trap very short arf group just before a kf." 2016-04-11 10:27:12 +00:00
Scott LaVarnway
ad47d1d194 Merge "VP9: Combine TileData with TileWorkerData" 2016-04-10 22:17:45 +00:00
Debargha Mukherjee
9930a00ed7 Merge "Refactor PC_TREE root handling." into nextgenv2 2016-04-09 13:33:53 +00:00
Debargha Mukherjee
38b26b0dc3 Merge "Make subpel masked motion work with upsampled refs" into nextgenv2 2016-04-09 13:30:09 +00:00
Debargha Mukherjee
c47c460f69 Step towards making the 2-pass cq mode perceptual
Uses a metric on fraction of smooth blocks derived from first pass
stats in a frame to adjust down the cq_level modestly in the cq mode.
The current implementation does not add much complexity, and is
fairly light in the adaptation.

Change-Id: Ic484e810d5bd51b7bb6b8945f378c7c3d9d27053
2016-04-09 06:24:18 -07:00