678 Commits

Author SHA1 Message Date
Debargha Mukherjee
56ab215dad Reduce transform options for ext-tx experiment
Reduces the transform optons for INTRA as well as INTER when
transform size is 16x16 to not use any of the DSTs.
Thus, a total of 10 options are used for 16x16, while 4x4
and 8x8 still uses 17 options.

derflr/hevchd actually improves a little, while hevcmr drops
a little.

About 10% speed improvement.

Change-Id: I920a182231e052cdd622f8bb67085c16c572cb1e
2015-11-23 12:58:48 -08:00
Jingning Han
c335bfeb56 Move n8_w and n8_h out of experiment flag
These primitive variables are commonly required by many other
experiments as well. The use of n4_w and n4_h was originally
introduced in the vp9 decoder implementation.

Change-Id: I93d701d891e3860f31150031e3b9a2b29a3993d2
2015-11-23 09:46:11 -08:00
Zoe Liu
3ec1601e37 Added 3 more reference frames for inter prediction.
Under the experiment of EXT_REFS: LAST2_FRAME, LAST3_FRAME, and
LAST4_FRAME.

Coding efficiency: derflr +1.601%; hevchr +1.895%
Speed: Encoder slowed down by ~75%

Change-Id: Ifeee5f049c2c1f7cb29bc897622ef88897082ecf
2015-11-20 17:00:24 -08:00
Angie Chiang
6e9ed38d1f Merge "Add vp10_inv_txfm2d" into nextgenv2 2015-11-20 18:22:49 +00:00
hui su
d894d34d04 Turn off tx type selection for intra blocks by default
Coding gain on derflr drops to +1.83%.

Change-Id: If68c429f09422a70513d9f1e8e36e10c928e034a
2015-11-18 23:16:25 -08:00
Angie Chiang
4fd0ba8f6f Add vp10_inv_txfm2d
Change-Id: Ib63062a52c688e65bae5eb0052ce69d73d96c9c5
2015-11-17 19:53:28 -08:00
hui su
66f2f65ef7 Merge MISC_FIXES
Remove MISC_FIXES flags except for the changes on MV precision, which
has a 0.1% performance drop.

On derflr, the impact is -0.012%.

Change-Id: I0a74e5a212dd0cb827192a318c92a714c9681e45
2015-11-17 15:06:08 -08:00
Hui Su
83388fb0af Merge "refactor ext-intra" into nextgenv2 2015-11-13 21:19:27 +00:00
hui su
4aa50c17df refactor ext-intra
Coding gain remains about the same, while overall speed is
substantially increased.

Change-Id: I2989bebcfd21092cd6a02653d4df4a3bf6780874
2015-11-13 12:12:09 -08:00
Angie Chiang
35ec6d2b88 Merge changes Ifafbd497,I042bba27,Id6fd8558,Id5b79519 into nextgenv2
* changes:
  Add adst_dct config to vp10_inv_txfm2d_cfg
  Add adst_adst config to vp10_inv_txfm2d_cfg
  Add dct_adst config to vp10_inv_txfm2d_cfg
  Add dct_dct config to vp10_inv_txfm2d_cfg
2015-11-12 23:38:44 +00:00
Angie Chiang
7104079efb Add adst_dct config to vp10_inv_txfm2d_cfg
Change-Id: Ifafbd4974be44685ab2550ed159dbf0411b6f031
2015-11-11 18:02:42 -08:00
Angie Chiang
164ba2a2d8 Add adst_adst config to vp10_inv_txfm2d_cfg
Change-Id: I042bba27540ab2a3d8a00871980295e98f616480
2015-11-11 17:59:22 -08:00
Angie Chiang
db88473ea9 Add dct_adst config to vp10_inv_txfm2d_cfg
Change-Id: Id6fd8558452f64c4ac30d7cb656b659f0587b5d6
2015-11-11 17:55:35 -08:00
Angie Chiang
09c2809a50 Add dct_dct config to vp10_inv_txfm2d_cfg
Change-Id: Id5b795198552443a700413284a1015296e267dcf
2015-11-11 17:51:55 -08:00
Yaowu Xu
a08bfb778a Replace inline with INLINE
Change-Id: I37b5ed9fef0e97feabd856bd4c1b4c7869991a34
2015-11-10 16:09:09 -08:00
Yaowu Xu
72a6cb62ee Fix msvc compling
Change-Id: I5abd6d2fd198b3789732e81b23a5bac009af5290
2015-11-10 16:08:09 -08:00
Debargha Mukherjee
bc54f9dc00 Merge "Resolve conficts caused by master branch merging" into nextgenv2 2015-11-06 23:35:07 +00:00
Angie Chiang
c7c69d88af Merge changes I7ca0cc34,I97189d6e,I4e2b51cf,I21158867,I8d73beee into nextgenv2
* changes:
  Add adst_dct config to vp10_fwd_txfm2d_cfg
  Add adst_adst config to vp10_fwd_txfm2d_cfg
  Add dct_adst config to vp10_fwd_txfm2d_cfg
  Add dct_dct config to vp10_fwd_txfm2d_cfg
  Add vp10_fwd_txfm2d_8x8/16x16/32x32
2015-11-06 23:34:56 +00:00
Angie Chiang
e26c712ab2 Merge "Add vp10_fwd_txfm2d_4x4" into nextgenv2 2015-11-06 23:34:35 +00:00
hui su
707cd03658 Resolve conficts caused by master branch merging
Change-Id: I167e241b789331572581fcb0567ebe535b4b9345
2015-11-06 14:35:08 -08:00
Angie Chiang
45222e5b20 Add adst_dct config to vp10_fwd_txfm2d_cfg
Change-Id: I7ca0cc341ae36ac9f7aa24789f8872161b832b7b
2015-11-06 10:47:46 -08:00
Angie Chiang
786f1af891 Add adst_adst config to vp10_fwd_txfm2d_cfg
Change-Id: I97189d6e917929c756a3f89fe0ab66077a0a5436
2015-11-06 10:47:46 -08:00
Angie Chiang
634d0bdc7c Add dct_adst config to vp10_fwd_txfm2d_cfg
Change-Id: I4e2b51cf5b0dedb9ea1106747edb76835804fffc
2015-11-06 10:47:46 -08:00
Angie Chiang
51c0c35c6a Add dct_dct config to vp10_fwd_txfm2d_cfg
Change-Id: I21158867fb2b762d3632d0664ebe70c68d0953e1
2015-11-06 10:47:46 -08:00
Angie Chiang
f08141c734 Add vp10_fwd_txfm2d_8x8/16x16/32x32
Change-Id: I8d73beee5a619d26f3f8640a6679150d874522c4
2015-11-06 10:47:45 -08:00
Angie Chiang
ff7fe99342 Add vp10_fwd_txfm2d_4x4
Change-Id: I9bca3b1c76b64575366d71ab65ffef7264ce0c9b
2015-11-06 10:39:27 -08:00
Debargha Mukherjee
85514c40ae New interpolation experiment
Adds a new interpolation experiment.

Improves entropy coding to send the filter type only if
the motion vectors have subpel components.
Adds one new 8-tap smooth filter, and tweaks the others.

derflr: +0.695%
hevcmr: +0.305%

About 5% encode slowdown. No visible impact for decoding.

Also makes the interpolation framework flexible to support both
strictly interpolating filters as well as non-interpolating
filters that filter integer offsets. This is mainly for
further experimentation and if not found useful the code will
be removed.

Change-Id: I8db9cde56ca916be771fe54a130d608bf10786e6
2015-11-06 09:51:34 -08:00
Hui Su
9b3ad185dc Merge "ext-intra experiment" into nextgenv2 2015-11-06 17:40:49 +00:00
Debargha Mukherjee
70e514ce78 Merge "Flip the result of the inverse transform for FLIPADST." into nextgenv2 2015-11-06 09:20:46 +00:00
Angie Chiang
b0df5e0f9e Add iadst32
Change-Id: I3a53ee51146d0bd4b0fe4b27c286e8c921f9823b
2015-11-04 14:23:56 -08:00
Angie Chiang
35486a6b88 Add iadst16
Change-Id: I093881aacaf9a070f78cc4eea2e8a6ede8a71792
2015-11-04 14:23:56 -08:00
Angie Chiang
0ca0cc240b Add iadst8
Change-Id: Ia58e4735d7d7bfd2ac55259c32705118c6745c6d
2015-11-04 14:23:56 -08:00
Angie Chiang
ba69089e65 Add iadst4
Change-Id: Ie419b2b1e939a41c30ed609e1ba46f5f6609b2a5
2015-11-04 14:23:56 -08:00
Angie Chiang
7467833401 Add idct32
Change-Id: I75412bdc4bd0d9c90e8b56e02e0e467a2d9957f9
2015-11-04 14:23:56 -08:00
Angie Chiang
d3cee565ad Add idct16
Change-Id: I8e5ba3a3f9b64ccbf038e371525e897774729b06
2015-11-04 14:23:56 -08:00
Angie Chiang
bd9db2f55b Add idct8
Change-Id: I8092a6f229b196c5c8b7dcd2dff8aaf68253e422
2015-11-04 14:23:56 -08:00
Angie Chiang
7d2b7b6944 Add idct4
Change-Id: I1d1b6822452772cec95160491c7bc6d3bba1f5c2
2015-11-04 14:23:56 -08:00
Angie Chiang
a9253a2029 Add fadst32
Change-Id: I77299f0e39fc7cef91e7e420513dbd05194f320a
2015-11-04 14:23:56 -08:00
Angie Chiang
a7d26f4e80 Add fadst16
Change-Id: I5175e39b5df73646488f74b2a9e4a463ae79d91a
2015-11-04 14:23:56 -08:00
Debargha Mukherjee
12fac1c281 Merge "Fix transform tables in C implementations." into nextgenv2 2015-11-04 21:11:38 +00:00
Angie Chiang
3813c2bc46 Merge "Add fadst8" into nextgenv2 2015-11-04 20:21:08 +00:00
Angie Chiang
498866b699 Merge "Add fadst4" into nextgenv2 2015-11-04 20:20:57 +00:00
Geza Lore
4f5108090a Flip the result of the inverse transform for FLIPADST.
When using FLIPADST, the vp10_inv_txfm_add functions used to flip
the destination array, add the result of the inverse transform, to it
and then flip the destination back. This has been replaced by
flipping the result of the inverse transform before adding it to the
destination. Up-Down flipping is done by negating the destination
stride, and staring from the bottom, so it should now be free.
Left-right flipping is done with the usual SSE2 instructions in the
optimized code.

The C functions match the SSE2 functions as expected, so the C functions
now do the flipping as well when required. Adding this cleanly required
some refactoring of the C functions, but there is no measurable
performance impact when ext-tx is not enabled.

Encode speedup with ext-tx enabled is about 3%.

Change-Id: I5b04e5d720f0b9f0d54fd8607a8764f2314c7234
2015-11-04 17:11:44 +00:00
Yaowu Xu
4aafd01861 Merge branch 'master' into nextgenv2 2015-11-04 05:00:05 -08:00
hui su
be3559ba07 ext-intra experiment
Currently there are two parts in this experiment: extra directional intra
prediction modes and the filter intra modes migrated from the nextgen branch.

Several macros are defined in "blockd.h" to provide controls of the experiment
settings. Setting "DR_ONLY" as 1 (default is 0) means we only use directional
modes, and skip the filter-intra modes; "EXT_INTRA_ANGLES" (default is 128)
defines the number of different angles we want to support; setting
"ANGLE_FAST_SEARCH" as 1 (default is 1) means we use fast sub-optimal search
for the best prediction angle, instead of exhaustive search. The fast search
is about 6 times faster than the exhaustive search, while preserving about
60% of the coding gains.

With extra directional prediction modes (fast search), we observe the following
code gains (number in parentheses is for all-key-frame setting):
derflr +0.42%  (+1.79%)
hevclr +0.78%  (+2.19%)
hevcmr +1.20%  (+3.49%)
stdhd  +0.56%
Speed-wise, about 110% slower for key frames, and 30% slower overall.

The gains of filter intra modes mostly add up with the gains of directional
modes. The overall coding gain of this experiment:
derflr +0.94%
hevclr +1.46%
hevcmr +1.94%
stdhd  +1.58%

Change-Id: Ida9ad00cdb33aff422d06eb42b4f4e5f25df8a2a
2015-11-03 18:46:02 -08:00
Hui Su
3cbe767972 Merge "Generate intra prediction reference values only when necessary" 2015-11-03 20:55:14 +00:00
Geza Lore
2b39bcec29 Fix transform tables in C implementations.
These tables were out of sync with the indexing enum since the
refactoring in commit 4f16f119 (change 303389), due to the removal
of the ext_tx_to_txtype lookup table. This patch just puts them
back in order.

Change-Id: Ieb7d57654f61b99b511d54c9ba09abbd5e8d0d14
2015-11-03 17:10:51 +00:00
Jingning Han
6d43a53a0c Merge "Incorporate flexible tx type and tx partition in RD scheme" into nextgenv2 2015-11-03 16:43:48 +00:00
Yaowu Xu
2c32861814 Merge branch 'master' into nextgenv2 2015-11-03 05:00:04 -08:00
Jingning Han
4b594d3d00 Incorporate flexible tx type and tx partition in RD scheme
This commit hooks up the rate-distortion optimization system to
fully exploit recursive transform block partition and multiple
transform type. The compression performance of the two experiments
largely adds up. For derf set, ext-tx provides additional 2.1%
coding gains on top of the gains due to recursive transform block
partition (0.69%).

Change-Id: I1091fb9545f74e489a6a2489dc3c12f5abd05043
2015-11-02 17:40:05 -08:00