1015 Commits

Author SHA1 Message Date
Jingning Han
2ec5ed258a Refactor tokenization coding tree
Expand the tokenization tree writing to support per transform block
type coding in next CLs.

Change-Id: I3560e658f89cc500eb49603f95dd2b4e99045f5b
2015-11-24 16:01:51 -08:00
Debargha Mukherjee
6ef5d8c4ed Merge "Reduce transform options for ext-tx experiment" into nextgenv2 2015-11-24 21:30:10 +00:00
Zoe Liu
9c62f9282f Merge "Added 3 more reference frames for inter prediction." into nextgenv2 2015-11-24 19:47:03 +00:00
Debargha Mukherjee
56ab215dad Reduce transform options for ext-tx experiment
Reduces the transform optons for INTRA as well as INTER when
transform size is 16x16 to not use any of the DSTs.
Thus, a total of 10 options are used for 16x16, while 4x4
and 8x8 still uses 17 options.

derflr/hevchd actually improves a little, while hevcmr drops
a little.

About 10% speed improvement.

Change-Id: I920a182231e052cdd622f8bb67085c16c572cb1e
2015-11-23 12:58:48 -08:00
Yaowu Xu
c1629ca53b Merge branch 'master' into nextgenv2 2015-11-21 05:00:05 -08:00
Zoe Liu
3ec1601e37 Added 3 more reference frames for inter prediction.
Under the experiment of EXT_REFS: LAST2_FRAME, LAST3_FRAME, and
LAST4_FRAME.

Coding efficiency: derflr +1.601%; hevchr +1.895%
Speed: Encoder slowed down by ~75%

Change-Id: Ifeee5f049c2c1f7cb29bc897622ef88897082ecf
2015-11-20 17:00:24 -08:00
Alex Converse
b1fcd1751e Fix unsigned overflow in rd_variance_adjustment.
Found with clang -fsanitize=integer

Change-Id: I2538e7483cb2d5f06bceecbd3326bdd88bfecfa1
2015-11-19 15:00:59 -08:00
hui su
d894d34d04 Turn off tx type selection for intra blocks by default
Coding gain on derflr drops to +1.83%.

Change-Id: If68c429f09422a70513d9f1e8e36e10c928e034a
2015-11-18 23:16:25 -08:00
Hui Su
4d3cf45992 Merge "Merge MISC_FIXES" into nextgenv2 2015-11-18 01:08:21 +00:00
hui su
66f2f65ef7 Merge MISC_FIXES
Remove MISC_FIXES flags except for the changes on MV precision, which
has a 0.1% performance drop.

On derflr, the impact is -0.012%.

Change-Id: I0a74e5a212dd0cb827192a318c92a714c9681e45
2015-11-17 15:06:08 -08:00
hui su
af084fbec1 Fix some unused variable warnings
Change-Id: Ia7680ddf00dd50dd66bbb5753bae30b937988800
2015-11-17 10:40:25 -08:00
Jingning Han
5f9e089b1d Merge "Limit the reset range of inter_tx_size array" into nextgenv2 2015-11-16 16:58:08 +00:00
Jingning Han
4ae193eec7 Merge "Alternate reference frame" into nextgenv2 2015-11-16 16:04:14 +00:00
Jingning Han
0f34e35d26 Limit the reset range of inter_tx_size array
Reset the effective range of inter_tx_size, instead of the entire
array in the rate-distortion optimization loop.

Change-Id: Id453fbd6dddfe69f4e451ba8518c083326d5dd53
2015-11-15 20:56:04 -08:00
Hui Su
83388fb0af Merge "refactor ext-intra" into nextgenv2 2015-11-13 21:19:27 +00:00
hui su
4aa50c17df refactor ext-intra
Coding gain remains about the same, while overall speed is
substantially increased.

Change-Id: I2989bebcfd21092cd6a02653d4df4a3bf6780874
2015-11-13 12:12:09 -08:00
Jingning Han
140182b96c Alternate reference frame
This commit re-designs the alternate reference frame generation
process. It employs non-local mean approach to produce more stable
pixel estimation for alternate reference frame. It improves the
compression performance gains:
derf   0.5%
hevcmr 0.8%
stdhd  1.3%
hevchr 1.0%

The encoding time at speed 0 is not affected.

Change-Id: Iaa757f0da189ce93812d69617a81bf630d449848
2015-11-12 11:16:59 -08:00
Jingning Han
35b3bd3e3b Fix an encoding failure case when speed features are on
This commit fixes an encoding failure case triggered when early
termination feature is turned on for transform block size search.
It resolves the corresponding enc/dec mismatch issue.

Change-Id: I2c5b7d8b1efe25fe3810e6ed307f4b1865dede49
2015-11-10 16:04:00 -08:00
Yaowu Xu
b49ac0b160 Merge branch 'master' into nextgenv2
Change-Id: I8811bfd8fc132b9f515707e795bb6308e4bf263b
2015-11-09 09:52:18 -08:00
hui su
6ab6ac450b Use accurate bit cost for uv_mode in UV intra mode RD selection
On derflr, +0.1% for VP10; however, -0.03% on VP9.

Change-Id: I09c724232ede74254043d61d3cadc506256af0af
2015-11-06 14:45:43 -08:00
Debargha Mukherjee
85514c40ae New interpolation experiment
Adds a new interpolation experiment.

Improves entropy coding to send the filter type only if
the motion vectors have subpel components.
Adds one new 8-tap smooth filter, and tweaks the others.

derflr: +0.695%
hevcmr: +0.305%

About 5% encode slowdown. No visible impact for decoding.

Also makes the interpolation framework flexible to support both
strictly interpolating filters as well as non-interpolating
filters that filter integer offsets. This is mainly for
further experimentation and if not found useful the code will
be removed.

Change-Id: I8db9cde56ca916be771fe54a130d608bf10786e6
2015-11-06 09:51:34 -08:00
Hui Su
9b3ad185dc Merge "ext-intra experiment" into nextgenv2 2015-11-06 17:40:49 +00:00
Debargha Mukherjee
46d2cc5714 Merge "Eliminate copying for FLIPADST in fwd transforms." into nextgenv2 2015-11-06 08:37:25 +00:00
Debargha Mukherjee
12fac1c281 Merge "Fix transform tables in C implementations." into nextgenv2 2015-11-04 21:11:38 +00:00
Jingning Han
de00c163c7 Merge "Simplify txfm rate-distortion optimization" into nextgenv2 2015-11-04 19:31:03 +00:00
Jingning Han
493d02347c Simplify txfm rate-distortion optimization
This commit refactors the rate-distortion optimization scheme for
transform block coding. When both ext-tx and var-tx experiments
are turned on, the encoding time for bus_cif at 1000 kbps goes down
from 706377 ms to 666503 ms (5.6% speed-up). The coding statics
remain unchanged.

Change-Id: I20835db573725580aad79c16220f799ce01f2093
2015-11-04 10:25:48 -08:00
Yaowu Xu
4aafd01861 Merge branch 'master' into nextgenv2 2015-11-04 05:00:05 -08:00
hui su
be3559ba07 ext-intra experiment
Currently there are two parts in this experiment: extra directional intra
prediction modes and the filter intra modes migrated from the nextgen branch.

Several macros are defined in "blockd.h" to provide controls of the experiment
settings. Setting "DR_ONLY" as 1 (default is 0) means we only use directional
modes, and skip the filter-intra modes; "EXT_INTRA_ANGLES" (default is 128)
defines the number of different angles we want to support; setting
"ANGLE_FAST_SEARCH" as 1 (default is 1) means we use fast sub-optimal search
for the best prediction angle, instead of exhaustive search. The fast search
is about 6 times faster than the exhaustive search, while preserving about
60% of the coding gains.

With extra directional prediction modes (fast search), we observe the following
code gains (number in parentheses is for all-key-frame setting):
derflr +0.42%  (+1.79%)
hevclr +0.78%  (+2.19%)
hevcmr +1.20%  (+3.49%)
stdhd  +0.56%
Speed-wise, about 110% slower for key frames, and 30% slower overall.

The gains of filter intra modes mostly add up with the gains of directional
modes. The overall coding gain of this experiment:
derflr +0.94%
hevclr +1.46%
hevcmr +1.94%
stdhd  +1.58%

Change-Id: Ida9ad00cdb33aff422d06eb42b4f4e5f25df8a2a
2015-11-03 18:46:02 -08:00
Alex Converse
255bcf8697 Merge "misc fixes: Remove a wasted value." 2015-11-03 17:52:34 +00:00
Geza Lore
01bb4a318d Eliminate copying for FLIPADST in fwd transforms.
This patch eliminates the copying of data when using FLIPADST forward
transforms, by incorporating the necessary data flipping into the
load_buffer_* functions of the SSE2 optimized forward transforms. The
load_buffer_* functions are normally inlined, so the overhead of copying
the data is removed and the overhead of flipping is minimized. Left to
right flipping is still not free, as the columns need to be shuffled in
registers.

To preserve identity between the C and SSE2 implementations, the
appropriate C implementations now also do the data flipping as part of
the transform, rather than relying on the caller for flipping the input.

Overall speedup is about 1.5-2% in encode on my tests. Note that these
are only the forward transforms. Inverse transforms to come in a later
patch.

There are also a few code hygiene changes:
- Fixed some indents of switch statements.
- DCT_DCT transform now always use vp10_fht* functions, which dispatch
  to vpx_fdct* for DCT_DCT (some of them used to call vpx_fdct*
  directly, some of them used to call vp10_fht*).

Change-Id: I93439257dc5cd104ac6129cfed45af142fb64574
2015-11-03 17:10:55 +00:00
Geza Lore
2b39bcec29 Fix transform tables in C implementations.
These tables were out of sync with the indexing enum since the
refactoring in commit 4f16f119 (change 303389), due to the removal
of the ext_tx_to_txtype lookup table. This patch just puts them
back in order.

Change-Id: Ieb7d57654f61b99b511d54c9ba09abbd5e8d0d14
2015-11-03 17:10:51 +00:00
Jingning Han
696ee004a5 Re-work rate-distortion optimization scheme for transform coding
This commit re-works the rate-distortion optimization scheme for
transform coding. It improves the overall compression performance.
For derf set, the ext-tx experiment provides 2.27% coding gains,
and the new scheme that integrates multiple transform type selection
and recursive transform block partitioning provides a total of 3.24%
coding gains.

Change-Id: Ia1887c4c44b73dfb915d091d96660a99f09d5cc3
2015-11-03 09:03:53 -08:00
Jingning Han
4b594d3d00 Incorporate flexible tx type and tx partition in RD scheme
This commit hooks up the rate-distortion optimization system to
fully exploit recursive transform block partition and multiple
transform type. The compression performance of the two experiments
largely adds up. For derf set, ext-tx provides additional 2.1%
coding gains on top of the gains due to recursive transform block
partition (0.69%).

Change-Id: I1091fb9545f74e489a6a2489dc3c12f5abd05043
2015-11-02 17:40:05 -08:00
Jingning Han
4b0ef55f10 Fix block size computation in coeff token packing
Correctly compute the block size in bit-stream coefficient token
packing. This fixes an enc/dec mismatch at very high bit-rates.

Change-Id: I37bf084731dc660df0c695cad406ddcd0f9eb904
2015-11-02 14:55:55 -08:00
Jingning Han
94266f4f34 Make loop filter support recursive transform block partitioning
This commit allows the loop filter to account for the recursive
transform block partition when selecting the filter and mask.

Change-Id: I62b6c2dcc0497cbe1f264b03c46163f55d2c9752
2015-10-30 15:42:25 -07:00
Jingning Han
6727943ceb Refactor loop filter mask
This commit refactors the loop filter selection process to support
variable transform block sizes based filter mask. It disables the
multi-thread loop filter implementation to simplify the experiments.
The speed impact on speed 0 encoding is negligible.

Change-Id: Ia470b6da9ad833fe6eb72d2cbeda9296b21910ec
2015-10-30 15:25:16 -07:00
Jingning Han
47c7fd984e Fix a switch condition in select_tx_block
Change-Id: I3d90a0286c5ef559b91ad298db97e8990becf85f
2015-10-30 13:01:52 -07:00
Jingning Han
b86b76bb4a Merge "Support per transform block skip coding" into nextgenv2 2015-10-30 16:58:03 +00:00
Jingning Han
bfeac5e19c Support per transform block skip coding
Allow the encoder to drop individual transform block coding.

Change-Id: I2c2b2985254cb92baf891f03daa33f067279373b
2015-10-30 08:55:17 -07:00
Yaowu Xu
cca1b39586 Merge branch 'master' into nextgenv2 2015-10-30 05:00:05 -07:00
Jingning Han
366bf3c2b6 Merge "Reset txfm context condition for skip coded blocks" into nextgenv2 2015-10-30 02:15:34 +00:00
Jingning Han
981f09a1f1 Reset txfm context condition for skip coded blocks
If a block has all coefficients quantized to zero, the codec will
assume that it uses largest transform block size.

Change-Id: Icd4e8e7cdc4b6af6974f87169e50b040ebfe9020
2015-10-29 18:02:37 -07:00
Jingning Han
88b9e90a56 Turn off fixed tx size in frame header
Temporarily turn off the fixed transform size at frame level.

Change-Id: I94a6a3b18893909d33fb7fa91e73ee3568b537b2
2015-10-29 14:30:56 -07:00
Alex Converse
6f229b3e62 Merge "Shrink probability remap tables." 2015-10-29 19:58:24 +00:00
Jingning Han
3edad6e887 Enable entropy coding of recursive transform block partition
This commit enables the entropy coding of the recursive transform
block partition syntax.

Change-Id: I0c2509fb7b9822d12a721f9ebf9327fac83c777e
2015-10-29 11:06:46 -07:00
Debargha Mukherjee
8a4292441f Refactoring tx-types to add more flexibility
Allows inter and intra tx_types to have different sets of
transforms for different tx_size/sb_type combinations.

Change-Id: Ic0ac1daef7a9fb15c4210271e4d04cd36e5cec8e
2015-10-28 23:31:32 -07:00
Jingning Han
71c156070c Use precise distortion metric
Rework the rate distortion optimization pipeline. Use precise
distortion metric that accounts for the forward and inverse
transform rounding effect.

Change-Id: Ibe19ce9791ec3547739294cc3012dd9e11f4ea49
2015-10-28 11:47:14 -07:00
Jingning Han
4bfed0b32e Account for variable txfm sizes in coeff token packing
This commit makes the coefficient token packtization process account
for variable transform block sizes supported in a single processing
block. It fixes an enc/dec mismatch issue when var-tx, ext-tx, and
misc-fixes experiments are all turned on.

Change-Id: I2e8946e6f72de567603a568debbadad11196430c
2015-10-28 11:45:31 -07:00
Yaowu Xu
eb7b5f660d Merge branch 'master' into nextgenv2
Change-Id: I63dc39d1ec9ad2e2454da6f5956dcd4367b87190
2015-10-28 08:14:16 -07:00
Alex Converse
0f059d6d65 misc fixes: Remove a wasted value.
Remove delta index 254 from probability remapping and subexp coding.
Saves 1-bit when the delta index is 129.

Change-Id: I88aba565fc766b1769165be458d2efd3ce45817e
2015-10-27 12:10:25 -07:00