454 Commits

Author SHA1 Message Date
Angie Chiang
cf3ef18fc4 Merge "Remove double operation from tx_size selection" into nextgenv2 2016-04-18 18:11:36 +00:00
Angie Chiang
6de4a77df3 Remove double operation from tx_size selection
This CL fix the bug
rdopt.c:1687: choose_tx_size_from_rd: Assertion
`mbmi->tx_type == DCT_DCT' failed

It is caused by
1) mms register access before double operation
2) different compiler behaviors
code:
  int64_t a = INT64_MAX;
  double b = 1. * INT64_MAX;
  printf("a < b: %d\n", a < b);
result:
  a < b: 0

code:
  --target=x86-linux-gcc
  int64_t a = INT64_MAX;
  double b = 1. * INT64_MAX;
  printf("a < b: %d\n", a < b);
result:
  a < b: 1

I remove the double operation and test it with EXT_TX experiment.
The psnr change is around 0.05%, which is considered as noise level.

Change-Id: If8935c70c8603617fcfa8571accd30ccdda786a0
2016-04-18 11:00:13 -07:00
Jingning Han
c8312daad1 Refactor rd_variance_adjustment function
Compute the reconstruction variance in the prediction mode search.

Change-Id: Id9c7635a9c9f5383e61c0e427e95234211834301
2016-04-18 09:37:34 -07:00
Yue Chen
16a99e967c Merge "Optimization for EXT_INTER + OBMC combination" into nextgenv2 2016-04-17 18:54:33 +00:00
Yue Chen
321794c4d5 Optimization for EXT_INTER + OBMC combination
In the rd loop, check the perf of obmc, whose mv is copied from regular
inter predictor, when wedge interinter is better than regular inter
(previously it will force allow_obmc = 0). The condition of the early
termination before this step is relaxed to avoid skipping too many obmc
predictions. The rates of the overhead are properly calculated for these tools.

The logic of the bitstream syntax:
(a single ref) the interintra flag is sent first, only if it is 0, we
send the obmc flag;
(compound refs) the obmc flag is sent first, only if it is 0, we send
the wedge interinter flag

Coding gain
lowres: 0.428% (2.287%->2.715%)

Change-Id: I5f3a34640b398e313cbf84235c9fe2073eb2173f
2016-04-15 17:03:20 -07:00
Zoe Liu
9638ee1f4e Merge "Fix segfault with --cpu-used >= 3 and ext-refs." into nextgenv2 2016-04-15 16:41:15 +00:00
Geza Lore
77d197e635 Fix segfault with --cpu-used >= 3 and ext-refs.
With ext-ref enabled, it is possible that when trying to encode the
first true ALTREF frame after a keyframe, the previous ALTREF frame
(alias for the keyframe) is the same as one of the new LAST{2,3,4}
reference frames, and hence cpi->ref_frame_flags will have the ALTREF
bit clear, as computed by get_ref_frame_flags in encoder.c.

sf->alt_ref_search_fp forces the previous ALTREF frame to
be used as the only possible  reference when encoding a new ALTREF
frame, but due to cpi->ref_frame_flags, some buffers will not be
initialized (see rdopt.c:7689 yv12_mb), leading to a segfault.

get_ref_frame_flags in encoder.c has been changed to prefer to keep
the  LAST frame, then the ALTREF frame, then any of the LAST{2,3,4}
frames and then the GOLDEN frame in that order of preference in case
any of them are the same. This avoids the segfault and behaves the
same for the baseline.

Change-Id: I4da1991667614009da5d3061a6316c0d5dbc6c0c
2016-04-15 11:17:22 +01:00
Jingning Han
019683e963 Merge "Clean up motion vector precision check in the encoding process" into nextgenv2 2016-04-14 20:55:51 +00:00
Jingning Han
03a468f9ac Merge "Enable mode conversion in sub8x8 block" into nextgenv2 2016-04-14 19:01:15 +00:00
Jingning Han
6af8f63d96 Clean up motion vector precision check in the encoding process
Remove unnecessary motion vector precision check in the encoding
process.

Change-Id: Ica32933c7d138f499f36b1dedec14c894b27d85a
2016-04-14 11:37:19 -07:00
Jingning Han
cd39224cff Merge "Speed up dynamic motion vector referencing system" into nextgenv2 2016-04-14 16:16:43 +00:00
Jingning Han
885a81f468 Merge "Fix a few mis-use cases of MAX_MV_REF_CANDIDATES" into nextgenv2 2016-04-13 23:44:25 +00:00
Angie Chiang
716f0ea3cf Merge changes I92819356,I50b5a313,I807e60c6,I8a8df9fd into nextgenv2
* changes:
  Branch dct to new implementation for bd12
  Change dct32x32's range
  Fit dct's stage range into 32-bit when bitdepth is 12
  Pass tx_type into get_tx_scale
2016-04-13 23:24:41 +00:00
Hui Su
85a3f5b740 Merge "Speed-up in tx_size search" into nextgenv2 2016-04-13 23:02:21 +00:00
Jingning Han
9a1a8f1d8e Speed up dynamic motion vector referencing system
Skip transform type search in modes with ref_mv_idx > 0. This
brings down the additional encoding time cost due to the DMR system
from 32% to 17%, at minimal coding performance regression.

Change-Id: Ie82e1d2831a313c6f1e47f7da221b51345023eb3
2016-04-13 15:51:36 -07:00
Jingning Han
f33a0a8215 Fix a few mis-use cases of MAX_MV_REF_CANDIDATES
Fix several use cases where MAX_MV_REF_CANDIDATES is mixed up with
is_compound flag to avoid potential coding interruption.

Change-Id: Ifdee1ef8a81ef6d1c155315c6c6a3074aa7a8b5e
2016-04-13 15:16:55 -07:00
Jingning Han
e07dbaa2f5 Enable mode conversion in sub8x8 block
Convert the newmv mode into reference motion vector modes.

Change-Id: I51bd2543dafb70345c1340fba700b44f67f20853
2016-04-13 14:35:54 -07:00
hui su
6a7ddd84bb Speed-up in tx_size search
Do not consider 4x4 transform when the maximum possible transform
size is 32x32.

Overall encoding speed is increased by more than 10%. Compression
performance is neutral on lowres, midres, and hdres.

Change-Id: Ifac61c3c9f4b0ab392bffd4d1faa373d91014cf1
2016-04-13 10:19:00 -07:00
Angie Chiang
027d12b7d6 Merge changes I359aa49c,Ic8ca5afb into nextgenv2
* changes:
  Generalize txfm scale in highbd quantizer
  Parameterize transform scale for quantizer
2016-04-12 18:02:05 +00:00
Sarah Parker
33ccd0f85e Merge "Fix prune one and two to make compatible with new transforms" into nextgenv2 2016-04-11 17:12:28 +00:00
Debargha Mukherjee
38b26b0dc3 Merge "Make subpel masked motion work with upsampled refs" into nextgenv2 2016-04-09 13:30:09 +00:00
Sarah Parker
19e3c6415c Fix prune one and two to make compatible with new transforms
Update svm parameters with training data using new transforms
and remove DST from pruning functions.

Change-Id: I7bd1c4744455d571c1ecfb4cea14c25ac291f002
2016-04-08 11:47:48 -07:00
Debargha Mukherjee
c485b10416 Make subpel masked motion work with upsampled refs
Change-Id: Id483354e73e983793370b55a1a6a1f2dcd137dc9
2016-04-08 08:54:58 -07:00
Alex Converse
bb0e692151 Convert palette from double to float.
About 20% less time spent coding in vp10_k_means().

Change-Id: I5cf7605cde869a269776197bace70de353b07d83
2016-04-07 15:17:30 -07:00
Geza Lore
454989ff32 Make superblock size variable at the frame level.
The uncompressed frame header contains a bit to signal whether the
frame is encoded using 64x64 or 128x128 superblocks. This can vary
between any 2 frames.

vpxenc gained the --sb-size={64,128,dynamic} option, which allows the
configuration of the superblock size used (default is dynamic). 64/128
will force the encoder to always use the specified superblock size.
Dynamic would enable the encoder to choose the sb size for each
frame, but this is not implemented yet (dynamic does the same as 128
for now).

Constraints on tile sizes depend on the superblock size, the following
is a summary of the current bitstream syntax and semantics:

If both --enable-ext-tile is OFF and --enable-ext-partition is OFF:
     The tile coding in this case is the same as VP9. In particular,
     tiles have a minimum width of 256 pixels and a maximum width of
     4096 pixels. The tile width must be multiples of 64 pixels
     (except for the rightmost tile column). There can be a maximum
     of 64 tile columns and 4 tile rows.

If --enable-ext-tile is OFF and --enable-ext-partition is ON:
     Same constraints as above, except that tile width must be
     multiples of 128 pixels (except for the rightmost tile column).

There is no change in the bitstream syntax used for coding the tile
configuration if --enable-ext-tile is OFF.

If --enable-ext-tile is ON and --enable-ext-partition is ON:
     This is the new large scale tile coding configuration. The
     minimum/maximum tile width and height are 64/4096 pixels. Tile
     width and height must be multiples of 64 pixels. The uncompressed
     header contains two 6 bit fields that hold the tile width/heigh
     in units of 64 pixels. The maximum number of tile rows/columns
     is only limited by the maximum frame size of 65536x65536 pixels
     that can be coded in the bitstream. This yields a maximum of
     1024x1024 tile rows and columns (of 64x64 tiles in a 65536x65536
     frame).

If both --enable-ext-tile is ON and --enable-ext-partition is ON:
     Same applies as above, except that in the bitstream the 2 fields
     containing the tile width/height are in units of the superblock
     size, and the superblock size itself is also coded in the bitstream.
     If the uncompressed header signals the use of 64x64 superblocks,
     then the tile width/height fields are 6 bits wide and are in units
     of 64 pixels. If the uncompressed header signals the use of 128x128
     superblocks, then the tile width/height fields are 5 bits wide and
     are in units of 128 pixels.

The above is a summary of the bitstream. The user interface to vpxenc
(and the equivalent encoder API) behaves a follows:

If --enable-ext-tile is OFF:
     No change in the user interface. --tile-columns and --tile-rows
     specify the base 2 logarithm of the desired number of tile columns
     and tile rows. The actual number of tile rows and tile columns,
     and the particular tile width and tile height are computed by the
     codec ensuring all of the above constraints are respected.

If --enable-ext-tile is ON, but --enable-ext-partition is OFF:
     No change in the user interface. --tile-columns and --tile-rows
     specify the WIDTH and HEIGHT of the tiles in unit of 64 pixels.
     The valid values are in the range [1, 64] (which corresponds to
     [64, 4096] pixels in increments of 64.

If both --enable-ext-tile is ON and --enable-ext-partition is ON:
     If --sb-size=64 (default):
         The user interface is the same as in the previous point.
         --tile-columns and --tile-rows specify tile WIDTH and HEIGHT,
         in units of 64 pixels, in the range [1, 64] (which corresponds
         to [64, 4096] pixels in increments of 64).
     If --sb-size=128 or --sb-size=dynamic:
         --tile-columns and --tile-rows specify tile WIDTH and HEIGHT,
         in units of 128 pixels in the range [1, 32] (which corresponds
         to [128, 4096] pixels in increments of 128).

Change-Id: Idc9beee1ad12ff1634e83671985d14c680f9179a
2016-04-07 10:34:25 +01:00
Julia Robson
4300e50cce Fixing assertion in *Large unit tests
In certain cases the code was subtracting the obmc cost
despite it not having been added previously.
For example with ref_mv, supertx, ext_inter, obmc & ext_refs
enabled the following test was failing but now passes:
"VP10/ArfFreqTestLarge.MinArfFreqTest/33"

Change-Id: I966853f34c18d5a1d4c7a56fa201c1b02973fc88
2016-04-06 11:22:10 +01:00
Debargha Mukherjee
0fc82ea1cf Refactoring and cosmetic changes to ext-inter expt
Change-Id: Icd457480744b7734b3c412c9fed43be738373334
2016-04-05 15:16:18 -07:00
Angie Chiang
75ae90f7a9 Pass tx_type into get_tx_scale
Change-Id: I8a8df9fdefa492f66cf2cd29b0b081ad69b5d85e
2016-04-01 12:53:10 -07:00
Debargha Mukherjee
2a6389bb8b Merge "Fix interpolation values and decouple interintra" into nextgenv2 2016-03-31 21:47:10 +00:00
Debargha Mukherjee
2be211e971 Fix interpolation values and decouple interintra
Decouples interintra modes and probability models from regular
intra modes, to enable creating/optimizing new interintra modes.
Also, fixes interpolation values for 128x128 interintra and obmc.

Change-Id: I5c2016db49b8f029164e5fe84c6274d4e02ff90e
2016-03-31 12:12:51 -07:00
Debargha Mukherjee
6d3fc82b7f Merge changes Id20526d0,Iee08d975 into nextgenv2
* changes:
  Refactor loopfilter level arrays to 2D.
  Rename MI_BLOCK_SIZE and MI_MASK macros.
2016-03-31 18:48:20 +00:00
Jingning Han
aae7e0f6a4 Merge "Refactor the sub8x8 block motion search control" into nextgenv2 2016-03-31 15:50:38 +00:00
Geza Lore
511da8cbe5 Rename MI_BLOCK_SIZE and MI_MASK macros.
Rename MI_BLOCK_SIZE.* -> MAX_MIB_SIZE.* (MIB is for MI Block).
Rename MI_MASK.* -> MAX_MIB_MASK.*

There are no functional changes.

This is in preparation for coding the superblock size at the frame
level, which will require some of these constants to become variables.
The new names better reflect future semantics, and hence make the code
clearer.

Change-Id: Iee08d97554cf4cc16a5dc166a3ffd1ab91529992
2016-03-31 09:57:41 +01:00
Hui Su
cce6688c31 Merge "Set block size upper bound for Palette mode" into nextgenv2 2016-03-31 00:23:11 +00:00
Angie Chiang
64413a6ca7 Parameterize transform scale for quantizer
This is to facilitate changing transform scale later

Change-Id: Ic8ca5afba57d2489ebd191ccc40c1b31605a0d8c
2016-03-30 15:25:26 -07:00
hui su
cbb8be769d Set block size upper bound for Palette mode
Avoid buffer overflow in case of such new experiments as
128 x 128 superblock size.

Change-Id: Ib775f3925a85fc87227c0ddd9b6a6110a12ef196
2016-03-30 14:39:44 -07:00
Debargha Mukherjee
8d3a4aa891 Some fixes/speed-ups on inter-intra part of ext-inter
Fixes an issue with rectangular inter-intra blocks.
Includes various other refactoring and cleanups to enable fast mixing
of inter and intra predictors.
Uses only the best single inter reference so far for the inter-intra
search.

About 30% speed-up with a 0.1% hit in performance.

This is part one of overhauling on the ext-inter experiment. To be
continued in subsequent patches.

Change-Id: Id10ee100c78c6e00009a3a4f930a4435ef403a95
2016-03-30 14:39:29 -07:00
Geza Lore
552d5cd715 Extend superblock size fo 128x128 pixels.
If --enable-ext-partition is used at build time, the superblock size
(sometimes also referred to as coding unit (CU) size) is extended to
128x128 pixels.

Change-Id: Ie09cec6b7e8d765b7555ff5d80974aab60803f3a
2016-03-30 18:23:06 +01:00
Jingning Han
b6238b413e Refactor the sub8x8 block motion search control
Change-Id: Ia340e66e0a61403070adf8e4f18f00eab143f8f7
2016-03-29 09:53:55 -07:00
hui su
4ab00912c4 Palette mode: record selected transform type
Change-Id: I4c3d3224571176ac924d79ddfaba56990fc4000e
2016-03-28 20:43:59 -07:00
Jingning Han
78ee83125b Merge "Fix a rdcost computation issue in sub8x8 block mode search" into nextgenv2 2016-03-29 00:51:01 +00:00
Jingning Han
7279a4748f Merge "Rename run_rd_check to run_mv_search" into nextgenv2 2016-03-28 23:10:48 +00:00
Jingning Han
b534987110 Merge "Rework the predicted motion vector for sub8x8 block" into nextgenv2 2016-03-28 23:10:35 +00:00
Jingning Han
d133524e7c Fix a rdcost computation issue in sub8x8 block mode search
Compute the rate-distortion cost for sub8x8 blocks with integer
motion vectors.

Change-Id: I7dc034fcc4bec3850f26d1f9ae0595c91df1137e
2016-03-28 23:09:53 +00:00
Jingning Han
59d45d603b Rename run_rd_check to run_mv_search
Improve the readability in the related rate-distortion optimization
search control function of sub8x8 blocks.

Change-Id: I7f7456bf40a98aa5146abfe0488cda745b84d899
2016-03-28 21:59:10 +00:00
Jingning Han
0586460938 Rework the predicted motion vector for sub8x8 block
This commit makes the sub8x8 block to use its nearest neighbor's
motion vector as predicted motion vector for NEWMV mode. It improves
the coding performance by 0.12%.

Change-Id: I99e56715b327573ce7e8a26e3515a4984dadfd98
2016-03-28 14:58:17 -07:00
Angie Chiang
4144a11552 Merge "Use vp10_[fwd/inv]_txfm2d_add_32x32 for bd 10" into nextgenv2 2016-03-28 19:20:48 +00:00
Angie Chiang
33833aefdd Merge "Use vp10_[fwd/inv]_txfm2d_add_#x# for bd 10" into nextgenv2 2016-03-28 18:11:47 +00:00
Angie Chiang
46b234478f Use vp10_[fwd/inv]_txfm2d_add_32x32 for bd 10
Change-Id: I996c48a90d7d71b52594a91a35cb8712c7fc212e
2016-03-28 11:08:40 -07:00
Yue Chen
e63792e5cf A major speed up for obmc experiment
Skip checking obmc when regular inter predictor is not so good (the
rd-cost for Y residual is greater than the total rd of the best mode
so far.)

Performance change compared to full rd search:
  +0.006% lowres, -0.056% midres
Encoding time :
  1.14X baseline (was 1.42X)

Change-Id: I11350f955a20e1a2331be458537a915e09fbedf3
2016-03-25 14:06:52 -07:00