Commit Graph

374 Commits

Author SHA1 Message Date
Geza Lore
d29ec48504 Initialize dummy variables.
Valgrind flags these up as needed by handle_inter_mode.
Initializing fixes some assertion failures in the unit tests with
only ref-mv enabled.

Change-Id: I4d56c356692745dbecd9f790cdbb8dbfbaf72d55
2016-04-27 13:35:12 +01:00
Yue Chen
02e941d371 Merge "Remove double counting for mv costs" into nextgenv2 2016-04-26 21:40:13 +00:00
Yue Chen
34177e673d Remove double counting for mv costs
The bug is introduced by commit 1a0352d, in which mv costs are
counted twice in joint_motion_search() in ext_inter experiment.

Change-Id: Ibace453df999d3c2e781d73f1f0912038fee2d4e
2016-04-26 13:01:52 -07:00
Hui Su
3f7a709676 Merge "ext-intra: get rid of some floating operations." into nextgenv2 2016-04-26 18:53:33 +00:00
Hui Su
1e93a3e64e Merge "Keep track of zcoeff_blk in tx size/type search" into nextgenv2 2016-04-26 16:41:50 +00:00
hui su
ad50c226e6 ext-intra: get rid of some floating operations.
No performance changes.

Change-Id: Idd4043090fec09e57520bc970ed2e39e6f7e1a5e
2016-04-25 14:44:42 -07:00
Geza Lore
23c4116ebb Clear X87 register state before using double.
MMX and X87 floating point instructions cannot be mixed freely on
the 32 bit x86 architecture.

This fixes a lot of unit tests in the 32bit build with
--enable-ext-intra.

BUG=https://bugs.chromium.org/p/webm/issues/detail?id=1196

Change-Id: I0e1c3565f4b9cb4fc2d716e94d9c40e68b36fac8
2016-04-25 10:30:20 -07:00
James Zern
0aa6435c45 vp10/rdopt: quiet unused variable warning
when CONFIG_REF_MV and CONFIG_EXT_INTER are enabled

Change-Id: I17fa2b5fe0e1878333099cc5fa2b1ee36636b4d3
2016-04-23 16:59:45 +00:00
Jingning Han
cdf989adb7 Silence compiler above-boundary warnings
Change-Id: I6d806f92e8d38d5b0b01bc8e0fd97bd8839c84df
2016-04-22 15:49:34 -07:00
Jingning Han
3f6ec144e5 Fix an enc/dec mismatch issue in ext-inter experiment
This commit fixes an encoding decision process issue that could
trigger enc/dec mismatch in the ext-inter experiment.

Change-Id: I6f10d1fd2fd1aa04e51df04c39a65cf72ac66c42
2016-04-22 20:49:29 +00:00
Debargha Mukherjee
53968c3917 Merge "Fix uninitialized blk_skip for VAR TX." into nextgenv2 2016-04-21 19:56:17 +00:00
Jingning Han
1a0352d18e Merge "Handle zero motion vector residual" into nextgenv2 2016-04-19 21:20:08 +00:00
hui su
7dffb43267 Keep track of zcoeff_blk in tx size/type search
Prevent potential problems when per transform block
zero forcing is re-enabled (a To-Do).

Change-Id: I03b0ab2a86d88058441f2ca18994cfd2e6329898
2016-04-19 11:45:11 -07:00
hui su
e43c21112d Enable optimize_b for intra blocks
Coding gain:
lowres  0.05%
midres  0.10%
hdres   0.18%

Change-Id: I508b150c02588f911a8ddddfe73c770f0819fe10
2016-04-19 09:50:45 -07:00
Geza Lore
7aa95be980 Fix uninitialized blk_skip for VAR TX.
x->blk_skip used to be uninitialized (leftover from encoding the
previous block), if cm->tx_mode != TX_MODE_SELECT (which is used with
higher --cpu-used or --rt options). This resulted in degraded coding
performance when using cm->tx_mode != TX_MODE_SELECT.

This fixes the VP10/EndToEndTestLarge.EndtoEndPSNRTest/40 unit test.

Also fixed an edge effect where encode_block in encodemb.c used the
formal width of the block (without cropping at the right edge), to
look up blk_skip, while select_tx_block in rdopt.c used the cropped
width to set blk_skip.

Change-Id: I76d0f49ac5ab3ab54203573e0d7fcfcc1c6aa10d
2016-04-19 17:00:20 +01:00
Geza Lore
8d64b53dc8 Revert "Fix uninitialized blk_skip for VAR TX."
This reverts commit e7b89d8835.
2016-04-19 15:41:56 +01:00
Geza Lore
e7b89d8835 Fix uninitialized blk_skip for VAR TX.
x->blk_skip used to be uninitialzied (leftover from encoding the
previous block), if cm->tx_mode != TX_MODE_SELECT (which is used with
higher --cpu-used or --rt options). This resulted in degraded coding
performance when uning cm->tx_mode != TX_MODE_SELECT.

This fixes the VP10/EndToEndTestLarge.EndtoEndPSNRTest/40 unit test.

Change-Id: If39062927446798c626fc93694b4e6a4f35fa5da
2016-04-19 14:22:48 +01:00
Jingning Han
ec2ffda599 Handle zero motion vector residual
This commit handles the zero motion vector residuals for single
and compound reference modes, respectively. It improves the coding
performance by 0.13% with no additional encoding complexity.

Change-Id: I16075a836025bd2746da2ff4698fb9261e4b08c1
2016-04-18 18:14:01 -07:00
Jingning Han
2aa6117bda Refactor transform selection process
This commit re-arranges the transform type and size selectio
process. It removes an unnecessary rate-distortion cost computation
step. Local experiments show that this speeds up the encoding
process by 6% for both the baseline and the ext-intra experiment.

Change-Id: Iab3b86a63a1e9e55548466791ed5d29a0575c1e7
2016-04-18 19:45:56 +00:00
Jingning Han
c5449d3eb7 Merge "Refactor rd_variance_adjustment function" into nextgenv2 2016-04-18 19:45:45 +00:00
Angie Chiang
cf3ef18fc4 Merge "Remove double operation from tx_size selection" into nextgenv2 2016-04-18 18:11:36 +00:00
Angie Chiang
6de4a77df3 Remove double operation from tx_size selection
This CL fix the bug
rdopt.c:1687: choose_tx_size_from_rd: Assertion
`mbmi->tx_type == DCT_DCT' failed

It is caused by
1) mms register access before double operation
2) different compiler behaviors
code:
  int64_t a = INT64_MAX;
  double b = 1. * INT64_MAX;
  printf("a < b: %d\n", a < b);
result:
  a < b: 0

code:
  --target=x86-linux-gcc
  int64_t a = INT64_MAX;
  double b = 1. * INT64_MAX;
  printf("a < b: %d\n", a < b);
result:
  a < b: 1

I remove the double operation and test it with EXT_TX experiment.
The psnr change is around 0.05%, which is considered as noise level.

Change-Id: If8935c70c8603617fcfa8571accd30ccdda786a0
2016-04-18 11:00:13 -07:00
Jingning Han
c8312daad1 Refactor rd_variance_adjustment function
Compute the reconstruction variance in the prediction mode search.

Change-Id: Id9c7635a9c9f5383e61c0e427e95234211834301
2016-04-18 09:37:34 -07:00
Yue Chen
16a99e967c Merge "Optimization for EXT_INTER + OBMC combination" into nextgenv2 2016-04-17 18:54:33 +00:00
Yue Chen
321794c4d5 Optimization for EXT_INTER + OBMC combination
In the rd loop, check the perf of obmc, whose mv is copied from regular
inter predictor, when wedge interinter is better than regular inter
(previously it will force allow_obmc = 0). The condition of the early
termination before this step is relaxed to avoid skipping too many obmc
predictions. The rates of the overhead are properly calculated for these tools.

The logic of the bitstream syntax:
(a single ref) the interintra flag is sent first, only if it is 0, we
send the obmc flag;
(compound refs) the obmc flag is sent first, only if it is 0, we send
the wedge interinter flag

Coding gain
lowres: 0.428% (2.287%->2.715%)

Change-Id: I5f3a34640b398e313cbf84235c9fe2073eb2173f
2016-04-15 17:03:20 -07:00
Zoe Liu
9638ee1f4e Merge "Fix segfault with --cpu-used >= 3 and ext-refs." into nextgenv2 2016-04-15 16:41:15 +00:00
Geza Lore
77d197e635 Fix segfault with --cpu-used >= 3 and ext-refs.
With ext-ref enabled, it is possible that when trying to encode the
first true ALTREF frame after a keyframe, the previous ALTREF frame
(alias for the keyframe) is the same as one of the new LAST{2,3,4}
reference frames, and hence cpi->ref_frame_flags will have the ALTREF
bit clear, as computed by get_ref_frame_flags in encoder.c.

sf->alt_ref_search_fp forces the previous ALTREF frame to
be used as the only possible  reference when encoding a new ALTREF
frame, but due to cpi->ref_frame_flags, some buffers will not be
initialized (see rdopt.c:7689 yv12_mb), leading to a segfault.

get_ref_frame_flags in encoder.c has been changed to prefer to keep
the  LAST frame, then the ALTREF frame, then any of the LAST{2,3,4}
frames and then the GOLDEN frame in that order of preference in case
any of them are the same. This avoids the segfault and behaves the
same for the baseline.

Change-Id: I4da1991667614009da5d3061a6316c0d5dbc6c0c
2016-04-15 11:17:22 +01:00
Jingning Han
019683e963 Merge "Clean up motion vector precision check in the encoding process" into nextgenv2 2016-04-14 20:55:51 +00:00
Jingning Han
03a468f9ac Merge "Enable mode conversion in sub8x8 block" into nextgenv2 2016-04-14 19:01:15 +00:00
Jingning Han
6af8f63d96 Clean up motion vector precision check in the encoding process
Remove unnecessary motion vector precision check in the encoding
process.

Change-Id: Ica32933c7d138f499f36b1dedec14c894b27d85a
2016-04-14 11:37:19 -07:00
Jingning Han
cd39224cff Merge "Speed up dynamic motion vector referencing system" into nextgenv2 2016-04-14 16:16:43 +00:00
Jingning Han
885a81f468 Merge "Fix a few mis-use cases of MAX_MV_REF_CANDIDATES" into nextgenv2 2016-04-13 23:44:25 +00:00
Angie Chiang
716f0ea3cf Merge changes I92819356,I50b5a313,I807e60c6,I8a8df9fd into nextgenv2
* changes:
  Branch dct to new implementation for bd12
  Change dct32x32's range
  Fit dct's stage range into 32-bit when bitdepth is 12
  Pass tx_type into get_tx_scale
2016-04-13 23:24:41 +00:00
Hui Su
85a3f5b740 Merge "Speed-up in tx_size search" into nextgenv2 2016-04-13 23:02:21 +00:00
Jingning Han
9a1a8f1d8e Speed up dynamic motion vector referencing system
Skip transform type search in modes with ref_mv_idx > 0. This
brings down the additional encoding time cost due to the DMR system
from 32% to 17%, at minimal coding performance regression.

Change-Id: Ie82e1d2831a313c6f1e47f7da221b51345023eb3
2016-04-13 15:51:36 -07:00
Jingning Han
f33a0a8215 Fix a few mis-use cases of MAX_MV_REF_CANDIDATES
Fix several use cases where MAX_MV_REF_CANDIDATES is mixed up with
is_compound flag to avoid potential coding interruption.

Change-Id: Ifdee1ef8a81ef6d1c155315c6c6a3074aa7a8b5e
2016-04-13 15:16:55 -07:00
Jingning Han
e07dbaa2f5 Enable mode conversion in sub8x8 block
Convert the newmv mode into reference motion vector modes.

Change-Id: I51bd2543dafb70345c1340fba700b44f67f20853
2016-04-13 14:35:54 -07:00
hui su
6a7ddd84bb Speed-up in tx_size search
Do not consider 4x4 transform when the maximum possible transform
size is 32x32.

Overall encoding speed is increased by more than 10%. Compression
performance is neutral on lowres, midres, and hdres.

Change-Id: Ifac61c3c9f4b0ab392bffd4d1faa373d91014cf1
2016-04-13 10:19:00 -07:00
Angie Chiang
027d12b7d6 Merge changes I359aa49c,Ic8ca5afb into nextgenv2
* changes:
  Generalize txfm scale in highbd quantizer
  Parameterize transform scale for quantizer
2016-04-12 18:02:05 +00:00
Sarah Parker
33ccd0f85e Merge "Fix prune one and two to make compatible with new transforms" into nextgenv2 2016-04-11 17:12:28 +00:00
Debargha Mukherjee
38b26b0dc3 Merge "Make subpel masked motion work with upsampled refs" into nextgenv2 2016-04-09 13:30:09 +00:00
Sarah Parker
19e3c6415c Fix prune one and two to make compatible with new transforms
Update svm parameters with training data using new transforms
and remove DST from pruning functions.

Change-Id: I7bd1c4744455d571c1ecfb4cea14c25ac291f002
2016-04-08 11:47:48 -07:00
Debargha Mukherjee
c485b10416 Make subpel masked motion work with upsampled refs
Change-Id: Id483354e73e983793370b55a1a6a1f2dcd137dc9
2016-04-08 08:54:58 -07:00
Alex Converse
bb0e692151 Convert palette from double to float.
About 20% less time spent coding in vp10_k_means().

Change-Id: I5cf7605cde869a269776197bace70de353b07d83
2016-04-07 15:17:30 -07:00
Geza Lore
454989ff32 Make superblock size variable at the frame level.
The uncompressed frame header contains a bit to signal whether the
frame is encoded using 64x64 or 128x128 superblocks. This can vary
between any 2 frames.

vpxenc gained the --sb-size={64,128,dynamic} option, which allows the
configuration of the superblock size used (default is dynamic). 64/128
will force the encoder to always use the specified superblock size.
Dynamic would enable the encoder to choose the sb size for each
frame, but this is not implemented yet (dynamic does the same as 128
for now).

Constraints on tile sizes depend on the superblock size, the following
is a summary of the current bitstream syntax and semantics:

If both --enable-ext-tile is OFF and --enable-ext-partition is OFF:
     The tile coding in this case is the same as VP9. In particular,
     tiles have a minimum width of 256 pixels and a maximum width of
     4096 pixels. The tile width must be multiples of 64 pixels
     (except for the rightmost tile column). There can be a maximum
     of 64 tile columns and 4 tile rows.

If --enable-ext-tile is OFF and --enable-ext-partition is ON:
     Same constraints as above, except that tile width must be
     multiples of 128 pixels (except for the rightmost tile column).

There is no change in the bitstream syntax used for coding the tile
configuration if --enable-ext-tile is OFF.

If --enable-ext-tile is ON and --enable-ext-partition is ON:
     This is the new large scale tile coding configuration. The
     minimum/maximum tile width and height are 64/4096 pixels. Tile
     width and height must be multiples of 64 pixels. The uncompressed
     header contains two 6 bit fields that hold the tile width/heigh
     in units of 64 pixels. The maximum number of tile rows/columns
     is only limited by the maximum frame size of 65536x65536 pixels
     that can be coded in the bitstream. This yields a maximum of
     1024x1024 tile rows and columns (of 64x64 tiles in a 65536x65536
     frame).

If both --enable-ext-tile is ON and --enable-ext-partition is ON:
     Same applies as above, except that in the bitstream the 2 fields
     containing the tile width/height are in units of the superblock
     size, and the superblock size itself is also coded in the bitstream.
     If the uncompressed header signals the use of 64x64 superblocks,
     then the tile width/height fields are 6 bits wide and are in units
     of 64 pixels. If the uncompressed header signals the use of 128x128
     superblocks, then the tile width/height fields are 5 bits wide and
     are in units of 128 pixels.

The above is a summary of the bitstream. The user interface to vpxenc
(and the equivalent encoder API) behaves a follows:

If --enable-ext-tile is OFF:
     No change in the user interface. --tile-columns and --tile-rows
     specify the base 2 logarithm of the desired number of tile columns
     and tile rows. The actual number of tile rows and tile columns,
     and the particular tile width and tile height are computed by the
     codec ensuring all of the above constraints are respected.

If --enable-ext-tile is ON, but --enable-ext-partition is OFF:
     No change in the user interface. --tile-columns and --tile-rows
     specify the WIDTH and HEIGHT of the tiles in unit of 64 pixels.
     The valid values are in the range [1, 64] (which corresponds to
     [64, 4096] pixels in increments of 64.

If both --enable-ext-tile is ON and --enable-ext-partition is ON:
     If --sb-size=64 (default):
         The user interface is the same as in the previous point.
         --tile-columns and --tile-rows specify tile WIDTH and HEIGHT,
         in units of 64 pixels, in the range [1, 64] (which corresponds
         to [64, 4096] pixels in increments of 64).
     If --sb-size=128 or --sb-size=dynamic:
         --tile-columns and --tile-rows specify tile WIDTH and HEIGHT,
         in units of 128 pixels in the range [1, 32] (which corresponds
         to [128, 4096] pixels in increments of 128).

Change-Id: Idc9beee1ad12ff1634e83671985d14c680f9179a
2016-04-07 10:34:25 +01:00
Julia Robson
4300e50cce Fixing assertion in *Large unit tests
In certain cases the code was subtracting the obmc cost
despite it not having been added previously.
For example with ref_mv, supertx, ext_inter, obmc & ext_refs
enabled the following test was failing but now passes:
"VP10/ArfFreqTestLarge.MinArfFreqTest/33"

Change-Id: I966853f34c18d5a1d4c7a56fa201c1b02973fc88
2016-04-06 11:22:10 +01:00
Debargha Mukherjee
0fc82ea1cf Refactoring and cosmetic changes to ext-inter expt
Change-Id: Icd457480744b7734b3c412c9fed43be738373334
2016-04-05 15:16:18 -07:00
Angie Chiang
75ae90f7a9 Pass tx_type into get_tx_scale
Change-Id: I8a8df9fdefa492f66cf2cd29b0b081ad69b5d85e
2016-04-01 12:53:10 -07:00
Debargha Mukherjee
2a6389bb8b Merge "Fix interpolation values and decouple interintra" into nextgenv2 2016-03-31 21:47:10 +00:00
Debargha Mukherjee
2be211e971 Fix interpolation values and decouple interintra
Decouples interintra modes and probability models from regular
intra modes, to enable creating/optimizing new interintra modes.
Also, fixes interpolation values for 128x128 interintra and obmc.

Change-Id: I5c2016db49b8f029164e5fe84c6274d4e02ff90e
2016-03-31 12:12:51 -07:00