678 Commits

Author SHA1 Message Date
Angie Chiang
dfa532cc2a Let txfm's constant bit be the same for each stage
Change-Id: I763f2924afca526db371231bca18b38879bdf793
2016-04-14 15:46:54 -07:00
Angie Chiang
02d23fbbf4 Fit adst/dct's stage range into 32-bit in bd12
Change-Id: Ie428c6f0655873de3e77e844a2f2e4203cf47dff
2016-04-14 15:44:05 -07:00
Jingning Han
525995a3d9 Apply motion vector precision check to candidate mv
This avoids repeatedly checking the candidate motion vector
precision level at the decoder end. The compression performance
varies at 0.01% level.

Change-Id: I4a88e95decd900d0cac9a0c2e70ba43ef7ecac38
2016-04-14 09:44:41 -07:00
Hui Su
436a6cc4e7 Merge "ext-tx: use raster scan order for identity transform" into nextgenv2 2016-04-13 23:52:35 +00:00
Angie Chiang
716f0ea3cf Merge changes I92819356,I50b5a313,I807e60c6,I8a8df9fd into nextgenv2
* changes:
  Branch dct to new implementation for bd12
  Change dct32x32's range
  Fit dct's stage range into 32-bit when bitdepth is 12
  Pass tx_type into get_tx_scale
2016-04-13 23:24:41 +00:00
hui su
b72aa72a90 ext-tx: use raster scan order for identity transform
coding gain of ext-tx:
screen_content 12.73% -> 13.05%

Change-Id: I5fc8cf0db84c3e56dd3cb7675e1d81c9c575bc57
2016-04-13 09:42:43 -07:00
Geza Lore
c50aaf3049 Make ext-refs respect encoding flags.
The VP8_EFLAG_NO_UPD_LAST and VP8_EFLAG_NO_REF_LAST flags can be
passed to the encoder to signal that it should not update/reference
the LAST ref frame when encoding the current frame. With
--enable-ext-refs turned on, the new LAST2 LAST3 and LAST4 ref frames
could still be used or updated, which causes the
  VP10/ErrorResilienceTestLarge.DropFramesWithoutRecovery/{0,1,2}
tests to fail.

With this patch, if --enable-ext-refs is used, then
VP8_EFLAG_NO_UPD_LAST and VP8_EFLAG_NO_REF_LAST also applies to the
new LAST2 LAST3 and LAST4 ref frames, as well as the LAST ref frame.

Change-Id: If482b1c09bbaf914eca8e0348a2367bff261661d
2016-04-13 12:03:58 +01:00
Angie Chiang
027d12b7d6 Merge changes I359aa49c,Ic8ca5afb into nextgenv2
* changes:
  Generalize txfm scale in highbd quantizer
  Parameterize transform scale for quantizer
2016-04-12 18:02:05 +00:00
Debargha Mukherjee
648538959d Merge "Use reduced transform set for 16x16" into nextgenv2 2016-04-11 23:32:29 +00:00
Debargha Mukherjee
c4da5d500e Use reduced transform set for 16x16
Speed increase for ext-tx by 20% for a BDRATE drop of 0.26%.
The ext-tx expt becomes -2.66% BDRATE (reduced from -2.92%) for
the lowres set.

It turns out that reducing the set of transforms for intra from
12 to 5 makes very little difference in coding performance (~0.04%).
Most of the performance drop comes from the reduction is transform
set for inter. Currently there is a provision to control that with
a macro.

Change-Id: I7de05527bf72f96acc1e0ab8a74a849da0a141e5
2016-04-11 13:04:41 -07:00
Debargha Mukherjee
9930a00ed7 Merge "Refactor PC_TREE root handling." into nextgenv2 2016-04-09 13:33:53 +00:00
hui su
f94d699c09 Changes to scan order neighbors
-Fix some bugs in row_scan and col_scan. In some cases, the above
or left neighbor was not considered even though it is available.

-When above or left neighbor is not available, try using the
top-left, top-right or bottom-left neighbor.

Compression improvement:
lowres   0.20%
midres   0.16%
hdres    0.20%

Change-Id: If521665589c7f29277b8e9223f21f4a8bf3fef39
2016-04-08 11:08:57 -07:00
hui su
b76118b736 Reformat scan order neighbors
Change-Id: Iafcd080612012b08f3cbff45335c12f434543f38
2016-04-08 10:50:13 -07:00
Geza Lore
f2be4f6058 Refactor PC_TREE root handling.
Change-Id: Id8b16c1b18bd6f909e72aae3fd582dd3503c88c6
2016-04-08 17:01:00 +01:00
hui su
69c7ad3407 Correct comments for scan order neighbors
Change-Id: I5e2dc39bf0ee8e501e4dd358be2e92ae50934593
2016-04-07 11:07:21 -07:00
Geza Lore
454989ff32 Make superblock size variable at the frame level.
The uncompressed frame header contains a bit to signal whether the
frame is encoded using 64x64 or 128x128 superblocks. This can vary
between any 2 frames.

vpxenc gained the --sb-size={64,128,dynamic} option, which allows the
configuration of the superblock size used (default is dynamic). 64/128
will force the encoder to always use the specified superblock size.
Dynamic would enable the encoder to choose the sb size for each
frame, but this is not implemented yet (dynamic does the same as 128
for now).

Constraints on tile sizes depend on the superblock size, the following
is a summary of the current bitstream syntax and semantics:

If both --enable-ext-tile is OFF and --enable-ext-partition is OFF:
     The tile coding in this case is the same as VP9. In particular,
     tiles have a minimum width of 256 pixels and a maximum width of
     4096 pixels. The tile width must be multiples of 64 pixels
     (except for the rightmost tile column). There can be a maximum
     of 64 tile columns and 4 tile rows.

If --enable-ext-tile is OFF and --enable-ext-partition is ON:
     Same constraints as above, except that tile width must be
     multiples of 128 pixels (except for the rightmost tile column).

There is no change in the bitstream syntax used for coding the tile
configuration if --enable-ext-tile is OFF.

If --enable-ext-tile is ON and --enable-ext-partition is ON:
     This is the new large scale tile coding configuration. The
     minimum/maximum tile width and height are 64/4096 pixels. Tile
     width and height must be multiples of 64 pixels. The uncompressed
     header contains two 6 bit fields that hold the tile width/heigh
     in units of 64 pixels. The maximum number of tile rows/columns
     is only limited by the maximum frame size of 65536x65536 pixels
     that can be coded in the bitstream. This yields a maximum of
     1024x1024 tile rows and columns (of 64x64 tiles in a 65536x65536
     frame).

If both --enable-ext-tile is ON and --enable-ext-partition is ON:
     Same applies as above, except that in the bitstream the 2 fields
     containing the tile width/height are in units of the superblock
     size, and the superblock size itself is also coded in the bitstream.
     If the uncompressed header signals the use of 64x64 superblocks,
     then the tile width/height fields are 6 bits wide and are in units
     of 64 pixels. If the uncompressed header signals the use of 128x128
     superblocks, then the tile width/height fields are 5 bits wide and
     are in units of 128 pixels.

The above is a summary of the bitstream. The user interface to vpxenc
(and the equivalent encoder API) behaves a follows:

If --enable-ext-tile is OFF:
     No change in the user interface. --tile-columns and --tile-rows
     specify the base 2 logarithm of the desired number of tile columns
     and tile rows. The actual number of tile rows and tile columns,
     and the particular tile width and tile height are computed by the
     codec ensuring all of the above constraints are respected.

If --enable-ext-tile is ON, but --enable-ext-partition is OFF:
     No change in the user interface. --tile-columns and --tile-rows
     specify the WIDTH and HEIGHT of the tiles in unit of 64 pixels.
     The valid values are in the range [1, 64] (which corresponds to
     [64, 4096] pixels in increments of 64.

If both --enable-ext-tile is ON and --enable-ext-partition is ON:
     If --sb-size=64 (default):
         The user interface is the same as in the previous point.
         --tile-columns and --tile-rows specify tile WIDTH and HEIGHT,
         in units of 64 pixels, in the range [1, 64] (which corresponds
         to [64, 4096] pixels in increments of 64).
     If --sb-size=128 or --sb-size=dynamic:
         --tile-columns and --tile-rows specify tile WIDTH and HEIGHT,
         in units of 128 pixels in the range [1, 32] (which corresponds
         to [128, 4096] pixels in increments of 128).

Change-Id: Idc9beee1ad12ff1634e83671985d14c680f9179a
2016-04-07 10:34:25 +01:00
Debargha Mukherjee
de3d15bb2c Merge "Refactoring and cosmetic changes to ext-inter expt" into nextgenv2 2016-04-06 01:19:06 +00:00
Debargha Mukherjee
0fc82ea1cf Refactoring and cosmetic changes to ext-inter expt
Change-Id: Icd457480744b7734b3c412c9fed43be738373334
2016-04-05 15:16:18 -07:00
Angie Chiang
ff8c490b9a Branch dct to new implementation for bd12
Change-Id: I9281935653aacce22ac3100f79fb956c249e2bf3
2016-04-04 12:40:10 -07:00
Angie Chiang
f1060f5bc4 Change dct32x32's range
Bitdepth 10/12:
Fit coefficient range into 32 bits
Fit codfficient * const range into 32 bits

Bitdepth 8:
Fit coefficient range into 16 bits
Fit codfficient * constant range into 32 bits

Change-Id: I50b5a3132e8a9f5155c971ab0f6eb52876d2b5ca
2016-04-04 11:21:11 -07:00
Angie Chiang
39b3c025fa Fit dct's stage range into 32-bit when bitdepth is 12
Change-Id: I807e60c6dcacc50c087adcbdb1df022f8541efc5
2016-04-04 11:13:44 -07:00
Geza Lore
f0290cd127 Refactor get_partition to be universal.
Change-Id: I3a2fe4073bb94c5afc24d9274e6edcdb3aed934f
2016-04-04 15:22:25 +01:00
Geza Lore
e0dbfdeedc Minor refactoring of partition type processing.
Change-Id: Idcb1e94298d4b7d8832d285548ec2d2ced4b2988
2016-04-04 14:51:10 +01:00
Debargha Mukherjee
2fba8189de Merge "Loopfilter fix" into nextgenv2 2016-04-01 17:48:09 +00:00
Angie Chiang
9f879b3c5f Merge "change vp10_fwd_txfm2d_#x#_sse2 to vp10_fwd_txfm2d_#x#_sse4_1" into nextgenv2 2016-04-01 17:25:23 +00:00
Angie Chiang
2c2b9bd455 Merge "Remove redundant code from vp10_fwd_txfm2d.c" into nextgenv2 2016-04-01 17:25:13 +00:00
Angie Chiang
1b755039c6 Merge "Simplify rounding in vp10_[fwd/inv]_txfm[1/2]d_#x#" into nextgenv2 2016-04-01 17:24:50 +00:00
Angie Chiang
0a9eedfbef Merge "Add vp10_fwd_txfm2d_sse2" into nextgenv2 2016-04-01 17:24:34 +00:00
Debargha Mukherjee
f7457f5e89 Loopfilter fix
Fixes mismatch introduced in
https://chromium-review.googlesource.com/#/c/336645

Change-Id: I15cded221c18dbf87b5029bc464e975d5c7c40e3
2016-03-31 19:57:42 -07:00
Yaowu Xu
a416d5bd2d Fix a build issue
Change-Id: Ifdb32c487632098496bf59fcc76c518f8f0426d2
2016-03-31 16:06:24 -07:00
Debargha Mukherjee
2a6389bb8b Merge "Fix interpolation values and decouple interintra" into nextgenv2 2016-03-31 21:47:10 +00:00
Debargha Mukherjee
2be211e971 Fix interpolation values and decouple interintra
Decouples interintra modes and probability models from regular
intra modes, to enable creating/optimizing new interintra modes.
Also, fixes interpolation values for 128x128 interintra and obmc.

Change-Id: I5c2016db49b8f029164e5fe84c6274d4e02ff90e
2016-03-31 12:12:51 -07:00
Geza Lore
10232eda8e Refactor loopfilter level arrays to 2D.
Change-Id: Id20526d0b6d1371dc9f45cb8b5f24b6974da7bc4
2016-03-31 15:52:12 +01:00
Geza Lore
511da8cbe5 Rename MI_BLOCK_SIZE and MI_MASK macros.
Rename MI_BLOCK_SIZE.* -> MAX_MIB_SIZE.* (MIB is for MI Block).
Rename MI_MASK.* -> MAX_MIB_MASK.*

There are no functional changes.

This is in preparation for coding the superblock size at the frame
level, which will require some of these constants to become variables.
The new names better reflect future semantics, and hence make the code
clearer.

Change-Id: Iee08d97554cf4cc16a5dc166a3ffd1ab91529992
2016-03-31 09:57:41 +01:00
Hui Su
cce6688c31 Merge "Set block size upper bound for Palette mode" into nextgenv2 2016-03-31 00:23:11 +00:00
Angie Chiang
c7c40d2329 Generalize txfm scale in highbd quantizer
Change-Id: I359aa49c09b244e0d44ebd09442e365a3d22556c
2016-03-30 15:25:26 -07:00
Angie Chiang
25520d8dc3 change vp10_fwd_txfm2d_#x#_sse2 to vp10_fwd_txfm2d_#x#_sse4_1
The speed performance for running 20k times  is as follows

Notice that the vp10_highbd_fdct#x#_sse2 version is
16-bit version plus range check

The rest are 32-bit version

vp10_fwd_txfm2d_4x4_c (2 ms)
vp10_fwd_txfm2d_8x8_c (9 ms)
vp10_fwd_txfm2d_16x16_c (45 ms)
vp10_fwd_txfm2d_32x32_c (233 ms)

vp10_fwd_txfm2d_4x4_sse4_1 (2 ms)
vp10_fwd_txfm2d_8x8_sse4_1 (3 ms)
vp10_fwd_txfm2d_16x16_sse4_1 (16 ms)
vp10_fwd_txfm2d_32x32_sse4_1 (80 ms)

vp10_highbd_fdct4x4_c (1 ms)
vp10_highbd_fdct8x8_c (3 ms)
vp10_highbd_fdct16x16_c (17 ms)
highbd_fdct32x32_c (160 ms)

vp10_highbd_fdct4x4_sse2 (0 ms)
vp10_highbd_fdct8x8_sse2 (2 ms)
vp10_highbd_fdct16x16_sse2 (8 ms)
highbd_fdct32x32_sse2 (105 ms)

Change-Id: I24daf1e0d4d66e91e4ce61ef71cefa7b70ee90ce
2016-03-30 15:25:26 -07:00
Angie Chiang
c75f64780b Remove redundant code from vp10_fwd_txfm2d.c
Change-Id: I87ae5e93957616c0f5160a4f679e42f77092c33f
2016-03-30 15:25:26 -07:00
Angie Chiang
f2b311f580 Simplify rounding in vp10_[fwd/inv]_txfm[1/2]d_#x#
Change-Id: I24ce46e157dc5b9c0d75000a1a48e9c136ed4ee1
2016-03-30 15:25:26 -07:00
Angie Chiang
11d2bb5429 Add vp10_fwd_txfm2d_sse2
Change-Id: Idfbe3c7f5a7eb799c03968171006f21bf3d96091
2016-03-30 15:25:26 -07:00
Angie Chiang
64413a6ca7 Parameterize transform scale for quantizer
This is to facilitate changing transform scale later

Change-Id: Ic8ca5afba57d2489ebd191ccc40c1b31605a0d8c
2016-03-30 15:25:26 -07:00
hui su
cbb8be769d Set block size upper bound for Palette mode
Avoid buffer overflow in case of such new experiments as
128 x 128 superblock size.

Change-Id: Ib775f3925a85fc87227c0ddd9b6a6110a12ef196
2016-03-30 14:39:44 -07:00
Debargha Mukherjee
8d3a4aa891 Some fixes/speed-ups on inter-intra part of ext-inter
Fixes an issue with rectangular inter-intra blocks.
Includes various other refactoring and cleanups to enable fast mixing
of inter and intra predictors.
Uses only the best single inter reference so far for the inter-intra
search.

About 30% speed-up with a 0.1% hit in performance.

This is part one of overhauling on the ext-inter experiment. To be
continued in subsequent patches.

Change-Id: Id10ee100c78c6e00009a3a4f930a4435ef403a95
2016-03-30 14:39:29 -07:00
Debargha Mukherjee
91707ac79e Merge "Extend superblock size fo 128x128 pixels." into nextgenv2 2016-03-30 20:55:32 +00:00
Geza Lore
552d5cd715 Extend superblock size fo 128x128 pixels.
If --enable-ext-partition is used at build time, the superblock size
(sometimes also referred to as coding unit (CU) size) is extended to
128x128 pixels.

Change-Id: Ie09cec6b7e8d765b7555ff5d80974aab60803f3a
2016-03-30 18:23:06 +01:00
Debargha Mukherjee
e467627f33 Merge "Fix for ext_interp experiment" into nextgenv2 2016-03-30 14:44:39 +00:00
Yaowu Xu
37241e6f95 Merge "Merge branch 'masterbase' into nextgenv2" into nextgenv2 2016-03-29 16:05:53 +00:00
Julia Robson
068e799459 Fix for ext_interp experiment
Amends previous commit to also handle subsampling correctly.
Change ID of prev commit: I6b07e6cf9b287ba4b5bd6599af4a7412e50b3bdc

Was causing occassional failures for 422 streams due to accessing
elements beyond the extent of the bmi array.

Change-Id: I37ebabf4c01ca84bcd1851428172bdf753805d98
2016-03-29 16:09:49 +01:00
Yaowu Xu
c810740c36 Merge branch 'masterbase' into nextgenv2
Conflicts:
	vp9/encoder/vp9_encoder.c
	vpx_dsp/x86/convolve.h

Change-Id: I60c3532936bedd796a75dfe78245a95ec21e2e55
2016-03-28 17:44:28 -07:00
Angie Chiang
4144a11552 Merge "Use vp10_[fwd/inv]_txfm2d_add_32x32 for bd 10" into nextgenv2 2016-03-28 19:20:48 +00:00