Allow for 3 quant profiles from entropy context
Refactored dq_offset bands to allow for re-optimization based on number
of quantization profiles
Change-Id: Ib8d7e8854ad4e0bf8745038df28833d91efcfbea
A tile copy mode is introduced, while allows a tile to use
another tile's coded data directly at bitstream level. This
largely reduces the bit rate in this use case. Our tests
showed that 10% - 20% bit rate reduction was achieved.
Change-Id: Icf5ae00320e27193b15ce95297720f8b6f5e7fd9
Added dq_off_index attribute to mbmi to allow for switching between
dequantization modes.
Reduced number of different dequantization modes from 5 to 3.
Changed dequant_val_nuq to be allow for 3 dequant levels instead of 1.
Fixed lint errors
Change-Id: I7aee3548011aa4eee18adb09d835051c3108d2ee
In large-scale tile coding, when the number of tiles is large and the tile
size is small, using a fixed number of bytes in the tile header to store
tile-data size information as done in current VP9 codec would bring high
overhead for each tile. This patch implemented 2 ways to lower that overhead
and adaptively determine the number of bytes needed for tile-data size
transmission.
The test on a test clip having the tile size of 64x64 showed that the number
of bytes used for storing tile-data size was reduced from 4 to 1, which
substantially improved the compression ratio in large-scale tile coding.
Change-Id: Ia02fc43eda67fa252fbc2554321957790f53f6fd
Changes ensure wedge_partition, interintra and palette expts all
work alongside the ext_coding_unit_size experiment.
Change-Id: I18f17acb29071f6fc6784e815661c73cc21144d6
When the error resilient mode is on, the decoder resets mode info structure
to zero once per frame. This makes decoder about 10x slower if we decode
a single tile at a time. This patch resolves the issue by only memset mode
info of those decoded tiles. Currently, to decode a frame, tile decoding is
less than 2x slower than frame decoding.
Change-Id: Ia3fd88d91a4e74e7bbbc6547d87c24d085a1533e
These changes have been made in preparation for the work on the
extended coding unit size experiment.
Change-Id: I83f289812426bb9aba6c4a5fedd2b0e0a4fe17cb
Under the experiment of CONFIG_LAST4_REF. On derflr testset, using
highbitdepth (HBD), in average PSNR,
(1) LAST2+LAST3+LAST4 obtained +0.361% against LAST2+LAST3;
(2) LAST2+LAST3+LAST4 obtained +1.567% against baesline.
Change-Id: Ic8b14272de6a569df2b54418fa72b505e1ed3aad
Under experiment CONFIG_LAST3_REF, which can only be turned on when
the experiment of CONFIG_MULTI_REF is on, i.e. LAST3_FRAME can only
be used when LAST2_FRAME is used. CONFIG_LAST3_REF would most likely
be combined with CONFIG_MULTI_REF once the performance improvement
is further confirmed.
On the testset of derflr, using Average PSNR metrics, with HighBitDepth
(HBD) on:
(1) LAST2 HBD obtained +0.579% against base HBD;
(2) LAST2 + LAST3 HBD obtained +0.591% against LAST2 HBD;
(3) LAST2 + LAST3 HBD obtained +1.173% against base HBD.
Change-Id: I1aa2b2e2d2c9834e5f8e61bf2d8818c7b1516669
This CL introduces a few macros plus code cleaning on the encoding of
the reference frames. Coding performance remains unchanged.
For the encoding of either the compound reference or the single reference
case, since each bit has different contexts, the tree structure may not
be applied to treat the combined bits as one symbol. It is possible we may
explore the sharing of the same context for all the bits to introduce
the use of tree structure for the next step.
Change-Id: I6916ae53c66be1a0b23e6273811c0139515484df
Under the experiment CONFIG_MULTI_REF. Current version shows
LAST2 vs base in nextgen on the testset of derflr:
(1) 8-bit: Average PSNR +0.53%
(worst: students_cif: -0.247%; best: mobile_cif: 1.902%)
(2) 12-bit HBD: Average PSNR +0.63%
(worst: pamphlet_cif: -0.213%, best: mobile_cif: 2.101%)
More tuning on the reference frame context design and default
probs are being conducted. This version does not guarantee to
work with other experiments in nextgen. A separate CL will address
the working with all other experiments.
Change-Id: I7f40d2522517afc26ca389c995bad56989587f65
CONFIG_SR_MODE=1, enable SR mode
USE_POST_F=1, enable SR post filter
SR_USE_MULTI_F=1, enable SR post filter family
Not compatible with other experiments yet
Change-Id: I116f1d898cc2ff7dd114d7379664304907afe0ec
Limited the prediction extension to 8 pixels at each edge
Fixed a bug in the combination of wedge prediction and supertx
~10% speed up in decoder
derflr: -0.004
derflr+hbd: +0.002
hevcmr: +0.015
Change-Id: I777518896894a612c9704d3de0e7902bf498b0ea
Framework for alternate transforms for inter 32x32 and larger based
on dwt-dct hybrid is implemented.
Further experiments are to be condcuted with different
variations of hybrid dct/dwt or plain dwt, as well as super-resolution
mode.
Change-Id: I9a2bf49ba317e7668002cf1499211d7da6fa14ad
There are 4 entropy tables to select for initial entropy table,
depending on the frame base q-index. The entropy tables are
trained with derf, yt, and stdhd sets. About 0.2% gain on
the following test sets:
derflr 0.227%
yt 0.277%
stdhd 0.233%
hevclr 0.221%
hevcmr 0.155%
hevchr 0.182%
Change-Id: I3fde846c47fc020e80c814897690b4cda1da569c
Change-Id: I460408372586c823974f945ed9fd8dcb0360fbaf
For the combination of this and removing NEWDV from the tree:
derflr: -0.101 screen_content: +0.053
The bulk of the decline in screen content effecincy is from the liquify
clip. These should be recoverable by further entropy tweaks.
Change-Id: I9d80152b8492e60a0367c31797fb6932fb09bba9
If the decoder is configured to decode a tile indexed above the
upper limit, the internal codec will clip the index to the upper
limits.
Change-Id: Icbc1bb7b14069ac009e0a2042dd66a46d6f76679
This commit allows the decoder to decode selective tiles according
to decoder configuration settings. To decode a single tile out of
the provided key frame bit-stream (test_kf.webm), set compiler
configuration:
--enable-experimental --enable-row-tile --enable-key-frame-tile
use the command:
vpxdec -o test_dec.y4m test_kf.webm --tile-row=1 --tile-column=2
where the tile's row and column indexes are 1 and 2, respectively.
To decode all row tiles inside a provided column index, use:
--tile-row=-1 --tile-column=2
To decode all column tiles inside a provided row index, use:
--tile-row=2 --tile-column=-1
Change-Id: Ib73c266414dcee7eaab5d741b90d0058970dae56
With this patch, the ZEROMV mode is overloaded to represent
a single global dominant motion using one of three models:
1. True zero translation motion (as before)
2. A translation motion different from 0
3. A Rotation-zoom affine model where the predictor is warped
The actual model used is indicated at the frame level for
each reference frame.
A metric that computes the ratio of the error with a global
non-zero model to the error for zero motion, is used to
determine on the encoder side whether to use one of the two
non-zero models or not.
Change-Id: I1f3d235b8860e543191237024a89041ff3aad689
This commit makes the bit-stream syntax support fast selective tile
decoding in a large scale tile array. It reduces the computational
complexity of computing the target tile offset in the bit-stream
from quadratic to linear scale, while maintaining relatively small
stack space requirement (in the order of 1024 bytes instead of 1M
bytes). The overhead cost due to tile separation remains identical.
Change-Id: Id60c6915733d33a627f49e167c57d2534e70aa96
If a non-skipped block has all transform blocks with only 0 data, then
decoder infers skip flag. This affects the loopfilter. No real encoder
would do this though, so it is pointless. Also, it causes headaches in
HW implmentations as the loop filter cannot proceed until all TX blocks
in the block have been checked. There could be up to 768 of them in
64x64 4:4:4 with 4x4 transform.
Change-Id: I45a021d1f27ca7feefed2242605777e70ce7cabd
This commit allows the encoder to process tile coding per 64x64
block. The supported upper limit of tile resolution is the minimum
of frame size and 4096 in each dimension. To turn on, set
--experiment --row-tile
and compile.
It overwrite the old --tile-columns and --tile-rows configurations.
These two parameters now tell the encoder the width and height of
tile in the unit of 64x64 block. For example,
--tile-columns=1 --tile-rows=1
will make a tile contains a single 64x64 block.
Change-Id: Id515749a05cfeb9e9d008291b76bdfb720de0948
This commit allows the internal codec handle arbitrary tile size
in the unit of 64x64 pixel blocks.
Change-Id: I3ad24de392064645bebab887c94e1db957794916
Move the 2D tile info arrays as global variables. This resolves
the local function stack overflow issue due to excessively large
tile info variables. This allows the internal operation to support
up to 1024 row and column tiles.
Change-Id: I6644cc929e5d3a778a5c03a712ebfc0b8729f576
This commit allows the codec to use up to row tiles (optionally
in combination with up to 64 column tiles per row tile). The
minimum tile size is set to be 256x256 pixel block.
Change-Id: I811ca93f0c5eba41e190f6c7c0f064d1083f530f
The max and min tile number reference should be used to support
both row and column tiles. This commit renames the previous col
prefix to avoid confusion.
Change-Id: I487bea43701af946b79023597a9a9a0516480380