This commit reworks the transform and quantization unit. It enables
the use of adaptive quantization for intra modes. This further
improves the compression performance:
lowres 0.36%
midres 0.79%
hdres 0.73%
The key frame coding performance is improved:
lowres 1.7%
midres 1.9%
hdres 3.3%
The overall coding gains are:
lowres 1.1%
midres 1.8%
hdres 2.3%
Change-Id: Iaec1a3a4c1d5eac883ab526ed076d957060479dd
This commit combines uniform quantizer with trellis based coefficient
level optimization. It improves the codebase compression performance:
lowres 0.8%
midres 1.0%
hdres 1.6%
Note that the current trellis optimization unit is using C code. This
will make the cost of the overall quantization process slower. A number
of optimizations will come up next.
Change-Id: Id441dd238e4844409d0f08f82604be777f3f5282
This experiment implements non-uniform quantization where
the width of the bins increases gradually to more closely
match a laplacian distribution of the coeficcients.
Performance Gain:
derflr: 0.15%
hevcmr: 0.675%
Change-Id: I25234244e3bcd94b87c1f77cf682190b61c8ef94
The assumption doesn't hold true in the current codebase. Remove
this speed feature to simplify the codebase.
Change-Id: I9b69f484c9b7cd612b825047cc5b2fce63ee0af7
"qc" in vp10_token_state is used to save quantized coefficients, this
commit changes the type from short to tran_low_t to properly reflect
the value range for highbitdepth build.
This fixes an out-of-range bug when optimize_b is used in highbitdepth
build.
Change-Id: I914c6fd3d3f4b9d061f9ed7cc5f08a883ab59dcd
x->blk_skip used to be uninitialized (leftover from encoding the
previous block), if cm->tx_mode != TX_MODE_SELECT (which is used with
higher --cpu-used or --rt options). This resulted in degraded coding
performance when using cm->tx_mode != TX_MODE_SELECT.
This fixes the VP10/EndToEndTestLarge.EndtoEndPSNRTest/40 unit test.
Also fixed an edge effect where encode_block in encodemb.c used the
formal width of the block (without cropping at the right edge), to
look up blk_skip, while select_tx_block in rdopt.c used the cropped
width to set blk_skip.
Change-Id: I76d0f49ac5ab3ab54203573e0d7fcfcc1c6aa10d
x->blk_skip used to be uninitialzied (leftover from encoding the
previous block), if cm->tx_mode != TX_MODE_SELECT (which is used with
higher --cpu-used or --rt options). This resulted in degraded coding
performance when uning cm->tx_mode != TX_MODE_SELECT.
This fixes the VP10/EndToEndTestLarge.EndtoEndPSNRTest/40 unit test.
Change-Id: If39062927446798c626fc93694b4e6a4f35fa5da
Rename MI_BLOCK_SIZE.* -> MAX_MIB_SIZE.* (MIB is for MI Block).
Rename MI_MASK.* -> MAX_MIB_MASK.*
There are no functional changes.
This is in preparation for coding the superblock size at the frame
level, which will require some of these constants to become variables.
The new names better reflect future semantics, and hence make the code
clearer.
Change-Id: Iee08d97554cf4cc16a5dc166a3ffd1ab91529992
If --enable-ext-partition is used at build time, the superblock size
(sometimes also referred to as coding unit (CU) size) is extended to
128x128 pixels.
Change-Id: Ie09cec6b7e8d765b7555ff5d80974aab60803f3a
Brings the following commits to vp10:
269428e Tie the bit cost scale to a define.
d13385c Switch to 9-bit rate cost constants built on a 256 probability denominator.
ad43a73 Fix a signed overflow in vp9 motion cost.
1c9b091 Fix some interger overflow errors
fac947d Restore previous motion search bit-error scale.
Change-Id: I598ba7ee7efcde18439c31dfa96b86cbf297a580
Various additional changes were made to make the experiment
compatible with misc_fixes.
derflr: +0.979%
hevcmr: +0.865%
Speed-wise with --enable-supertx the encoder is only about 10%
slower than without. Decoding impact is about 30% slowdown.
Note this does not work with ext-tx or var-tx yet. That is
a TODO.
Change-Id: If25af4241a7a9efbd28f58eda3c4f044c7a7ef4b
1) Add VP10_XFORM_QUANT_SKIP_QUANT mode for vp10_xform_quant
2) Let encode_block call vp10_xform_quant so that its code flow
is clear
Change-Id: I122d5cf6a089f444ae018f3e4bf844be847e17ee
1) Add facade to quantize b/fp/dc version so that their interface
are the same.
2) Merge vp10_xform_quant b/fp/dc version to one function so that
the code flow in encodemb.c is clear
Change-Id: Ib62d6215438fc2d07f4e7e72393f964832d6746f
Add inv_txfm and highbd_inv_txfm as facades of inverse transform
such that the code flow in encodemb.c can be simpler
Change-Id: Iea45fd22dd8b173f8eb3919ca6502636f7bcfcf7
This patch eliminates the copying of data when using FLIPADST forward
transforms, by incorporating the necessary data flipping into the
load_buffer_* functions of the SSE2 optimized forward transforms. The
load_buffer_* functions are normally inlined, so the overhead of copying
the data is removed and the overhead of flipping is minimized. Left to
right flipping is still not free, as the columns need to be shuffled in
registers.
To preserve identity between the C and SSE2 implementations, the
appropriate C implementations now also do the data flipping as part of
the transform, rather than relying on the caller for flipping the input.
Overall speedup is about 1.5-2% in encode on my tests. Note that these
are only the forward transforms. Inverse transforms to come in a later
patch.
There are also a few code hygiene changes:
- Fixed some indents of switch statements.
- DCT_DCT transform now always use vp10_fht* functions, which dispatch
to vpx_fdct* for DCT_DCT (some of them used to call vpx_fdct*
directly, some of them used to call vp10_fht*).
Change-Id: I93439257dc5cd104ac6129cfed45af142fb64574
Allows inter and intra tx_types to have different sets of
transforms for different tx_size/sb_type combinations.
Change-Id: Ic0ac1daef7a9fb15c4210271e4d04cd36e5cec8e
This commit fixes the merge conflicts between master and nextgenv2 and
disable early termination in choose_tx_size() to avoid failure in test.
The test failures are pre-existing, some of the issue were fixed in
masterbase already, so will have another merge to introduce the fixes.
Change-Id: Ib71889661955e73aedbb4db49d8be70425281dcb
Some more testing of this patch would probably be useful, but I
think the basics of it should work fine now.
See issue 1035.
Change-Id: I4a36d58f671c5391cb09d564581784a00ed26245
This experiment allows using full above/right edges for all transform
sizes whenever available (for d45/d63), and adds bottom/left edges for
d207.
See issue 1043.
Change-Id: I5cf7f345e783e8539bb6b6d2c9972fb1d6d0a78b
This commit makes the rate-distortion optimization for chroma
component support the recursive transform block coding scheme.
Change-Id: I1bfed6d05b0ebb3905cb625222401e2ccbae10f3
This commit makes the transform, quantization, tokenization and
their corresponding inverse operations support recursive transform
block coding process.
Change-Id: I71f2ef3a7c2d3db7cfc63c1fd3f1337e8e0360b5
Add the row and column index to the argument list of unit functions
called by foreach_transformed_block wrapper. This avoids the
repeated internal parsing according to the block index.
Change-Id: I42b3578eac258ebaba7a7c74f684de9abab521a6
Resolved Conflicts in the following files:
configure
vp10/common/idct.c
vp10/encoder/dct.c
vp10/encoder/encodemb.c
vp10/encoder/rdopt.c
Change-Id: I4cb3986b0b80de65c722ca29d53a0a57f5a94316