This commit enables a hybrid 1-D/2-D transform coding scheme and
the accompany entropy coding system. It currently uses hybrid
1-D/2-D DCT transform coding. It provides coding performance gains:
lowres_all 0.55%
hdres_all 0.43%
Change-Id: I2b30dcafd21eb2bb3371f6e854cbab440a4dfa78
This sets up the interface for 3 speed features that progressively
eliminate a greater number of transforms in ext tx using
pre-trained support vector machines.
Each speed feature still needs to be implemented.
Change-Id: Ia508aeadc0cffdc080fb227f357a5d1dfbca08e2
This commit fixes an encoding issue related to var-tx and ref-mv
experiments that causes the codec to use random values for transform
block skip flag.
Change-Id: I8daa6d6b88ea45b5bbeb81b43dd0eeff545c8e5a
Make the RANS implementation operate on cumulative distribution
functions rather than individual probability distribution functions.
CDFs have shown themselves more flexible to work with.
Reduces decoding memory usage from scaling O(num_distributions *
symbol_resolution) to O(num_distributions).
No bitstream change. This is an purely implementation change.
Change-Id: I4e18d3a0a3d37a36a61487c3d778f9d088b0b374
fdct16_sse2() was not bit-exact with C reference, fdct16().
The inconsistency was found by writing a unit test for
vp10_fht16x16_sse2(). Since the unit test needs a pending
change on the inherited base class. I will commit this unit
test after making a header file for this base class.
Passed the uncommitted unit test: vp10_fht16x16_test.cc.
Change-Id: If2b617883c633a3ea90c19e1d018240c8007102b
The sum of squared value of a block can overflow 32bit, this commit
changes to use int64_t to avoid the overflow issue.
Change-Id: I78fcd6999634f186f86d649cfce85d97a993d040
Up-sampled the reference frames to 8 times in each dimension using
the 8-tap interpolation filter. In sub-pixel motion search, use the
up-sampled reference frames to find the best matching blocks. This
largely improved the motion search precision, and thus, improved
the compression quality. There was no change in decoder side.
Borg test and speed test results:
1. On derflr set,
Overall PSNR gain: 1.306%, and SSIM gain: 1.512%.
Average speed loss on derf set was 6.0%.
2. On stdhd set,
Overall PSNR gain: 0.754%, and SSIM gain: 0.814%.
On hevchd set,
Overall PSNR gain: 0.465%, and SSIM gain: 0.527%.
Speed loss on HD clips was 3.5%.
Change-Id: I300ebaafff57e88914f3dedc8784cb21d316b04f
Fixes some issues introduced by a merge of two patches.
Also decouples the temporal interpolation filter from the switchable
filters for now for ease of experimentation with both separately.
Change-Id: If1c7c08adf00e0cf818fe8d0d3656c26ea65eb32
Includes various cosmetic changes and refactoring including
naming the sharp filters differently (since they are no longer
8-tap).
Change-Id: Ida5a19ca0daa9f6a64a6734394c685b2a4a2564a
The interintra experiment, which combines an inter prediction and an
inter prediction have been ported from the nextgen branch. The
experiment is merged into ext_inter, so there is no separate configure
option to enable it.
Change-Id: I0cc20cefd29e9b77ab7bbbb709abc11512320325
This commit uses 12-tap sharp filter to generate alter reference
frame. It improves the compression performance by
derf 0.45%
hevcmr 0.35%
stdhd 0.79%
No encoding time change is observed.
Change-Id: Ia5dc26d5aae6b9b0cb782e5a28dc5066eeeb2ec8
Implemented fdst8_sse2() function against C version: fdst8().
Added seven DST related hybrid transform types in vp10_fht8x8_sse2().
Replaced vp10_fht8x8_c() with vp10_fht8x8_sse2() in fwd_txfm_8x8().
Speedup: 18.1%, 11.5%, 22.0% based on speed test from
city_cif.y4m, garden_sif.y4m, mobile_cif.y4m.
Change-Id: Ia4aa1ea44c7a33e494f64ce843037f8703f975e3
Adds hooks to use 32x32 ext-tx. Also adds scan orders for the masked
transforms for 32x32.
Make macro USE_MSKTX_FOR_32X32 1 in blockd.h to support 32x32 masked
transforms for ext-tx.
Change-Id: Ie6564830266651fcafae2d536c274dafd664ce17
Instead of using model_rd_for_sb() to estimate the cost and make the
decision on bmc/obmc, we use super_block_yrd/uvrd() to calculate and
compare the real rd costs of bmc and obmc.
Average bit-rate reduction(%) of obmc experiment:
derflr/derfhd/hevcmr/hevchd
2.353/TBD/TBD/TBD
Before the optimization, the coding gain was:
1.582/1.109/1.600/1.164
Note: there is still some mysterious bug because that compared to
the previous version, the performance at low bit rate drops a lot.
Change-Id: I8dbee04a272190f10516a3953c1ae690f8136766
This commits adds a shift stage for FASTSSIM computaton when source
bit depth is different from working bit depth, to make sure metric
results are calculated in bit_depth consistent with source.
Change-Id: I997799634076ef7b00fd051710544681ed536185
Don't initialize first pass costs for a number of symbols where first
pass probabilities aren't initialized.
As a side effect, an illegal read in the ANS experiment is fixed.
https://bugs.chromium.org/p/webm/issues/detail?id=1089
Change-Id: I97438c357bd88f52f5a15c697031cf0c3cc8f510
This commit unifies the motion vector cost buffers for full pixel
and sub-pixel motion search. The new motion vector coding system
provides 0.5% coding gains for 720p and above sequences and 0.2%
for lower resolution sets.
Change-Id: I927ec81eadc39d11a3c12b375221a1ddd2e8bf24
This commit accounts for the context based probability model for
motion vector cost estimate in rate-distortion optimization.
Change-Id: Ia068a9395dcb4ecc348f128b17b8d24734660b83
This commit converts the scalar motion vector probability model
into vector format for later precise estimate.
Change-Id: I7008d047ecc1b9577aa8442b4db2df312be869dc