Implemented fdst8_sse2() function against C version: fdst8().
Added seven DST related hybrid transform types in vp10_fht8x8_sse2().
Replaced vp10_fht8x8_c() with vp10_fht8x8_sse2() in fwd_txfm_8x8().
Speedup: 18.1%, 11.5%, 22.0% based on speed test from
city_cif.y4m, garden_sif.y4m, mobile_cif.y4m.
Change-Id: Ia4aa1ea44c7a33e494f64ce843037f8703f975e3
Adds hooks to use 32x32 ext-tx. Also adds scan orders for the masked
transforms for 32x32.
Make macro USE_MSKTX_FOR_32X32 1 in blockd.h to support 32x32 masked
transforms for ext-tx.
Change-Id: Ie6564830266651fcafae2d536c274dafd664ce17
These variable names were legacy from a previous version of this
function and in the current version they were confusingly backwards.
Change-Id: I4f6c1628f296fd5b650fd9c5e2d56d7daf66a3f6
This commit enables a context based motion vector entropy coding
conditioned on dynamic reference motion vector list. This (along with
the previous CL) imporves the coding gains due to dynamic motion
vector referencing based entropy coding:
derf 0.1%
hevcmr 0.2%
stdhd 0.7%
hevchr 0.4%
No encoding time change was observed.
Change-Id: I179c723844079195f6952a12582996a3ca9e9914
Instead of using model_rd_for_sb() to estimate the cost and make the
decision on bmc/obmc, we use super_block_yrd/uvrd() to calculate and
compare the real rd costs of bmc and obmc.
Average bit-rate reduction(%) of obmc experiment:
derflr/derfhd/hevcmr/hevchd
2.353/TBD/TBD/TBD
Before the optimization, the coding gain was:
1.582/1.109/1.600/1.164
Note: there is still some mysterious bug because that compared to
the previous version, the performance at low bit rate drops a lot.
Change-Id: I8dbee04a272190f10516a3953c1ae690f8136766
This commits adds a shift stage for FASTSSIM computaton when source
bit depth is different from working bit depth, to make sure metric
results are calculated in bit_depth consistent with source.
Change-Id: I997799634076ef7b00fd051710544681ed536185
This commit adds the ability to shift down the working buffer when
source bit_depth is different than working bit_depth. It does so
by shift down to be consistent with source bit_depth.
Change-Id: Idfdbfc614d73fe445d62e35e642cc7d75e9dc4ff
Don't initialize first pass costs for a number of symbols where first
pass probabilities aren't initialized.
As a side effect, an illegal read in the ANS experiment is fixed.
https://bugs.chromium.org/p/webm/issues/detail?id=1089
Change-Id: I97438c357bd88f52f5a15c697031cf0c3cc8f510
This commit unifies the motion vector cost buffers for full pixel
and sub-pixel motion search. The new motion vector coding system
provides 0.5% coding gains for 720p and above sequences and 0.2%
for lower resolution sets.
Change-Id: I927ec81eadc39d11a3c12b375221a1ddd2e8bf24
This commit extends the HBDMetricTests to handle testing for metric
computation where input source depth is different from working bit
depth.
Change-Id: I5d11101cc9603a3fd09e8439816bb982a0f1b654
Priviously, we do 12-tap interpolation even there is no sub pixel,
This could cause a bug becuase decoder doesn't extend border when there
is no sub pixel. In this situation, if we still do interpolation, we
will access the border extension which doesn't exist and cause a
memory error
Change-Id: I55b879722f0a10c5d13261bd9617a75c826a2418
This commit accounts for the context based probability model for
motion vector cost estimate in rate-distortion optimization.
Change-Id: Ia068a9395dcb4ecc348f128b17b8d24734660b83
This commit converts the scalar motion vector probability model
into vector format for later precise estimate.
Change-Id: I7008d047ecc1b9577aa8442b4db2df312be869dc
This fixes a bug in HBD sum of squared error computation introduced
in #abd00505d1c658cc106bad51369197270a299f92.
Change-Id: I9d4e8627eb8ea491bac44794c40c7f1e6ba135dc