1015 Commits

Author SHA1 Message Date
Jingning Han
a8dc9694a4 Hybrid 1-D/2-D transform coding
This commit enables a hybrid 1-D/2-D transform coding scheme and
the accompany entropy coding system. It currently uses hybrid
1-D/2-D DCT transform coding. It provides coding performance gains:

lowres_all  0.55%
hdres_all   0.43%

Change-Id: I2b30dcafd21eb2bb3371f6e854cbab440a4dfa78
2016-03-07 09:27:46 -08:00
Sarah Parker
df3849370a Merge "Adding speed feature interface for ext tx search" into nextgenv2 2016-03-07 16:32:55 +00:00
Sarah Parker
2ca7d42e7e Adding speed feature interface for ext tx search
This sets up the interface for 3 speed features that progressively
eliminate a greater number of transforms in ext tx using
pre-trained support vector machines.
Each speed feature still needs to be implemented.

Change-Id: Ia508aeadc0cffdc080fb227f357a5d1dfbca08e2
2016-03-04 10:27:21 -08:00
Jingning Han
04cb49385e Merge "Properly restore transform block skip flag in RD search" into nextgenv2 2016-03-03 23:30:58 +00:00
Jingning Han
7174d637e8 Properly restore transform block skip flag in RD search
This commit fixes an encoding issue related to var-tx and ref-mv
experiments that causes the codec to use random values for transform
block skip flag.

Change-Id: I8daa6d6b88ea45b5bbeb81b43dd0eeff545c8e5a
2016-03-03 13:52:49 -08:00
Yi Luo
6231b6b077 Merge "Fixed a computation bug in fdct16_sse2()" into nextgenv2 2016-03-03 20:05:36 +00:00
Alex Converse
6bbbe31656 ANS: Switch from PDFs to CDFs.
Make the RANS implementation operate on cumulative distribution
functions rather than individual probability distribution functions.
CDFs have shown themselves more flexible to work with.

Reduces decoding memory usage from scaling O(num_distributions *
symbol_resolution) to O(num_distributions).

No bitstream change. This is an purely implementation change.

Change-Id: I4e18d3a0a3d37a36a61487c3d778f9d088b0b374
2016-03-03 09:32:54 +00:00
Yi Luo
68d6a5073a Fixed a computation bug in fdct16_sse2()
fdct16_sse2() was not bit-exact with C reference, fdct16().
The inconsistency was found by writing a unit test for
vp10_fht16x16_sse2().  Since the unit test needs a pending
change on the inherited base class.  I will commit this unit
test after making a header file for this base class.
Passed the uncommitted unit test: vp10_fht16x16_test.cc.

Change-Id: If2b617883c633a3ea90c19e1d018240c8007102b
2016-03-02 15:20:12 -08:00
Yunqing Wang
84f982080a Minor fix in header files
Move functions to be included in extern "C".

Change-Id: If57fa5eb7955763cf99e6839dde4d7221fad75ea
2016-03-01 13:16:03 -08:00
Yaowu Xu
5c613ea881 Fix an overflow issue for HBD
The sum of squared value of a block can overflow 32bit, this commit
changes to use int64_t to avoid the overflow issue.

Change-Id: I78fcd6999634f186f86d649cfce85d97a993d040
2016-03-01 09:44:04 -08:00
Yunqing Wang
342a368fd4 Do sub-pixel motion search in up-sampled reference frames
Up-sampled the reference frames to 8 times in each dimension using
the 8-tap interpolation filter. In sub-pixel motion search, use the
up-sampled reference frames to find the best matching blocks. This
largely improved the motion search precision, and thus, improved
the compression quality. There was no change in decoder side.

Borg test and speed test results:
1. On derflr set,
Overall PSNR gain: 1.306%, and SSIM gain: 1.512%.
Average speed loss on derf set was 6.0%.
2. On stdhd set,
Overall PSNR gain: 0.754%, and SSIM gain: 0.814%.
On hevchd set,
Overall PSNR gain: 0.465%, and SSIM gain: 0.527%.
Speed loss on HD clips was 3.5%.

Change-Id: I300ebaafff57e88914f3dedc8784cb21d316b04f
2016-02-29 12:14:47 -08:00
Debargha Mukherjee
db084506d8 A build fix and some other cosmetic changes
Fixes some issues introduced by a merge of two patches.
Also decouples the temporal interpolation filter from the switchable
filters for now for ease of experimentation with both separately.

Change-Id: If1c7c08adf00e0cf818fe8d0d3656c26ea65eb32
2016-02-29 10:20:52 -08:00
Debargha Mukherjee
48589e8d07 Merge "Some refactoring and cleanups of interp filter" into nextgenv2 2016-02-29 15:55:48 +00:00
Jingning Han
0fc0c1a32d Merge "Enable improved temporal filter in ext-interp experiment" into nextgenv2 2016-02-27 01:22:15 +00:00
Debargha Mukherjee
bab2912b5e Some refactoring and cleanups of interp filter
Includes various cosmetic changes and refactoring including
naming the sharp filters differently (since they are no longer
8-tap).

Change-Id: Ida5a19ca0daa9f6a64a6734394c685b2a4a2564a
2016-02-26 15:42:49 -08:00
Jingning Han
95d35a4a0b Enable improved temporal filter in ext-interp experiment
It improves the coding performance by 0.3%.

Change-Id: I9703abd705ceacdf9e7424428e5120253cadcc18
2016-02-26 21:59:51 +00:00
Geza Lore
7ded038af5 Port interintra experiment from nextgen.
The interintra experiment, which combines an inter prediction and an
inter prediction have been ported from the nextgen branch. The
experiment is merged into ext_inter, so there is no separate configure
option to enable it.

Change-Id: I0cc20cefd29e9b77ab7bbbb709abc11512320325
2016-02-26 13:01:51 -08:00
Debargha Mukherjee
3287f5519e Merge "Hooks to use 32x32 masked transforms for ext-tx" into nextgenv2 2016-02-26 20:54:37 +00:00
Yi Luo
b347c3c5e5 Merge "Implemented DST 8x8 with SSE2 intrinsics." into nextgenv2 2016-02-26 19:10:00 +00:00
Jingning Han
2b7196a8bb Merge "Use sharp filter for alter reference frame generation" into nextgenv2 2016-02-26 16:24:59 +00:00
Yaowu Xu
a570cefcf8 Merge "Extend vpxssim to handle more HBD combinations" into nextgenv2 2016-02-26 15:57:40 +00:00
Jingning Han
72eda13e50 Use sharp filter for alter reference frame generation
This commit uses 12-tap sharp filter to generate alter reference
frame. It improves the compression performance by
derf    0.45%
hevcmr  0.35%
stdhd   0.79%

No encoding time change is observed.

Change-Id: Ia5dc26d5aae6b9b0cb782e5a28dc5066eeeb2ec8
2016-02-25 14:20:38 -08:00
James Zern
ac4c37c684 vp9/10: fix forced keyframes w/alt-refs enabled
in 1-pass encodes. issues with 2-pass as well as other forced flags
persist.

Change-Id: Ic7ceb906fccea6456d5df96483c10cacd46e01c7
2016-02-24 15:56:37 -08:00
Yi Luo
0353f596e9 Implemented DST 8x8 with SSE2 intrinsics.
Implemented fdst8_sse2() function against C version: fdst8().
Added seven DST related hybrid transform types in vp10_fht8x8_sse2().
Replaced vp10_fht8x8_c() with vp10_fht8x8_sse2() in fwd_txfm_8x8().
Speedup: 18.1%, 11.5%, 22.0% based on speed test from
city_cif.y4m, garden_sif.y4m, mobile_cif.y4m.

Change-Id: Ia4aa1ea44c7a33e494f64ce843037f8703f975e3
2016-02-24 14:58:01 -08:00
Debargha Mukherjee
da2d4a7afc Hooks to use 32x32 masked transforms for ext-tx
Adds hooks to use 32x32 ext-tx. Also adds scan orders for the masked
transforms for 32x32.
Make macro USE_MSKTX_FOR_32X32 1 in blockd.h to support 32x32 masked
transforms for ext-tx.

Change-Id: Ie6564830266651fcafae2d536c274dafd664ce17
2016-02-24 13:08:37 -08:00
Debargha Mukherjee
389efb289e Adds an utility macro ROUNDZ_POWER_OF_TWO
This macro works for the shift parameter being 0.
The ROUND_POWER_OF_TWO macro does not.

Change-Id: I8434d2933892e09bbc0d2dafc934d0c3637df347
2016-02-24 12:35:29 -08:00
Yaowu Xu
aa6c754635 Merge remote-tracking branch 'webm/master' into nextgenv2 2016-02-24 10:53:17 -08:00
Debargha Mukherjee
c1e51beba6 Merge "Experiment to use image domain dist in baseline." into nextgenv2 2016-02-24 18:30:50 +00:00
Yue Chen
02e734168c Merge "Optimizing obmc rd decision by checking the real rd cost" into nextgenv2 2016-02-23 23:05:06 +00:00
Yue Chen
a614262edb Optimizing obmc rd decision by checking the real rd cost
Instead of using model_rd_for_sb() to estimate the cost and make the
decision on bmc/obmc, we use super_block_yrd/uvrd() to calculate and
compare the real rd costs of bmc and obmc.

Average bit-rate reduction(%) of obmc experiment:
derflr/derfhd/hevcmr/hevchd
2.353/TBD/TBD/TBD
Before the optimization, the coding gain was:
1.582/1.109/1.600/1.164

Note: there is still some mysterious bug because that compared to
the previous version, the performance at low bit rate drops a lot.

Change-Id: I8dbee04a272190f10516a3953c1ae690f8136766
2016-02-23 14:16:12 -08:00
Alex Converse
05f33142f5 Merge "Port "Better workaround for Bug 1089." to vp10 (nextgenv2)." into nextgenv2 2016-02-23 17:53:57 +00:00
Geza Lore
3c4b56c4dd Experiment to use image domain dist in baseline.
Change-Id: Ib29f510289716b5ab5c7d74d32a450c190308a83
2016-02-23 09:35:40 -08:00
Yaowu Xu
272dbaa13f Merge "Cleanup psnr.h" into nextgenv2 2016-02-23 17:13:34 +00:00
Angie Chiang
5340d1424d Merge "Merge 12sharp filter into ext-interp" into nextgenv2 2016-02-23 01:26:23 +00:00
Yaowu Xu
ec6b8d8b76 Merge "Add shift stage in FASTSSIM computation" into nextgenv2 2016-02-23 00:43:18 +00:00
Angie Chiang
e4af6a42a7 Merge 12sharp filter into ext-interp
Change-Id: I7df48e7f3b57f212798ef4be86f28aed928fc3e0
2016-02-22 16:26:38 -08:00
Yaowu Xu
eeaf8e6b6c Extend vpxssim to handle more HBD combinations
Change-Id: I38426d946b74c9090a265d34b89e2db6693927c2
2016-02-22 16:09:08 -08:00
Yaowu Xu
38cfc45e07 Cleanup psnr.h
Change-Id: Id026e72ee655ee5bd645a89e378da0d462be367d
2016-02-22 15:37:40 -08:00
Yaowu Xu
d1c5cd4a30 Add shift stage in FASTSSIM computation
This commits adds a shift stage for FASTSSIM computaton when source
bit depth is different from working bit depth, to make sure metric
results are calculated in bit_depth consistent with source.

Change-Id: I997799634076ef7b00fd051710544681ed536185
2016-02-22 14:58:10 -08:00
Yaowu Xu
af3a8381ef Merge "Move psnrhvs function declaration to psnr.h" into nextgenv2 2016-02-22 18:46:39 +00:00
Alex Converse
9fce131de8 Port "Better workaround for Bug 1089." to vp10 (nextgenv2).
Don't initialize first pass costs for a number of symbols where first
pass probabilities aren't initialized.

As a side effect, an illegal read in the ANS experiment is fixed.

https://bugs.chromium.org/p/webm/issues/detail?id=1089

Change-Id: I97438c357bd88f52f5a15c697031cf0c3cc8f510
2016-02-22 10:19:03 -08:00
Jingning Han
404c512786 Merge "Unify motion vector cost system" into nextgenv2 2016-02-22 17:38:00 +00:00
Jingning Han
a10814e11e Merge "Account context based prob model for motion vector cost estimate" into nextgenv2 2016-02-22 17:37:42 +00:00
Jingning Han
1f984a5a63 Merge "Vectorize motion vector probability models" into nextgenv2 2016-02-22 17:37:29 +00:00
Jingning Han
682dad0ec7 Merge "Store predicted motion vectors" into nextgenv2 2016-02-22 17:14:05 +00:00
Yaowu Xu
6e695da2d9 Move psnrhvs function declaration to psnr.h
From "ssim.h"

Change-Id: Ie53378794149ef8a844b4eb47ad4f08579de4b60
2016-02-22 08:38:49 -08:00
Jingning Han
fec5988657 Unify motion vector cost system
This commit unifies the motion vector cost buffers for full pixel
and sub-pixel motion search. The new motion vector coding system
provides 0.5% coding gains for 720p and above sequences and 0.2%
for lower resolution sets.

Change-Id: I927ec81eadc39d11a3c12b375221a1ddd2e8bf24
2016-02-21 22:21:28 -08:00
Jingning Han
03c01bc3c0 Account context based prob model for motion vector cost estimate
This commit accounts for the context based probability model for
motion vector cost estimate in rate-distortion optimization.

Change-Id: Ia068a9395dcb4ecc348f128b17b8d24734660b83
2016-02-19 16:32:51 -08:00
Yi Luo
961668c91c Merge "Initial SSE2 function fdst4_sse2()." into nextgenv2 2016-02-20 00:31:31 +00:00
Jingning Han
df59bb8986 Vectorize motion vector probability models
This commit converts the scalar motion vector probability model
into vector format for later precise estimate.

Change-Id: I7008d047ecc1b9577aa8442b4db2df312be869dc
2016-02-19 16:20:41 -08:00