Commit Graph

15670 Commits

Author SHA1 Message Date
Jingning Han
79c5a533cd Merge "Hybrid 1-D/2-D transform coding" into nextgenv2 2016-03-07 19:15:44 +00:00
Jingning Han
a8dc9694a4 Hybrid 1-D/2-D transform coding
This commit enables a hybrid 1-D/2-D transform coding scheme and
the accompany entropy coding system. It currently uses hybrid
1-D/2-D DCT transform coding. It provides coding performance gains:

lowres_all  0.55%
hdres_all   0.43%

Change-Id: I2b30dcafd21eb2bb3371f6e854cbab440a4dfa78
2016-03-07 09:27:46 -08:00
Sarah Parker
df3849370a Merge "Adding speed feature interface for ext tx search" into nextgenv2 2016-03-07 16:32:55 +00:00
Debargha Mukherjee
1815961469 Merge "Add 128 pixel variance and SAD functions" into nextgenv2 2016-03-07 16:02:05 +00:00
Geza Lore
938b8dfc73 Extend convolution functions to 128x128 for ext-partition.
Change-Id: I7f7e26cd1d58eb38417200550c6fbf4108c9f942
2016-03-07 11:39:27 +00:00
Hui Su
5e5bef6c18 Merge "Cleanup in get_uv_tx_size" into nextgenv2 2016-03-05 07:42:26 +00:00
hui su
c3c1c6f405 Cleanup in get_uv_tx_size
Change-Id: Ia2aa7558f9f53da7dff970b30fe0a94958159ffb
2016-03-04 16:53:19 -08:00
Yue Chen
10cdeab42a Fix a bug in obmc prediction
For left side obmc, the input of the mask function is corrected as
the column coordinate.
Also, minor fixes for a compiler warning.

Change-Id: Ia981ef443d5b0285a93d73e5c7ab83f8c3a23464
2016-03-04 15:54:14 -08:00
Yi Luo
267f73a1f7 Added vp10_fht4x4_sse2() unit test
Inherited class TransformTestBase to derived class VP10Trans4x4HT.
Employed RunCoeffCheck() to test vp10_fht4x4_sse2() against
C reference vp10_fht4x4_c().
fdst4_sse2() related seven hybrid transform cases are covered
 in this test.
Wrote a header file for test base class. Some modification to
make sure the base class can be used for 8x8, 16x16, 32x32 cases.
All related tests passed.

Change-Id: I6b19a39d3ea30b657847781e78e73b829998a57a
2016-03-04 14:19:30 -08:00
Sarah Parker
2ca7d42e7e Adding speed feature interface for ext tx search
This sets up the interface for 3 speed features that progressively
eliminate a greater number of transforms in ext tx using
pre-trained support vector machines.
Each speed feature still needs to be implemented.

Change-Id: Ia508aeadc0cffdc080fb227f357a5d1dfbca08e2
2016-03-04 10:27:21 -08:00
Jingning Han
351ca31238 Merge "Apply mv precision check to reference mv candidate" into nextgenv2 2016-03-04 16:54:27 +00:00
Jingning Han
04cb49385e Merge "Properly restore transform block skip flag in RD search" into nextgenv2 2016-03-03 23:30:58 +00:00
Jingning Han
7174d637e8 Properly restore transform block skip flag in RD search
This commit fixes an encoding issue related to var-tx and ref-mv
experiments that causes the codec to use random values for transform
block skip flag.

Change-Id: I8daa6d6b88ea45b5bbeb81b43dd0eeff545c8e5a
2016-03-03 13:52:49 -08:00
Yi Luo
6231b6b077 Merge "Fixed a computation bug in fdct16_sse2()" into nextgenv2 2016-03-03 20:05:36 +00:00
Debargha Mukherjee
7d2618bc70 Make sharp filter 10 tap and makes sharp2 sharper
There is a ~0.1% gain.

Various experiments with different kinds of windowing functions to
follow.

Change-Id: I0787fddca53607ab39e53f919066839301938e68
2016-03-03 12:01:55 -08:00
Geza Lore
697bf5beff Add 128 pixel variance and SAD functions
Change-Id: I8fde245b32c9e586683a28aa6925da0b83850b39
2016-03-03 10:24:29 +00:00
Alex Converse
6bbbe31656 ANS: Switch from PDFs to CDFs.
Make the RANS implementation operate on cumulative distribution
functions rather than individual probability distribution functions.
CDFs have shown themselves more flexible to work with.

Reduces decoding memory usage from scaling O(num_distributions *
symbol_resolution) to O(num_distributions).

No bitstream change. This is an purely implementation change.

Change-Id: I4e18d3a0a3d37a36a61487c3d778f9d088b0b374
2016-03-03 09:32:54 +00:00
Jingning Han
13fb7c1b88 Apply mv precision check to reference mv candidate
This allows the codec to use effective motion vector as the candidate
to produce the reference motion vector list.

Change-Id: Ib90be705fe28200c13376d6d7741800a61f13043
2016-03-02 20:14:07 -08:00
Yi Luo
68d6a5073a Fixed a computation bug in fdct16_sse2()
fdct16_sse2() was not bit-exact with C reference, fdct16().
The inconsistency was found by writing a unit test for
vp10_fht16x16_sse2().  Since the unit test needs a pending
change on the inherited base class.  I will commit this unit
test after making a header file for this base class.
Passed the uncommitted unit test: vp10_fht16x16_test.cc.

Change-Id: If2b617883c633a3ea90c19e1d018240c8007102b
2016-03-02 15:20:12 -08:00
hui su
ebc6e058db Fix a bug in vp10_predict_intra_block
Avoid mistakenly setting "have_right" as 0 for UV channel in blocks
of width no larger than 8.

Change-Id: Ic2b031e32f967a23fd118a052bf9edd7d5a3abe6
2016-03-02 11:22:09 -08:00
Debargha Mukherjee
339ef0ce7a Merge "Adds masked variance and sad functions for wedge" into nextgenv2 2016-03-02 03:28:39 +00:00
Debargha Mukherjee
1d69ceee5c Adds masked variance and sad functions for wedge
Adds masked variance and sad functions needed for wedge
prediction modes to come.

Change-Id: I25b231bbc345e6a494316abb0a7d5cd5586a3a54
2016-03-01 17:28:56 -08:00
Yaowu Xu
9425616615 Merge "Fix a unused function warning with var_tx on" into nextgenv2 2016-03-02 01:11:17 +00:00
Hui Su
90fe1cffbf Merge "Fix a couple of minor bugs in vp10_has_right and vp10_has_bottom" into nextgenv2 2016-03-02 00:33:38 +00:00
Yunqing Wang
84f982080a Minor fix in header files
Move functions to be included in extern "C".

Change-Id: If57fa5eb7955763cf99e6839dde4d7221fad75ea
2016-03-01 13:16:03 -08:00
Yaowu Xu
3d89d059dc Merge "Fix an overflow issue for HBD" into nextgenv2 2016-03-01 19:22:48 +00:00
Yaowu Xu
0cfa89c0eb Fix a unused function warning with var_tx on
Change-Id: I1e65d7e1586d8c7c65bb150b1a928cf3adf97366
2016-03-01 11:05:48 -08:00
hui su
935a837c01 Fix a couple of minor bugs in vp10_has_right and vp10_has_bottom
The above-right and left-bottom pixels were sometimes not used even
though they are available. Results on lowres_all and hdres_all are
mostly neutral.

Change-Id: Ic13533dd498442ad5592b83bb5fabf053cc8e8f0
2016-03-01 10:09:04 -08:00
Yaowu Xu
5c613ea881 Fix an overflow issue for HBD
The sum of squared value of a block can overflow 32bit, this commit
changes to use int64_t to avoid the overflow issue.

Change-Id: I78fcd6999634f186f86d649cfce85d97a993d040
2016-03-01 09:44:04 -08:00
Angie Chiang
7667733991 Update obmc counts in multithread mode
Change-Id: I0743e00dad9d36a87870c480922f5ae904bd5c9d
2016-02-29 17:09:02 -08:00
Yunqing Wang
342a368fd4 Do sub-pixel motion search in up-sampled reference frames
Up-sampled the reference frames to 8 times in each dimension using
the 8-tap interpolation filter. In sub-pixel motion search, use the
up-sampled reference frames to find the best matching blocks. This
largely improved the motion search precision, and thus, improved
the compression quality. There was no change in decoder side.

Borg test and speed test results:
1. On derflr set,
Overall PSNR gain: 1.306%, and SSIM gain: 1.512%.
Average speed loss on derf set was 6.0%.
2. On stdhd set,
Overall PSNR gain: 0.754%, and SSIM gain: 0.814%.
On hevchd set,
Overall PSNR gain: 0.465%, and SSIM gain: 0.527%.
Speed loss on HD clips was 3.5%.

Change-Id: I300ebaafff57e88914f3dedc8784cb21d316b04f
2016-02-29 12:14:47 -08:00
Debargha Mukherjee
db084506d8 A build fix and some other cosmetic changes
Fixes some issues introduced by a merge of two patches.
Also decouples the temporal interpolation filter from the switchable
filters for now for ease of experimentation with both separately.

Change-Id: If1c7c08adf00e0cf818fe8d0d3656c26ea65eb32
2016-02-29 10:20:52 -08:00
Debargha Mukherjee
48589e8d07 Merge "Some refactoring and cleanups of interp filter" into nextgenv2 2016-02-29 15:55:48 +00:00
Hui Su
95428a5926 Merge "Fix compiler warnings" into nextgenv2 2016-02-27 05:04:02 +00:00
Jingning Han
0fc0c1a32d Merge "Enable improved temporal filter in ext-interp experiment" into nextgenv2 2016-02-27 01:22:15 +00:00
Jingning Han
dca86af8f4 Merge "Unify frame border extension operation" into nextgenv2 2016-02-27 01:22:03 +00:00
Debargha Mukherjee
bab2912b5e Some refactoring and cleanups of interp filter
Includes various cosmetic changes and refactoring including
naming the sharp filters differently (since they are no longer
8-tap).

Change-Id: Ida5a19ca0daa9f6a64a6734394c685b2a4a2564a
2016-02-26 15:42:49 -08:00
Jingning Han
95d35a4a0b Enable improved temporal filter in ext-interp experiment
It improves the coding performance by 0.3%.

Change-Id: I9703abd705ceacdf9e7424428e5120253cadcc18
2016-02-26 21:59:51 +00:00
Jingning Han
d1d11fc6dd Unify frame border extension operation
This commit unifies the encoder and decoder border extension and
motion compensated prediction process. Remove the decoder specific
flow to simplify the development flow.

Change-Id: I9c43bbe6d7c017e6da2db6a62c5bf3d0af7ccfce
2016-02-26 13:58:53 -08:00
hui su
4aeabf1b0d Fix compiler warnings
Change-Id: Id7240260cec471a3f8d0986b9c8df06efda925f9
2016-02-26 13:52:49 -08:00
Geza Lore
7ded038af5 Port interintra experiment from nextgen.
The interintra experiment, which combines an inter prediction and an
inter prediction have been ported from the nextgen branch. The
experiment is merged into ext_inter, so there is no separate configure
option to enable it.

Change-Id: I0cc20cefd29e9b77ab7bbbb709abc11512320325
2016-02-26 13:01:51 -08:00
Debargha Mukherjee
3287f5519e Merge "Hooks to use 32x32 masked transforms for ext-tx" into nextgenv2 2016-02-26 20:54:37 +00:00
Yi Luo
b347c3c5e5 Merge "Implemented DST 8x8 with SSE2 intrinsics." into nextgenv2 2016-02-26 19:10:00 +00:00
Jingning Han
2b7196a8bb Merge "Use sharp filter for alter reference frame generation" into nextgenv2 2016-02-26 16:24:59 +00:00
Jingning Han
83ecafbd95 Merge "Enable context based motion vector entropy coding" into nextgenv2 2016-02-26 16:24:49 +00:00
Yaowu Xu
a570cefcf8 Merge "Extend vpxssim to handle more HBD combinations" into nextgenv2 2016-02-26 15:57:40 +00:00
Jingning Han
72eda13e50 Use sharp filter for alter reference frame generation
This commit uses 12-tap sharp filter to generate alter reference
frame. It improves the compression performance by
derf    0.45%
hevcmr  0.35%
stdhd   0.79%

No encoding time change is observed.

Change-Id: Ia5dc26d5aae6b9b0cb782e5a28dc5066eeeb2ec8
2016-02-25 14:20:38 -08:00
Hui Su
1226e734a0 Merge "Add test for screen content coding tools in end to end test" into nextgenv2 2016-02-25 03:47:03 +00:00
Angie Chiang
8878fa4f9a convolve8 sse2 test
This experiment shows that when frame size is 64x64
vpx_highbd_convolve8_sse2 and vpx_convolve8_sse2's speed are similar.
However when frame size becomes 1024x1024
vpx_highbd_convolve8_sse2 is around 50% slower than vpx_convolve8_sse2
we think the bottleneck is from memory IO

VP10ConvolveTest.vpx_highbd_convolve8_sse2_speed_8_64
VP10ConvolveTest.vpx_highbd_convolve8_sse2_speed_8_64 (17 ms)
VP10ConvolveTest.vpx_highbd_convolve8_sse2_speed_16_64
VP10ConvolveTest.vpx_highbd_convolve8_sse2_speed_16_64 (42 ms)
VP10ConvolveTest.vpx_highbd_convolve8_sse2_speed_32_64
VP10ConvolveTest.vpx_highbd_convolve8_sse2_speed_32_64 (139 ms)
VP10ConvolveTest.vpx_highbd_convolve8_sse2_speed_64_64
VP10ConvolveTest.vpx_highbd_convolve8_sse2_speed_64_64 (499 ms)

VP10ConvolveTest.vpx_convolve8_sse2_speed_l_8_64
VP10ConvolveTest.vpx_convolve8_sse2_speed_l_8_64 (16 ms)
VP10ConvolveTest.vpx_convolve8_sse2_speed_l_16_64
VP10ConvolveTest.vpx_convolve8_sse2_speed_l_16_64 (40 ms)
VP10ConvolveTest.vpx_convolve8_sse2_speed_l_32_64
VP10ConvolveTest.vpx_convolve8_sse2_speed_l_32_64 (130 ms)
VP10ConvolveTest.vpx_convolve8_sse2_speed_l_64_64
VP10ConvolveTest.vpx_convolve8_sse2_speed_l_64_64 (485 ms)

VP10ConvolveTest.vpx_highbd_convolve8_sse2_speed_8_1024
VP10ConvolveTest.vpx_highbd_convolve8_sse2_speed_8_1024 (32 ms)
VP10ConvolveTest.vpx_highbd_convolve8_sse2_speed_16_1024
VP10ConvolveTest.vpx_highbd_convolve8_sse2_speed_16_1024 (61 ms)
VP10ConvolveTest.vpx_highbd_convolve8_sse2_speed_32_1024
VP10ConvolveTest.vpx_highbd_convolve8_sse2_speed_32_1024 (196 ms)
VP10ConvolveTest.vpx_highbd_convolve8_sse2_speed_64_1024

VP10ConvolveTest.vpx_highbd_convolve8_sse2_speed_64_1024 (694 ms)
VP10ConvolveTest.vpx_convolve8_sse2_speed_l_8_1024
VP10ConvolveTest.vpx_convolve8_sse2_speed_l_8_1024 (21 ms)
VP10ConvolveTest.vpx_convolve8_sse2_speed_l_16_1024
VP10ConvolveTest.vpx_convolve8_sse2_speed_l_16_1024 (44 ms)
VP10ConvolveTest.vpx_convolve8_sse2_speed_l_32_1024
VP10ConvolveTest.vpx_convolve8_sse2_speed_l_32_1024 (138 ms)
VP10ConvolveTest.vpx_convolve8_sse2_speed_l_64_1024
VP10ConvolveTest.vpx_convolve8_sse2_speed_l_64_1024 (491 ms)

Change-Id: I3131a031e0380e8eae748cfcccc6cbb961d05943
2016-02-24 17:01:20 -08:00
hui su
827e1b3fef Add test for screen content coding tools in end to end test
Test screen content coding tools (currently only palette) at
speed 1 and two-pass.

Change-Id: I3c467aee1cd9c366c65a3abfdccfafa0416b59b7
2016-02-24 15:27:07 -08:00