1473 Commits

Author SHA1 Message Date
Debargha Mukherjee
9b88762b17 Refactor 1D transforms
In preparation for adding more 1D variants with ADST/FlipADST/etc.

BDRATE actually improves by 0.21% on lowres.

Change-Id: I2fa4720c69fe001fa666119a284dfc6b17fffab2
2016-03-14 22:30:09 -07:00
Yunqing Wang
5f5552d846 Optimize HBD up-sampled prediction functions
Optimized 2 up-sampled reference prediction functions in high-bit
depth case. This reduced the HBD encoding time by 3%.

Change-Id: I8663ffb5234f5e70168c0fc9ca676309fe8e98f2
2016-03-14 19:04:33 -07:00
Yue Chen
66e6fb84de Merge "Speed up rd selection in OBMC experiment" into nextgenv2 2016-03-15 00:14:06 +00:00
Yue Chen
b5f8b70ce5 Speed up rd selection in OBMC experiment
Instead of testing all interpfilter-BMC/OBMC combinations, we choose
the best interpolation filter based on regular inter prediction.

Reduction in encoding time: ~10%
Drop in performance gain: 0.08% lowres, 0.04% midres

Change-Id: Ifc19097a918ac76b529db9af4c60e2c70e93f7ad
2016-03-14 15:36:44 -07:00
Jingning Han
a2c87a3dda Turn off 32x32 transform type selection
Temporarily disable transform type selection for 32x32 transform
block size. This speeds up the encoding process. For bus at CIF
150 frames, the encoding time goes from 896s -> 762s (11% faster).
The compression performance for lowres set is improved by 0.15%,
and -0.029% for hdres.

Change-Id: If239b272970eb302150bec13b8cf192fbe045332
2016-03-14 11:31:03 -07:00
Yunqing Wang
91b8236cdd Merge "Add high-precision sub-pixel search as a speed feature" into nextgenv2 2016-03-12 02:26:36 +00:00
Angie Chiang
46cd6ee9bd Merge "Fix sub8x8 interpolation full pixel bug" into nextgenv2 2016-03-12 01:45:27 +00:00
Yunqing Wang
e6e2d886d3 Add high-precision sub-pixel search as a speed feature
Using the up-sampled reference frames in sub-pixel motion search is
enabled as a speed feature for good-quality mode speed 0 and speed 1.

Change-Id: Ieb454bf8c646ddb99e87bd64c8e74dbd78d84a50
2016-03-11 16:32:11 -08:00
Debargha Mukherjee
e38e2ad86e Merge "Fix an overflow in highbitdepth loop restoration" into nextgenv2 2016-03-11 21:48:37 +00:00
Hui Su
f0e0a7e7e9 Merge "Complete (mostly) migration of palette mode" into nextgenv2 2016-03-11 19:52:41 +00:00
Hui Su
571072b84b Merge "Fix a bug in ext-intra experiment" into nextgenv2 2016-03-11 19:52:34 +00:00
Debargha Mukherjee
7ea59de69c Fix an overflow in highbitdepth loop restoration
Change-Id: Ie20cd35a4c96443c0de234d2cf097187a70ec8dd
2016-03-11 11:48:24 -08:00
Hui Su
f7fbc54bd1 Merge "Fix compiler warnings" into nextgenv2 2016-03-11 19:47:39 +00:00
hui su
8102aeb368 Fix a bug in ext-intra experiment
Change-Id: I6fab352eb1f7d9c5dc783a4d4d878b6b42838ca2
2016-03-11 10:23:51 -08:00
hui su
8fce4b8543 Fix compiler warnings
Change-Id: I00314ec296e8368f1239a556b3a55feac9cec7ae
2016-03-11 10:13:08 -08:00
Jingning Han
68d9a14e9f Merge "Enable hybrid 1-D/2-D transform coding for highbd setting" into nextgenv2 2016-03-11 18:09:11 +00:00
hui su
78b0bd0a0d Complete (mostly) migration of palette mode
Coding gain on screen_content is 12.2% (was 6.6%).

Some features such as frame-level color buffer, adaptive
entropy coding, are coming in future patches.

Change-Id: I2658cf5ec0cbb02cff685475759f3b68c9807697
2016-03-11 09:56:21 -08:00
Sarah Parker
09368fcf99 Filling in speed feature functions for ext tx search
Filled in prune one and prune two. Prune three is still
being experimented with.

Change-Id: Ic07f828c448e86cacb0369aa3a9a0feb2edae054
2016-03-10 14:08:13 -08:00
Debargha Mukherjee
ce4b35d510 Merge "Adds compound wedge prediction modes" into nextgenv2 2016-03-10 17:44:45 +00:00
Jingning Han
c453ae53d0 Enable hybrid 1-D/2-D transform coding for highbd setting
This commit enables the hybrid 1-D/2-D transform coding scheme for
high bit-depth setting. It improves the compression performance of
ext-tx experiment by 0.98% for lowres_all set.

Change-Id: Ic27f5037f2c36b095a93b9f15dbae34bdcdf00aa
2016-03-10 08:58:07 -08:00
Debargha Mukherjee
f34deab243 Adds compound wedge prediction modes
Incorporates wedge compound prediction modes.

Change-Id: Ie73b54b629105b9dcc5f3763be87f35b09ad2ec7
2016-03-10 07:19:54 -08:00
Jingning Han
ccc809f30c Merge "Fix an assertion condition in transform type search" into nextgenv2 2016-03-10 00:20:30 +00:00
Yi Luo
431e35913e Merge "Implemented DST 16x16 SSE2 intrinsics optimization" into nextgenv2 2016-03-09 22:27:44 +00:00
Jingning Han
240ae9729e Merge "Add horizontal and vertical scan order for 1-D transform" into nextgenv2 2016-03-09 20:47:06 +00:00
Angie Chiang
836e83c49f Fix sub8x8 interpolation full pixel bug
Change-Id: I5df744dc6b21ed9dbbf6ddf38004f2a9e88b7d00
2016-03-09 11:15:19 -08:00
Jingning Han
02734b6457 Fix an assertion condition in transform type search
Change-Id: I442475e559be2acdc1c2a3e5ca021b3de77adda5
2016-03-09 19:07:23 +00:00
Jingning Han
e0413094fb Add horizontal and vertical scan order for 1-D transform
This commit enables the 1-D transform to use Manhattan grid vertical
and horizontal scan order for transform coefficient entropy coding.

Enabled in inter prediction mode, the hybrid 1D/2D transform coding
scheme outperforms the 2D-DCT based coding system used in VP9 by
lowres_all  1.7%
hdres_all   1.4%

As one coding option, in addition to the existing 17 other transform
types in ext-tx experiment, the 1D/2D hybrid transform improves
the coding gains:
lowres_all  2.2% -> 3.0%

Change-Id: I9cefa9d9e38224546d0afd67feecd9f8d4a16ab0
2016-03-09 10:58:23 -08:00
hui su
954e560f9e Refactor entropy coding of transform size
No performance change.

Change-Id: If35125fed909d89235b303514f77a33183bb36b3
2016-03-08 16:46:00 -08:00
Yi Luo
50a164a1f6 Implemented DST 16x16 SSE2 intrinsics optimization
- Implemented fdst16_sse2(), fdst16_8col() against C version: fdst16().
- Turned on 7 DST related hybrid txfm types in vp10_fht16x16_sse2().
- Replaced vp10_fht10x10_c() with vp10_fht16x16_sse2() in
  fwd_txfm_16x16().
- Added vp10_fht16x16_sse2() unit test against C version:
  vp10_fht16x16_c() (--gtest_filter=*VP10Trans16x16*).
- Unit test passed.
- Speed improvement: 2.4%, 3.2%, 3.2%, for city_cif.y4m, garden_sif.y4m,
  and mobile_cif.y4m.

Change-Id: Ib30a67ce5d5964bef143d588d0f8fa438be8901f
2016-03-08 14:56:38 -08:00
Yaowu Xu
de661cdbc5 Merge "Fix several MSVC compiler warning/errors" into nextgenv2 2016-03-08 16:44:17 +00:00
Yaowu Xu
28eb784e46 Fix several MSVC compiler warning/errors
Change-Id: Iccaacee9b7a66b016b5747a3902c236888ad4ba1
2016-03-07 17:00:03 -08:00
Yue Chen
043b698a87 Merge "Calculate the distortion in pixel domain for sub8x8 rd selection" into nextgenv2 2016-03-08 00:13:46 +00:00
Yue Chen
ef8f7c1211 Calculate the distortion in pixel domain for sub8x8 rd selection
Pixel domain distortion calculation is enabled for the rd loop of
inter sub8x8 and intra 4x4 cases.

Coding gain: 0.124% derflr, 0.122% derfhd

Change-Id: I43b47fe81b4f5ccc1c66bc626bd310c413a1ed87
2016-03-07 14:49:22 -08:00
Alex Converse
76d4fdd391 Merge "ANS: Switch from PDFs to CDFs." into nextgenv2 2016-03-07 20:51:45 +00:00
Debargha Mukherjee
6adfba7c0f Merge "Make sharp filter 10 tap and makes sharp2 sharper" into nextgenv2 2016-03-07 19:51:42 +00:00
Jingning Han
79c5a533cd Merge "Hybrid 1-D/2-D transform coding" into nextgenv2 2016-03-07 19:15:44 +00:00
Jingning Han
a8dc9694a4 Hybrid 1-D/2-D transform coding
This commit enables a hybrid 1-D/2-D transform coding scheme and
the accompany entropy coding system. It currently uses hybrid
1-D/2-D DCT transform coding. It provides coding performance gains:

lowres_all  0.55%
hdres_all   0.43%

Change-Id: I2b30dcafd21eb2bb3371f6e854cbab440a4dfa78
2016-03-07 09:27:46 -08:00
Sarah Parker
df3849370a Merge "Adding speed feature interface for ext tx search" into nextgenv2 2016-03-07 16:32:55 +00:00
Hui Su
5e5bef6c18 Merge "Cleanup in get_uv_tx_size" into nextgenv2 2016-03-05 07:42:26 +00:00
hui su
c3c1c6f405 Cleanup in get_uv_tx_size
Change-Id: Ia2aa7558f9f53da7dff970b30fe0a94958159ffb
2016-03-04 16:53:19 -08:00
Yue Chen
10cdeab42a Fix a bug in obmc prediction
For left side obmc, the input of the mask function is corrected as
the column coordinate.
Also, minor fixes for a compiler warning.

Change-Id: Ia981ef443d5b0285a93d73e5c7ab83f8c3a23464
2016-03-04 15:54:14 -08:00
Sarah Parker
2ca7d42e7e Adding speed feature interface for ext tx search
This sets up the interface for 3 speed features that progressively
eliminate a greater number of transforms in ext tx using
pre-trained support vector machines.
Each speed feature still needs to be implemented.

Change-Id: Ia508aeadc0cffdc080fb227f357a5d1dfbca08e2
2016-03-04 10:27:21 -08:00
Jingning Han
351ca31238 Merge "Apply mv precision check to reference mv candidate" into nextgenv2 2016-03-04 16:54:27 +00:00
Jingning Han
04cb49385e Merge "Properly restore transform block skip flag in RD search" into nextgenv2 2016-03-03 23:30:58 +00:00
Jingning Han
7174d637e8 Properly restore transform block skip flag in RD search
This commit fixes an encoding issue related to var-tx and ref-mv
experiments that causes the codec to use random values for transform
block skip flag.

Change-Id: I8daa6d6b88ea45b5bbeb81b43dd0eeff545c8e5a
2016-03-03 13:52:49 -08:00
Yi Luo
6231b6b077 Merge "Fixed a computation bug in fdct16_sse2()" into nextgenv2 2016-03-03 20:05:36 +00:00
Debargha Mukherjee
7d2618bc70 Make sharp filter 10 tap and makes sharp2 sharper
There is a ~0.1% gain.

Various experiments with different kinds of windowing functions to
follow.

Change-Id: I0787fddca53607ab39e53f919066839301938e68
2016-03-03 12:01:55 -08:00
Alex Converse
6bbbe31656 ANS: Switch from PDFs to CDFs.
Make the RANS implementation operate on cumulative distribution
functions rather than individual probability distribution functions.
CDFs have shown themselves more flexible to work with.

Reduces decoding memory usage from scaling O(num_distributions *
symbol_resolution) to O(num_distributions).

No bitstream change. This is an purely implementation change.

Change-Id: I4e18d3a0a3d37a36a61487c3d778f9d088b0b374
2016-03-03 09:32:54 +00:00
Jingning Han
13fb7c1b88 Apply mv precision check to reference mv candidate
This allows the codec to use effective motion vector as the candidate
to produce the reference motion vector list.

Change-Id: Ib90be705fe28200c13376d6d7741800a61f13043
2016-03-02 20:14:07 -08:00
Yi Luo
68d6a5073a Fixed a computation bug in fdct16_sse2()
fdct16_sse2() was not bit-exact with C reference, fdct16().
The inconsistency was found by writing a unit test for
vp10_fht16x16_sse2().  Since the unit test needs a pending
change on the inherited base class.  I will commit this unit
test after making a header file for this base class.
Passed the uncommitted unit test: vp10_fht16x16_test.cc.

Change-Id: If2b617883c633a3ea90c19e1d018240c8007102b
2016-03-02 15:20:12 -08:00