1473 Commits

Author SHA1 Message Date
Jingning Han
7279a4748f Merge "Rename run_rd_check to run_mv_search" into nextgenv2 2016-03-28 23:10:48 +00:00
Jingning Han
b534987110 Merge "Rework the predicted motion vector for sub8x8 block" into nextgenv2 2016-03-28 23:10:35 +00:00
Jingning Han
d133524e7c Fix a rdcost computation issue in sub8x8 block mode search
Compute the rate-distortion cost for sub8x8 blocks with integer
motion vectors.

Change-Id: I7dc034fcc4bec3850f26d1f9ae0595c91df1137e
2016-03-28 23:09:53 +00:00
Jingning Han
59d45d603b Rename run_rd_check to run_mv_search
Improve the readability in the related rate-distortion optimization
search control function of sub8x8 blocks.

Change-Id: I7f7456bf40a98aa5146abfe0488cda745b84d899
2016-03-28 21:59:10 +00:00
Jingning Han
0586460938 Rework the predicted motion vector for sub8x8 block
This commit makes the sub8x8 block to use its nearest neighbor's
motion vector as predicted motion vector for NEWMV mode. It improves
the coding performance by 0.12%.

Change-Id: I99e56715b327573ce7e8a26e3515a4984dadfd98
2016-03-28 14:58:17 -07:00
Angie Chiang
4144a11552 Merge "Use vp10_[fwd/inv]_txfm2d_add_32x32 for bd 10" into nextgenv2 2016-03-28 19:20:48 +00:00
Alex Converse
7b22c1d433 Force the VPX boolcoder trees in the ANS test.
Change-Id: I282f958c35aabcdfaf1077f8909c56c999420937
2016-03-28 12:10:08 -07:00
Yunqing Wang
89a8174fec Merge "Make set_reference control API work in VP9 and VP10" into nextgenv2 2016-03-28 18:33:03 +00:00
Hui Su
14f2d03b4b Merge "Fix assertion fail in build_intra_predictors" into nextgenv2 2016-03-28 18:14:47 +00:00
Angie Chiang
33833aefdd Merge "Use vp10_[fwd/inv]_txfm2d_add_#x# for bd 10" into nextgenv2 2016-03-28 18:11:47 +00:00
Angie Chiang
46b234478f Use vp10_[fwd/inv]_txfm2d_add_32x32 for bd 10
Change-Id: I996c48a90d7d71b52594a91a35cb8712c7fc212e
2016-03-28 11:08:40 -07:00
Alex Converse
72e29c3a73 Merge changes I3c72a2d8,I9905f3a8 into nextgenv2
* changes:
  Add pluggable bitwriters.
  Add pluggable bitreaders.
2016-03-28 16:59:18 +00:00
Yunqing Wang
9aaa3c933c Make set_reference control API work in VP9 and VP10
Moved the API patch from NextGen to NextGenv2 and also added this
API to VP10. An example was included. To try it, for example, run
the following command:
$ examples/vpx_cx_set_ref vp10 352 288 in.yuv out.ivf 4 30

Change-Id: Ib56bc3d365e530cfc8d859a13ddbf4c007907b81
2016-03-28 09:55:24 -07:00
Hui Su
9ab6e589b0 Merge "Fixes for Palette mode" into nextgenv2 2016-03-28 16:44:34 +00:00
hui su
f24b91c9e1 Fix assertion fail in build_intra_predictors
Change-Id: Id6683b9593b52aa0d159f8f013782d9e0bd07206
2016-03-28 09:37:54 -07:00
Yi Luo
80bc1d1d90 Merge "8x8/16x16 HT types V_DCT to H_FLIPADST SSE2 optimization" into nextgenv2 2016-03-28 15:39:24 +00:00
hui su
8a128c2a72 Fixes for Palette mode
This patch fixes 2 issues in Palette mode:
1. More memory is needed in PALETTE_BUFFER for 444 video format.
2. A merge issue caused by
https://chromium-review.googlesource.com/#/c/333940/7

Change-Id: I2aedc7dfdfb6b66fbd600189ec6e1e2cc6120d40
2016-03-25 18:16:44 -07:00
Yi Luo
770bf71503 8x8/16x16 HT types V_DCT to H_FLIPADST SSE2 optimization
- Wrote function: fidtx8_sse2() and fidtx16_sse2().
- Turned on vp10_fht8x8_sse2()/vp10_fht16x16_sse2() for new types.
- Updated 8x8/16x16 unit tests for accuracy/speed.
- Running 20K times with random numbers and getting through
  tx type from V_DCT to H_FLIPADST, SSE2 speed improvement:
  8x8: ~131%
  16x16: ~66%

Change-Id: Ibbb707e932a08fec3b1f423a7dab280a1d696c9a
2016-03-25 16:48:19 -07:00
Yue Chen
e63792e5cf A major speed up for obmc experiment
Skip checking obmc when regular inter predictor is not so good (the
rd-cost for Y residual is greater than the total rd of the best mode
so far.)

Performance change compared to full rd search:
  +0.006% lowres, -0.056% midres
Encoding time :
  1.14X baseline (was 1.42X)

Change-Id: I11350f955a20e1a2331be458537a915e09fbedf3
2016-03-25 14:06:52 -07:00
Alex Converse
d5c6b83431 Merge "Fix memory leak and slopiness around the uncompressed ANS buffer." into nextgenv2 2016-03-25 20:08:01 +00:00
Yunqing Wang
916bdfd9ac Merge "Recover tile coding performance" into nextgenv2 2016-03-25 19:18:58 +00:00
Alex Converse
30097af4ea Fix memory leak and slopiness around the uncompressed ANS buffer.
Change-Id: Ic9ed1f88f5550b69a45a0fdc71aae5864db7e178
2016-03-25 11:11:07 -07:00
Alex Converse
65bea98d74 Add pluggable bitwriters.
This will make the code change for a pure ANS experiment manageable.

Change-Id: I3c72a2d8e75afa2cc8e56992ee91f4760202f4d4
2016-03-25 11:02:41 -07:00
Alex Converse
efd566ff93 Add pluggable bitreaders.
This will make the code change for a pure ANS experiment manageable.

Change-Id: I9905f3a89f492a4346860463a72fa8c52aac4c8e
2016-03-25 11:02:41 -07:00
Hui Su
f9d77d66e6 Merge "Speed up ext-intra" into nextgenv2 2016-03-25 17:52:33 +00:00
Yunqing Wang
bdcc14051b Recover tile coding performance
After porting tile coding from VP9 to VP10, some performance
degradation was seen because of the difference between VP9 and
Vp10 baseline. This patch disabled some features in VP10 while
tile coding is turned on. Also, an encoder control API was added
back for this use case.

Change-Id: I8f736db8388408a8cc35320a2f80abb02906571c
2016-03-25 09:05:25 -07:00
hui su
c85a68123f Speed up ext-intra
Skip filtered intra modes search in inter frame when DC mode is
worse than the best mode so far.

With ext-intra enabled, the overall speed is increased by 20~40%;
performance drop is 0.03% on lowres and 0.05% on midres.

Change-Id: I75d2503b067cf5e46e3533b97fb01497e125baa7
2016-03-24 21:43:18 -07:00
Yi Luo
4970388c23 4x4 hybrid transform type V_DCT to H_FLIPADST SSE2 optimization
- Added function fidtx4_sse2().
- Turned on vp10_fht4x4_sse2() for these tx types.
- Updated 4x4 unit test for speed/accuracy.
- 4x4 Unit test passed.
- Running 20K times with random numbers for tx type from
  V_DCT to H_FLIPADST, SSE2 against C, speed improves ~46%.

Change-Id: I828088b7f98dc0f5939a72e3fcd6cb0b8d8dd8bf
2016-03-24 15:09:18 -07:00
Jingning Han
fa76102929 Merge "Fix an enc/dec mismatch issue in DRL experiment" into nextgenv2 2016-03-24 19:02:13 +00:00
Jingning Han
4823dc364e Fix an enc/dec mismatch issue in DRL experiment
This was broken due the leakage between consecutive CLs.

Change-Id: I08ba8c67a42871d9488729ed854845641aa7ca30
2016-03-24 09:48:54 -07:00
Geza Lore
490ba1ad25 Port large scale tile coding features from nextgen.
If configured with --enable-ext-tile, the codec uses an alternative
tile coding syntax in the bitstream. Changes include::
 - The maximum number of tile rows and columns is extended to 1024
   each.
 - The minimum tile width/height is 64 pixels (1 superblock).
 - A tile copy mode is added where a tile directly reuse the coded
   data of a previous tile
 - The meaning of the tile-columns and tile-rows codec parameters are
   overloaded to mean tile-width and tile-height in units of 64
   pixels.
 - All tiles should now be independent, including rows within the
   same columns, so large scale parallel, or independent decoding is
   possible.
 - vpxdec also gained the options to decode only a particular tile,
   tile row, or tile column.

Changes without --enable-ext-tile:
 - All tiles should now be independent, including rows within the
   same columns, so large scale parallel, or independent decoding is
   possible.
 - vpxenc default tile configuration changed to use 1 tile column.

Change-Id: I0cd08ad550967ac18622dae5e98ad23d581cb33e
2016-03-24 09:26:05 +00:00
Angie Chiang
b4334460cb Merge "Call vp10_fwd_txfm_4x4 in encode_inter_mb_segment" into nextgenv2 2016-03-24 00:38:04 +00:00
Yi Luo
ea94451f20 Merge "Misc. updates for highbd changes" into nextgenv2 2016-03-23 22:43:47 +00:00
Yi Luo
659c2c98e1 Misc. updates for highbd changes
- Use Makefile to control the build for highbd_fwd_txfm_sse4.c.
- Fixed hybrid transform (HT) types due to recent update.
- Added new unit test cases for highbd HT.

Change-Id: Ifd768a9b429a8c21ed40c1de8152fb5ac71e2f90
2016-03-23 12:10:52 -07:00
Jingning Han
1fcb5fc755 Refactor motion vector residual coding process
This commit separates the predicted motion vector from the nearestmv
motion vector in the coding process for both regular and sub8x8
block sizes.

Change-Id: I703490513b0194e6669ebf719352db015facb3e1
2016-03-23 12:10:38 -07:00
Angie Chiang
d9a0cbb1b7 Use vp10_[fwd/inv]_txfm2d_add_#x# for bd 10
Change-Id: Ie35bdbd7aafae693e3106d7ccbbdd8e65ee8800c
2016-03-23 12:05:12 -07:00
Angie Chiang
2b93fde9da Call vp10_fwd_txfm_4x4 in encode_inter_mb_segment
Change-Id: Ieabe5534e5f4fb3f2d751a3cfc682208b3913715
2016-03-23 11:43:45 -07:00
Yi Luo
deb33056d1 Merge "Highbd fht4x4 SSE4.1 optimization for DCT_DCT mode - Setup function vp10_highbd_fht4x4_sse4_1 for highbd SSE4.1 intrinsics optimization. - Wrote SSE4.1 functions: load_buffer_4x4(), write_buffer_4x4(), and fdct4x4_sse4_1(). - Used logic right shift to avoid coeff memory write/read. - Turned on vp10_highbd_fht4x4_sse4_1 for DCT_DCT mode only. - Improved overall encoding performance >2.3% for 50 frames sequence, park_joy_1080p_12.y4m, in which, --input-bit-depth=12, --bit-depth=12, 50 frames. - Unit test passed." into nextgenv2 2016-03-23 18:30:40 +00:00
Hui Su
daf2fb42e6 Merge "Add "entropy" experiment" into nextgenv2 2016-03-23 17:50:57 +00:00
Alex Converse
a06e39a945 Merge "Add buf_ans.h to the Makefile." into nextgenv2 2016-03-23 16:27:13 +00:00
Alex Converse
b5454b245a Merge "Add some ANS helpers needed to replace the vpx bool coder with pure ANS." into nextgenv2 2016-03-23 16:21:58 +00:00
Hui Su
13501fe45f Merge "Small speed up for super_block_uvrd" into nextgenv2 2016-03-23 16:16:46 +00:00
Yi Luo
977dccd12c Highbd fht4x4 SSE4.1 optimization for DCT_DCT mode
- Setup function vp10_highbd_fht4x4_sse4_1 for highbd SSE4.1
  intrinsics optimization.
- Wrote SSE4.1 functions: load_buffer_4x4(), write_buffer_4x4(),
  and fdct4x4_sse4_1().
- Used logic right shift to avoid coeff memory write/read.
- Turned on vp10_highbd_fht4x4_sse4_1 for DCT_DCT mode only.
- Improved overall encoding performance >2.3% for 50 frames
  sequence, park_joy_1080p_12.y4m, in which, --input-bit-depth=12,
  --bit-depth=12, 50 frames.
- Unit test passed.

Change-Id: Idd6dc6e472cbbf235f0ade4f66fbe859a860a004
2016-03-23 09:13:45 -07:00
Debargha Mukherjee
7a3bae768e Merge "Porting ext_partition experiment from nextgen" into nextgenv2 2016-03-23 04:58:38 +00:00
Alex Converse
6b9cb8c489 Add some ANS helpers needed to replace the vpx bool coder with pure ANS.
Change-Id: I32b63fca020c410cef16e93379b4e6e281ccbccd
2016-03-22 16:23:23 -07:00
Yue Chen
2613b5e9d6 Merge "Refactor prediction functions of OBMC" into nextgenv2 2016-03-22 21:06:16 +00:00
Julia Robson
5cce322a09 Porting ext_partition experiment from nextgen
This has been ported under ext_partition_types because it is due
to be combined with the coding_unit_size experiment which is
already being ported under ext_partition

Change-Id: I47af869ae123ddf0aa99160dac644059d14266ee
2016-03-22 12:29:01 -07:00
Alex Converse
b00c09026c Wrap write_modes functions with macros to avoid ifdefs at all the callsites.
Change-Id: I5a960bf63ec404f0fbfe6a404f436ef4122a219d
2016-03-22 10:02:23 -07:00
Angie Chiang
9d380d8872 Merge "mv vp10_fwd_txfm2d_#x# into vp10_rtcd.h" into nextgenv2 2016-03-22 01:07:56 +00:00
Angie Chiang
063e965d7d Merge "Passing TXFM_TYPE instead of func pointer" into nextgenv2 2016-03-22 01:07:42 +00:00