generic-library/vpx

Author	SHA1	Message	Date
Yi Luo	770bf71503	8x8/16x16 HT types V_DCT to H_FLIPADST SSE2 optimization - Wrote function: fidtx8_sse2() and fidtx16_sse2(). - Turned on vp10_fht8x8_sse2()/vp10_fht16x16_sse2() for new types. - Updated 8x8/16x16 unit tests for accuracy/speed. - Running 20K times with random numbers and getting through tx type from V_DCT to H_FLIPADST, SSE2 speed improvement: 8x8: ~131% 16x16: ~66% Change-Id: Ibbb707e932a08fec3b1f423a7dab280a1d696c9a	2016-03-25 16:48:19 -07:00
Yue Chen	e63792e5cf	A major speed up for obmc experiment Skip checking obmc when regular inter predictor is not so good (the rd-cost for Y residual is greater than the total rd of the best mode so far.) Performance change compared to full rd search: +0.006% lowres, -0.056% midres Encoding time : 1.14X baseline (was 1.42X) Change-Id: I11350f955a20e1a2331be458537a915e09fbedf3	2016-03-25 14:06:52 -07:00
Alex Converse	d5c6b83431	Merge "Fix memory leak and slopiness around the uncompressed ANS buffer." into nextgenv2	2016-03-25 20:08:01 +00:00
Yunqing Wang	916bdfd9ac	Merge "Recover tile coding performance" into nextgenv2	2016-03-25 19:18:58 +00:00
Alex Converse	30097af4ea	Fix memory leak and slopiness around the uncompressed ANS buffer. Change-Id: Ic9ed1f88f5550b69a45a0fdc71aae5864db7e178	2016-03-25 11:11:07 -07:00
Alex Converse	65bea98d74	Add pluggable bitwriters. This will make the code change for a pure ANS experiment manageable. Change-Id: I3c72a2d8e75afa2cc8e56992ee91f4760202f4d4	2016-03-25 11:02:41 -07:00
Hui Su	f9d77d66e6	Merge "Speed up ext-intra" into nextgenv2	2016-03-25 17:52:33 +00:00
Yunqing Wang	bdcc14051b	Recover tile coding performance After porting tile coding from VP9 to VP10, some performance degradation was seen because of the difference between VP9 and Vp10 baseline. This patch disabled some features in VP10 while tile coding is turned on. Also, an encoder control API was added back for this use case. Change-Id: I8f736db8388408a8cc35320a2f80abb02906571c	2016-03-25 09:05:25 -07:00
hui su	c85a68123f	Speed up ext-intra Skip filtered intra modes search in inter frame when DC mode is worse than the best mode so far. With ext-intra enabled, the overall speed is increased by 20~40%; performance drop is 0.03% on lowres and 0.05% on midres. Change-Id: I75d2503b067cf5e46e3533b97fb01497e125baa7	2016-03-24 21:43:18 -07:00
Yi Luo	4970388c23	4x4 hybrid transform type V_DCT to H_FLIPADST SSE2 optimization - Added function fidtx4_sse2(). - Turned on vp10_fht4x4_sse2() for these tx types. - Updated 4x4 unit test for speed/accuracy. - 4x4 Unit test passed. - Running 20K times with random numbers for tx type from V_DCT to H_FLIPADST, SSE2 against C, speed improves ~46%. Change-Id: I828088b7f98dc0f5939a72e3fcd6cb0b8d8dd8bf	2016-03-24 15:09:18 -07:00
Jingning Han	fa76102929	Merge "Fix an enc/dec mismatch issue in DRL experiment" into nextgenv2	2016-03-24 19:02:13 +00:00
Jingning Han	4823dc364e	Fix an enc/dec mismatch issue in DRL experiment This was broken due the leakage between consecutive CLs. Change-Id: I08ba8c67a42871d9488729ed854845641aa7ca30	2016-03-24 09:48:54 -07:00
Geza Lore	490ba1ad25	Port large scale tile coding features from nextgen. If configured with --enable-ext-tile, the codec uses an alternative tile coding syntax in the bitstream. Changes include:: - The maximum number of tile rows and columns is extended to 1024 each. - The minimum tile width/height is 64 pixels (1 superblock). - A tile copy mode is added where a tile directly reuse the coded data of a previous tile - The meaning of the tile-columns and tile-rows codec parameters are overloaded to mean tile-width and tile-height in units of 64 pixels. - All tiles should now be independent, including rows within the same columns, so large scale parallel, or independent decoding is possible. - vpxdec also gained the options to decode only a particular tile, tile row, or tile column. Changes without --enable-ext-tile: - All tiles should now be independent, including rows within the same columns, so large scale parallel, or independent decoding is possible. - vpxenc default tile configuration changed to use 1 tile column. Change-Id: I0cd08ad550967ac18622dae5e98ad23d581cb33e	2016-03-24 09:26:05 +00:00
Angie Chiang	b4334460cb	Merge "Call vp10_fwd_txfm_4x4 in encode_inter_mb_segment" into nextgenv2	2016-03-24 00:38:04 +00:00
Yi Luo	ea94451f20	Merge "Misc. updates for highbd changes" into nextgenv2	2016-03-23 22:43:47 +00:00
Yi Luo	659c2c98e1	Misc. updates for highbd changes - Use Makefile to control the build for highbd_fwd_txfm_sse4.c. - Fixed hybrid transform (HT) types due to recent update. - Added new unit test cases for highbd HT. Change-Id: Ifd768a9b429a8c21ed40c1de8152fb5ac71e2f90	2016-03-23 12:10:52 -07:00
Jingning Han	1fcb5fc755	Refactor motion vector residual coding process This commit separates the predicted motion vector from the nearestmv motion vector in the coding process for both regular and sub8x8 block sizes. Change-Id: I703490513b0194e6669ebf719352db015facb3e1	2016-03-23 12:10:38 -07:00
Angie Chiang	d9a0cbb1b7	Use vp10_[fwd/inv]_txfm2d_add_#x# for bd 10 Change-Id: Ie35bdbd7aafae693e3106d7ccbbdd8e65ee8800c	2016-03-23 12:05:12 -07:00
Angie Chiang	2b93fde9da	Call vp10_fwd_txfm_4x4 in encode_inter_mb_segment Change-Id: Ieabe5534e5f4fb3f2d751a3cfc682208b3913715	2016-03-23 11:43:45 -07:00
Yi Luo	deb33056d1	Merge "Highbd fht4x4 SSE4.1 optimization for DCT_DCT mode - Setup function vp10_highbd_fht4x4_sse4_1 for highbd SSE4.1 intrinsics optimization. - Wrote SSE4.1 functions: load_buffer_4x4(), write_buffer_4x4(), and fdct4x4_sse4_1(). - Used logic right shift to avoid coeff memory write/read. - Turned on vp10_highbd_fht4x4_sse4_1 for DCT_DCT mode only. - Improved overall encoding performance >2.3% for 50 frames sequence, park_joy_1080p_12.y4m, in which, --input-bit-depth=12, --bit-depth=12, 50 frames. - Unit test passed." into nextgenv2	2016-03-23 18:30:40 +00:00
Hui Su	daf2fb42e6	Merge "Add "entropy" experiment" into nextgenv2	2016-03-23 17:50:57 +00:00
Alex Converse	b5454b245a	Merge "Add some ANS helpers needed to replace the vpx bool coder with pure ANS." into nextgenv2	2016-03-23 16:21:58 +00:00
Hui Su	13501fe45f	Merge "Small speed up for super_block_uvrd" into nextgenv2	2016-03-23 16:16:46 +00:00
Yi Luo	977dccd12c	Highbd fht4x4 SSE4.1 optimization for DCT_DCT mode - Setup function vp10_highbd_fht4x4_sse4_1 for highbd SSE4.1 intrinsics optimization. - Wrote SSE4.1 functions: load_buffer_4x4(), write_buffer_4x4(), and fdct4x4_sse4_1(). - Used logic right shift to avoid coeff memory write/read. - Turned on vp10_highbd_fht4x4_sse4_1 for DCT_DCT mode only. - Improved overall encoding performance >2.3% for 50 frames sequence, park_joy_1080p_12.y4m, in which, --input-bit-depth=12, --bit-depth=12, 50 frames. - Unit test passed. Change-Id: Idd6dc6e472cbbf235f0ade4f66fbe859a860a004	2016-03-23 09:13:45 -07:00
Debargha Mukherjee	7a3bae768e	Merge "Porting ext_partition experiment from nextgen" into nextgenv2	2016-03-23 04:58:38 +00:00
Alex Converse	6b9cb8c489	Add some ANS helpers needed to replace the vpx bool coder with pure ANS. Change-Id: I32b63fca020c410cef16e93379b4e6e281ccbccd	2016-03-22 16:23:23 -07:00
Yue Chen	2613b5e9d6	Merge "Refactor prediction functions of OBMC" into nextgenv2	2016-03-22 21:06:16 +00:00
Julia Robson	5cce322a09	Porting ext_partition experiment from nextgen This has been ported under ext_partition_types because it is due to be combined with the coding_unit_size experiment which is already being ported under ext_partition Change-Id: I47af869ae123ddf0aa99160dac644059d14266ee	2016-03-22 12:29:01 -07:00
Alex Converse	b00c09026c	Wrap write_modes functions with macros to avoid ifdefs at all the callsites. Change-Id: I5a960bf63ec404f0fbfe6a404f436ef4122a219d	2016-03-22 10:02:23 -07:00
Yue Chen	b5083af67a	Merge "Refactor transform type-size search function" into nextgenv2	2016-03-22 00:58:44 +00:00
Jingning Han	4df51c8de4	Merge "Refactor sub8x8 reference motion vector search function" into nextgenv2	2016-03-22 00:07:45 +00:00
Jingning Han	bfdcccd8a1	Merge "Rework the DRL syntax entropy coding system" into nextgenv2	2016-03-22 00:07:36 +00:00
Yue Chen	2e3f77316d	Refactor prediction functions of OBMC Merge the functions that generate prediction by above/left predictors for the encoder and the decoder. Change-Id: I57e53a8f2eb8d3028c4ed0c9abdcbf00503f95a0	2016-03-21 17:04:13 -07:00
Yue Chen	7c1f6d1862	Refactor transform type-size search function Decompose choose_tx_size_from_rd into three functions that determine the transform coding rd at different stages. Besides the original function, txfm_yrd() calculates the rd for fixed size and type. choose_tx_size_fix_type() fixes the type and searches for the size. It can enable other experiments to do restricted tx searches so as to reduce the impact on speed. Similar refactoring is done for select_tx_type_yrd() in VAR_TX. Performance change in baseline is trivial: 0.014/0.001/-0.020 for lowres/midres/hdres. Change-Id: I2ecbf6066329be088ec1bfb69013b657b14b8afe	2016-03-21 16:12:05 -07:00
Yaowu Xu	cbfc15b11b	Merge "Properly set rate_nocoef when pallete mode is used" into nextgenv2	2016-03-21 20:44:17 +00:00
Debargha Mukherjee	c28dbdf665	Merge "Adds 1D transforms for ADST/FlipADST to make 16" into nextgenv2	2016-03-21 20:40:21 +00:00
Alex Converse	d324c6b025	Write MB tokens using the forward buffered ANS writer. This allows sharing more code paths with the rest of the code an allows for easier compatibility with the other experiments. Change-Id: Id288b533805a4d0657ec2f17542f2e6ad23ebdb4	2016-03-21 18:43:14 +00:00
Alex Converse	109ef96a5f	Merge "Add a placeholder forward buffered ANS coder." into nextgenv2	2016-03-21 18:41:32 +00:00
Debargha Mukherjee	1b17559327	Adds 1D transforms for ADST/FlipADST to make 16 Makes a set of 16 transforms total, adding all 1D combinations of ADST and FlipADST, and removng all DST transforms. lowres, midres both improve by about 0.1% and hdres by -0.378% in BDRATE but with fewer transforms that are also simpler. Further experiments to continue later. Change-Id: I7348a4c0e12078fdea5ae3a2d36a89a319ffcc6e	2016-03-21 11:19:36 -07:00
Yaowu Xu	c96c3fa2b3	Properly set rate_nocoef when pallete mode is used Change-Id: Iff04c82b3d3b5cf2c7700717c3c3d678bbbb9f9b	2016-03-21 11:07:53 -07:00
Jingning Han	66df6e7c7f	Refactor sub8x8 reference motion vector search function Rework the interface to allow codec store the reference motion vector list information for coding process. Change-Id: I47e26587f6c0808655e4626f316ec7614a7ad8ed	2016-03-21 10:02:08 -07:00
Jingning Han	5c9d315572	Rework the DRL syntax entropy coding system This commit re-designs the probability model for the syntax elements of the dynamic motion vector referencing system. Change-Id: Icfb8203c7e8f64e10e99f5890e25e6f6b15fe5d1	2016-03-21 09:52:33 -07:00
Jingning Han	4914ae4622	Merge "Enable dynamic motion vector referencing for newmv mode" into nextgenv2	2016-03-19 00:40:04 +00:00
Debargha Mukherjee	3c065ac46a	Merge "Refactor bsse and skip_txfm in MACROBLOCK." into nextgenv2	2016-03-18 23:51:40 +00:00
Debargha Mukherjee	05029a47a1	Merge "Refactor save_context restore_context in rd_pick_partition." into nextgenv2	2016-03-18 23:51:06 +00:00
Debargha Mukherjee	0ac48f8f65	Merge "Refactor mbmi->inter_tx_size to 2D array." into nextgenv2	2016-03-18 23:50:25 +00:00
Sarah Parker	0adb805db9	Merge "Remove prune three from speed features" into nextgenv2	2016-03-18 21:29:24 +00:00
Sarah Parker	fab5454a16	Remove prune three from speed features Not getting good results for this feature, will try again when transforms are frozen. Change-Id: Id12396786cb9369ad34d0bd845f7beba3a037726	2016-03-18 13:06:40 -07:00
Alex Converse	44ce668063	Add a placeholder forward buffered ANS coder. This buffered ANS coder supports coding the symbols in forward (decode) order. Rather than windowing or growing the buffer, right now this coder merely asserts that the buffer will never overflow. This approach should allow ANS to be used as a drop in replacement for other entropy coders rather than requiring complicated reversal logic throughout the codebase. Change-Id: I6689271233d0e22fea94c51950415dad5af96598	2016-03-18 19:33:45 +00:00
Yaowu Xu	42e5c2ad8a	Two minor logic fixes Change-Id: I1d5624fb2f34f87a55613036851034ec7c2d0b76	2016-03-18 11:48:19 -07:00

... 8 9 10 11 12 ...

1015 Commits