generic-library/vpx

Author	SHA1	Message	Date
Dmitry Kovalev	bc7acb134b	Consistent names for inter mode probabilities and encodings. Renaming vp9_sb_mv_ref_tree to vp9_inter_mode_tree, and vp9_sb_mv_ref_encoding_array to vp9_inter_mode_encodings. Change-Id: I0e91fbf81350d3ec5a2599064c74089b5d06133a	2013-07-19 10:40:04 -07:00
Yaowu Xu	37d901a47a	Merge "Add best_rd breakout to keyframe partition selection also."	2013-07-18 17:50:39 -07:00
Yaowu Xu	67fb0679ee	Merge "Merge scale_factors and scale_factors_uv."	2013-07-18 17:50:34 -07:00
Yaowu Xu	55b52e32da	Merge "Do in-place UV intra mode selection."	2013-07-18 17:50:07 -07:00
Yaowu Xu	51972d1279	Merge "Change break statement in a 2d loop to a return statement."	2013-07-18 17:49:58 -07:00
Dmitry Kovalev	92f4198d52	Merge "Using VP9_REF_NO_SCALE instead of (1 << VP9_REF_SCALE_SHIFT)."	2013-07-18 17:29:05 -07:00
Dmitry Kovalev	0b562b2d3d	Using VP9_REF_NO_SCALE instead of (1 << VP9_REF_SCALE_SHIFT). Change-Id: Ide58a74d31ff948319445a6337d2c05e98720e34	2013-07-18 15:12:46 -07:00
Ronald S. Bultje	96e4db2660	Add best_rd breakout to keyframe partition selection also. Change-Id: I96b8058f6dfecf8aa3e152cdcbfd7e10071fbbc9	2013-07-18 14:10:56 -07:00
Ronald S. Bultje	5ebe503f04	Merge scale_factors and scale_factors_uv. This prevents a duplicate memcpy of a 128-byte struct every time set_scale_factors() is called (which is a lot), thus leading to a decrease from 3.7 MB to 1.85 MB of struct copying per 64x64 block RD/partition loop. Overall, this decreases encoding time of the first 50 frames of bus @ 1500kbps (speed 0) from 1min5.9 to 1min4.9, i.e. about a 1.5% overall speedup. We can likely get more gains by removing the copy of the other struct (and replacing it with an indexing) as well. Change-Id: I3dceb7e79f71e6fe911b11cc994cf89a869dde7a	2013-07-18 14:10:56 -07:00
Ronald S. Bultje	df4b4fab26	Do in-place UV intra mode selection. This means we only do UV intra mode selection if we find any intra mode to actually be useful at all; in addition, we only do UV intra mode selection for the transform sizes that were selected, rather than all sizes available in this partition. First 50 frames of bus @ 1500kbps (speed 0) gains about 5% with this change. Change-Id: I7b461eb8b803247f57896c5a9505f745b55502b3	2013-07-18 14:10:56 -07:00
Ronald S. Bultje	e54a5782b9	Change break statement in a 2d loop to a return statement. The break statement only breaks out of the nested loop, not the top-level loop, so it doesn't always work as intended. Changing it to a return statement does what's intended. Change-Id: I585419823b39a04ec8826b1c8a216099b1728ba7	2013-07-18 14:10:56 -07:00
Ronald S. Bultje	2d4929e340	Remove motion vectors from PARTITION_INFO. The same information already exists in union b_mode_info. Change-Id: Iac5086b99a3c3cc270380138062bb693e58f9e6d	2013-07-18 14:10:52 -07:00
Ronald S. Bultje	9da67da04a	Merge "Fix bug where we don't choose any mode in RD selection."	2013-07-18 12:47:50 -07:00
Ronald S. Bultje	247197d57b	Fix bug where we don't choose any mode in RD selection. This could happen during golden overlay frame coding from a previous alt-ref frame if the special overlay code was triggered. Change-Id: I3056d0c547cd26903b260ef93c94026e96bd9868	2013-07-18 12:13:15 -07:00
Ronald S. Bultje	4f5815290c	Merge "Fix bug which skips zeromv even if near/nearest is not 0,0."	2013-07-18 10:06:51 -07:00
Ronald S. Bultje	deb7456058	Fix bug which skips zeromv even if near/nearest is not 0,0. Change-Id: Id4f454831f3f11099f39c30246adeaa52857d08d	2013-07-18 09:35:19 -07:00
Jingning Han	ced3c20165	Use mv_check_bounds in sub8x8 rd loop Make the use of mv_check_bounds consistent for mvs of both ref_frame[0] and ref_frame[1]. Change-Id: I1ca24865cc7232ca9cbe5db566c53abad1592211	2013-07-17 17:13:51 -07:00
Ronald S. Bultje	facecd80da	Merge "Add a best_yrd shortcut in splitmv mode search."	2013-07-17 16:11:13 -07:00
Ronald S. Bultje	056111c822	Merge "Skip redundant nearest/near/zero encodes in splitmv."	2013-07-17 16:10:51 -07:00
Ronald S. Bultje	0b1eba25b2	Merge "Skip nearest/near/zero redundant encodes."	2013-07-17 16:10:41 -07:00
Ronald S. Bultje	607424449c	Merge "Best_rd breakout in rd partition search."	2013-07-17 16:10:22 -07:00
Yaowu Xu	6ac5b7db2c	Merge "changed mode checking order"	2013-07-17 14:44:40 -07:00
Dmitry Kovalev	a7a1e96136	Merge changes Ieffea49e,Idf610746 * changes: Removing two unused arguments from vp9_inc_mv signature. Changing signature of vp9_get_pred_probs_tx_size.	2013-07-17 14:44:20 -07:00
Ronald S. Bultje	c6917528a5	Add a best_yrd shortcut in splitmv mode search. Encoding of first 50 frames of bus (speed 0) @ 1500kbps goes from 1min6.2 to 1min5.9, i.e. 0.5% faster overall. Change-Id: I59d8a3b2f0a75010fa041d5e2646c8caac5bd683	2013-07-17 14:21:57 -07:00
Ronald S. Bultje	161c995658	Skip redundant nearest/near/zero encodes in splitmv. Encode of first 50 frames of bus @ 1500kbps (speed 0) goes from 1min7.3 to 1min6.2, i.e. 1.7% faster overall. Change-Id: I19d2deacfbffadd61d32551cee9586757ab4a987	2013-07-17 13:53:48 -07:00
Yaowu Xu	42facc292d	changed mode checking order Change-Id: Ic4c4b363ed840935e42f495f13ea5e601a56f1b2	2013-07-17 13:43:50 -07:00
Ronald S. Bultje	8fea880b6f	Skip nearest/near/zero redundant encodes. Encode of first 50 frames of bus @ 1500kbps (speed 0) goes from 1min12.8 to 1min7.3, i.e. 8% faster. Change-Id: Ia22d1c7b687316c553cc60eacae988b24e175b62	2013-07-17 11:33:15 -07:00
Ronald S. Bultje	9f427bfe98	Best_rd breakout in rd partition search. About 15% faster for bus (speed 0) first 50 frames @ 1500kbps, which goes from 1min36 to 1min24. Results become slightly better (+0.2% on derf/yt, +0.4% on hd), probably because of a bugfix for skipmode in super_block_yrd(). Overall speed change (on derfraw300) is roughly -13%. This can probably be improved further by caching best_yrd between partition searches. Also, we might be able to get more speedups by always doing PARTITION_NONE before PARTITIONS_SPLIT, not just at the sb8x8 level. Change-Id: I83736949ebd5b4a3b400ee688d7661913fefc98b	2013-07-17 09:56:46 -07:00
Ronald S. Bultje	83c7e13a6b	Do a skip-block check for sub8x8 partitions also. +0.2% SSIM and glbPSNR on derfraw300. Change-Id: I9cba0bca55e606a22f557c7732b064f738efe84d	2013-07-17 09:46:47 -07:00
Yunqing Wang	df90d58f4f	Speed up motion estimation using small partitions' result(experiment) Current partition checking starts from small sizes, and then goes up to large sizes. This experiment uses the small partitions' motion estimation result, which is already available, to speed up the large partition's motion estimation. We can decide to skip some patition checkings if they are unlikely choices. We could use the motion vector(MV) result as current partition's prediction MV, limit the search range and reference frame. Current result at speed 1: psnr loss: 1.19% for stdhd, 0.287% for derf. speed gain: 14% for sunflower(hd), 11% for akiyo. Further improvement will be done later. Change-Id: I5abfd070e9cace2e91e2a0247d1325df313887ab	2013-07-17 09:11:47 -07:00
Paul Wilkins	d66eab15dd	Merge "Move uv intra mode selection in rd loop."	2013-07-17 05:19:26 -07:00
Paul Wilkins	154c34a3ee	Merge "Limit transform sizes searched for uv intra."	2013-07-17 03:40:11 -07:00
Paul Wilkins	2ee338ce3b	Move uv intra mode selection in rd loop. Use an estimate based on DC_PRED for intra uv cost within the rd loop then only do a full uv mode analysis if an intra mode is chosen. Significant speed gains in some cases. Currently only enabled for speed 2 pending speed/quality tests. Change-Id: Ie851a12400d5483bce47ec0e3ccb8516041e91c0	2013-07-17 11:11:21 +01:00
Paul Wilkins	6c667f0ffe	Limit transform sizes searched for uv intra. Apply limit if search_method == USE_LARGESTALL to the range of UV tx sizes searched. Change-Id: I6db29f0dd237285ffc50d75a37e8b68151ad821c	2013-07-17 11:08:55 +01:00
Paul Wilkins	5f4722c75f	Merge "Minor cleanup in code to fine uv tx_size."	2013-07-17 02:50:09 -07:00
Jingning Han	a142d6fc93	Skip redundant motion search in 4x4 level rd loop This commit makes the encoder to perform motion search only once per reference frame type for each 4x4/4x8/8x4 block. For bus_cif at 2000 kbps, the runtime goes from 253812ms -> 217817ms (14% speed-up) for speed 0. Change-Id: I5f17599ccc8cfaf93ccb4f98fcb6008af6d79e92	2013-07-16 17:21:11 -07:00
Dmitry Kovalev	5b65a71cdc	Changing signature of vp9_get_pred_probs_tx_size. Removing VP9_COMMON* argument and adding struct tx_probs* instead of MACROBLOCKD*. Change-Id: Idf61074631a90ec51eac22c8dcd977f44ac0757c	2013-07-16 16:34:54 -07:00
Paul Wilkins	30d2ea45ce	Minor cleanup in code to fine uv tx_size. Change-Id: I94b97a966b5efbc9a243048f1f5ddbbdc4b1846e	2013-07-16 18:27:33 +01:00
Dmitry Kovalev	ca75f1255f	Removing and moving around constant definitions. Removing unused and duplicated constants, moving them from .h to .c if possible. Change-Id: Ief4d6b984a3ca2e9b38504f0d855ed072cf7133f	2013-07-15 19:26:30 -07:00
Jingning Han	faff6ed0fb	Skip duplicate block encoding in the rd loop This speed feature allows the encoder to largely remove the spatial dependency between blocks inside a 64x64 superblock, thereby removing the need to repeatedly encode superblocks per partition type in the rate-distortion optimization loop. A major challenge lies in the intra modes tested in the rate-distortion optimization loop. The subsequent blocks do not have access to the reconstructed boundary pixels without the intermediate coding steps. This was resolved by using the original pixels for intra prediction in the rd loop, followed by an appropriately designed distortion modeling on the quantization parameters. Experiments also suggested that the performance impact is more discernible at lower bit-rate/psnr settings. Hence a quantizer dependent threshold is applied to deactivate skip of block coding. For bus_cif at 2000 kbps, speed 0: runtime 269854ms -> 237774ms (12% speed-up) at 0.05dB performance loss. speed 1: runtime 65312ms -> 61536ms, (7% speed-up) at 0.04dB performance loss. This operation is currently turned on in settings of speed 1. Change-Id: Ib689741dfff8dd38365d8c1b92860a3e176f56ec	2013-07-15 11:08:58 -07:00
Yaowu Xu	fb754b182f	Fix a build issue Change-Id: I23a75c495ed7ea917d7f312bef0990e20a6b53d9	2013-07-12 11:38:44 -07:00
Deb Mukherjee	94c481f9f1	Some minor cleanups for efficiency Implements some of the helper functions more efficiently with lookups rathers than branches. Modeling function is consolidated to reduce some computations. Also merged the two enums BLOCK_SIZE_TYPES and BlockSize into one because there is no need to keep them separate (even though the semantics are a little different). No bitstream or output change. About 0.5% speedup Change-Id: I7d71a66e8031ddb340744dc493f22976052b8f9f	2013-07-12 10:22:56 -07:00
Ronald S. Bultje	ee09dd9949	Remove unused function block_error(). Change-Id: I78a79fc51c2d7cc3c261f35b569155397f3dc0c4	2013-07-11 17:14:03 -07:00
Dmitry Kovalev	8c05e59065	Calling is_inter_mode() instead of custom code. Change-Id: Iccd4ab95ea51a6d57ed43947f2fd7ad92e8979cf	2013-07-11 14:14:47 -07:00
Dmitry Kovalev	c4ad3273c7	Moving segmentation related vars into separate struct. Adding segmentation struct to vp9_seg_common.h. Struct members are from macroblockd and VP9Common structs. Moving segmentation related constants and enums to vp9_seg_common.h. Change-Id: I23fabc33f11a359249f5f80d161daf569d02ec03	2013-07-11 11:57:57 -07:00
Jingning Han	18803f9cc4	Fix tx_type bug in intra4x4 rd loop This commit fixed the mis-use of the tx_type for inverse transform in intra4x4 rate-distortion optimization loop. It improves the overall coding performance. Change-Id: I7fe9953175b74890357dbcee33c138573766e980	2013-07-10 15:49:49 -07:00
Deb Mukherjee	7494bba66b	Merge "Prunes out full-rd computation based on modeled rd"	2013-07-10 15:37:11 -07:00
Jim Bankoski	865ca76604	Merge "remove warnings when NDEBUG is set"	2013-07-10 14:39:39 -07:00
Jim Bankoski	6591cf2f7e	remove warnings when NDEBUG is set Change-Id: Ie0cb732fdcb98616a422c4463bff80642248d136	2013-07-10 14:27:20 -07:00
Deb Mukherjee	53ff43adc3	Prunes out full-rd computation based on modeled rd Adds a speed feature to eliminate full-rd computation if the modeled rd or rd based on a different parameter in the same mode is already a lot larger than the best rd yet. Specifically, only search the sharp and smooth filters if the modeled rd cost based on the regular filter is within a certain factor of the best rd cost so far. Also, skip full-rd computation of non splitmv inter modes if the modeled rd cost based on pred error is within the same factor of the best rd cost so far. Also adds some enhancements in the rd search for splitmv mode to speed things up by early breakouts. Negligible impact on performance. Resuts on derfraw300: psnr: -0.013% with the splitmv enhancements, -0.24% with the rd breakout feature on. speedup: 6% with splitmv enhancements, 20% with also residual breakout (tested on football sequence at 600 Kbps) Change-Id: I37abc308ea9f110c1679ce649b6a7e73ab1ad5fc	2013-07-10 13:49:49 -07:00
Yaowu Xu	e52eec490c	Merge "Add a feature to reduce chrome intra mode search"	2013-07-10 11:35:47 -07:00
Ronald S. Bultje	b1df674a99	Remove memcpy() in handle_inter_mode() filter selection. Encode time of first 50 frames of bus (speed 0) @ 1500kbps goes from 2min4.9 to 2min3.1, i.e. a 1.4% speedup overall. Change-Id: I9b25e87974430cb942caa276410bb2eda815bd83	2013-07-10 09:27:56 -07:00
Yaowu Xu	bed27a960a	Add a feature to reduce chrome intra mode search Change-Id: I721ebdeef2b53ce3e5c3eba3f7462ae2103c95a8	2013-07-10 08:59:18 -07:00
Jim Bankoski	fb027a7658	removing case statements around prediction entropy coding Removes SEG_ID Removes MBSKIP Removes SWITCHABLE_INTERP Removes INTRA_INTER Removes COMP_INTER_INTER Removes COMP_REF_P Removes SINGLE_REF_P1 Removes SINGLE_REF_P2 Removes TX_SIZE Change-Id: Ie4520ae1f65c8cac312432c0616cc80dea5bf34b	2013-07-09 20:10:16 -07:00
Yaowu Xu	205efbc153	Revert "Remove memcpy() in handle_inter_mode() filter selection." This reverts commit `fcf7998a47`. Change-Id: Ic6532223faec9f1483b78adb2e37b79c7b1a0efb	2013-07-09 17:42:10 -07:00
Ronald S. Bultje	204d1b7058	Merge "Unbreak lossless."	2013-07-09 09:54:48 -07:00
Ronald S. Bultje	059c0ba5d4	Unbreak lossless. Change-Id: I8130ec9b5371c65e885f245a5ac73840c23cb4a1	2013-07-09 09:46:37 -07:00
Dmitry Kovalev	1c65c580d6	Merge "Refactoring setup_pre_planes function."	2013-07-08 20:08:05 -07:00
Ronald S. Bultje	8fde07a3ae	Don't recalculate mv_ref costs for each block/partition. Changes cost_mv_ref() into doing a LUT into pre-calculated cost arrays instead. Encode time of first 50 frames of bus (speed 0) @ 1500kbps goes from 2min11.6 to 2min10.9, i.e. 0.5% faster overall. Change-Id: If186e92c34c201b29cbbc058785a15c9c09e433a	2013-07-08 16:22:39 -07:00
Ronald S. Bultje	fcf7998a47	Remove memcpy() in handle_inter_mode() filter selection. Encode time of first 50 frames of bus (speed 0) @ 1500kbps goes from 2min4.9 to 2min3.1, i.e. a 1.4% speedup overall. Change-Id: Ibe8b08d159797504c5d0c5122de1b6da3b6595e0	2013-07-08 16:22:39 -07:00
Ronald S. Bultje	ed995afba1	Make frame-wide filter-type decision fully RD-based. Overall, on all test sets, this gains about +0.2% on all metrics. City is a clip where this really hurts (-1.0% on all metrics), I'm not quite sure why yet. Maybe interesting to look into in the future. Change-Id: I6f0eecb20e72f0194633270d30bf00d76d9eae78	2013-07-08 16:22:37 -07:00
Deb Mukherjee	d9b62160a0	Implements several heuristics to prune mode search Skips mode searches for intra and compound inter modes depending on the best mode so far and the reference frames. The various heuristics to be used are selected by bits from a flag. The previous direction based intra mode search pruning is also absorbed in this framework. Specifically the flags and their impact are: 1) FLAG_SKIP_INTRA_BESTINTER (skip intra mode search for oblique directional modes and TM_PRED if the best so far is an inter mode) derfraw300: -0.15%, 10% speedup 2) FLAG_SKIP_INTRA_DIRMISMATCH (skip D27, D63, D117 and D153 mode search if the best so far is not one of the closest hor/vert/diagonal directions. derfraw300: -0.05%, about 9% speedup 3) FLAG_SKIP_COMP_BESTINTRA (skip compound prediction mode search if the best so far is an intra mode) derfraw300: -0.06%, about 7-8% speedup 4) FLAG_SKIP_COMP_REFMISMATCH (skip compound prediction search if the best single ref inter mode does not have the same ref as one of the two references being tested in the compound mode) derfraw300: -0.56%, about 10% speedup Change-Id: I1a736cd29b36325489e7af9f32698d6394b2c495	2013-07-08 12:17:12 -07:00
Paul Wilkins	ef0ca2deaa	Merge "Fix to comp_inter_joint_search_thresh feature."	2013-07-04 03:27:00 -07:00
Dmitry Kovalev	f72e072555	Refactoring setup_pre_planes function. Removing set_refs, adding set_ref function. Change-Id: I5635c478b106ae4e57d317f1c83d929644307e63	2013-07-03 17:42:01 -07:00
Jingning Han	68172dbede	Merge "Enable early termination in rd search"	2013-07-03 14:20:41 -07:00
Jingning Han	2bd6fe08f8	Enable early termination in rd search This commit allows encoder to detect the cumulative rate-distortion cost per transformed block inside a partition. If the cumulative rd cost is already above the best rd value, it terminates the rest operations and continue to next prediction mode test. It reduces the runtime of bus at target bit-rate 2000 from 308 second to 266 second, i.e., about 13% speed-up at no performance penalty. Change-Id: I5f15a3d8955d97031d5653006027866a00654e7a	2013-07-03 12:54:18 -07:00
Paul Wilkins	f58b44ad62	Fix to comp_inter_joint_search_thresh feature. When this is 0 (BLOCK_SIZE_AB4X4) we want to do the inter joint search for all sizes. Change-Id: Id40cd6fe7790e7e1165352b9cef5e12fa8c0bc88	2013-07-03 16:58:34 +01:00
Paul Wilkins	72c5778ec5	Added two new skip experiments. sf->unused_mode_skip_lvl. Tests modes as normal for all sizes at or below the given level. At larger sizes it skips all modes that were not chosen at any smaller size. Hence setting BLOCK_SIZE_SB64X64 is in effect off. Setting BLOCK_SIZE_AB4X4 will only consider modes that were chosen for one or more 4x4 blocks at larger sizes. sf->reference_masking. Do a test encode of the NONE partition at one size and create a reference frame mask based on the best rd choice. In the full search only allow this reference frame. Currently it is testing 64x64 and repeats this in the full search. This does not work well with Jim's Partition code just now and is disabled by default. Change-Id: I8f8c52d2ef4a0c08100150b0ea4155d1aaab93dd	2013-07-03 16:56:06 +01:00
Dmitry Kovalev	be77f6bbbf	Removing redundant struct from union b_mode_info. Change-Id: I08fc6e474ff2c12cfa065bae4989c724276e2c83	2013-07-02 16:51:57 -07:00
Deb Mukherjee	37501d687c	Speed feature to binary search dir intramodes This speed feature will skip searching the directional intra prediction modes D63, D117, D27, D153 if the best intra mode so far is not one of the diagonal, horizontal or vertical directions closest to the respective directions being tested. In other words, this implements a sort of binary search in the angular domain. Speedup: about 9-10% Results: -0.05% only on derfraw300. Change-Id: I413584c41f2a3e8dabfbdeb40718c8fc4b1d63a2	2013-07-02 14:07:19 -07:00
Deb Mukherjee	8d3d2b76f3	Tx size selection enhancements (1) Refines the modeling function and uses that to add some speed features. Specifically, intead of using a flag use_largest_txfm as a speed feature, an enum tx_size_search_method is used, of which two of the types are USE_FULL_RD and USE_LARGESTALL. Two other new types are added: USE_LARGESTINTRA (use largest only for intra) USE_LARGESTINTRA_MODELINTER (use largest for intra, and model for inter) (2) Another change is that the framework for deciding transform type is simplified to use a heuristic count based method rather than an rd based method using txfm_cache. In practice the new method is found to work just as well - with derf only -0.01 down. The new method is more compatible with the new framework where certain rd costs are based on full rd and certain others are based on modeled rd or are not computed. In this patch the existing rd based method is still kept for use in the USE_FULL_RD mode. In the other modes, the count based method is used. However the recommendation is to remove it eventually since the benefit is limited, and will remove a lot of complications in the code (3) Finally a bug is fixed with the existing use_largest_txfm speed feature that causes mismatches when the lossless mode and 4x4 WH transform is forced. Results on derf: USE_FULL_RD: +0.03% (due to change in the tables), 0% encode time reduction USE_LARGESTINTRA: -0.21%, 15% encode time reduction (this one is a pretty good compromise) USE_LARGESTINTRA_MODELINTER: -0.98%, 22% encode time reduction (currently the benefit of modeling is limited for txfm size selection, but keeping this enum as a placeholder) . USE_LARGESTALL: -1.05%, 27% encode-time reduction (same as existing use_largest_txfm speed feature). Change-Id: I4d60a5f9ce78fbc90cddf2f97ed91d8bc0d4f936	2013-07-02 13:54:00 -07:00
Ronald S. Bultje	3cc6eb7c00	Merge "Make get_coef_context() branchless."	2013-07-02 11:48:15 -07:00
Jingning Han	b91a1586a3	Calculate rd cost per transformed block Compute the rate-distortion cost per transformed block, and cumulate the cost through all blocks inside a partition. This allows encoder to detect if the cumulative rd cost is already above the best rd cost, thereby enabling early termination in the rate-distortion optimization search. Change-Id: I0a856367a9a7b6dd0b466e7b767f54d5018d09ac	2013-07-02 09:58:46 -07:00
Paul Wilkins	b7cd01ed73	Revert "New motion threshold factor - speed feature." This reverts commit `1377278180`. Also fixes a spelling mistake. Change-Id: I5be8aa4d8d3c0323d4a6f41968a7b2c048949c3f	2013-07-02 15:06:40 +01:00
Ronald S. Bultje	26b6318de8	Make get_coef_context() branchless. This should significantly speedup cost_coeffs(). Basically what the patch does is to make the neighbour arrays padded by one item to prevent an eob check in get_coef_context(), then it populates each col/row scan and left/top edge coefficient with two times the same neighbour - this prevents a single/double context branch in get_coef_context(). Lastly, it populates neighbour arrays in pixel order (rather than scan order), so we don't have to dereference the scantable to get the correct neighbours. Total encoding time of first 50 frames of bus (speed 0) at 1500kbps goes from 2min10.1 to 2min5.3, i.e. a 2.6% overall speed increase. Change-Id: I42bcd2210fd7bec03767ef0e2945a665b851df56	2013-07-01 16:34:10 -07:00
Yaowu Xu	ba3b2604f0	Merge "Quantize (64-bit only, for now) SSSE3 SIMD."	2013-07-01 15:58:57 -07:00
Ronald S. Bultje	7353ceab9d	Quantize (64-bit only, for now) SSSE3 SIMD. Total encoding time for first 50 frames of bus (speed 0) @ 1500kbps goes 2min34.8 to 2min14.4, i.e. a 10.4% overall speedup. The code is x86-64 only, it needs some minor modifications to be 32bit compatible, because it uses 15 xmm registers, whereas 32bit only has 8. Change-Id: I2df53770c2e850813ffa713e1a91b45b0082b904	2013-07-01 11:36:07 -07:00
Paul Wilkins	1377278180	New motion threshold factor - speed feature. Added a speed feature that focuses only on thresholds for new motion modes. Moved sf->comp_inter_joint_search_thresh into speed 1. This has ~+0.4% impact on quality at speed 0 as our quality reference baseline. Slight adjustment to baseline thresholds. Change-Id: I7ebf104f1fe29af77ed4837b2e84be065621bbe5	2013-07-01 12:11:21 +01:00
Ronald S. Bultje	bc70c60b25	Merge "fixed a bug where sse is not populated"	2013-06-29 07:42:41 -07:00
Yaowu Xu	f853e662b7	fixed a bug where sse is not populated Change-Id: I692d800af1f976c84a76f8bd66864c4b39540abc	2013-06-28 17:10:22 -07:00
Ronald S. Bultje	d00b8e5f82	Inline vp9_get_coef_context() (and remove vp9_ prefix). Makes cost_coeffs() a lot faster: 4x4: 236 -> 181 cycles 8x8: 888 -> 588 cycles 16x16: 3550 -> 2483 cycles 32x32: 17392 -> 12010 cycles Total encode time of first 50 frames of bus (speed 0) @ 1500kbps goes from 2min51.6 to 2min43.9, i.e. 4.7% overall speedup. Change-Id: I16b8d595946393c8dc661599550b3f37f5718896	2013-06-28 10:40:21 -07:00
Ronald S. Bultje	e3ce2b2ab3	Minor change to prevent one level of dereference in cost_coeffs(). 4x4: 234 -> 236 cycles 8x8: 878 -> 888 cycles 16x16: 3664 -> 3550 cycles 32x32: 18134 -> 17392 cycles Change-Id: I37a51bfbb0060a3a54f09c6045c14a989811ed78	2013-06-28 10:29:07 -07:00
Ronald S. Bultje	91d223bd5c	Some minor optimizations for cost_coeffs(). Cycle timings for first 3 frames of bus (speed 0) at 1500kbps: 4x4: 298 -> 234 cycles 8x8: 1227 -> 878 cycles 16x16: 23426 -> 18134 cycles 32x32: 4906 -> 3664 cycles Total encode time of first 50 frames of bus @ 1500kbps (speed 0) goes from 3min0.7 to 2min51.6 seconds, i.e. 5.3% faster. Change-Id: I68a0e1b530b0563b84a67342cca4b45146077e95	2013-06-28 10:29:02 -07:00
Ronald S. Bultje	af660715c0	Make coefficient skip condition an explicit RD choice. This commit replaces zrun_zbin_boost, a method of biasing non-zero coefficients following runs of zero-coefficients to be rounded towards zero, with an explicit skip-block choice in the RD loop. The logic is basically that if individual coefficients should be rounded towards zero (from a RD point of view), the trellis/optimize loop should take care of it. If whole blocks should be zero (from a RD point of view), a single RD check is much more efficient than a complete serialization of the quantization loop. Quality change: derf +0.5% psnr, +1.6% ssim; yt +0.6% psnr, +1.1% ssim. SIMD for quantize will follow in a separate patch. Results for other test sets pending. Change-Id: Ife5fa641163ac5150ac428011e87188f1937c1f4	2013-06-28 10:28:49 -07:00
Yaowu Xu	8b9eea0a34	Minor cleanups Change-Id: I379617c1c731a686b3f7e032b8805860c1055b12	2013-06-28 09:19:50 -07:00
Paul Wilkins	05ffdf2625	Merge "Auto adapt step size feature."	2013-06-27 02:28:41 -07:00
Paul Wilkins	59af9049d3	Merge "Start adaptive threshold for each mode at max."	2013-06-27 02:28:36 -07:00
Paul Wilkins	5bcf069c6b	Merge "Change meaning of cpi->sf.first_step and rename."	2013-06-27 02:28:21 -07:00
Jingning Han	fc1cfd8e32	Merge "Make intra predictor reference buffer configurable"	2013-06-26 19:02:02 -07:00
Jingning Han	861cb06c67	Make intra predictor reference buffer configurable This commit enables configurable reference buffer pointer for intra predictor. This allows later removal of spatial dependency between blocks inside a 64x64 superblock in the rate-distortion optimization loop. Change-Id: I02418c2077efe19adc86e046a6b49364a980f5b1	2013-06-26 17:17:21 -07:00
Paul Wilkins	9f3ab83486	Auto adapt step size feature. Also tweaks to other features and experiments with what is on and off at different speed settings. Change-Id: I3e1d0be0d195216bf17c2ac5df67f34ce0b306b2	2013-06-26 19:48:39 +01:00
Dmitry Kovalev	49dee16879	Merge "Using get_plane_block_{width, height} instead of custom code."	2013-06-26 10:23:27 -07:00
Paul Wilkins	689957e3ad	Start adaptive threshold for each mode at max. Each frame we reset all adaptive thresholds to MAX rather than base. As modes are picked their thresholds drop down. Change-Id: Ia37f03a73003c2d9bfcda57edea07205e9a0e5e8	2013-06-26 17:04:47 +01:00
Paul Wilkins	e606cac046	Change meaning of cpi->sf.first_step and rename. Renamed cpi->sf.first_step to cpi->sf.reduce_first_step_size and changed its meaning such that it is a delta applied to reduce the default first step size (>> x) in the motion search rather than an absolute value. The default first step size is already changed according to the image dimensions (smaller for smaller images). cpi->sf.reduce_first_step_size now applies a further correction from the default. Change-Id: Ia94e08bc24c67b604831f980909af7e982fcd16d	2013-06-26 17:04:06 +01:00
Jingning Han	d19ea3861d	Refactor intra predictor block Remove vp9_intra4x4_predict(). Use the common intra prediction function for all block sizes. Change-Id: Ibd19d51dfa3da8bbdfb79ddeb81530b2e2089560	2013-06-25 16:33:13 -07:00
Dmitry Kovalev	dc0f457c94	Using get_plane_block_{width, height} instead of custom code. Change-Id: I453ed11b965e857a14c18ea5c0f4a0a48e7dc0d9	2013-06-25 14:11:18 -07:00
Dmitry Kovalev	87ee34aacb	Removing unused code. Removing block index (ib) parameter from get_tx_type_{8x8, 16x16} functions. Change-Id: Ia213335aae7a7cb027f97b9cc9b04519840250f1	2013-06-25 10:17:19 -07:00
Dmitry Kovalev	f27f76dfb3	Transforming scale_mv_component_q4 into scale_mv_q4 function. Using MV instead of int_mv for function arguments. Change-Id: Ic25e13dccbc98fac1fa1b3255127e00cca2a57f6	2013-06-21 15:34:29 -07:00
Ronald S. Bultje	54b2a59623	Implement SSE2 block_error. Change vp9_block_error() to return a 64bit error variable, change all callers to expect a 64bit return value (this will prevent overflows, which we basically don't check for at all right now). Remove duplicate block_error() function, which fixed that through truncation. Remove old (incompatible) mmx/sse2 block_error SIMD versions and replace with a new one that returns a 64bit value. Encoding time of first 50 frames of bus @ 1500kbps goes from 3min29 to 3min23, i.e. a 3% overall speedup. Change-Id: Ib71ac5508b5ee8a80f1753cd85d72df1629abe68	2013-06-21 12:54:52 -07:00
Yaowu Xu	ee07a261a0	rename variables to avoid build error in MSVC Change-Id: I7960178c95c54d5c4497e44cfc8c493566294b34	2013-06-20 18:31:48 -07:00
Deb Mukherjee	7947a33d72	Improving model rd with variance and quant step Improves the rd modeling function and implements them using interpolation from a table which is a little faster. Also uses sse as input to the modeling function rather than var - since there is no dc prediction used and as a result the sse works a little better. derfraw300: +0.05% Speedup: ~1% Change-Id: I151353c6451e0e8fe3ae18ab9842f8f67e5151ff	2013-06-20 10:06:28 -07:00
Jim Bankoski	1f94b97694	convert all speed things to speed features Change-Id: Ie24489a4d39f3e53e816eeebf75a1c9c7d94515a	2013-06-20 09:42:44 -07:00
Yaowu Xu	12180c8329	Remove unnecessary copying of probs. Change-Id: Ic924f07c6ab0c929c6cdf11880d3c625806e272c	2013-06-18 23:02:27 -07:00
Deb Mukherjee	4ad96115cd	Some cleanups in rd motion search No bitstream or output change - only cosmetics. Change-Id: Ic8c1d7ad010a87dcf27d12a38cd7dd5adba683a7	2013-06-13 17:25:23 -07:00
Deb Mukherjee	f18328cbf1	Adds a zero check in model_rd function Avoids divide-by-zero when variance is 0. Change-Id: I3c7f526979046ff7d17714ce960fe81d6e1442a0	2013-06-10 17:04:47 -07:00
John Koleszar	717d744a01	Fix use of get_uv_tx_size in loopfilter Change the argument of get_uv_tx_size() to be an MBMI pointer, so that the correct column's MBMI can be passed to the function. Change-Id: Ied6b8ec33b77cdd353119e8fd2d157811815fc98	2013-06-10 11:40:57 -07:00
Paul Wilkins	de6ec27d1a	Rd check on segment level reference mode. Do not allow the rd code to check compound modes if a segment level reference frame is selected. Change-Id: I95f0c57789e0eaceed7caf227e94b4ba3130a06c	2013-06-10 11:03:15 -07:00
Ronald S. Bultje	b12a8dac98	Allow non-zeromv if ref_frame=intra with segmentation skip/ref enabled. Change-Id: Ib5a95bb6ab643b276df3faa9bf99595e4a69ff18	2013-06-10 10:55:10 -07:00
Tero Rintaluoma	86bb6df005	Fixed point reference picture scaling Fixed point scaling factors are calculated once for each reference frame by using integer division. Otherwise fixed point scaling routines are used in all scaling calculations. This makes it possible to calculate fixed point scaling factors on device driver software and pass them to hardware and thus avoid division on hardware. TODO: - Missing check for maximum frame dimensions (currently scaling uses 14 bits) - Missing check for maximum scaling ratio (upscaling 16:1, downscaling 2:1) Problems: - Straightforward fixed point implementation can cause error +-1 compared to integer division (i.e. in x_step_q4). Should only be an issue for frames larger than 16k. Change-Id: I3cf4dabd610a4dc18da3bdb31ae244ebaf5d579c	2013-06-10 08:07:55 -07:00
Deb Mukherjee	21401942b0	Coding tx-size selection by use of spatial context Adds coding of transform size within a frame by use of context of transform sizes selected in left and above blocks. Also incorporates code for generating stats. TODO: generate and incorporate new default stats Change-Id: I6a7af099f6ad61d448521d9a51167aedaf638ed6	2013-06-07 16:07:58 -07:00
Paul Wilkins	340c7a48e6	Change to segment ref frame feature. Simplify feature to only support a single reference frame instead of a mask. Change-Id: I5dd3a98c7a224aafb35708850ab82e2f220e68fb	2013-06-07 21:42:22 +01:00
Deb Mukherjee	3ee1a21a42	Coding updates for tx-size selection Changes to the coding of transform sizes, along with forward and backward probability updates. Results: derf300: +0.241% Context based coding of transform sizes will be in a separate patch. Change-Id: I97241d60a926f014fee2de21fa4446ca56495756	2013-06-07 08:54:00 -07:00
Ronald S. Bultje	6ef805eb9d	Change ref frame coding. Code intra/inter, then comp/single, then the ref frame selection. Use contextualization for all steps. Don't code two past frames in comp pred mode. Change-Id: I4639a78cd5cccb283023265dbcc07898c3e7cf95	2013-06-06 17:28:09 -07:00
Ronald S. Bultje	ad34368786	New intra mode and partitioning probabilities. Split partition probabilities between keyframes and non-keyframes, since they are fairly different. Also have per-blocksize interframe y intramode probabilities, since these vary heavily between different blocksizes. Lastly, replace default probabilities for partitioning and intra modes with new ones generated from current codec. Replace counts with actual probabilities also. Change-Id: I77ca996e25e4a28e03bdbc542f27a3e64ca1234f	2013-06-06 10:45:30 -07:00
Jingning Han	d03e974fbd	Bug fix in rd_pick_inter_mode_sb_ Fix the calculation of step size in height. Change-Id: I0e0c0175f141f5a41214ae51cef233d13942d3c5	2013-06-06 10:04:26 -07:00
Paul Wilkins	26e24b1dd7	Merge "Rd thresholds change with block size." into experimental	2013-06-06 09:27:44 -07:00
Paul Wilkins	02590a5b1b	Merge "Turn off compound inter search refinement for good quality." into experimental	2013-06-06 09:27:31 -07:00
Jim Bankoski	b4c4f64862	signs reverted Change-Id: Ieface458c83eb6e7ee95595d9fc662f372117c9a	2013-06-06 08:59:22 -07:00
Paul Wilkins	c3316c2bc5	Rd thresholds change with block size. Added structures to support independent rd thresholds for different block sizes (and set experimental block size correction factors). Added structure to to allow dynamic adaptation of thresholds per mode and per block size basis depending on how often the mode/block size combination is seen (currently fixed factor). Removed some unused variables. TODO - Adaptation of thresholds based on how often each mode chosen. - The baseline mode values could also be adjusted based on the block size (e.g. for a particular intra mode use a low threshold for 4x4 prediction blocks but a relatively high value for 64x64. Change-Id: Iddee65ff3324ee309815ae7c1c5a8584720e7568	2013-06-06 15:45:53 +01:00
Paul Wilkins	c880e02f97	Turn off compound inter search refinement for good quality. Turn this feature off for some modes in "good" quality. Change-Id: I3f262d62cca8f01736b977af1465291e8be29f0a	2013-06-06 15:44:25 +01:00
Jim Bankoski	5a88271b09	don't tokenize & encode tokens for blocks in UMV This avoids encoding tokens for blocks that are entirely in the UMV border. This changes the bitstream. Change-Id: I32b4df46ac8a990d0c37cee92fd34f8ddd4fb6c9	2013-06-06 06:10:25 -07:00
Jingning Han	61e6586230	Merge "Fix UV intra coding rd loop" into experimental	2013-06-05 21:47:00 -07:00
Jingning Han	f04b15486a	Fix UV intra coding rd loop This commit makes the coding/reconstruction operations of intra coding rate-distortion loop for UV components consistent with those of the encoding process. key frame coding gains: derf: 0.11% stdhd: 0.42% Change-Id: I8d49f83924a320e3689ef2d60096c49d7f0c7a40	2013-06-05 21:18:02 -07:00
Deb Mukherjee	30226a658f	Cosmetic renaming VP9_MVREFS to VP9_INTER_MODES NO bitstream change Change-Id: I79f6146dac5fdd157051b6f8dc611c0b7b5e5f7f	2013-06-05 11:24:01 -07:00
Jingning Han	513d326d75	Merge "Make sb intra rd search consistent with encoding" into experimental	2013-06-04 14:59:05 -07:00
Jingning Han	51b6e73a68	Make sb intra rd search consistent with encoding This commit makes operations of the superblock intra coding rate distortion optimization consistent with those used in the encoding process. Given the test prediction mode and transform size, the rd optimizer encodes and reconstructs each transformed block of the superblock consecutively, then computes the total rate-distortion costs accosicated with the current superblock to select the coding decisions. It achieves coding performance gains: derf 0.353% yt 1.111% Change-Id: I0da2eb7a71361dfb8c1384927fc536b0c2790d07	2013-06-04 13:54:48 -07:00
Dmitry Kovalev	6a961e7dc8	Merge "Replacing memcpy with struct assignment." into experimental	2013-06-03 14:32:05 -07:00
Jingning Han	9068bce4e7	Put iterative motion search under speed control Enable iterative motion search for compound inter-inter prediction of block sizes 4x4/4x8/8x4 only when best coding quality is selected. The iterative motion search provides about 0.1% gains for derf and stdhd at this point, at the expense of longer runtime. Change-Id: Idc03e7f827e51f1bb8d269bc3752ee297a6bbfe5	2013-06-03 09:18:57 -07:00
Dmitry Kovalev	3b9ec31eaf	Replacing memcpy with struct assignment. Change-Id: Ib557cc6351404b9e178e95a545883eb3666f11f0	2013-05-31 16:00:32 -07:00
Dmitry Kovalev	317d832d38	Merge "Adding plane_block_width and plane_block_height functions." into experimental	2013-05-31 15:28:45 -07:00
Deb Mukherjee	0048ec2329	Costing fixes related to trellis optimization Migrates costing changes/fixes from the rebalance expt to the head without the expt on. Rebased. Change-Id: I51677d62f77ed08aca8d21a4c9a13103eb8de93f Results: derfraw300: +0.126%	2013-05-31 13:56:32 -07:00
Dmitry Kovalev	120a878199	Adding plane_block_width and plane_block_height functions. Change-Id: I02c17fb733c0f3c22dc3167c3d3182797415f1ae	2013-05-31 12:31:49 -07:00
Ronald S. Bultje	a288cb3b10	Merge "Merge all various transform size data trackers into single variables." into experimental	2013-05-31 09:59:24 -07:00
Scott LaVarnway	1e025dbfd1	Merge "Moved use_prev_in_find_mv_refs check to frame level" into experimental	2013-05-31 09:35:51 -07:00
Ronald S. Bultje	e9d68a5e36	Merge all various transform size data trackers into single variables. Change-Id: I2dfc569106b29fbe4da20585a0e85e5e9ea6a4db	2013-05-31 09:18:59 -07:00
Jim Bankoski	21595f8e38	Merge "Creates a new speed 1:" into experimental	2013-05-30 20:36:05 -07:00
Jim Bankoski	ced21bd6a6	Creates a new speed 1: This speed 1 - uses variance threshold stolen from static-thresh to determine split. Any superblock with greater than the variance set by static thresh * quantizer index squared is split. In addition transform size is set to largest size less than or equal to partition size, sub pixel filter is set to normal, and only 12 modes are used at all. Change-Id: If7a2858ee70f96d1eb989c04fd87a332b147abef	2013-05-30 19:53:00 -07:00
Ronald S. Bultje	16482bddf7	Merge "Remove splitmv." into experimental	2013-05-30 19:07:12 -07:00
Ronald S. Bultje	d2205f92c3	Merge changes I98c18fe5,I80c37cff into experimental * changes: Remove i4x4_pred. Remove unused table.	2013-05-30 19:06:44 -07:00
Ronald S. Bultje	e6485581fe	Remove splitmv. We leave it in rdopt.c as a local define for now - this can be removed later. In all other places, we remove it, thereby slightly decreasing the size of some arrays in the bitstream. Change-Id: Ic2a9beb97a4eda0b086f62c039d994b192f99ca5	2013-05-30 17:21:01 -07:00
Ronald S. Bultje	1efa79d32f	Remove i4x4_pred. It remains as a local define in rdopt.c so we can distinguish between split and non-split modes in the RD loop, but disappears outside that scope in the codec. Change-Id: I98c18fe5ab7e4fbd1d6620ec5695e2ea20513ce9	2013-05-30 16:44:58 -07:00
Ronald S. Bultje	f5827699bf	Merge "Merge all intra mode coding trees into a single one." into experimental	2013-05-30 11:27:51 -07:00
Jingning Han	5e97862a71	Merge "Enable iterative motion search for 4x4 inter pred" into experimental	2013-05-30 11:02:10 -07:00
Ronald S. Bultje	98c192ae83	Merge all intra mode coding trees into a single one. Also merge all counters. This removes a few unused probability updates from the bitstream. Change-Id: I20f58853e9dac84d8c0d9703ae012c55917516eb	2013-05-30 09:58:53 -07:00
Jim Bankoski	e987f03acd	Merge "valgrind - txfm_thresh not set" into experimental	2013-05-30 09:34:48 -07:00
Deb Mukherjee	c98bfcfbbb	Merge "Balancing coef-tree to reduce bool decodes" into experimental	2013-05-30 08:10:47 -07:00
Jim Bankoski	ecf023f6e4	Merge "fix valgrind warning" into experimental	2013-05-30 08:04:49 -07:00
Jingning Han	87626a8f6e	Enable iterative motion search for 4x4 inter pred This commit enables iterative motion search for 4x4/4x8/8x4 block size compound inter-inter prediction. WIP: borg run testing Change-Id: I2b318db4a03cdca5a8002b3fa6c0fa89b129288b	2013-05-30 10:49:35 +01:00
Ronald S. Bultje	17544d1478	Merge "Remove some unused code related to macroblock/splitmv coding." into experimental	2013-05-29 17:35:05 -07:00
Jingning Han	5c05fbf6bb	Merge "Refactor 4x4 block level rd loop" into experimental	2013-05-29 16:35:02 -07:00
Deb Mukherjee	b8b3f1a46d	Balancing coef-tree to reduce bool decodes This patch changes the coefficient tree to move the EOB to below the ZERO node in order to save number of bool decodes. The advantages of moving EOB one step down as opposed to two steps down in the other parallel patch are: 1. The coef modeling based on the One-node becomes independent of the tree structure above it, and 2. Fewer conext/counter increases are needed. The drawback is that the potential savings in bool decodes will be less, but assuming that 0s are much more predominant than 1's the potential savings is still likely to be substantial. Results on derf300: -0.237% Change-Id: Ie784be13dc98291306b338e8228703a4c2ea2242	2013-05-29 16:25:52 -07:00
Jim Bankoski	aae78c8ac7	valgrind - txfm_thresh not set For 4x4 blocks valgrind points out the cache was uninitalized. This resolves the issue by setting it. Change-Id: I22733000da048643762813a84fbda66d8e4040d2	2013-05-29 13:56:08 -07:00
Jingning Han	d0a3872019	Refactor 4x4 block level rd loop This commit makes clean-ups in the rate-distortion loop for 4x4, 4x8, and 8x4 block sizes for the use of iterative motion search. Removed unnecessary use of bmi in handle_inter_mode. Deprecated loop over labels in the 4x4/4x8/8x4 block rd search. Change-Id: I71203dbb68b65e66f073b37abd90d82ef5ae6826	2013-05-29 13:44:52 -07:00
Scott LaVarnway	353642bc53	Moved use_prev_in_find_mv_refs check to frame level This patch checks at the frame level to see if the previous mode info context can be used. This patch eliminates the flag check that was done for every mode and removes another check that was done prior to every vp9_find_mv_refs(). Change-Id: I9da5e18b7e7e28f8b1f90d527cad087073df2d73	2013-05-29 16:42:23 -04:00
Jim Bankoski	5e5470b254	fix valgrind warning scales for second reference frame vars are unitialized if the second ref frame is one of of those disallowed by refframeflags Change-Id: I4ce42de391178c1699dcaede18c5f12c84993c61	2013-05-29 12:34:10 -07:00
Jingning Han	84deeddbaf	Merge "Refactor rd loop for inter modes" into experimental	2013-05-29 10:55:23 -07:00
Jingning Han	6c97bba403	Merge "further clean-ups on intra4x4 coding" into experimental	2013-05-29 10:55:14 -07:00
Sami Pietila	88a4d4c510	Residual coding to cache energy class of tokens. Proposal for tuning the residual coding by changing how the context from previous tokens is calculated. Storing the energy class of previous tokens instead of the token itself eases the critical path of HW implementations. Change-Id: I6d71d856b84518f6c88de771ddd818436f794bab	2013-05-29 15:21:01 +01:00
Ronald S. Bultje	4487f5a690	Remove some unused code related to macroblock/splitmv coding. Change-Id: Ic40d56fb162f4e201547dfae33e62ccd9e865889	2013-05-29 06:29:56 -07:00
Jingning Han	94d700e763	Refactor rd loop for inter modes This commit pulls the iterative motion search for compound inter- inter out from handle_inter_mode_ as a separate function. Hence, it is applicable to 4x4/4x8/8x4 level compound inter search to be enabled later. Also edit the rd loop for 4x4 inter block sizes for cosmetic purpose. Change-Id: Ibc71a11cbe5a26cd52faba01026cf8446cf4d2b4	2013-05-28 16:31:33 -07:00
Jingning Han	4729a6f389	further clean-ups on intra4x4 coding Removed one 4x4 prediction step that was unnessary in the rd loop. Removed a unused modecosts estimate from encoder side. Change-Id: I65221a52719d6876492996955ef04142d2752d86	2013-05-28 11:19:05 -07:00
Yaowu Xu	601bab4fde	Merge "a few clean-ups" into experimental	2013-05-27 15:16:21 -07:00
Ronald S. Bultje	cba8e16e93	Decrease scope of frame_mv argument to handle_inter_mode(). Change-Id: I81c637c61ecc33cb66beb59a2a33166d66b9a0a2	2013-05-27 14:16:45 -07:00
Yaowu Xu	2b96ffe025	a few clean-ups 1. remove prediction mode conversion 2. unified bmode, same for key and non-key frame 3. set I4X4_PRED count for pdf to 0, as I4X4_PRED is no longer coded ever. It is determined by ref_frame and block partition Change-Id: If5b282957c24339b241acdb9f2afef85658fe47d	2013-05-27 13:53:56 -07:00
Ronald S. Bultje	f188bf1c3d	Remove unused mode_index argument from handle_inter_mode(). Change-Id: I07b8c15f33e6e7c63dd0033c18c4ac5c0303cf32	2013-05-27 08:49:17 -07:00
Ronald S. Bultje	5cac66078e	Remove splitmv. Also do per-partition motion vector referencing in <sb8x8 partitions, and adjust mvref finding for sub8x8 partitions. Change-Id: Id3ed1ed4d2a8910d11d327db6cc63b8eb79f941f	2013-05-26 14:40:49 -07:00
Jingning Han	826efc838c	Fix a bug in intra4x4 level rd loop This commit fixed a uninitialized value use in the intra 4x4/8x4/4x8 rate-distortion loop. Change-Id: I5c25b3536b59e4f5fbb23cf85baf93b2ccec7d72	2013-05-23 17:44:33 -07:00
Jingning Han	ae10319520	Make comp_inter_inter support 4x4 partition coding This commit refactors the iterative motion search for compound inter-inter mode, to make it support all partition types including 4x4/4x8/8x4 block sizes. Change-Id: I5f1212b0f307377291763e45c6bdc9693b5f04c8	2013-05-23 13:13:42 +01:00
Paul Wilkins	33ecd6ad54	Merge Scatter Scan experiment. Removal from under configure flag. A bit renaming Change-Id: I2213229dfe852001dfec16b149f47c52ce88f3aa	2013-05-23 13:09:27 +01:00
Jingning Han	7ac5ac52f9	Merge 4x4 block level partition into codebase Move 4x4/4x8/8x4 partition coding out of experimental list. This commit fixed the unit test failure issues. It also resolved the merge conflicts between 4x4 block level partition and iterative motion search for comp_inter_inter. Change-Id: I898671f0631f5ddc4f5cc68d4c62ead7de9c5a58	2013-05-23 11:58:50 +01:00
Deb Mukherjee	ddb2309568	Merge "Using 128 entry look up table for coef models" into experimental	2013-05-22 10:38:35 -07:00
Jingning Han	d2cacdc530	Merge "Make the intra rd search support 8x4/4x8" into experimental	2013-05-22 10:00:15 -07:00
Deb Mukherjee	de4d682ca4	Using 128 entry look up table for coef models Reverts to using 128 bit LUT for the coef models rather than 48 to ease hardware implementation. Also incorporates some cleanups including removing various hooks to support different lookup tables based on block_type and ref_type. Change-Id: I54100c120cca07a2ebd3a7776bc4630fa6a153f6	2013-05-22 08:44:31 -07:00
Paul Wilkins	0b713f8c18	Merge CONFIG_COMP_INTER_JOINT_SEARCH. Merge this experiment so that it is under a speed feature flag not a configuration flag. Change-Id: I536f7f125a4ff5149bb3a64f791e835c324535fd	2013-05-22 11:23:31 +01:00
Jingning Han	f153a5d063	Make the intra rd search support 8x4/4x8 This commit allows the rate-distortion optimization of intra coding capable of supporting 8x4 and 4x8 partition settings. It enables the entropy coding of intra modes in key frame using a unified contextual probability model conditioned on its above/left prediction modes. Coding performance: derf 0.464% Change-Id: Ieed055084e11fcb64d5d5faeb0e706d30268ba18	2013-05-21 21:03:00 -07:00
John Koleszar	ddf13be8ef	Merge "Initial version of alpha channel support" into experimental	2013-05-21 17:29:51 -07:00
Deb Mukherjee	7a645e4e12	Merging the model coef prob experiment Merges the experiment. Change-Id: I4eb19af6de6df6aa3a96a2e82f231d47ed9b3ae9	2013-05-21 14:44:38 -07:00
Scott LaVarnway	1db6373267	Merge "WIP: 4x4 idct/recon merge" into experimental	2013-05-21 10:45:53 -07:00
Dmitry Kovalev	4ac70bd7d3	Adding get_ref_frame_idx function. Change-Id: I4f1a4eca6794cda78d00512196caacd5567e2dcc	2013-05-20 16:09:00 -07:00
Deb Mukherjee	39a90bc8e8	Updating the model coef experiment Cleans up the experiment. Actually uses reduced counts for backward updates, and reduced number of probabilities in the context. No change in bitstream when the experiment is on. Between expt on and off: derfraw300 is down only -0.062% (which is better than when expts were run previously). Change-Id: I55285a049a0c22810bdb42914212ab5a4f8521b5	2013-05-20 12:46:36 -07:00
Scott LaVarnway	ba48a11130	WIP: 4x4 idct/recon merge This patch eliminates the intermediate diff buffer usage by combining the short idct and the add residual into one function. The encoder can use the same code as well. Change-Id: I296604bf73579c45105de0dd1adbcc91bcc53c22	2013-05-20 13:03:17 -04:00
Jingning Han	810b612c23	Enable bit-stream support to 8x4 and 4x8 partition The recursive partition type search is enabled down to 4x4, 4x8 and 8x4, followed by the corresponding rate-distortion optimization for the per-partition encoding mode decisions. The bit-stream writing/reading synchronized in supporting the rectangular partition of 8x8 block. This provides above 1% coding performance gains on derf. To do next: 1. re-design the rate-distortion loop for inter prediction below 8x8. 2. re-design the rate-distortion loop for intra prediction below 4x4. 3. make the loop-filter aware of rectangular partition of 8x8 block. 4. clean the unused probability models. 5. update default probability values. Change-Id: Idd41a315b16879db08f045a322241f46f1d53f20	2013-05-19 14:59:04 -07:00
John Koleszar	679e4abdd5	Initial version of alpha channel support This is a mostly-working implementation of an extra channel in the bitstream. Configure with --enable-alpha to test. Notable TODOs: - Add extra channel to all mismatch tests, PSNR, SSIM, etc - Configurable subsampling - Variable number of planes (currently always uses all 4) - Loop filtering - Per-plane lossless quantizer - ARNR support This implementation just uses the same contents as the Y channel for the A channel, due to lack of content and general pain in playing back 4 channel content. A later patch will use the actual alpha channel passed in from outside the codec. Change-Id: Ibf81f023b1c570bd84b3064e9b4b8ae52e087592	2013-05-16 22:21:09 -07:00
Jingning Han	8e3d0e4d7d	Add building blocks for 4x8/8x4 rd search These building blocks enable rate-distortion optimization search over block sizes of 8x4 and 4x8. Need to convert them into mmx/sse forms. Change-Id: I570ea2d22d14ceec3fe3575128d7dfa172a577de	2013-05-16 10:41:29 -07:00
Jingning Han	8468a5c1a0	Fix the transform type selection in 4x4 partition This commit allows proper transform type (DCT/ADST) selection in the settings of partition 4x4 level. Change-Id: Iec6f922a46480d777e7ca9142a99e8c131f0077b	2013-05-15 16:09:58 -07:00
Jingning Han	1f26840fbf	Enable recursive partition down to 4x4 This commit allows the rate-distortion optimization recursion at encoder to go down to 4x4 block size. It deprecates the use of I4X4_PRED and SPLITMV syntax elements from bit-stream writing/reading. Will remove the unused probability models in the next patch. The partition type search and bit-stream are now capable of supporting the rectangular partition of 8x8 block, i.e., 8x4 and 4x8. Need to revise the rate-distortion parts to get these two partition tested in the rd loop. Change-Id: I0dfe3b90a1507ad6138db10cc58e6e237a06a9d6	2013-05-14 12:39:56 -07:00
Yunqing Wang	dee12bdf8f	Merge "Do joint motion search iteratively" into experimental	2013-05-14 10:18:11 -07:00
Yunqing Wang	60456083e9	Do joint motion search iteratively Allow motion search multiple times iteratively, and break out the loop if this search couldn't find better motion vectors. Limit the maximum number of search to 2. Tests results: 1. stdhd set: 0.311%(overall psnr); 0.346%(ssim). positive gain on 10 out of 16 clips(best: 2.746% on sunflower; worst: -0.434% on old_town_cross). 2. derf set: 0.016%(overall psnr); 0.062%(ssim). positive gain on half of the clips(best: 0.499% on bowing; worst: -0.387 on city). Change-Id: Ibf0a51776d4caf7707be0586346db08128117559	2013-05-13 12:14:09 -07:00
Jingning Han	e996c9c5f1	Merge "Force bsize for UV in I4X4 and SPLITMV" into experimental	2013-05-13 10:51:39 -07:00
Paul Wilkins	e5f715201a	Change to band calculation. Change band calculation back to simpler model based on the order in which coefficients are coded in scan order not the absolute coefficient positions. With the scatter scan experiment enabled the results were appear broadly neutral on derf (-0.028) but up a little on std-hd +0.134). Without the scatterscan experiment on the results were up derf as well. Change-Id: Ie9ef03ce42a6b24b849a4bebe950d4a5dffa6791	2013-05-13 17:21:49 +01:00
Jingning Han	4c2c350309	Force bsize for UV in I4X4 and SPLITMV Use 4x4 block coding for UV components arbitrarily in I4X4_PRED and SPLITMV coding modes. This is a temporary solution to enable bit-stream support for recursive partition down to 4x4 block size. Will separate the functionalities of 4x4 block coding rate-distortion out from those of superblocks. Change-Id: I03dc15d5897014f175f3f2c91e9b266091d56797	2013-05-11 13:39:16 -07:00
Yunqing Wang	9755d9fda2	Remove unused mdcounts mdcounts seems no longer used. Change-Id: Idd8162e8acfa3f5be7a18767156cc79ccbc2bdee	2013-05-10 11:02:22 -07:00
Yunqing Wang	9f5811c2da	Add joint motion search in comp_inter_inter mode(experiment) In current code, motion vectors got from single prediction mode are used in compound prediction mode directly. These motion vectors may not give accurate prediction since they are searched independently. In this patch, we took Pascal's suggestion, and did joint motion search in compound prediction mode to find better motion vectors in this situation. Test results: Overall PSNR: 0.570%(derf), 0.918%(stdhd); SSIM: 0.572%(derf), 1.009%(stdhd); The encoder is a little slower. This can be improved since some c code is used in motion search. Change-Id: Ib30c9240f6c56c9b070867b4ca89412a76d9f3c6	2013-05-10 10:15:43 -07:00
Dmitry Kovalev	f0911886f3	Merge "Renaming 'Speed' to 'speed' inside VP9_COMP struct." into experimental	2013-05-08 16:35:35 -07:00
Dmitry Kovalev	8f4e9ac8bc	Removing y_to_uv_block_size and y_bsizet_to_block_size functions. Change-Id: I49527ff8dd8bef1074c18a964fed2a575f0b118a	2013-05-08 15:23:42 -07:00
Dmitry Kovalev	4be190d9d0	Renaming 'Speed' to 'speed' inside VP9_COMP struct. Change-Id: I4374b5af40ee9082ddf7956a9756a15ad9ad5436	2013-05-08 14:35:42 -07:00
John Koleszar	14a5c7285b	Make switchable filter search subsampling-aware Makes the temporary storage of the filtered data agnostic to the number of planes and how they're subsampled. Change-Id: I12f352cd69a47ebe1ac622af30db29b49becb7f4	2013-05-07 21:57:00 -07:00
John Koleszar	7465f52f81	Merge "Make setup_pred_block subsampling-aware." into experimental	2013-05-07 21:53:31 -07:00
Dmitry Kovalev	80997b3aa2	Merge "Adding get_switchable_rate function." into experimental	2013-05-07 17:10:48 -07:00
Paul Wilkins	a14ae84749	Deprecate code_zerogroup experiment. Delete code under the CONFIG_CODE_ZEROGROUP flag. Change-Id: I5fe6c7b42a5da9b73118e33594301da4129f320a	2013-05-07 16:52:55 -07:00
Dmitry Kovalev	455816231e	Adding get_switchable_rate function. Change-Id: I71311a14f8d7f48508b250f25d1d0914c6a1ac72	2013-05-07 16:52:04 -07:00
Paul Wilkins	1ed57a6a62	Deprecate comp_interintra_pred experiment. Delete code under the CONFIG_COMP_INTERINTRA_PRED flag. Change-Id: I3d1079cf46305c08f7e11d738596ea112e7b547f	2013-05-07 16:24:08 -07:00
Paul Wilkins	8c1b516d10	Deprecate the newbintramode experiment. Clean out code relating to newbintramode. Change-Id: Ie91f4f156cdf60ce0da8ca407c1c9cb00c7d0705	2013-05-07 16:00:59 -07:00
Jingning Han	cf8b5a09ed	Add building blocks for partition down to 4x4 Macro ab4x4 contains experiments for recursive partition down to 4x4 block size. Change-Id: Ic727842fa98a4df9fd51e0025a545dc76a5c76c1	2013-05-07 12:11:51 -07:00
John Koleszar	e559e14fa6	Make setup_pred_block subsampling-aware. Code previously set up the pointers by scaling by MI_UV_SIZE, which is 4:2:0 only. Change-Id: Ic13a92895cff018ec1345736746ed84cb31e6e31	2013-05-07 11:47:45 -07:00
Jingning Han	776c1482a3	Merge SB8X8 into the codebase Pull sb8x8 out of experimental list. verified via borg run tests. Fixed unit test failures. Change-Id: I12a4bbd17395930580c048ab68becad1ffe46e76	2013-05-07 09:08:25 -07:00
Dmitry Kovalev	2e5f0084f3	Adding model_rd_for_sb function. Iterating over all planes in the loop instead of custom y,uv code inside handle_inter_mode function. Change-Id: I301f9276d6d544c2fd7203d84f1318ac80ea625d	2013-05-06 12:42:53 -07:00
Jingning Han	8e1c97cf73	Fix a unit test failure of sb8x8 on scaling ref Disable the use of scaled reference frame for motion search in SPLITMV mode. This fixes the unit test failure issue triggered when merging sb8x8 from experimental list. Change-Id: I02ac25fd8db8d5762f8fee29513b947189875fa0	2013-05-06 10:28:18 -07:00
Ronald S. Bultje	f7fa367094	Fix first-pass intra4x4 for sb8x8 experiment. Change-Id: I1df17f45721c690d157800daa6a0b377e3d32bc2	2013-05-04 15:49:41 -07:00
Ronald S. Bultje	842c573e04	Merge "Fix overflow in RD error calculation code." into experimental	2013-05-03 18:03:06 -07:00
John Koleszar	6c622e2783	Merge "Separate transform and quant from vp9_encode_sb" into experimental	2013-05-03 17:19:01 -07:00
John Koleszar	4529c68b3b	Separate transform and quant from vp9_encode_sb This allows removing a large number of transform size specific functions, as well as supporting 444/alpha by routing all code through the subsampling-aware path. Change-Id: Ieb085cebe9f37f24fc24de179898b22abfda08a4	2013-05-03 12:14:50 -07:00
Ronald S. Bultje	ee808e52bd	Fix overflow in RD error calculation code. Change-Id: I61ef1f198c876f9f79787ea7d7385a862cfbae19	2013-05-03 10:33:07 -07:00
Dmitry Kovalev	7ab2d7bf55	Removing MAXF macro and using MAX instead. Change-Id: I51c53692b1150005645bf362c5e5a8275178a8fd	2013-05-02 11:57:16 -07:00
Ronald S. Bultje	f37d8400db	Store splitmv modes in context after 8x8 rd loop. Change-Id: I07aa89a67e0ac5f99ef0c448553dbc46b0ed27f2	2013-05-01 17:13:23 -07:00
Ronald S. Bultje	b6c2d872f0	Fix some crashes in sb8x8 experiment. Change-Id: I390bb1cedc835f439fd5dd6cda6572b29cbb139c	2013-05-01 14:45:27 -07:00
Dmitry Kovalev	79590f186c	Merge "Cleaning up encoder segmentation code." into experimental	2013-04-30 17:49:55 -07:00
Ronald S. Bultje	d068d869b9	sb8x8 integration in rd loop. Work-in-progress, not yet ready for review. TODO items: - bitstream writing (encoder) and reading (decoder) - decoder reconstruction Change-Id: I5afb7284e7e0480847b47cd0097cb469433c9081	2013-04-30 16:13:20 -07:00
Dmitry Kovalev	51a73fbba2	Merge "Consistent names for quant-related functions and variables." into experimental	2013-04-30 10:19:48 -07:00
Dmitry Kovalev	ee97da2c03	Cleaning up encoder segmentation code. Moving code from vp9_pack_bitstream to new function encode_segmentation. Change-Id: I1f1e59a1f038618ad95162b7db4b6f8164850ea8	2013-04-29 16:07:17 -07:00
Ronald S. Bultje	2dbaa4f4f4	Change above/left_context to use an 8x8 basis. Output changes slightly because of a minor bug in (at least) the sb32x16 block2above tx16x16 tables that previously existed in vp9_blockd.c. Change-Id: I624af28ac200a8322d64454cf05c79e9502968cc	2013-04-29 10:37:25 -07:00
Dmitry Kovalev	5a5a1f25a8	Consistent names for quant-related functions and variables. Change-Id: I3a6d601e90e8740b9c26dd0afbfe9d467b75d367	2013-04-26 12:30:20 -07:00
Ronald S. Bultje	1a46b30ebe	Grow MODE_INFO array to use an 8x8 basis. Change-Id: I087e08e7909a406b71715b8525c104208daa6889	2013-04-26 11:57:17 -07:00
John Koleszar	bb41ab4a0c	Remove BLOCKD structure All members can be referenced from their per-plane counterparts, and removes assumptions about 24 blocks per macroblock. Change-Id: I7ff2fa72d22c29163eb558981c8193765a8113d9	2013-04-26 10:35:54 -07:00
John Koleszar	4f55c5618a	Remove destination pointers from BLOCKD Access these members from MACROBLOCKD instead. Change-Id: I7907230dd473ff12ebe182b9280d8b7f12a888c4	2013-04-26 10:14:07 -07:00
John Koleszar	4b27eb1f18	Merge "quantize: make 4x4, 8x8 common with larger transforms" into experimental	2013-04-26 09:08:49 -07:00
Scott LaVarnway	57f180b388	Removed bmi from blockd This originally was "Removed update_blockd_bmi()". Now, this patch removed bmi from blockd and uses the bmi found in mode_info_context. Eliminates unnecessary bmi copies between blockd and mode_info_context. Change-Id: I287a4972974bb363f49e528daa9b2a2293f4bc76	2013-04-26 10:19:43 -04:00
John Koleszar	a672351af9	quantize: make 4x4, 8x8 common with larger transforms There were 4 variants of the quantize loop in vp9_quantize.c, now there is 1. Change-Id: Ic853393411214b32d46a6ba53769413bd14e1cac	2013-04-25 14:44:54 -07:00
Ronald S. Bultje	18f29ff581	Remove duplicate code in RD handle_inter_mode() function. Change-Id: I552d53f7e7331e9246d8a32d6c6dcc0cfa0cbeb0	2013-04-25 14:21:21 -07:00
Ronald S. Bultje	c849eaca59	Use b_width/height_log2 instead of mb_ where appropriate. Basic assumption: when talking about transform units, use b_; when talking about macroblock indices, use mb_. Change-Id: Ifd163f595d4924ff892de4eb0401ccd56dc81884	2013-04-25 14:20:59 -07:00
John Koleszar	a99e1aa8ca	Remove predictor pointers from BLOCKD Access these members from MACROBLOCKD instead. Change-Id: I2574622e577bb9feede47f6b7ccbb11f3e928ca8	2013-04-25 12:04:07 -07:00
John Koleszar	6c0c6b86c1	Remove diff from BLOCKD The underlying storage for these buffers is in the per-plane MACROBLOCKD area, so read it from there directly. Change-Id: Id6bd835117fdd9dea07db95ad06eff9f12afaaf7	2013-04-25 11:57:22 -07:00
John Koleszar	15255eef82	Move dequant from BLOCKD to per-plane MACROBLOCKD This data can vary per-plane, but not per-block. Change-Id: I1971b0b2c2e697d2118e38b54ef446e52f63c65a	2013-04-25 11:57:20 -07:00
John Koleszar	4bd0f4f646	Remove BLOCK structure All members can be referenced from their per-plane counterparts, and removes assumptions about 24 blocks per macroblock. Change-Id: I593fb0715e74cd84b48facd1c9b18c3ae1185d4b	2013-04-25 11:33:17 -07:00
Dmitry Kovalev	61a47da869	Adding is_inter_mode function. Change-Id: I2d32d46002cb92c63050c2b8328865c406103621	2013-04-25 10:23:00 -07:00
Jingning Han	b0e3b3df18	Move sbsegment out of experimental list Move rectangular superblock coding out of experimental list. Change-Id: I96c37547d122330d666a67b4bf577ae54547857f	2013-04-24 15:19:17 -07:00
Jingning Han	ff2b8aa2c9	Contextual entropy coding of partition syntax This commit enables selecting probability models for recursive block partition information syntax, depending on its above/left partition information, as well as the current block size. These conditional probability models are reasonably stationary and consistent across frames, hence the backward adaptive approach is used to maintain and update the contextual models. It achieves coding performance gains (on top of enabling rectangular block sizes): derf: 0.242% yt: 0.391% hd: 0.376% stdhd: 0.645% Change-Id: Ie513d9673337f0d27abd65fb566b711d0844ec2e	2013-04-24 14:23:14 -07:00
John Koleszar	bc30736f9b	Merge "Remove coeff from BLOCK" into experimental	2013-04-23 17:42:12 -07:00
John Koleszar	aa6a36b062	Merge "Convert coeff to per-plane MACROBLOCK data" into experimental	2013-04-23 17:41:59 -07:00
John Koleszar	48f3e66e16	Remove coeff from BLOCK Lookup the data per-plane from the MACROBLOCK struct. Change-Id: I9253c4d3cf886aa9ab4aeab23a2156bfcf994ede	2013-04-23 16:39:21 -07:00
John Koleszar	138ec38cab	Convert coeff to per-plane MACROBLOCK data This commit moves the coeff storage from the MACROBLOCK struct to its per-plane part. The next commit will remove the coeff member from the BLOCK structure so that it is consistently accessed per-plane. Also refactors vp9_sb_block_error_c and vp9_sb_uv_block_error_c to be variable subsampling aware. Change-Id: I18c30f87f27c3a012119b6c1970d5fa499804455	2013-04-23 16:28:17 -07:00
John Koleszar	4f35e3e1c1	Merge "Move src_diff to per-plane MACROBLOCK data" into experimental	2013-04-23 16:24:08 -07:00
Dmitry Kovalev	d0d1094a05	Merge "Adding get_scan_{4x4, 8x8, 16x16} functions." into experimental	2013-04-23 12:44:51 -07:00
John Koleszar	cbd1315ac4	Move src_diff to per-plane MACROBLOCK data First in a series of commits making certain MACROBLOCK members addressable per-plane. This commit also refactors the block subtraction functions vp9_subtract_b, vp9_subtract_sby_c, etc to be loops-over-planes and variable subsampling aware. Change-Id: I371d092b914ae0a495dfd852ea1a3d2467be6ec3	2013-04-23 12:18:51 -07:00
Deb Mukherjee	611b26bbe0	Merge "Removing the implicit compound inter experiment" into experimental	2013-04-22 23:22:28 -07:00
Deb Mukherjee	735febf1ce	Removing the implicit compound inter experiment Removing this experiment for now, since it has been broken with the latest code changes. Change-Id: I1be2181b56de490fcb577f5905b5e147a8ed82d8	2013-04-22 16:46:54 -07:00
Jim Bankoski	366ff224ef	Merge "new version of speed 1" into experimental	2013-04-22 16:42:33 -07:00
Jim Bankoski	e7bddba149	new version of speed 1 This version of speed 1 only disables modes at higher resolution that had distortions >2x the best mode we found... The hope is that this could be a replacement for speed 0 ... Change-Id: I7421f1016b8958314469da84c4dccddf25390720	2013-04-22 15:42:41 -07:00
Dmitry Kovalev	5de7e16ca2	Adding get_scan_{4x4, 8x8, 16x16} functions. Change-Id: Id4306ef6d65d4a3984aed50b775bdf48d4f6c438	2013-04-22 14:08:41 -07:00
John Koleszar	a443447b8b	Move pre, second_pre to per-plane MACROBLOCKD data Continue moving framebuffers to per-plane data. Change-Id: I237e5a998b364c4ec20316e7249206c0bff8631a	2013-04-22 12:05:24 -07:00
Deb Mukherjee	f12509f640	Merge "Removes the code_nonzerocount experiment" into experimental	2013-04-22 11:53:14 -07:00
Deb Mukherjee	0aa79be7d5	Removes the code_nonzerocount experiment This patch does not seem to give any benefits. Change-Id: I9d2b4091d6af3dfc0875f24db86c01e2de57f8db	2013-04-22 10:58:49 -07:00
Deb Mukherjee	6ce718eb18	Merge "End of orientation zero group experiment" into experimental	2013-04-22 10:33:12 -07:00
Deb Mukherjee	70d9f116fd	End of orientation zero group experiment Adds an experiment that codes an end-of-orientation symbol for every eligible zero encountered in scan order. This cleans out various other sub-experiments that were part of the origiinal patch, which will be later included if found useful. Results are slightly positive on all sets (0.1 - 0.2% range). Change-Id: I57765c605fefc7fb9d1b57f1b356843602abefaf	2013-04-22 09:27:59 -07:00
John Koleszar	6d5ac8f2e1	reconinter: remove unnecessary functions, params Removes the redundant dst pointers from vp9_build_inter_predictors_sb{y,uv} and the remaining mb specific functions. Change-Id: I7b6bf439d9394b85ea79b4fe61a3ffc1025720da	2013-04-22 08:20:54 -07:00
John Koleszar	fa8ddbd2a6	Merge "Move dst to per-plane MACROBLOCKD data" into experimental	2013-04-19 16:33:45 -07:00
John Koleszar	d12376aa2c	Move dst to per-plane MACROBLOCKD data First in a series of commits moving the framebuffers pointers to per-plane data, so that they can be indexed numerically rather than by name. Change-Id: I6e0d60fd4d51e6375c384eb7321776564df21775	2013-04-19 16:16:10 -07:00
Yunqing Wang	25edb68100	Merge "Remove unused parameters in handle_inter_mode" into experimental	2013-04-19 14:12:43 -07:00
Paul Wilkins	fb754fd37e	Merge "Mv ref candidates cut to 2." into experimental	2013-04-19 14:09:44 -07:00
Dmitry Kovalev	3689122b1c	Merge "Fixing member names inside TOKENVALUE and TOKENEXTRA structs." into experimental	2013-04-19 10:09:04 -07:00
Jim Bankoski	35b1d2e38f	Merge "catch all for new block sizes" into experimental	2013-04-19 09:57:38 -07:00
Jim Bankoski	afb04eb211	catch all for new block sizes Just make sure we don't stop them from testing in speed 1. Change-Id: Iec9b3dba0a32616ff7a451207e0f54b81bb72575	2013-04-19 09:48:56 -07:00
Jim Bankoski	6d82fe219d	Merge "set up a new speed 1" into experimental	2013-04-19 08:28:35 -07:00
Paul Wilkins	de80da39dc	Mv ref candidates cut to 2. Further simplification of mvref search to return only the top two candidates. Distance weights removed as the test order reflects distance anyway. Change-Id: I0518cab7280258fec2058670add4f853fab7b855	2013-04-19 16:13:53 +01:00
Jim Bankoski	b6ef0823c5	set up a new speed 1 slightly worse results for faster encodes Change-Id: I25ea82a18ce20635dbcd328808c1d05ac1f58fd7	2013-04-19 08:04:57 -07:00
Paul Wilkins	92e8a3f514	Simplification of MVref search. As we are no longer able to sort the candidate mvrefs in both encoder and decode and given that the cost of explicit signalling has proved prohibitive, it no longer makes sense to find more than 2 candidates. This patch: Modifies and simplifies add_candidate_mv() Removes the forced addition of a 0 vector in the MAX_MV_REF_CANDIDATES-1 position (in preparation to reducing MAX_MV_REF_CANDIDATES to 2). Re-orders the addition of candidates slightly. This actually gives small gains (circa 0.2% on std-hd) A subsequent patch will remove NEW_MVREF experiment, reduce MAX_MV_REF_CANDIDATES to 2 and remove distance weights as these are implicit now in the order. Change-Id: I3dbe1a6f8a1a18b3c108257069c22a1141a207a4	2013-04-19 11:19:59 +01:00
Dmitry Kovalev	77f4697a13	Fixing member names inside TOKENVALUE and TOKENEXTRA structs. Change-Id: I183ec5819d4d80966c92db36db75b8c3be0d381d	2013-04-18 16:18:08 -07:00
Jingning Han	f0b065e946	Merge "Make the use of pred buffers consistent in MB/SB" into experimental	2013-04-18 15:24:55 -07:00
Jingning Han	6f43ff5824	Make the use of pred buffers consistent in MB/SB Use in-place buffers (dst of MACROBLOCKD) for macroblock prediction. This makes the macroblock buffer handling consistent with those of superblock. Remove predictor buffer MACROBLOCKD. Change-Id: Id1bcd898961097b1e6230c10f0130753a59fc6df	2013-04-18 14:59:36 -07:00
Dmitry Kovalev	a8d903e539	Merge "Replacing VP9_COMBINEENTROPYCONTEXTS macro with function." into experimental	2013-04-18 14:26:34 -07:00
Dmitry Kovalev	8b20aa2337	Merge "Renaming y1dc_delta_q, uvdc_delta_q, uvac_delta_q fields from VP9Common." into experimental	2013-04-18 14:26:06 -07:00
Yunqing Wang	e304160885	Remove unused parameters in handle_inter_mode Removed 2 unused parameters. Change-Id: Ic2862569313c404047072b268c3d2be3f635492c	2013-04-18 11:55:46 -07:00
Ronald S. Bultje	e693472236	Fairly basic integration of rectangular blocks in encoding RD loop. Adds RD integration for 32x16, 16x32, 64x32 and 32x64 rectangular blocks. Derf almost +0.6%, HD a little over +1.0%, STDHD +1.3%. Change-Id: Id651fdb6a655fdbb5c47009757e63317acfb88a5	2013-04-17 09:25:06 -07:00
Dmitry Kovalev	9087d6d470	Replacing VP9_COMBINEENTROPYCONTEXTS macro with function. Change-Id: I3bbc31840af69481e1d9bb4427c9ee25abf82946	2013-04-16 15:30:28 -07:00
Dmitry Kovalev	1ad7c1f250	Renaming y1dc_delta_q, uvdc_delta_q, uvac_delta_q fields from VP9Common. New names are y_dc_delta_q, uv_dc_delta_q, uv_ac_delta_q. Change-Id: I4acae1fc23a4697ce2c5a5becb8dc28ef0a4b552	2013-04-16 15:05:52 -07:00
John Koleszar	e3cfe4e89e	Remove the mb_no_coeff_skip flag This flag was added to VP8 to allow a mode where MB-level skipping was not allowed, saving a bit per mb. It was never used in practice, and hasn't been tested in VP9, so remove it. Change-Id: Id450ec6904c6d06c1919508e7efc52d05cde5631	2013-04-16 12:36:16 -07:00
Dmitry Kovalev	a0d9309eab	Removing TRUE and FALSE macro definitions. Using regular 0 and 1 constants now. Change-Id: Ie763503cbb727847cc8f1d6506cd6f2ee607f056	2013-04-15 15:24:39 -07:00
Ronald S. Bultje	33a8df085d	Fix lingering x->skip settings if static_threshold is used. Keyframes don't set this variable, so it would use the last set values from inter frames. Change-Id: Ie1ef45ece2c44b21b5d55f6cea9f7d6e7a445692	2013-04-15 13:39:07 -07:00
Jingning Han	aaf33d7df5	Add rectangular block size variance/sad functions. With this, the RD loop properly supports rectangular blocks. Change-Id: Iece79048fb4e84741ee1ada982da129a7bf00470	2013-04-15 13:39:07 -07:00
Ronald S. Bultje	15eac18c4e	Make filter RD code and encode breakout variance size-independent. Static threshold results slightly up (+0.1% on derf), probably b/c we now take the filter (sharp/lowpass) into account for the breakout decision. Change-Id: I9f597601da434205142afd05f32690e7ba8fd690	2013-04-15 13:38:35 -07:00
Jingning Han	3ba9dd4165	Enable inter predictor for rectangular block size Combine superblock inter predictors into a unified function that allows configurable block width and height. The inter predictions of block sizes smaller than 16x16 are handled differently. To be continued on merging them later. Change-Id: I14075959dd5e221f00c205c99ca35c1c31ef728e	2013-04-12 11:51:58 -07:00
Yaowu Xu	7de5edd14a	Rename B_PRED to I4X4_PRED So it is consistent with I8x8_PRED. Change-Id: Iefa65124b2419690d83e526c611129c0ede29d11	2013-04-12 09:23:58 -07:00
Jingning Han	815e95fbeb	Make intra predictor support rectangular blocks The intra predictor supports configurable block sizes. It can handle intra prediction down to 4x4 sizes, when enabled in BLOCK_SIZE_TYPE. Change-Id: I7399ec2512393aa98aadda9813ca0c83e19af854	2013-04-11 16:45:57 -07:00
Scott LaVarnway	cff266bbef	Merge "WIP: removing predictor buffer usage from decoder" into experimental	2013-04-11 15:24:33 -07:00
Ronald S. Bultje	69902c6bf0	Merge "Merge pick_sb_modes and pick_sb64_modes." into experimental	2013-04-11 15:06:37 -07:00
Scott LaVarnway	6189f2bcb1	WIP: removing predictor buffer usage from decoder This patch will use the dest buffer instead of the predictor buffer. This will allow us in future commits to remove the extra mem copy that occurs in the dequant functions when eob == 0. We should also be able to remove extra params that are passed into the dequant functions. Change-Id: I7241bc1ab797a430418b1f3a95b5476db7455f6a	2013-04-11 13:55:18 -07:00
John Koleszar	c2bd46bf45	tokenize: convert skippable functions Use the common block walker to calculate skippability. Change-Id: I6721e42f065df237426c91c1d871ec226ba7cdcb	2013-04-11 12:27:37 -07:00
Ronald S. Bultje	605ff051f7	Merge pick_sb_modes and pick_sb64_modes. Change-Id: Iad69e7a3b7e470acf6094f6a52e7da69066fd552	2013-04-11 09:33:49 -07:00
Ronald S. Bultje	33d94a843f	Remove copying of coefficients and predictor in i8x8 RD loop. The resulting values are never used. Change-Id: I688caf30da9aab87aa280cce913eda4f33172293	2013-04-10 17:39:03 -07:00
Ronald S. Bultje	8fb5be48a6	Make usage of sb_type independent of literal values. Change-Id: I0d12f9ef9d960df0172a1377f8e5236eb6d90492	2013-04-10 17:38:57 -07:00
Ronald S. Bultje	b4f6098ef7	Make RD superblock mode search size-agnostic. Merge various super_block_yrd and super_block_uvrd versions into one common function that works for all sizes. Make transform size selection size-agnostic also. This fixes a slight bug in the intra UV superblock code where it used the wrong transform size for txsz > 8x8, and stores the txsz selection for superblocks properly (instead of forgetting it). Lastly, it removes the trellis search that was done for 16x16 intra predictors, since trellis is relatively expensive and should thus only be done after RD mode selection. Gives basically identical results on derf (+0.009%). Change-Id: If4485c6f0a0fe4038b3172f7a238477c35a6f8d3	2013-04-10 16:50:30 -07:00
Ronald S. Bultje	1932828d19	Merge "Make SB coding size-independent." into experimental	2013-04-10 08:51:58 -07:00
Ronald S. Bultje	a3874850dd	Make SB coding size-independent. Merge sb32x32 and sb64x64 functions; allow for rectangular sizes. Code gives identical encoder results before and after. There are a few macros for rectangular block sizes under the sbsegment experiment; this experiment is not yet functional and should not yet be used. Change-Id: I71f93b5d2a1596e99a6f01f29c3f0a456694d728	2013-04-09 21:28:27 -07:00
Jingning Han	12bf0796e6	Clamp inferred motion vectors only Clamp only the motion vectors inferred from neighboring reference macroblocks. The motion vectors obtained through motion search in NEWMV mode are constrained during the search process, which allows a relatively larger referencing region than the inferred mvs. Hence further clamping the best mv provided by the motion search may affect the efficacy of NEWMV mode. Synchronized the decoding process. The decoded mvs in NEWMV modes should be guaranteed to fit in the effective range. Put a mv range clamping function there for security purpose. This improves the coding performance of high motion sequences, e.g., derf set: foreman 0.233% husky 0.175% icd 0.135% mother_daughter 0.337% pamphlet 0.561% stdhd set: blue_sky 0.408% city 0.455% also saw sunflower goes down by -0.469%. Change-Id: I3fcbba669e56dab779857a8126a91b926e899cb5	2013-04-08 11:37:03 -07:00
John Koleszar	fa135d7b9e	Merge changes Ibbfa68d6,Idb76a0e2 into experimental * changes: Move EOB to per-plane data Move qcoeff, dqcoeff from BLOCKD to per-plane data	2013-04-05 15:56:50 -07:00
Ronald S. Bultje	36c3a67c20	Remove full-pixel-related code. This is a VP8-only feature (part of profile 3) that is unsupported in VP9. Change-Id: I78016eede8d9c834d44d4c517f3e8b8fc2a378b1	2013-04-05 12:50:19 -07:00
John Koleszar	05a79f2fbf	Move EOB to per-plane data Continue migrating data from BLOCKD/MACROBLOCKD to the per-plane structures. Change-Id: Ibbfa68d6da438d32dcbe8df68245ee28b0a2fa2c	2013-04-04 21:30:23 -07:00
John Koleszar	4c05a051ab	Move qcoeff, dqcoeff from BLOCKD to per-plane data Start grouping data per-plane, as part of refactoring to support additional planes, and chroma planes with other-than 4:2:0 subsampling. Change-Id: Idb76a0e23ab239180c818025bae1f36f1608bb23	2013-04-04 16:30:57 -07:00
Deb Mukherjee	73031aaa7d	Bugfix in encode_inter_mb_segment_8x8 Fixes an indexing bug. Looks like the bug has been there for a while. Change-Id: I9fc04b0c30754bcb47366ad94a08112925600c4d	2013-04-04 11:07:19 -07:00
John Koleszar	a417a6e32c	Merge "Removing redundant function arguments." into experimental	2013-04-01 21:09:48 -07:00
Deb Mukherjee	e3955007df	Merge "Framework changes in nzc to allow more flexibility" into experimental	2013-03-29 15:57:27 -07:00
Deb Mukherjee	fe9b5143ba	Framework changes in nzc to allow more flexibility The patch adds the flexibility to use standard EOB based coding on smaller block sizes and nzc based coding on larger blocksizes. The tx-sizes that use nzc based coding and those that use EOB based coding are controlled by a function get_nzc_used(). By default, this function uses nzc based coding for 16x16 and 32x32 transform blocks, which seem to bridge the performance gap substantially. All sets are now lower by 0.5% to 0.7%, as opposed to ~1.8% before. Change-Id: I06abed3df57b52d241ea1f51b0d571c71e38fd0b	2013-03-28 09:33:50 -07:00
Ronald S. Bultje	9eea9fa206	Fix mix-up in pt token indexing. This fixes uninitialized reads in the trellis, and probably makes the trellis do something again. Change-Id: Ifac8dae9aa77574bde0954a71d4571c5c556df3c	2013-03-28 09:24:29 -07:00
Dmitry Kovalev	17cddb4e26	Removing redundant function arguments. Almost all arguments for vp9_build_inter32x32_predictors_sb and vp9_build_inter64x64_predictors_sb can be deduced from the first macroblock argument. Change-Id: I5d477a607586d05698d5b3b9b9bc03891dd3fe83	2013-03-27 16:19:27 -07:00
Ronald S. Bultje	7c70145914	Merge "Add col/row-based coefficient scanning patterns for 1D 8x8/16x16 ADSTs." into experimental	2013-03-26 19:17:08 -07:00
Ronald S. Bultje	3c77ab4c0f	Merge "Redo banding for all transforms." into experimental	2013-03-26 19:16:44 -07:00
Ronald S. Bultje	c6efbbcfe4	Merge "Use above/left (instead of previous in scan-order) as token context." into experimental	2013-03-26 19:16:24 -07:00
Deb Mukherjee	23144d2345	Implicit weighted prediction experiment Adds an experiment to use a weighted prediction of two INTER predictors, where the weight is one of (1/4, 3/4), (3/8, 5/8), (1/2, 1/2), (5/8, 3/8) or (3/4, 1/4), and is chosen implicitly based on consistency of the predictors to the already reconstructed pixels to the top and left of the current macroblock or superblock. Currently the weighting is not applied to SPLITMV modes, which default to the usual (1/2, 1/2) weighting. However the code is in place controlled by a macro. The same weighting is used for Y and UV components, where the weight is derived from analyzing the Y component only. Results (over compound inter-intra experiment) derf: +0.18% yt: +0.34% hd: +0.49% stdhd: +0.23% The experiment suggests bigger benefit for explicitly signaled weights. Change-Id: I5438539ff4485c5752874cd1eb078ff14bf5235a	2013-03-26 16:58:56 -07:00
Ronald S. Bultje	d9094d8fd3	Add col/row-based coefficient scanning patterns for 1D 8x8/16x16 ADSTs. These are mostly just for experimental purposes. I saw small gains (in the 0.1% range) when playing with this on derf. Change-Id: Ib21eed477bbb46bddcd73b21c5c708a5b46abedc	2013-03-26 16:46:13 -07:00
Ronald S. Bultje	3120dbddb1	Redo banding for all transforms. Now that the first AC coefficient in both directions use the same DC as their context, there no longer is a purpose in letting both have their own band. Merging these two bands allows us to split bands for some of the very high-frequency AC bands. In addition, I'm redoing the banding for the 1D-ADST col/row scans. I don't think the old banding made any sense at all (it merged the last coefficient of the first row/col in the same band as the first two of the second row/col), which was clearly an oversight from the band being applied in scan-order (rather than in their actual position). Now, coefficients at the same position will be in the same band, regardless what scan order is used. I think this makes most sense for the purpose of banding, which is basically "predict energy for this coefficient depending on the energy of context coefficients" (i.e. pt). After full re-training, together with previous patch, derf gains about 1.2-1.3%, and hd/stdhd gain about 0.9-1.0%. Change-Id: I7a0cc12ba724e88b278034113cb4adaaebf87e0c	2013-03-26 16:46:13 -07:00
Ronald S. Bultje	790fb13215	Use above/left (instead of previous in scan-order) as token context. Pearson correlation for above or left is significantly higher than for previous-in-scan-order (absolute values depend on position in scan, but in general, we gain about 0.1-0.2 by using either above or left; using both basically just makes this even better). For eob branch skipping, we continue to use the previous token in scan order. This helps about 0.9% on derf after re-training on a limited data set. Full re-training and results on larger-resolution clips are pending. Note that this commit breaks trellis, so we can probably get further gains out of it by fixing trellis at some later point. Change-Id: Iead68e296fc3a105cca746b5e3da9555d6010cfe	2013-03-26 16:46:09 -07:00
Dmitry Kovalev	56f3a2c663	Code cleanup: lower case variable names. Renaming Width to width, Height to height and Version to version in several structs and function signatures. Change-Id: I084c3f7e747cb2ce3345aff27a3dff9b13a87543	2013-03-20 16:41:30 -07:00
Paul Wilkins	d8ffee4526	Changes to rd error_per_bit calculation. Specifically changes to retain more precision especially at low Q through to the point of use. Change-Id: Ief5f010f2ca4daaabef49520e7edb46c35daf397	2013-03-18 23:07:51 +00:00
Paul Wilkins	ef179bce61	Merge "Adapt ARNR filter length and strength." into experimental	2013-03-18 12:00:39 -07:00
John Koleszar	c5b317057b	Merge "Fix pulsing issue with scaling" into experimental	2013-03-18 11:57:36 -07:00
Paul Wilkins	cdb322dd72	Adapt ARNR filter length and strength. Adjust the filter length and strength for each ARF group based on a measure of difficulty (the boost) and the active q range. Remove lower limit on RDMULT value. Average gains on the different sets in range 0.4%-0.9%. However the ARNR changes give a very big boost on a few clips. Eg. Soccer ~5%, in derf set and Cyclist ~ 10% in the std-hd set Change-Id: I2078d78798e27ad2bcc2b32d703ea37b67412ec4	2013-03-18 16:17:04 +00:00
Deb Mukherjee	b1921b2f08	Context-pred fix to not use top/left on edges This fix resolves some of the mismatch issues being seen recently. While this is the right thing to do when tiling is used for this experiment, it is not the underlying cause of the the mismatches. Something else is causing writing outside of the allowable frame area in the encoder leading to this mismatch. Change-Id: If52c6f67555aa18ab8762865384e323b47237277	2013-03-16 09:26:52 -07:00
John Koleszar	9b7be88883	Fix pulsing issue with scaling Updates the YV12_BUFFER_CONFIG structure to be crop-aware. The exiting width/height parameters are left unchanged, storing the width and height algined to a 16 byte boundary. The cropped dimensions are added as new fields. This fixes a nasty visual pulse when switching between scaled and unscaled frame dimensions due to a mismatch between the scaling ratio and the 16-byte aligned sizes. Change-Id: Id4a3f6aea6b9b9ae38bdfa1b87b7eb2cfcdd57b6	2013-03-13 19:10:10 -07:00
Deb Mukherjee	a28139c849	Continued experiment with nonzero count Adds probability updates for extra bits for the nzcs, code for getting nzc stats, plus some minor cleanups and fixes. Change-Id: If2814e7f04fb52f5025ad9f400f3e6c50a00b543	2013-03-08 16:37:08 -08:00
Ronald S. Bultje	0643c3f133	Merge "Add support for tx_select in i8x8 encoding in keyframes." into experimental	2013-03-08 16:25:27 -08:00
Jingning Han	2a5278bdbd	Extend diff MV limit from +/-256 to +/-1024 Increase the motion search range by 4x. Change MV_CLASS tree of the entropy coding to allow two additional mv classes to cover the extended motion vector limit. The codec determines the effective motion search range conditioned on the actual frame dimension. It provides coding gains: stdhd 0.39% yt 0.56% hd 0.47% Major coding performance gains are packed in several sequences with intense motion activities, e.g., ped_1080p gains 7% at high bit-rates, and on average 3%. TODO: Need to further tune the rate control and motion search units. Change-Id: Ib842540a6796fbee5a797809433ef6a477c6d78d	2013-03-08 10:04:36 -08:00
Ronald S. Bultje	b41dee8428	Add support for tx_select in i8x8 encoding in keyframes. Also enable tx_select for keyframes. Change-Id: Iadb1231d9fa7af0c8dce3d9b41830b93a302479e	2013-03-08 09:28:46 -08:00
Ronald S. Bultje	d3724abe9f	Re-add support for ADST in superblocks. This also changes the RD search to take account of the correct block index when searching (this is required for ADST positioning to work correctly in combination with tx_select). Change-Id: Ie50d05b3a024a64ecd0b376887aa38ac5f7b6af6	2013-03-07 11:19:10 -08:00
Deb Mukherjee	eb6ef2417f	Coding con-zero count rather than EOB for coeffs This patch revamps the entropy coding of coefficients to code first a non-zero count per coded block and correspondingly remove the EOB token from the token set. STATUS: Main encode/decode code achieving encode/decode sync - done. Forward and backward probability updates to the nzcs - done. Rd costing updates for nzcs - done. Note: The dynamic progrmaming apporach used in trellis quantization is not exactly compatible with nzcs. A suboptimal approach has been used instead where branch costs are updated to account for changes in the nzcs. TODO: Training the default probs/counts for nzcs Change-Id: I951bc1e22f47885077a7453a09b0493daa77883d	2013-03-07 07:20:30 -08:00
Ronald S. Bultje	4209bba462	Merge changes Ifacbf5a0,Ibad7c3dd into experimental * changes: vpxenc: actually report mismatch on stderr. Make superblocks independent of macroblock code and data.	2013-03-05 11:17:14 -08:00
Ronald S. Bultje	111ca42133	Make superblocks independent of macroblock code and data. Split macroblock and superblock tokenization and detokenization functions and coefficient-related data structs so that the bitstream layout and related code of superblock coefficients looks less like it's a hack to fit macroblocks in superblocks. In addition, unify chroma transform size selection from luma transform size (i.e. always use the same size, as long as it fits the predictor); in practice, this means 32x32 and 64x64 superblocks using the 16x16 luma transform will now use the 16x16 (instead of the 8x8) chroma transform, and 64x64 superblocks using the 32x32 luma transform will now use the 32x32 (instead of the 16x16) chroma transform. Lastly, add a trellis optimize function for 32x32 transform blocks. HD gains about 0.3%, STDHD about 0.15% and derf about 0.1%. There's a few negative points here and there that I might want to analyze a little closer. Change-Id: Ibad7c3ddfe1acfc52771dfc27c03e9783e054430	2013-03-04 16:34:36 -08:00
Dmitry Kovalev	49b697d327	Merge "Code cleanup." into experimental	2013-03-04 15:41:15 -08:00
Dmitry Kovalev	135428e954	Code cleanup. Removing redundant 'extern' keyword, lowercase variable names. Change-Id: I608e8d8579aba8981f5fac3493f77b4481b13808	2013-03-01 17:39:31 -08:00
John Koleszar	69c67c9531	Merge master branch into experimental Picks up some build system changes, compiler warning fixes, etc. Change-Id: I2712f99e653502818a101a72696ad54018152d4e	2013-03-01 11:06:05 -08:00
Ronald S. Bultje	90932399b4	Merge "Move eob from BLOCKD to MACROBLOCKD." into experimental	2013-02-27 11:39:16 -08:00
Ronald S. Bultje	e8c74e2b70	Move eob from BLOCKD to MACROBLOCKD. Consistent with VP8. Change-Id: I8c316ee49f072e15abbb033a80e9c36617891f07	2013-02-27 11:00:55 -08:00
John Koleszar	800ad0b886	Use ref_frame_map vice active_ref_idx on the encoder This patch makes the encoder's use of ref_frame_map and active_ref_idx consistent with the decoder. ref_frame_map[] maps a reference buffer index to its actual location in the yv12_fb array, since many references may share an underlying buffer. active_ref_idx[] mirrors cpi->{lst,gld,alt}_fb_idx, holding the active references in each slot. This also fixes a bug in setup_buffer_inter() where the incorrect reference was used to populate the scaling factors. Change-Id: Id3728f6d77cffcd27c248903bf51f9c3e594287e	2013-02-27 08:22:40 -08:00
John Koleszar	77f88e97fa	Combined motion compensation with scaled predictors This patch extends the previous support for using references of a different resolution in ZEROMV mode to all inter prediction modes. Subpixel based best-mv scoring is disabled when the reference frame differs in resolution from the current frame. Change-Id: Id4dc3e5e6692de98d9857fd56bfad3ac57e944ac	2013-02-27 08:22:39 -08:00
John Koleszar	eb939f45b8	Spatial resamping of ZEROMV predictors This patch allows coding frames using references of different resolution, in ZEROMV mode. For compound prediction, either reference may be scaled. To test, I use the resize_test and enable WRITE_RECON_BUFFER in vp9_onyxd_if.c. It's also useful to apply this patch to test/i420_video_source.h: --- a/test/i420_video_source.h +++ b/test/i420_video_source.h @@ -93,6 +93,7 @@ class I420VideoSource : public VideoSource { virtual void FillFrame() { // Read a frame from input_file. + if (frame_ != 3) if (fread(img_->img_data, raw_sz_, 1, input_file_) == 0) { limit_ = frame_; } This forces the frame that the resolution changes on to be coded with no motion, only scaling, and improves the quality of the result. Change-Id: I1ee75d19a437ff801192f767fd02a36bcbd1d496	2013-02-26 23:54:23 -08:00
Ronald S. Bultje	96d260515a	Merge "Merge cnvcontext experiment." into experimental	2013-02-26 19:39:39 -08:00
Ronald S. Bultje	db54e6774f	Merge "Minor cosmetics in rdopt." into experimental	2013-02-26 19:39:28 -08:00
John Koleszar	25686fc22d	Merge "Refactor inter recon functions to support scaling" into experimental	2013-02-26 11:45:28 -08:00
Dmitry Kovalev	998bed1d2c	Merge "Changing pitch value meaning for fht and iht transforms." into experimental	2013-02-26 10:44:15 -08:00
Ronald S. Bultje	b1641150b1	Merge cnvcontext experiment. Change-Id: I35e64998b25694a3bb4a62164bba3c03c1db4bc7	2013-02-26 10:40:15 -08:00
Ronald S. Bultje	71539eae2a	Minor cosmetics in rdopt. Change-Id: I62497dcf2074b4bb4787bf660e727e5cf1bf3472	2013-02-26 10:40:11 -08:00
Ronald S. Bultje	c4ae97911a	Merge "make cost_coeffs to use combined context" into experimental	2013-02-26 10:32:01 -08:00
John Koleszar	6a4f708c25	Refactor inter recon functions to support scaling Ensure that all inter prediction goes through a common code path that takes scaling into account. Removes a bunch of duplicate 1st/2nd predictor code. Also introduces a 16x8 mode for 8x8 MVs, similar to the 8x4 trick we were doing before. This has an unexpected effect with EIGHTTAP_SMOOTH, so it's disabled in that case for now. Change-Id: Ia053e823a8bc616a988a0af30452e1e75a739cba	2013-02-26 10:03:29 -08:00
Dmitry Kovalev	9bf3f75168	Changing pitch value meaning for fht and iht transforms. Pitch now means the number of elements, not the number of bytes. Change-Id: Idb9f2f012e39b09d596a3cc1802305a80b7c13af	2013-02-25 18:19:55 -08:00
Yaowu Xu	ecb03e9a3f	make cost_coeffs to use combined context Change-Id: Ia15f4244595fab49bffda0c651a750a8a9481d28	2013-02-25 17:01:33 -08:00
Jingning Han	77a3becf92	clean up forward and inverse hybrid transform Rebased. Remove the old matrix multiplication transform computation. The 16x16 ADST/DCT can be switched on/off and evaluated by setting ACTIVE_HT16 300/0 in vp9/common/vp9_blockd.h. Change-Id: Icab2dbd18538987e1dc4e88c45abfc4cfc6e133f	2013-02-25 09:16:12 -08:00
Ronald S. Bultje	0c9e2e9a1d	Split coefficient token tables intra vs. inter. Change-Id: I5416455f8f129ca0f450d00e48358d2012605072	2013-02-23 07:33:46 -08:00
Paul Wilkins	c17672a33d	Further changes to coefficient contexts. This patch alters the balance of context between the coefficient bands (reflecting the position of coefficients within a transform blocks) and the energy of the previous token (or tokens) within a block. In this case the number of coefficient bands is reduced but more previous token energy bands are supported. Some initial rebalancing of the default tables has been by running multiple derf clips at multiple data rates using the ENTOPY_STATS macro. Further balancing needs to be done using larger image formatsd especially in regard to the bigger transform sizes which are not as well represented in encodings of smaller image formats. Change-Id: If9736e95c391e711b04aef6393d26f60f36e1f8a	2013-02-23 07:29:09 -08:00
Jingning Han	c67a20994f	Merge "Forward butterfly hybrid transform" into experimental	2013-02-22 09:20:26 -08:00
Paul Wilkins	b5f3cb6e37	Merge "Experimental removal of over quant code" into experimental	2013-02-22 08:44:40 -08:00
Paul Wilkins	dbf4942046	Experimental removal of over quant code The over quant code was added in VP8 post bitstream freeze to allow compression to lower data rates In VP9 the real qualtizer range has been greatly extended anyway. Change-Id: I5d384fa5e9a83ef75a3df34ee30627bd21901526	2013-02-22 14:00:51 +00:00
Jingning Han	babbd5d170	Forward butterfly hybrid transform This patch includes 4x4, 8x8, and 16x16 forward butterfly ADST/DCT hybrid transform. The kernel of 4x4 ADST is sin((2k+1)(n+1)/(2N+1)). The kernel of 8x8/16x16 ADST is of the form sin((2k+1)(2n+1)/4N). Change-Id: I8f1ab3843ce32eb287ab766f92e0611e1c5cb4c1	2013-02-21 18:24:28 -08:00
Deb Mukherjee	048f593703	Merge "Refactoring of switchable filter search for speed" into experimental	2013-02-21 09:23:50 -08:00
Deb Mukherjee	28b1db9278	Refactoring of switchable filter search for speed Refactors the switchable filter search in the rd loop to improve encode speed. Uses a piecewise approximation to a closed form expression to estimate rd cost for a Laplacian source with a given variance and quantization step-size. About 40% encode time reduction is achieved. Results (on a feb 12 baseline) show a slight drop: derf: -0.019% yt: +0.010% std-hd: -0.162% hd: -0.050% Change-Id: Ie861badf5bba1e3b1052e29a0ef1b7e256edbcd0	2013-02-20 18:34:42 -08:00
Ronald S. Bultje	aa84c16da2	Minor cosmetic cleanups. Change-Id: I13d8ae754827368755575dd699a087b3b11f5b16	2013-02-15 17:21:16 -08:00
Ronald S. Bultje	ebfdaa0e0b	Prevent filling transform size cache with uninitialized values. The 32x32 value in case of splitmv was uninitialized. this leads to all kind of erratic behaviour down the line. Also fill in dummy values for superblocks in keyframes (the values are currently unused, but we run into integer overflows anyway, which makes detecting bad cases harder). Lastly, in case we did not find any RD value at all, don't set tx_diff to INT_MIN, but instead set it to zero (since if we couldn't find a mode, it's unlikely that any particular transform would have made that worse or better; rather, it's likely equally bad for all tx_sizes). Change-Id: If236fd3aa2037e5b398d03f3b1978fbbc5ce740e	2013-02-15 17:21:16 -08:00
Ronald S. Bultje	5bb103c486	Merge "Remove Y2 and Y-no-DC token types from the bitstream." into experimental	2013-02-15 17:11:20 -08:00
Jingning Han	e343732a92	Fixed a subtle issue that breaks encoding process This issue breaks the encoding process of the codebase. The effect emerges only in particular test sequence at certain bit-rates and frame limits. Change-Id: I02e080f2a49624eef9a21c424053dc2a1d902452	2013-02-15 14:49:30 -08:00
Ronald S. Bultje	3af36ea8cc	Remove Y2 and Y-no-DC token types from the bitstream. Change-Id: I7a5314daca993d46b8666ba1ec2ff3766c1e5042	2013-02-15 14:06:30 -08:00
Ronald S. Bultje	46dff5d233	Remove some Y2-related code. Change-Id: I4f46d142c2a8d1e8a880cfac63702dcbfb999b78	2013-02-15 14:06:25 -08:00
Scott LaVarnway	ae886d6bff	Moved vp9_get_coef_band to header file allowing the compiler to inline. Change-Id: I66e5caf5e7fefa68a223ff0603aa3f9e11e35dbb	2013-02-14 12:27:25 -08:00
Paul Wilkins	9255ad107f	Abstract selection of coef band. This patch abstracts the selection of the coefficient band context into a function as a precursor to further experiments with the coefficient context. It also removes the large per TX size coefficient band structures and uses a single matrix for all block sizes within the test function. This may have an impact on quality (results to follow) but is only an intermediate step in the process of redefining the context. Also the quality impact will be larger initially because the default tables will be out of step with the new banding. In particular the 4x4 will in this case only use 7 bands. If needed we can add back block size dependency localized within the function, but this can follow on after the other changes to the definition of the context. Change-Id: Id7009c2f4f9bb1d02b861af85fd8223d4285bde5	2013-02-13 19:01:25 +00:00
Paul Wilkins	0d284ffed1	Abstract the selection of coefficient context. This is an initial step to facilitate experimentation with changes to the prior token context used to code coefficients to take better account of the energy of preceding tokens. This patch merely abstracts the selection of context into two functions and does not alter the output. Change-Id: I117fff0b49c61da83aed641e36620442f86def86	2013-02-13 18:56:30 +00:00
Paul Wilkins	afa57bfc97	Merge "Remove NEWCOEFCONTEXT experiment." into experimental	2013-02-13 10:41:13 -08:00
Yaowu Xu	f01b08c96c	Merge "enable bitstream lossless support" into experimental	2013-02-13 10:26:58 -08:00
Yaowu Xu	d3de97794f	Merge "fix the lossless experiment" into experimental	2013-02-13 09:54:35 -08:00
Yaowu Xu	17db5d00be	enable bitstream lossless support 1. Added a bit in frame header to to indicate if a frame is encoded in lossless mode, so decoder does not make the decision based on Q0 2. Minor changes to make sure that lossy coding works same as when the lossless experiment is not enabled. 3. Renamed function pointers for transforms to be consistent, using prefix fwd_txm and inv_txm for forward and inverse respectively To encode in lossless mode, using "--lossless=1 --min-q=0 --max-q=0" with vpxenc. Change-Id: Ifae53b26d2ffbe378d707e29d96817b8a5e6c068	2013-02-13 09:24:39 -08:00
Yaowu Xu	16f25f9dc8	fix the lossless experiment Change-Id: I95acfc1417634b52d344586ab97f0abaa9a4b256	2013-02-13 09:20:26 -08:00
Paul Wilkins	6a9f0c61a4	Remove NEWCOEFCONTEXT experiment. Removal of the NEWCOEFCONTEXT experiment to reduce code clutter and make it easier to experiment with some other changes to the coefficient coding context. Change-Id: Icd17b421384c354df6117cc714747647c5eb7e98	2013-02-13 15:12:17 +00:00
Paul Wilkins	649be94cf0	Removal of Hybrid DWT/DCT experiment. Removal of experiment to simplify code base for other changes. Change-Id: If0a33952504558511926ad212bc311fc2bffb19a	2013-02-13 15:08:48 +00:00
John Koleszar	1d60b6bcb5	Merge "Replace as_mv struct with array" into experimental	2013-02-12 13:59:04 -08:00
Jingning Han	57e995ff9c	butterfly inverse 4x4 ADST fixed format issues. Implement the inverse 4x4 ADST using 9 multiplications. For this particular dimension, the original ADST transform can be factorized into simpler operations, hence is retained. Change-Id: Ie5d9749942468df299ab74e90d92cd899569e960	2013-02-11 10:42:39 -08:00
John Koleszar	7ca517f755	Replace as_mv struct with array Replace as_mv.{first, second} with a two element array, so that they can easily be processed with an index variable. Change-Id: I1e429155544d2a94a5b72a5b467c53d8b8728190	2013-02-08 20:23:35 -08:00
John Koleszar	dc836109e4	Merge "Pass macroblock index to pick inter functions" into experimental	2013-02-08 20:20:37 -08:00
Ronald S. Bultje	639b863d22	Make cost_coeffs() more efficient. Cache the constant offset in one variable to prevent re-loading that in each loop iteration, and mark the function as inline so we can use the fact that the transform size is always known in the caller. Almost 1% faster encoding overall. Change-Id: Id78325a60b025057d8f4ecd9003a74086ccbf85a	2013-02-08 16:32:24 -08:00
John Koleszar	6125a1ed81	Pass macroblock index to pick inter functions Pass the current mb row and column around rather than the recon_yoffset and recon_uvoffset, since those offsets will change from predictor to predictor, based on the reference frame selection. Change-Id: If3f9df059e00f5048ca729d3d083ff428e1859c1	2013-02-08 14:25:40 -08:00
John Koleszar	6dfc95fe63	Merge changes Icd1a2a5a,I204d17a1,I3ed92117 into experimental * changes: Initial support for resolution changes on P-frames Avoid allocating memory when resizing frames Adds a test for the VP8E_SET_SCALEMODE control	2013-02-08 14:20:05 -08:00
John Koleszar	3de8ee6ba1	Merge changes Ife0d8147,I7d469716,Ic9a5615f into experimental * changes: Restore SSSE3 subpixel filters in new convolve framework Convert subpixel filters to use convolve framework Add 8-tap generic convolver	2013-02-08 13:19:47 -08:00
John Koleszar	393b485627	Initial support for resolution changes on P-frames Allows inter-frames to change resolution. Currently these are almost equivalent to keyframes, as only intra prediction modes are allowed, but without the other context resets that occur on keyframes. Change-Id: Icd1a2a5af0d9462cc792588427b0a1f5b12e40d3	2013-02-08 12:20:30 -08:00
Ronald S. Bultje	5cfd82bcaf	Use fdct8x4 instead of fdct4x4 where the block size allows it. This allows for faster SIMD implementations in the future (currently there is no speed impact). Change-Id: I732647e9148b5dcb44e6bc8728138f0141218329	2013-02-06 16:13:02 -08:00
Ronald S. Bultje	aac73df1a7	Use configure checks for various inline keywords. Change-Id: I8508f1a3d3430f998bb9295f849e88e626a52a24	2013-02-06 16:12:56 -08:00
John Koleszar	31cbe2ed9a	Eliminate tautology Unreachable code that does nothing anyway removed forever. Change-Id: I14105d2dd9dbc9d558f36464055e350dbeb45488	2013-02-06 08:22:59 -08:00
Paul Wilkins	8b4e9c5925	Merge "Change definition of NearestMV." into experimental	2013-02-06 04:06:31 -08:00
Ronald S. Bultje	1407bdc243	[WIP] Add column-based tiling. This patch adds column-based tiling. The idea is to make each tile independently decodable (after reading the common frame header) and also independendly encodable (minus within-frame cost adjustments in the RD loop) to speed-up hardware & software en/decoders if they used multi-threading. Column-based tiling has the added advantage (over other tiling methods) that it minimizes realtime use-case latency, since all threads can start encoding data as soon as the first SB-row worth of data is available to the encoder. There is some test code that does random tile ordering in the decoder, to confirm that each tile is indeed independently decodable from other tiles in the same frame. At tile edges, all contexts assume default values (i.e. 0, 0 motion vector, no coefficients, DC intra4x4 mode), and motion vector search and ordering do not cross tiles in the same frame. t log Tile independence is not maintained between frames ATM, i.e. tile 0 of frame 1 is free to use motion vectors that point into any tile of frame 0. We support 1 (i.e. no tiling), 2 or 4 column-tiles. The loopfilter crosses tile boundaries. I discussed this briefly with Aki and he says that's OK. An in-loop loopfilter would need to do some sync between tile threads, but that shouldn't be a big issue. Resuls: with tiling disabled, we go up slightly because of improved edge use in the intra4x4 prediction. With 2 tiles, we lose about ~1% on derf, ~0.35% on HD and ~0.55% on STD/HD. With 4 tiles, we lose another ~1.5% on derf ~0.77% on HD and ~0.85% on STD/HD. Most of this loss is concentrated in the low-bitrate end of clips, and most of it is because of the loss of edges at tile boundaries and the resulting loss of intra predictors. TODO: - more tiles (perhaps allow row-based tiling also, and max. 8 tiles)? - maybe optionally (for EC purposes), motion vectors themselves should not cross tile edges, or we should emulate such borders as if they were off-frame, to limit error propagation to within one tile only. This doesn't have to be the default behaviour but could be an optional bitstream flag. Change-Id: I5951c3a0742a767b20bc9fb5af685d9892c2c96f	2013-02-05 15:43:03 -08:00
John Koleszar	7a07eea13f	Convert subpixel filters to use convolve framework Update the code to call the new convolution functions to do subpixel prediction rather than the existing functions. Remove the old C and assembly code, since it is unused. This causes a 50% performance reduction on the decoder, but that will be resolved when the asm for the new functions is available. There is no consensus for whether 6-tap or 2-tap predictors will be supported in the final codec, so these filters are implemented in terms of the 8-tap code, so that quality testing of these modes can continue. Implementing the lower complexity algorithms is a simple exercise, should it be necessary. This code produces slightly better results in the EIGHTTAP_SMOOTH case, since the filter is now applied in only one direction when the subpel motion is only in one direction. Like the previous code, the filtering is skipped entirely on full-pel MVs. This combination seems to give the best quality gains, but this may be indicative of a bug in the encoder's filter selection, since the encoder could achieve the result of skipping the filtering on full-pel by selecting one of the other filters. This should be revisited. Quality gains on derf positive on almost all clips. The only clip that seemed to be hurt at all datarates was football (-0.115% PSNR average, -0.587% min). Overall averages 0.375% PSNR, 0.347% SSIM. Change-Id: I7d469716091b1d89b4b08adde5863999319d69ff	2013-02-05 14:23:17 -08:00
Paul Wilkins	81043e8d62	Change definition of NearestMV. This commit makes the NearestMV match the chosen best reference MV. It can be a 0,0 or non zero vector which means the the compound nearest mv mode can combine a 0,0 and a non zero vector. Change-Id: I2213d09996ae2916e53e6458d7d110350dcffd7a	2013-02-05 17:03:25 +00:00
Deb Mukherjee	a53be60904	Merge "Adding a frame parallel decoding mode" into experimental	2013-01-30 12:03:45 -08:00
Ronald S. Bultje	3febf9707d	Default superblock skip flag to 32x32 for skip-blocks. This is identical to the later decisions made in encode_superblock(). This commit doesn't actually change anything, but makes the mbmi state more consistent between the RD loop and the final encode result. Change-Id: I9e735afb7c5a52e5b61728cb88c67ef9b9bf59be	2013-01-29 21:46:31 -08:00
Ronald S. Bultje	b90996c51b	Reset skip flag in superblock RD loop. This is the superblock equivalent of commit `290b83a`. Change-Id: Ib3945dd9e992fa9ec1fdea5a11e17a3cc0e37637	2013-01-29 21:42:56 -08:00
Ronald S. Bultje	5a9da2d906	Merge "Fix block pointer corruption in intra8x8 prediction with 4x4 transform." into experimental	2013-01-29 12:49:42 -08:00
Paul Wilkins	5d1c62c639	Merge "Segment Skip Flag" into experimental	2013-01-29 09:29:26 -08:00
Ronald S. Bultje	ffc2e4f4af	Fix block pointer corruption in intra8x8 prediction with 4x4 transform. The RD loop would change the pointer after the first mode (DC) was tested, leading to corrupt block objects being provided for the others. This would essentially render the i8x8 predictor useless. Change-Id: I16c5906ca64fb34878ac32ce59af8974e4582bb8	2013-01-29 09:18:47 -08:00
Paul Wilkins	0ff9b033b0	Segment Skip Flag First step in simplifying the segment mode and segment EOB flags into a simpler segment skip flag that implies 0,0 mv and EOB at position 0. Change-Id: Ib750cac31a7a02dc21082580498efd9f7d8d72a5	2013-01-28 17:28:04 +00:00
Deb Mukherjee	dfd89f2eab	Adding a frame parallel decoding mode Adds a flag to disable features that would inhibit frame parallel decoding. This includes backward adaptation and MV sorting based on search in ref frame buffer. Also includes some minor clean-ups. Change-Id: I434846717a47b7bcb244b37ea670c5cdf776f14d	2013-01-25 17:16:19 -08:00
Ronald S. Bultje	3ca5b35ce5	Merge "Remove "update_context" variable from VP9_COMP context." into experimental	2013-01-25 09:43:42 -08:00
Ronald S. Bultje	0a7b3953f0	Remove "update_context" variable from VP9_COMP context. The variable is always zero. Change-Id: Id5cdbecad543bca465a5b1d471badaec7e112c8d	2013-01-24 16:28:53 -08:00
Deb Mukherjee	01cafaab1d	Adds an error-resilient mode with test Adds an error-resilient mode where frames can be continued to be decoded even when there are errors (due to network losses) on a prior frame. Specifically, backward updates are turned off and probabilities of various symbols are reset to defaults at the beginning of each frame. Further, the last frame's mvs are not used for the mv reference list, and the sorting of the initial list based on search on previous frames is turned off as well. Also adds a test where an arbitrary set of frames are skipped from decoding to simulate errors. The test verifies (1) that if the error frames are droppable - i.e. frame buffer updates have been turned off - there are no mismatch errors for the remaining frames after the error frames; and (2) if the error-frames are non droppable, there are not only no decoding errors but the mismatch PSNR between the decoder's version of the post-error frames and the encoder's version is at least 20 dB. Change-Id: Ie6e2bcd436b1e8643270356d3a930e8989ff52a5	2013-01-23 21:56:15 -08:00
Frank Galligan	9ca907b53e	libvpx: Fix some warnings. Change-Id: If8be8b9d28a29631f29c46daea8a226ab3580610	2013-01-18 09:51:57 -08:00
John Koleszar	da832a80e4	Start to anonymize reference frames Remove lst_fb_idx, gld_fb_idx, alt_fb_idx, refresh_last_frame, refresh_golden_frame, refresh_alt_ref_frame from common. Gold/Alt are encode side conventions. From the decoder's perspective, we want to be dealing with numbered references. Updates to active_ref 2 signal mode context switches, vestigial from refresh_alt_ref_frame. This needs some clean up to make sense with increased numbers of reference frames, as well as reimplementing the swapping of alt/golden which was previously done using the buffer-to-buffer copy mechanism removed in an earlier commit. Change-Id: I7334445158b7666f9295d2a2dd22aa03f4485f58	2013-01-16 14:06:23 -08:00
Yaowu Xu	9bf73f46f9	fix a number issues that cause failures During master jenkins verification proces Change-Id: I3722b8753eaf39f99b45979ce407a8ea0bea0b89	2013-01-14 18:32:32 -08:00
John Koleszar	24bc1a7189	Use INT64_MAX instead of LLONG_MAX These variables have the type int64_t, not long long. long long could be a larger type than 64 bits. Emulate INT64_MAX for older versions of MSVC, and remove the unreferenced vpx_ports/vpxtypes.h Change-Id: Ideaca71838fcd3849d816d5ab17aa347c97d03b0	2013-01-14 15:57:21 -08:00
Ronald S. Bultje	c9071601a2	Remove compound intra-intra experiment. This experiment gives little gains and adds relatively much code complexity (and it hinders other experiments), so let's get rid of it. Change-Id: Id25e79a137a1b8a01138aa27a1fa0ba4a2df274a	2013-01-14 15:47:25 -08:00
Paul Wilkins	e2c696a7aa	Merge "Fix compiler warnings" into experimental	2013-01-14 14:20:57 -08:00
Yaowu Xu	113005b11d	Fix compiler warnings The warnings caused verify failure with gerrit for several commits Change-Id: I030df8638bd69b8783a3ac58e720ff9f0bfd546c	2013-01-14 13:56:52 -08:00
Ronald S. Bultje	290b83ab62	Reset x->skip for each iteration in the RD loop. This prevents ill-defined behaviour, such as setting x->skip for a mode that is excluded because of frame-level flags (e.g. filter selection, compound prediction selection), then not breaking out of the RD loop because the mode is not allowed, but keeping the flag on. Whatever mode is iterated through next in the RD loop will then carry this flag, and all sort of bad stuff happens, such as x->skip being set on intra pred modes. Change-Id: I5bec46b36e38292174acb1c564b3caf00a9b4b9a	2013-01-14 12:44:32 -08:00
Paul Wilkins	d27ae620bc	Remove INT64_MAX references. Replace INT64_MAX references with LLONG_MAX for windows build. Change-Id: Ib8b45c1e9c15c043b2f54c27ed83b8682b2be34f	2013-01-11 19:45:26 +00:00
Ronald S. Bultje	aa2effa954	Merge tx32x32 experiment. Change-Id: I615651e4c7b09e576a341ad425cf80c393637833	2013-01-10 08:23:59 -08:00
Ronald S. Bultje	6884a83f06	Merge superblocks64 experiment. Change-Id: If6c88752dffdb566f8d4322f135145270716fb8e	2013-01-09 17:21:40 -08:00
Adrian Grange	7d6b5425d7	New prediction filter This patch removes the old pred-filter experiment and replaces it with one that is implemented using the switchable filter framework. If the pred-filter experiment is enabled, three interopolation filters are tested during mode selection; the standard 8-tap interpolation filter, a sharp 8-tap filter and a (new) 8-tap smoothing filter. The 6-tap filter code has been preserved for now and if the enable-6tap experiment is enabled (in addition to the pred-filter experiment) the original 6-tap filter replaces the new 8-tap smooth filter in the switchable mode. The new experiment applies the prediction filter in cases of a fractional-pel motion vector. Future patches will apply the filter where the mv is pel-aligned and also to intra predicted blocks. Change-Id: I08e8cba978f2bbf3019f8413f376b8e2cd85eba4	2013-01-09 12:00:39 -08:00
Deb Mukherjee	4b7304ee68	Adds 64x64 hybrid dct/dwt transform This is to add to the 64x64 transform experiment as an alternative to a 64x64 DCT. Two levels of wavelet decomposition is used on a 64x64 block, followed by 16x16 DCT on the four lowest subbands. The highest three subbands are left untransformed after the first level DWT. Change-Id: I3d48d5800468d655191933894df6b46e15adca56	2013-01-08 14:05:58 -08:00
Ronald S. Bultje	4455036cfc	Merge superblocks (32x32) experiment. Change-Id: I0df99742029834a85c4933652b0587cf5b6b2587	2013-01-08 12:54:45 -08:00
John Koleszar	879cb7d962	Merge vp9-preview changes into experimental branch Incorportate vp9-preview changes by merging master branch into experimental. Conflicts: test/test.mk vp9/common/vp9_filter.c vp9/common/vp9_idctllm.c vp9/common/vp9_invtrans.h vp9/common/vp9_mbpitch.c vp9/common/vp9_rtcd_defs.sh vp9/common/vp9_systemdependent.h vp9/common/vp9_type_aliases.h vp9/common/x86/vp9_asm_stubs.c vp9/common/x86/vp9_subpixel_mmx.asm vp9/decoder/vp9_decodframe.c vp9/decoder/vp9_dequantize.c vp9/decoder/vp9_dequantize.h vp9/decoder/vp9_onyxd_int.h vp9/encoder/vp9_bitstream.c vp9/encoder/vp9_encodeframe.c vp9/encoder/vp9_rdopt.c Change-Id: I17f51c3666d1b59cf1a699f87607cbc5d30a87c5	2013-01-08 10:19:59 -08:00
Ronald S. Bultje	c13d9fef42	Re-enable support for static_threshold (encode_breakout). Change-Id: Ibd7380f478d3127f9db91d0a4fd2fd0dfde961ab	2013-01-07 11:02:14 -08:00
Ronald S. Bultje	e6216d163a	Don't use tx32x32 for macroblocks. Change-Id: Ib674e0153ca360867ab7a20ba291ac9171a01250	2013-01-07 09:40:19 -08:00
Ronald S. Bultje	c3941665e9	64x64 blocksize support. 3.2% gains on std/hd, 1.0% gains on hd. Change-Id: I481d5df23d8a4fc650a5bcba956554490b2bd200	2013-01-05 18:20:25 -08:00
Adrian Grange	81d1171fd4	Fix mode selection infinite loop bug Mode selection for SBs could enter an infinite loop because the interpolation filter mode index was not being reset correctly. Change-Id: I4bbe726f29ef5b6836e94884067c46084713cc11	2013-01-04 09:00:47 -08:00
Yaowu Xu	df7ce5a711	Merge "make cost_coeffs() and tokenize_b() consistent" into experimental	2013-01-03 09:57:07 -08:00
Yaowu Xu	818f5698fb	Merge "Merge cost_coeffs_2x2() into cost_coeffs()" into experimental	2013-01-03 09:33:21 -08:00
Yaowu Xu	83664f457b	make cost_coeffs() and tokenize_b() consistent Change-Id: I7cdb5c32a1400f88ec36d08ea982e38b77731602	2013-01-03 09:31:47 -08:00
Adrian Grange	259b800832	New interpolation filter selection algorithm Old Scheme: When SWITCHABLE filter selection is enabled the encoder evaluates the use of each interpolation filter type and selects the best one to use at the MB level. A frame- level flag can be set to force the use of a particular filter type for all MBs in a frame if it is more efficient to encode that way. The logic here involved a Q dependent threshold that assumed that the second 8-tap filter was a high-pass filter. However, this requires a trip around the recode loop. If the frame-level flag indicates use of a particular filter, the other filters are not evaluated in the pick_mode loop. New Scheme: Each filter type is evaluated at the MB level and a record of the best filter is kept, irrespective of what filter is signaled at the frame-level. Once all MBs have been encoded, a decision is made as to what frame-level mode to set for the next frame. If one filter is used by 80% or more of the MBs, then this filter is forced since it is assumed that this will be more efficient if the next frame has similar characteristics. i.e. there is a one-frame lag between measuring the filter selection and setting the frame-level mode to use. Change-Id: I6a7e7ced8f27e120fafb99db2dc9c6293f8d20f7	2013-01-03 08:12:43 -08:00
Yaowu Xu	bd28510ef9	Merge cost_coeffs_2x2() into cost_coeffs() Remove special case function cost_coeffs_2x2() and change function cost_coeffs() to handle 2nd order haar block as it is handle all other block types already. Change-Id: I2aac6f81ee0ae9e03d6a8da4f8681d69b79ce41f	2013-01-03 08:00:00 -08:00
Paul Wilkins	cad4a91429	Change INT64_MAX to LLONG_MAX This is needed to make the windows build work after the removal of vp9_type_alisases.h. Change-Id: I8addf38e9f3c8b864e0e30a8916a26e0264dd02c	2013-01-02 18:06:00 +00:00
Paul Wilkins	313d1100af	Added update-able mv-ref probabilities. Part of NEW_MVREF experiment. Added update-able probabilities. Change-Id: I5a4fcf4aaed1d0d1dac980f69d535639a3d59401	2013-01-02 14:22:11 +00:00
John Koleszar	5ebe94f9f1	Build fixes to merge vp9-preview into master Various fixups to resolve issues when building vp9-preview under the more stringent checks placed on the experimental branch. Change-Id: I21749de83552e1e75c799003f849e6a0f1a35b07	2012-12-26 11:21:09 -08:00
Deb Mukherjee	08f0c7cc9c	New previous coef context experiment Adds an experiment to derive the previous context of a coefficient not just from the previous coefficient in the scan order but from a combination of several neighboring coefficients previously encountered in scan order. A precomputed table of neighbors for each location for each scan type and block size is used. Currently 5 neighbors are used. Results are about 0.2% positive using a strategy where the max coef magnitude from the 5 neigbors is used to derive the context. Change-Id: Ie708b54d8e1898af742846ce2d1e2b0d89fd4ad5	2012-12-19 18:49:39 -08:00
Ronald S. Bultje	4cca47b538	Use standard integer types for pixel values and coefficients. For coefficients, use int16_t (instead of short); for pixel values in 16-bit intermediates, use uint16_t (instead of unsigned short); for all others, use uint8_t (instead of unsigned char). Change-Id: I3619cd9abf106c3742eccc2e2f5e89a62774f7da	2012-12-18 15:31:19 -08:00
Ronald S. Bultje	5cab8b7a18	Merge "Give 4x4 scan and coef_band tables a _4x4 suffix." into experimental	2012-12-18 14:17:46 -08:00
Yunqing Wang	779c5f28a8	Fix uninitialized warning Fixed uninitialized warning for txfm_size. Change-Id: I42b7e802c3e84825d49f34e632361502641b7cbf	2012-12-18 13:19:04 -08:00
Ronald S. Bultje	8986eb5c26	Give 4x4 scan and coef_band tables a _4x4 suffix. This matches the names of tables for all other transform sizes. Change-Id: Ia7681b7f8d34c97c27b0eb0e34d490cd0f8d02c6	2012-12-18 10:49:10 -08:00
Paul Wilkins	d8f5d1b257	Problem of over smoothing with intra modes. In some cases intra modes in inter frames give an over smoothed appearance. Especially with noisy but flat content. Also in some cases there were problems with key frame sizing again with very flat but noisy content. These are temporary changes to help alleviate the visual problems but will almost certainly hurt metric results especially at the very low data rate end. Change-Id: I11549179a19277ffc283d9788bc70168f2a8bdc9	2012-12-17 11:54:17 +00:00
Deb Mukherjee	7fa3deb1f5	Build fixes with teh super blcoks and 32x32 expts Change-Id: I3c751f8d57ac7d3b754476dc6ce144d162534e6d	2012-12-13 12:18:38 -08:00
Scott LaVarnway	b575394e21	Improved vp9_ihtllm_c As suggested by Yaowu, we can use eob to reduce the complexity of the vp9_ihtllm_c function. For the 1080p test clip used, the decoder performance improved by 17%. Change-Id: I32486f2f06f9b8f60467d2a574209aa3a3daa435	2012-12-12 15:49:39 -08:00
Ronald S. Bultje	39de1e14ed	Merge "Consistently use get_prob(), clip_prob() and newly added clip_pixel()." into experimental	2012-12-12 10:34:14 -08:00
Ronald S. Bultje	4d0ec7aacd	Consistently use get_prob(), clip_prob() and newly added clip_pixel(). Add a function clip_pixel() to clip a pixel value to the [0,255] range of allowed values, and use this where-ever appropriate (e.g. prediction, reconstruction). Likewise, consistently use the recently added function clip_prob(), which calculates a binary probability in the [1,255] range. If possible, try to use get_prob() or its sister get_binary_prob() to calculate binary probabilities, for consistency. Since in some places, this means that binary probability calculations are changed (we use {255,256}count0/(total) in a range of places, and all of these are now changed to use 256count0+(total>>1)/total), this changes the encoding result, so this patch warrants some extensive testing. Change-Id: Ibeeff8d886496839b8e0c0ace9ccc552351f7628	2012-12-12 10:01:19 -08:00
Yaowu Xu	0c35b27689	Merge "clean up tokenize_b() and stuff_b()" into experimental	2012-12-11 13:51:56 -08:00
Yaowu Xu	899f0fc126	clean up tokenize_b() and stuff_b() Change-Id: I0c1be01aae933243311ad321b6c456adaec1a0f5	2012-12-11 13:32:16 -08:00
Paul Wilkins	d124465975	Further changes to mv reference code. Some further changes and refactoring of mv reference code and selection of center point for searches. Mainly relates to not passing so many different local copies of things around. Some place holder comments. Change-Id: I309f10ffe9a9cde7663e7eae19eb594371c8d055	2012-12-10 17:31:51 +00:00
Ronald S. Bultje	885cf816eb	Introduce vp9_coeff_probs/counts/stats/accum types. Use these, instead of the 4/5-dimensional arrays, to hold statistics, counts, accumulations and probabilities for coefficient tokens. This commit also re-allows ENTROPY_STATS to compile. Change-Id: If441ffac936f52a3af91d8f2922ea8a0ceabdaa5	2012-12-07 16:09:59 -08:00
Ronald S. Bultje	c456b35fdf	32x32 transform for superblocks. This adds Debargha's DCT/DWT hybrid and a regular 32x32 DCT, and adds code all over the place to wrap that in the bitstream/encoder/decoder/RD. Some implementation notes (these probably need careful review): - token range is extended by 1 bit, since the value range out of this transform is [-16384,16383]. - the coefficients coming out of the FDCT are manually scaled back by 1 bit, or else they won't fit in int16_t (they are 17 bits). Because of this, the RD error scoring does not right-shift the MSE score by two (unlike for 4x4/8x8/16x16). - to compensate for this loss in precision, the quantizer is halved also. This is currently a little hacky. - FDCT and IDCT is double-only right now. Needs a fixed-point impl. - There are no default probabilities for the 32x32 transform yet; I'm simply using the 16x16 luma ones. A future commit will add newly generated probabilities for all transforms. - No ADST version. I don't think we'll add one for this level; if an ADST is desired, transform-size selection can scale back to 16x16 or lower, and use an ADST at that level. Additional notes specific to Debargha's DWT/DCT hybrid: - coefficient scale is different for the top/left 16x16 (DCT-over-DWT) block than for the rest (DWT pixel differences) of the block. Therefore, RD error scoring isn't easily scalable between coefficient and pixel domain. Thus, unfortunately, we need to compute the RD distortion in the pixel domain until we figure out how to scale these appropriately. Change-Id: I00386f20f35d7fabb19aba94c8162f8aee64ef2b	2012-12-07 14:45:05 -08:00
Deb Mukherjee	8b92f1e023	Supports inter-intra prediction with superblocks Adds support for compound inter-intra prediction with superblocks. Also, fixes a bug that disabled intra modes for superblocks. Change-Id: I4d711317e1bc19df8c2f32dc645429f7fff31036	2012-12-01 15:19:55 -08:00
Deb Mukherjee	6632330702	Adds switchable filters with superblocks Allows switchbale filters to be used without mismatch when the superblock experiment is on. Also removes a spurious clamping code in decodemv.c which causes rare encode/decode mismatches. Change-Id: I809d9ee0b2859552b613500b539a615515b863ae	2012-11-30 09:37:08 -08:00
Jim Bankoski	9f9370425b	warnings in various experiments Change-Id: Ib5106d4772450f8026f823dd743f162ab833b1d6	2012-11-30 07:31:37 -08:00
Yaowu Xu	ff2f9de828	Merge changes Iaa67bcf1,Ibea3bc80 into experimental * changes: more warning cleanup unused variables & warnings	2012-11-29 09:34:10 -08:00
Yaowu Xu	6431007df3	Merge "minor fix to eob check for setting CONTEXT" into experimental	2012-11-29 09:27:00 -08:00
Yaowu Xu	7ab1d3e49f	minor fix to eob check for setting CONTEXT Previously, the "!=" check is logically incorrect when eob is at 0 and effective coefficient starting position is 1. This commit should have no effect on bitstream. Change-Id: I6ce3a847c7e72bfbe4f7c74f88e3310c6b9b6d30	2012-11-29 09:10:15 -08:00
Jim Bankoski	a802f5e783	unused variables & warnings Change-Id: Ibea3bc80eb26a975faaa60268bbc93237f82bc57	2012-11-29 09:02:47 -08:00
Jim Bankoski	030e268a90	ihtllm moves to rtcd clears up some warnings Change-Id: I9899637497c6ad7519f098e055ab98580ae6d688	2012-11-29 07:19:38 -08:00
Jim Bankoski	13dbf1fb17	more rtcd cleanup Change-Id: Ieefd76e164ca4aa87597da0412977614ddfbacb7	2012-11-28 17:27:15 -08:00
Deb Mukherjee	0742b1e4ae	Fixing 8x8/4x4 ADST for intra modes with tx select This patch allows use of 8x8 and 4x4 ADST correctly for Intra 16x16 modes and Intra 8x8 modes when the block size selected is smaller than the prediction mode. Also includes some cleanups and refactoring. Rebase. Change-Id: Ie3257bdf07bdb9c6e9476915e3a80183c8fa005a	2012-11-28 16:21:12 -08:00
Jim Bankoski	c67873989f	fixed includes to be fully specified Change-Id: Ia1cce221f8511561b9cbd8edb7726fbc286ff243	2012-11-28 10:53:17 -08:00
John Koleszar	fcccbcbb39	Add vp9_ prefix to all vp9 files Support for gyp which doesn't support multiple objects in the same static library having the same basename. Change-Id: Ib947eefbaf68f8b177a796d23f875ccdfa6bc9dc	2012-11-27 14:12:30 -08:00

... 14 15 16 17 18 ...

1200 Commits