generic-library/vpx

Author	SHA1	Message	Date
Dmitry Kovalev	dde8069e57	Splitting partition_probs array into two arrays. We only update partition_probs for inter frames but they are constant for key frames. It is not necessary to have constants inside frame context and copy them every time. This change reduces FRAME_CONTEXT size by at least 48 bytes. Change-Id: If70a53be51043f37fe7d113853217937710932a7	2013-11-04 14:26:16 -08:00
Jingning Han	4efa6a0176	Fix the use case of plane_block_idx in sub8x8 RD This commit fixes the use case of plane_block_idx, which determines the plane (Y/U/V) index based on block index. When block idx >= 4 in sub8x8 block loop, it should be of chroma components. Change-Id: I072705aa7b35445524ac607089ca8ce54b7ba478	2013-11-03 11:19:33 -08:00
Jingning Han	2de7cbe980	Add second ref frame check back in rdcost hist Update best_inter_rd and best_inter_ref_frame only in single ref frame case. Change-Id: Id56825b231a62d6852bd83811410c05a7569f715	2013-11-03 09:15:41 -08:00
Dmitry Kovalev	0e1756330b	Merge "Removing is_intra_mode() function."	2013-10-31 18:06:53 -07:00
Yunqing Wang	d03b3cbdd7	Merge "Fix x_offset_q4/y_offset_q4 calculation"	2013-10-31 09:47:54 -07:00
Jingning Han	a2a5c6f821	Merge "Enable all-zero coeff block index for sub8x8 blk"	2013-10-30 16:00:57 -07:00
Dmitry Kovalev	6761872e49	Replacing (SWITCHABLE_FILTERS + 1) with SWITCHABLE_FILTER_CONTEXTS. Change-Id: I9781a62bc1a4cd9176554d1271d87dbcafda9cb0	2013-10-30 14:40:34 -07:00
Jingning Han	8c8381d543	Enable all-zero coeff block index for sub8x8 blk This commit makes zcoeff_blk cache the case where the entire block is quantized to be zero (without applying zero-forcing) in the rate- distortion optimization loop, and skip the forward DCT, quantization, inverse DCT, and reconstruction process in the encode_block stage. It now works for all the block sizes, including sub8x8 blocks. Change-Id: I5ae60a9c436ba3637d11666733554bec4580ef98	2013-10-30 14:01:06 -07:00
Yunqing Wang	9ed2d0a577	Fix x_offset_q4/y_offset_q4 calculation "<< SUBPEL_BITS" needs to be added in the calculation. Call set_scaled_offsets() to calculate x_offset_q4 and y_offset_q4. Change-Id: Ied130ea771510e918f51cd1dc3abe57f4c0962b5	2013-10-29 17:46:55 -07:00
Dmitry Kovalev	e6dcf2aeb6	Fixing wrongly initialized tx_type variable. Wrong value was used in get_tx_type_4x4() function, so making initialization before that call. Change-Id: Ief30bb1e0c03b2f23d993bbf9ae18d7150ba9a83	2013-10-29 12:18:53 -07:00
Dmitry Kovalev	aa76cd1e49	Removing is_intra_mode() function. It is enough to check just block type: intra or inter. Intra block implies intra prediction mode, and inter block implies inter mode. Change-Id: I3cf98731a3935f670a3cd8e2b2443483eb944be4	2013-10-28 20:00:55 -07:00
Dmitry Kovalev	8253532c2d	Cleaning up vp9_regular_quantize_b_4x4. Passing scan & iscan as parameters, adding useful local variables. Change-Id: Ia2a87906941db9557350d273669ce5c3cdb7235d	2013-10-28 14:28:28 -07:00
James Zern	58a0f6dbdd	vp9: add TileInfo replaces use of cur_tile_mi_(row\|col)_(start\|end) by VP9_COMMON, making it less stateful and more reusable for parallel tile decoding Change-Id: I1df09382b4567a0e5f4434825d47c79afe2399be	2013-10-28 20:54:43 +01:00
Dmitry Kovalev	710ca1fe36	Merge changes I1868fb75,I9ff504c6 * changes: Renaming INTERPOLATIONFILTERTYPE to INTERPOLATION_TYPE. Adding VP9_FRAME_MARKER constant.	2013-10-24 10:08:19 -07:00
Yunqing Wang	93ec31dff6	Merge "Improve scale_factors struct"	2013-10-24 09:13:41 -07:00
Dmitry Kovalev	ad867fe237	Renaming INTERPOLATIONFILTERTYPE to INTERPOLATION_TYPE. Change-Id: I1868fb75ed88bfa65c1c2ca24677d65f2894d713	2013-10-23 17:45:52 -07:00
Jingning Han	ae0e747d6a	Merge "Use inter mode flag in super_block_yrd"	2013-10-23 13:52:05 -07:00
Jingning Han	f3b31380fa	Use inter mode flag in super_block_yrd Use a flag variable to determine if coded in inter mode, thus avoiding multiple inter mode checks in super_block_yrd. Change-Id: I0ef998b2811c38e185a2e0583f0f636cee45d2cf	2013-10-23 11:18:23 -07:00
Yunqing Wang	175c313a12	Improve scale_factors struct The ref's scale_factors are set at frame level, and then copied for each partition block. Since the struct members are mostly constant, this patch separated the constant and non-constant members, and reduced struct copying. This gave 0.5% ~ 1.4% decoder speed gain. Change-Id: I94043bf5a6995c8042da52e5c661818dfa6f6d4c	2013-10-22 13:10:22 -07:00
Dmitry Kovalev	ec414372e8	Removing quantize_b_4x4 function pointer. The pointer was asigned only once with vp9_regular_quantize_b_4x4, calling this function directly now. Also removing unused declarations: prototype_quantize_block prototype_quantize_block_pair prototype_quantize_mb vp9_regular_quantize_b_4x4_pair vp9_regular_quantize_b_8x8 Change-Id: I14325bc2f082336820671eafbc06126651b79f73	2013-10-22 13:09:36 -07:00
Dmitry Kovalev	9f09618bd4	Merge "Using stride (# of elements) instead of pitch (bytes) in fdct4x4."	2013-10-22 13:05:24 -07:00
Dmitry Kovalev	fa57135b2c	Merge "Removing NUM_ prefix from constant names."	2013-10-22 11:34:28 -07:00
Jingning Han	c807949408	Prevent left_block_mode stepping into left tile This commit uses left_available flag to decide if the left mode_info struct is available for left_block_mode. As discussed with James Zern (jzern@), this prevents the codec from fetching mode_info from blocks in the left tile, which although effectively not used might present concerns for multi-threaded tile decoding. This is NOT a bit-stream change. Change-Id: I1dc8cf1bcbf056688eee27c7bc5706ac4b4e0125	2013-10-22 09:02:41 -07:00
Dmitry Kovalev	190c2b4591	Using stride (# of elements) instead of pitch (bytes) in fdct4x4. Just making fdct consistent with iht/idct/fht functions which all use stride (# of elements) as input argument. Change-Id: I0ba3c52513a5fdd194f1e7e2901092671398985b	2013-10-21 15:27:35 -07:00
Dmitry Kovalev	1e05c9a7e6	Merge "Cleanup: using cm variable instead of cpi->common."	2013-10-21 14:30:01 -07:00
Jingning Han	deb10ac6f9	Merge "Make memory alloc in pick_mode_context bsize aware"	2013-10-21 11:45:59 -07:00
Dmitry Kovalev	a698e52926	Cleanup: using cm variable instead of cpi->common. Change-Id: Iab334b5fd51dfa7e7f29963f8bdc62fd7355e56d	2013-10-21 11:10:11 -07:00
Dmitry Kovalev	d1b65c6bda	Moving allow_high_precision_mv from MACROBLOCKD to VP9_COMMON. This value is a global frame-level flag, not a macroblock-level. Change-Id: Ie8c5790a931150741c2167c00c3e3dd2cf26744d	2013-10-21 10:12:14 -07:00
Dmitry Kovalev	6d2a0da7a7	Removing NUM_ prefix from constant names. Renames for consistency with other constants: NUM_FRAME_TYPES -> FRAME_TYPES NUM_PARTITION_CONTEXTS -> PARTITION_CONTEXTS Change-Id: I3db30acb2868eb0a424237c831087b2e264ec47f	2013-10-18 17:44:19 -07:00
Yaowu Xu	db1045f2c0	Merge "Use lookup table to simplify logic"	2013-10-18 12:55:24 -07:00
Jingning Han	72033fcff8	Make memory alloc in pick_mode_context bsize aware This commit makes the buffer allocation of zcoeff_blk array in pick_mode_context block size aware. It calculates the number of 4x4 blocks in the partition and assigns the memory space accordingly. This process (and the uninitialization) is done once for each encoding pass. It allows memory copy of smaller buffer when possible. For football at 600kbps, the runtimes improve by about 1%: speed 1, 45961ms -> 45472ms speed 2, 23863ms -> 23598ms Change-Id: Id2ca24906fa89f46fa5fe742ec4b8efc2a61f877	2013-10-18 12:42:44 -07:00
Yaowu Xu	30d1ec38a7	Use lookup table to simplify logic In deciding the transform size for a given block in a given TX_MODE. Change-Id: I1467da09853e69cd320695a24c04e19a2f3d04fb	2013-10-17 14:54:16 -07:00
Dmitry Kovalev	2726f383cd	Adding allow_hp as an argument to mv search functions. Making this change in order to move allow_high_precision_mv field from MACROBLOCKD structure to VP9_COMMON (because it is a frame level flag). Change-Id: I1d006ba36d938e0caf4d40fa051e2e38df9c1108	2013-10-17 14:02:04 -07:00
Guillaume Martres	7fd2561d64	Merge changes I6226456d,I97925178,I766c4b74 * changes: Use a separate MODE_INFO stream for each tile column Get rid of "this_mi", use "mi_8x8[0]" everywhere instead Make the static_segmentation feature work again	2013-10-16 17:05:39 -07:00
Guillaume Martres	acf0d56f0b	Get rid of "this_mi", use "mi_8x8[0]" everywhere instead The only case where they were intentionally pointing to different structures was in mbgraph, and this didn't have the expected behavior because both of these pointers are used interchangeably through the code Change-Id: I979251782f90885fe962305bcc845bc05907f80c	2013-10-16 16:24:03 -07:00
Dmitry Kovalev	9deb614a57	Adding get_band_translate() function. Moving code that gets band_translate array from get_scan_and_band() function to get_band_translate() function. Renaming get_scan_and_band() to get_scan(). Change-Id: I43047c205a1ca2a6e24be44db39dc04b7a385008	2013-10-16 15:11:42 -07:00
Guillaume Martres	e55f60240a	Implement variance-based adaptive quantization This should be similar to what x264 does with --aq-mode 1. It works well with clips like parkjoy and touhou (http://x264.nl/developers/Dark_Shikari/LosslessTouhou.mkv). At low bitrates, the segmentation signaling overhead may negate the benefits of this feature. (PGW) Default changed to feature OFF to allow provisional merge. Change-Id: I938abf9bb487e1d4ad3b0264ea03d9826275c70b	2013-10-16 11:55:13 +01:00
Alexander Voronov	d6a59fb12c	Updated encoder to handle intra-only frames Updated the encoder to handle frames that are coded intra-only. Intra-only frames must be non-showable, that is, the "show frame" flag must be set to 0 in the frame header. Tested by forcing the ARF frames to be coded intra- only. Note: The rate control code will need to be modified to account for intra-only frames better than they are currently handled. Change-Id: I6a9dd5337deddcecc599d3a44a7431909ed21079	2013-10-15 16:44:02 -07:00
Jingning Han	3f52cfa130	Merge "Re-design all-zero-coeff block index buffer use"	2013-10-15 16:23:38 -07:00
Jingning Han	8e3ce1a9e3	Re-design all-zero-coeff block index buffer use Use the zcoeff_blk buffer of PICK_MODE_CONTEXT to store the indexes of all-zero-coeff block of the current best mode. Remove the temporary buffer best_zcoeff_blk defined in the rate-distortion optimization loop. This improves the speed performance by about 0.5% in all speed settings. Change-Id: Ie3e15988ddfa581eafa2e19a8228d3fe4a46095c	2013-10-15 10:54:06 -07:00
Jingning Han	a0d8ec7b76	Merge "Move token_cache from cost_coeffs to MACROBLOCK"	2013-10-14 13:05:32 -07:00
Jingning Han	f60a3910c4	Move token_cache from cost_coeffs to MACROBLOCK This commit moves token_cache buffer into macroblock struct, instead of defining as a local variable in cost_coeffs. This avoids repeatedly re-allocating memory space in the rate-distortion optimization loop. The runtime at speed 0 reduces: bus 2000kbps, 161692ms to 159951ms football 600kbps, 229505ms to 225821ms Change-Id: If7da6b0b6d8c5138a16271a33c4548fba33d8840	2013-10-14 10:45:56 -07:00
Dmitry Kovalev	107897cf05	Merge "Consistent names for inverse hybrid transforms (1 of 2)."	2013-10-11 15:33:00 -07:00
Deb Mukherjee	c222b96bfd	Merge "Change in rddiv parameter to make it a power of 2"	2013-10-11 13:53:59 -07:00
Dmitry Kovalev	7ef573914d	Consistent names for inverse hybrid transforms (1 of 2). Renames: vp9_short_iht4x4_add -> vp9_iht4x4_16_add vp9_short_iht8x8_add -> vp9_iht8x8_64_add vp9_short_iht16x16_add_c -> vp9_iht16x16_256_add Change-Id: Ibca7a188fd062b196787ac5efc1ea545e7f166c0	2013-10-11 13:31:32 -07:00
Deb Mukherjee	d9655e42b8	Change in rddiv parameter to make it a power of 2 Converts the constant rddiv parameter to 128 (from 100) and implements RDCOST with bit-shift rather than multiplication. Other parameters are also adjusted to roughly keep the same balance between Rate and Distortion. There is a slight speed-up of about 0.5-1% (at speed 0) as testted on football_cif. There is a slight change in performance due to small change in the parameters. derfraw300: +0.033% stdhdraw250; +0.102% Change-Id: I70ac69f58fa71c83108f68fe41796cd19d1fc760	2013-10-11 10:43:02 -07:00
Yaowu Xu	8b175679be	Masking intra mode choice adaptively The commit changes to mask available intra prediction modes for test based on prediction block size. With this patch, encoding time of CpuUsed 2 reduces from 10% to 20% for HD clips with a compression drop of 0.2% Change-Id: I65f320f1237c0f5ae3a355bf7caf447f55625455	2013-10-11 10:29:53 -07:00
Jingning Han	54e702b5d7	Merge "Restore mode skip feature in sub8x8 rd loop"	2013-10-11 09:21:06 -07:00
Yaowu Xu	e2d6e37a54	Merge "change to avoid out-of-range computation"	2013-10-10 13:38:16 -07:00
Jingning Han	09aca3089f	Merge "Re-design rate-distortion cost tracking buffers"	2013-10-10 12:57:31 -07:00
Jingning Han	fc19243ced	Re-design rate-distortion cost tracking buffers This commit re-designs the per transformed block rate-distortion costs tracking buffers. It removes redundant buffer usage, makes the needed context memory allocation per VP9_COMP instance and reuses the same buffer sets inside the rate-distortion optimization search loop, thereby avoiding repeatedly requiring memory space. It reduces speed 0 runtime: bus at 2000 kbps from 166763ms to 158967ms, football at 600 kbps from 246614ms to 234257ms. Both about 5% speed-up. Local tests suggest about 2% to 5% speed-up for speed 1 and 2 settings. This does not change compression performance. Change-Id: I363514c5276b5cf9a38c7251088ffc6ab7f9a4c3	2013-10-10 11:03:44 -07:00
Yaowu Xu	b47cef056e	change to avoid out-of-range computation Change-Id: Id5e31833a0ef40de9f64c2f5674af7083233bf14	2013-10-10 11:01:50 -07:00
Dmitry Kovalev	1e8fc24af8	Merge "Removing inv_txm4x4_1_add and inv_txm4x4_add function pointers."	2013-10-10 10:49:27 -07:00
Deb Mukherjee	2b055dfe3f	Merge "Adjustment to mv cost parameters"	2013-10-10 09:08:58 -07:00
Jingning Han	be6ae20510	Merge "Fix intra dist model of skip_encode feature"	2013-10-10 09:00:20 -07:00
Deb Mukherjee	e4b0fce41c	Adjustment to mv cost parameters Increases these parameters. There is a small efficiency gain. Change-Id: Ie5f0ddb39c907d335e0dafa5eb112365a81f4542 derfraw300: +0.091% stdhdraw250: +0.238%	2013-10-09 23:14:25 -07:00
Jingning Han	013db649fa	Fix intra dist model of skip_encode feature The intra mode distortion adjustment for skip_encode feature was broken in the refactoring cc91851. This commit fixes it and tunes the distortion models used therein. Change-Id: I0d676e82f8e855536a90cf9b3e3fdefafcd886c6	2013-10-09 16:05:50 -07:00
Deb Mukherjee	d6aae4d456	Merge "Clean-ups in rdopt.c"	2013-10-09 12:10:20 -07:00
Deb Mukherjee	eb8b1cd764	Clean-ups in rdopt.c Some minor cleanups in preparation for experimentation with some encode parameters and thresholds Change-Id: I449d66da97eae0a7acdf4aae374e2f9111342056	2013-10-09 11:32:03 -07:00
Jingning Han	03fe08ca30	Deprecate the use of PARTITION_INFO from encoder Use b_mode_info to store the inter prediction mode of sub8x8 block, in replacement of the use of partition_info. Remove redundant buffer update for partition_info. For bus_cif at 2000 kbps, this seem to make speed 0 about 1% faster. Change-Id: Id1b3be45e75a24fb4b42335ac480c23e440978f6	2013-10-09 09:23:52 -07:00
Dmitry Kovalev	c983c966cb	Removing inv_txm4x4_1_add and inv_txm4x4_add function pointers. We already have itxm_add member in MACROBLOCKD structure. Both inv_txm4x4_1_add and inv_txm4x4_add are just its special cases for different eob values. But eob logic is already implemented in vp9_iwht4x4_add and vp9_idct4x4_add (that's why also removing inverse_transform_b_4x4_add). Change-Id: I80bec9b6f7d40c5e5033c613faca5c819c3e6326	2013-10-08 11:27:56 -07:00
Dmitry Kovalev	8d3ef287a2	Merge "Removing redundant vp9_pt_energy_class declarations."	2013-10-08 10:54:48 -07:00
Jim Bankoski	08feefbe7b	easy to fix cpplint issue in rdopt.c Change-Id: Id093816146de0d100f0c6ae2542aaa427dbab2d8	2013-10-07 17:03:29 -07:00
Jingning Han	c8f481fa3d	Restore mode skip feature in sub8x8 rd loop This commit restores the mode skip feature in the sub8x8 rd loop. Change-Id: I5496ee32053f572b8961b549e9ecd4f1360824de	2013-10-07 14:20:34 -07:00
Dmitry Kovalev	23cc1cd8e6	Removing redundant vp9_pt_energy_class declarations. Declaring vp9_pt_energy_class in vp9_entropy.h instead of many external places. Change-Id: I66e8a3fc119a43f88d130d0dae4133c825a047a3	2013-10-07 14:11:01 -07:00
Dmitry Kovalev	272adbbec4	Using inter_mode_offset_function instead of duplicated code. Change-Id: I8de865cd1deca07b5c92c225782f0867367e9a11	2013-10-07 13:18:46 -07:00
Jingning Han	1ab60f7bfb	Merge "Remove redundant second_ref_frame check in sub8x8"	2013-10-04 09:04:11 -07:00
Paul Wilkins	8abd92f12f	Remove mode_skip_start and mask code for sub 8x8 This code serves no purpose in the re-factored sub 8x8 code. Change-Id: I5364986224d1a28b71bcb046ec8557a3d14aaa47	2013-10-04 14:26:17 +01:00
Dmitry Kovalev	d975804e9a	Merge "Replacing duplicated code with get_scan_and_band call."	2013-10-03 18:58:40 -07:00
Dmitry Kovalev	8b34437522	Replacing duplicated code with get_scan_and_band call. Change-Id: I2cc3684f416a63dc99b9303109f9850f34a470d5	2013-10-03 17:46:28 -07:00
Jingning Han	2952b7d1fb	Remove redundant second_ref_frame check in sub8x8 This commit removes the redundant second reference frame check in the rate-distortion optimization loop for sub8x8 blocks. Change-Id: I13a57a6f624c4a9bcef02ff2a867fa30d8b44a93	2013-10-03 14:02:12 -07:00
Jingning Han	b9daef91d8	Use vp9_zero in sub8x8 RD optimiazion loop Change-Id: Ic23a705e48cadaa7151f2bd8536d56636cb973e3	2013-10-03 12:34:25 -07:00
Jingning Han	4093192ec9	Change b_mode_info definition from union to struct This commit defines b_mode_info as a struct type. This will allow us to further remove the use of PARTITION_INFO in the encoding process. Change-Id: I975b0f7d557b5e0f66545a61b472def76b671cce	2013-10-03 12:34:11 -07:00
Jingning Han	793c2d8429	Remove unused variables in inter_mode rd loops Remove redundant variable definition/use in rate-distortion search loop for regular and sub8x8 blocks, respectively. Change-Id: Ic0eb3660bb6851ba2eb8d702ba9fd11595000d01	2013-10-03 12:34:11 -07:00
Jingning Han	a55625873f	Merge "Refactor inter mode rate-distortion search"	2013-10-03 12:19:53 -07:00
Jingning Han	11abab356e	Refactor inter mode rate-distortion search This commit separates the rate-distortion optimization loop of superblocks from that of sub8x8 blocks. This allows better design rate-distortion optimization search loop for each setting. It also removes the use of SPLITMV and I4X4_PRED therein. No performance change in speed 0 settings. For bus@CIF at 2000kbps, the speed 1 runtime goes from 48009ms to 43894ms (about 10% faster). The overall compression performance on derf changed by -0.021%. Speed 2 runtime goes from 27114ms to 28700ms (6% slower), while the overall coding efficiency goes up by 1.629% for derf, 1.236% for yt. Change-Id: Ie6bdfa0a370148dd60bd800961077f7e97e67dd4	2013-10-03 11:36:49 -07:00
Dmitry Kovalev	9250d1529c	Using vp9_zero instead of vpx_memset. Change-Id: I9a0d0e9c3459954aa7b9c68f92cc5d56385ebd18	2013-10-03 10:59:36 -07:00
Paul Wilkins	6253cc9279	Speed setting review. Substantial reworking of the speed vs quality trade offs for speed 1 and 2. In this patch I am attempting to freeze the "quality" meaning of speeds 1 and 2 relative to speed 0 so that in future we can better evaluate progress. I am targeting : Speed 1 quality ~-5% vs speed 0. Speed 2 quality ~-10% vs speed 0 It is inevitable that quality will still fluctuate a little as we adjust settings and add new features, but we will attempt to keep as close as possible to these values. Above speed 2 things will remain a bit more fluid for now. In this patch speed 1 is approximately 4-5x as fast as speed 0. This is similar to before but the quality hit is a lot less. Likewise speed 2 is approximately 2x as fast as speed 1 but is similar in quality to the previous speed 1 configuration. Also slight change to behavior of FLAG_EARLY_TERMINATE to insure all reference frames get at least one rd test. Important for very low variance regions. WIP :- Added a new speed level with old speed 4 becoming speed 5. Speed 3 and 4 tradeoffs still WIP Change-Id: Ic7a38dd7b5b63ab1501f9352411972f480ac6264	2013-10-03 10:23:28 +01:00
Dmitry Kovalev	b927620231	Merge "Using is_inter_block and has_second_ref functions."	2013-09-29 12:14:41 -07:00
Dmitry Kovalev	29815ca729	Merge "Moving from int_mv* to MV* (3)."	2013-09-29 12:13:16 -07:00
Dmitry Kovalev	7343681675	Merge "Removing vp9_get_coef_neighbors_handle function."	2013-09-29 12:01:36 -07:00
Dmitry Kovalev	209c6cbf8f	Removing vp9_get_coef_neighbors_handle function. Change-Id: I6be72c8b048d1ccc7ef43764cf84c32360098970	2013-09-27 14:11:13 -07:00
Guillaume Martres	2b426969c3	Simplify RDMULT and RDDIV derivation Don't divide RDMULT and RDDIV by 100 when RDMULT > 1000. This was probably done to avoid overflow when the rd cost was stored in a 32 bits integer but this is not the case anymore. This change will make it easier to support multiple quantizers per frame. derf compression gain at speed 0: 0.037% Change-Id: Ibeeb9b7cfa1a132a7af41bc90fc07a3bba0857f6	2013-09-26 13:55:16 -07:00
Dmitry Kovalev	eda4e24c0d	Using is_inter_block and has_second_ref functions. Change-Id: I60dee58a4fd24d3c4f3c101a49d30e217309f43a	2013-09-25 19:03:04 -07:00
Dmitry Kovalev	8266da1cd1	Moving from int_mv* to MV* (3). Change-Id: I9795d0937bc07793c13d067281995e0750f694d9	2013-09-25 16:44:19 -07:00
Dmitry Kovalev	f9e2140cab	Merge "Moving from int_mv* to MV* (2)."	2013-09-25 16:12:13 -07:00
Dmitry Kovalev	2b5670238b	Merge "Replacing txfm with tx."	2013-09-25 15:57:56 -07:00
Dmitry Kovalev	d445945a84	Adding vp9_get_entropy_contexts function. Change-Id: Ife0dd29fb4ad65c7e12ac5f1db8cea4ed81de488	2013-09-24 17:26:05 -07:00
Dmitry Kovalev	d0365c4a2c	Replacing txfm with tx. Renaming txfm_stepdown_count to tx_stepdown_count and max_txfm_size to max_tx_size. Change-Id: Ifc173e22c78240e561a57c4c741b64b1b8fc6fef	2013-09-24 17:24:35 -07:00
Dmitry Kovalev	b87696ac37	Moving from int_mv* to MV* (2). Updating fractional_mv_step_fp and fractional_mv_step_comp_fp function types. Change-Id: I601c4378bc39ac3ffd4e295d9cbd8e1f74829d46	2013-09-24 12:48:12 -07:00
Dmitry Kovalev	30888742f4	Merge "Moving from int_mv to MV."	2013-09-24 12:25:56 -07:00
Yaowu Xu	71cfaaa689	Merge "Replace memcpy with vpx_memcpy"	2013-09-24 11:35:03 -07:00
Yaowu Xu	9be0bb19df	Replace memcpy with vpx_memcpy Also removed obselete comment Change-Id: Iae1664777d76383639c637ee786e0d50fc45819a	2013-09-24 10:56:06 -07:00
Yaowu Xu	ff1ae7f713	Prevent using uninitialized value in RD decision INT64_MAX may be assigned as RDCOST when RDCSOST computation is skipped for speed, this commit to prevent INT64_MAX from being used as real RDCOST in transform size decision. Change-Id: I89a945134191bbdea1f1431ade70424ac079eaac	2013-09-24 10:53:01 -07:00
Jingning Han	9bcd750565	Merge "Enable per transformed block zero coeffs forcing"	2013-09-24 09:18:17 -07:00
Jingning Han	24ad692572	Merge "Calculate rd cost per transformed block"	2013-09-24 09:18:03 -07:00
Jingning Han	a517343ca3	Enable per transformed block zero coeffs forcing This commit enables forcing all coefficients zero per transformed block, when its rate-distortion cost is lower than regular coeff quantization. The overall performance improvement (including its parent patch on calculating rd cost per transformed block) at speed 1: derf: 0.298% yt: 0.452% hd: 0.741% stdhd: 0.006% Change-Id: I66005fe0fd7af192c3eba32e02fd6d77952accb5	2013-09-23 10:39:35 -07:00
Jingning Han	78fbb10642	Calculate rd cost per transformed block This commit makes the rate-distortion optimization loop evaluate the rd costs of regular quantization and all zero coeffs, per transformed block. It improves speed 1 compression performance: derf: 0.245% yt: 0.515% For a large partition that consists multiple transformed blocks, this allows more flexibility to selectively force a portion of them coded as all zero coeffs, as well be continued in the next patches. Change-Id: I211518be4179747b57375696f017d1160cc91851	2013-09-20 12:40:17 -07:00
Dmitry Kovalev	e51e7a0e8d	Moving from int_mv to MV. Converting vp9_mv_bit_cost, mv_err_cost, and mvsad_err_cost functions for now. Change-Id: I60e3cc20daef773c2adf9a18e30bc85b1c2eb211	2013-09-20 13:52:43 +04:00
Jingning Han	44b708b4c4	Remove redundant mv_pred use for sub8x8 blocks The sub8x8 blocks has its own motion vector reference scheme. The mv_pred is only used blocks of sizes 8x8 and above, to find the starting point for motion search. This change does not change any coding behavior. It makes the encoding process slightly faster. (0.5% speed-up for local test on speed 1.) Change-Id: I746ee6ef0eac19aa3621be014afa12be8d82cbb9	2013-09-19 10:32:44 -07:00
Yaowu Xu	014acfa2af	fix integer overflow errors Change-Id: I76f440a917832c02d7a727697b225bac66b99f56	2013-09-19 08:14:26 -07:00
Dmitry Kovalev	cda802ac86	Merge "Removing redundant coef calculation + cleanup."	2013-09-19 00:28:31 -07:00
Dmitry Kovalev	98cf0145b1	Removing redundant coef calculation + cleanup. Adding temp variable for &x->plane[0], inlining src_diff values. Change-Id: I24c08a5425a6da6fd66f5b0278f2fce74f9989b2	2013-09-18 16:20:10 +04:00
Dmitry Kovalev	245ca04bab	Fixing typo in the encoder. Change-Id: I168efdc366eecf638694f357ccad2f4eba7e2fdb	2013-09-18 12:02:22 +04:00
Yaowu Xu	85fd8bdb01	Merge "Silence a bunch of MSVC warnings"	2013-09-17 17:10:58 -07:00
Jingning Han	c437bbcde0	Clean up second ref check in sub8x8 rd loop This commit cleans up the second reference check in the rate-distortion optimization loop of sub8x8 blocks. Change-Id: Ife68feaa4cddbfad2878c9b44d3012788d634f97	2013-09-17 15:59:49 -07:00
Yaowu Xu	a783da80e7	Silence a bunch of MSVC warnings Change-Id: I16633269582a640809dca27572bbe99efa6369fc	2013-09-17 12:08:51 -07:00
Yaowu Xu	eeae6f946d	fix a problem where an invalid mv used in search The commit added reset of pred_mv at the beginning of each SB64x64 partition mv search, also limited the usage of pred_mv only when search on the largest partition is already done. This is to fix a crash at speed 1/2 encoder where an invalid mv is used in mv search. Change-Id: I39010177da76d054e3c90b7899a44feb2e3a5b1b	2013-09-16 12:49:27 -07:00
Jingning Han	c4826c5941	Adaptive motion search control This commit enables adaptive constraint on motion search range for smaller partitions, given the motion vectors of collocated larger partition as a candidate initial search point. It makes speed 0 runtime of bus at CIF and 2000 kbps goes from 167s down to 162s (3% speed-up), at 0.01dB performance gains. In the settings of speed 1, this makes the runtime goes from 33687 ms to 32142 ms (4.5% speed-up), at 0.03dB performance gains. Compression performance wise, it gains at speed 1: derf 0.118% yt 0.237% hd 0.203% stdhd 0.438% Change-Id: Ic8b34c67810d9504a9579bef2825d3fa54b69454	2013-09-13 13:58:10 -07:00
Paul Wilkins	5d8642354e	Merge "Fix VP9_mode_order[]"	2013-09-13 09:19:31 -07:00
Scott LaVarnway	8fc95a1b11	Merge "New mode_info_context storage -- undo revert"	2013-09-13 08:56:20 -07:00
Paul Wilkins	1407cf8588	Fix VP9_mode_order[] Mis-merge of the following change managed to break mode order and delete two mode options (new alt ref and near alt ref) It also created a situation where we could test two undefined modes off the end of the VP9_mode_order[] data structure. "clang warnings : remove split and i4x4_pred fake modes" "Change Id: I8ef3c*" Initial testing on Akiyo at speed 2. 101.35 44.567 44.447 improves to 96.82 44.915 44.815 Approx 0.3-0.4db gain and 2.5% size reduction Change-Id: Icff813e7c0778d140ad4f0eea18cf1ed203c4e34	2013-09-13 13:33:26 +01:00
Jim Bankoski	9ee9918dad	fix clang warning in rdopt either missed this or it crept back in Change-Id: I6cc1519d09e558be7250254c25bde2ae720555ea	2013-09-12 06:39:42 -07:00
Jim Bankoski	7fb42d909e	clang warnings : remove split and i4x4_pred fake modes Change-Id: I8ef3c7c0f08f0f1f4ccb8ea4deca4cd8143526ee	2013-09-11 16:34:55 -07:00
Scott LaVarnway	ac6093d179	New mode_info_context storage -- undo revert mode_info_context was stored as a grid of MODE_INFO structs. The grid now constists of pointers to MODE_INFO structs. The MODE_INFO structs are now stored as a stream (decoder only), eliminating unnecessary copies and is a little more cache friendly. Change-Id: I031d376284c6eb98a38ad5595b797f048a6cfc0d	2013-09-11 13:45:44 -04:00
Yunqing Wang	939791a129	Modify encode breakout for static frames Thank Paul for the suggestions. While turning on static-thresh for static-image videos, a big jump on bitrate was seen. In this patch, we detected static frames in the video using first-pass stats. For different cases, disable encode breakout or reduce encode breakout threshold to limit the skipping. More modification need be done to break incorrect partition picking pattern for static frames while skipping happens. Change-Id: Ia25f47041af0f04e229c70a0185e12b0ffa6047f	2013-09-10 09:06:03 -07:00
Paul Wilkins	4f660cc018	Modified mode skip functionality. A previous speed feature skipped modes not used in earlier partitions but this not longer worked as intended following changes to the partition coding order and in conjunction with some other speed features (Especially speed 2 and above). This modified mode skip feature sets a mask after the first X modes have been tested in each partition depending on the reference frame of the current best case. This patch also makes some changes to the order modes are tested to fit better with this skip functionality. Initial testing suggests speed and rd hit count improvements of up to 20% at speed 1. Quality results. (derf -1.9%, std hd +0.23%). Change-Id: Idd8efa656cbc0c28f06d09690984c1f18b1115e1	2013-09-10 13:30:10 +01:00
Ivan Maltz	20abe595ec	Merge "API extensions and sample app for spacial scalable encoder"	2013-09-09 16:57:01 -07:00
Ivan Maltz	01b35c3c16	API extensions and sample app for spacial scalable encoder Sample app: vp9_spatial_scalable_encoder vpx_codec_control extensions: VP9E_SET_SVC VP9E_SET_WIDTH, VP9E_SET_HEIGHT, VP9E_SET_LAYER VP9E_SET_MIN_Q, VP9E_SET_MAX_Q expanded buffer size for vp9_convolve modified setting of initial width in vp9_onyx_if.c so that layer size can be set prior to initial encode Default number of layers set to 3 (VPX_SS_DEFAULT_LAYERS) Number of layers set explicitly in vpx_codec_enc_cfg.ss_number_layers Change-Id: I2c7a6fe6d665113671337032f7ad032430ac4197	2013-09-09 15:57:56 -07:00
James Zern	54a03e20dd	Revert "New mode_info_context storage" This reverts commit `dae17734ec` Encode crashes, leaks and increases integer overflow errors. Change-Id: I595aa2649bb8d0b6552ff91652837a74c103fda2	2013-09-09 13:37:01 -07:00
Scott LaVarnway	dae17734ec	New mode_info_context storage mode_info_context was stored as a grid of MODE_INFO structs. The grid now constists of a pointer to a MODE_INFO struct and a "in the image" flag. The MODE_INFO structs are now stored as a stream, eliminating unnecessary copies and is a little more cache friendly. For the test clips used, the decoder performance improved by ~4.3% (1080p) and ~9.7% (720p). Patch Set 2: Re-encoded clips with latest. Now ~1.7% (1080p) and 5.9% (720p). Change-Id: I846f29e88610fce2523ca697a9a9ef2a182e9256	2013-09-06 12:33:34 -04:00
Yunqing Wang	0ca7855f67	Use correct bit cost while static-thresh is on While static-thresh is on, we only need to transmit skip flag if skip = 1. The cost of skip bit is added to the total rate cost. Change-Id: I64e73e482bc297eba22907026298a15fa8cc3920	2013-08-30 15:25:13 -07:00
Paul Wilkins	1f4bf79d65	Added per pixel inter rd hit count stats Added some code to output normalized rd hit count stats. In effect this approximates to the average number of rd operations/tests per pixel for the sequence. The results are not quite accurate and I have not bothered to account for partial SB64s at frame edges and for key frames However they do give some idea of the number of modes / prediction methods being tested for each pixel across the different partition sizes. This indicates how much scope their is for further gains either by reducing the number of partitions examined or the modes per partition through heuristics. Patch 3 moved place where count incremented so partial rd tests that are aborted with INT_MAX return are also counted. Example numbers for first 50 frames of Akiyo. Speed 0 ~84.4 rd operations / pixel Speed 1 ~28.8 Speed 2 ~11.9 Change-Id: Ib956e787e12f7fa8b12d3a1a2f6cda19a65a6cb8	2013-08-30 00:13:51 +01:00
Yaowu Xu	ee961599e1	Merge "Fixed potential overflows"	2013-08-29 15:43:26 -07:00
Dmitry Kovalev	e80bf802a9	Merge "Renaming txfm_size to tx_size."	2013-08-29 12:30:18 -07:00
Yaowu Xu	aaa7b44460	Fixed potential overflows The two arrays are typically initialized to INT64_MAX, if they are not filled with valid values before the addition, the values can overflow and lead to wrong results. Change-Id: I515de22cf3e8f55af4b74bdb2c8eb821a02d3059	2013-08-29 10:26:52 -07:00
Dmitry Kovalev	b62ddd5f8b	General code cleanup. Switching from mi_{width, height}_log2 and b_{width, height}_log2 to num_8x8_blocks_{wide, high} and num_4x4_blocks_{wide, high}. Removing redundant code, adding const. Change-Id: Iaab2207590fd24d0b76999071778d1395dc5cd5d	2013-08-28 12:22:37 -07:00
Dmitry Kovalev	851a2fd72c	Renaming txfm_size to tx_size. Change-Id: I752e374867d459960995b24d197301d65ad535e3	2013-08-27 19:47:53 -07:00
Jingning Han	eb7acb5524	Merge "Fix buf alignment in sub8x8 comp inter-inter pred"	2013-08-27 19:03:12 -07:00
Dmitry Kovalev	7b95f9bf39	Renaming BLOCK_SIZE_TYPE to BLOCK_SIZE in the encoder. Change-Id: I62bb07c377f947cb72fac68add7a6b199e42c6b9	2013-08-27 11:05:08 -07:00
Dmitry Kovalev	f389ca2acc	Merge "Cleaning up model_rd_for_sb_y_tx."	2013-08-27 10:17:10 -07:00
Dmitry Kovalev	78e670fcf8	Merge "Renaming D27 to D207."	2013-08-27 10:03:57 -07:00
Jingning Han	2d6aadd7e2	Fix buf alignment in sub8x8 comp inter-inter pred This commit resolved a mis-alignment issue in compound inter-inter prediction of sub8x8. This patch follows solution from dkovalev@. Change-Id: I3cc0cf7e55b84110e0c42ef4b2e6ca7ac3f8f932	2013-08-27 09:28:05 -07:00
Dmitry Kovalev	657ee2d719	Cleaning up model_rd_for_sb_y_tx. Removing references to plane_block_width and plane_block_height (we are going to delete the latter ones). Change-Id: I7982da4d373aebb54d2209dc8886f6192df4d287	2013-08-26 16:18:28 -07:00
Paul Wilkins	aa823f8667	Merge "Changes to adaptive inter rd thresholds."	2013-08-26 12:48:11 -07:00
Paul Wilkins	642696b678	Merge "Limit Key frame Intra modes checks."	2013-08-26 12:34:56 -07:00
James Zern	c8ba8c513c	cosmetics: strip 'VP9_' from defines in vp9 only code Change-Id: I481d9bb2fa3ec72b6a83d5f04d545ad8013f295c	2013-08-23 19:16:49 -07:00
Dmitry Kovalev	50ee61db4c	Renaming D27 to D207. I've already renamed d27_predictor to d207_predictor but forgot about the corresponding constant. Change-Id: Id312aa80fc5b5a1ab8a709a33418a029552a6857	2013-08-23 17:33:48 -07:00
Dmitry Kovalev	21d8e8590b	Cleanup in mvref_common.{h, c}. Making code more compact, adding consts, removing redundant arguments, adding do/while(0) for macros. Change-Id: Ic9ec0bc58cee0910a5450b7fb8cfbf35fa9d0d16	2013-08-23 12:00:30 -07:00
Paul Wilkins	aa5b67add0	Changes to adaptive inter rd thresholds. Values now carried over frame to frame. Change to algorithm for decreasing threshold after a hit and to max threshold (now based on speed) Removed some old commented out code relating to VP8 adaptive thresholds. The impact of these changes tested on Akiyo (50 frames) and measured in terms of unit rd hits is as follows: Speed 0 84.36 -> 84.67 Speed 1 29.48 -> 22.22 Speed 2 11.76 -> 8.21 Speed 3 12.32 -> 7.21 Encode speed impact is broadly in line with these. Change-Id: I5b886efee3077a11553fa950d796fd6d00c8cb19	2013-08-23 16:18:45 +01:00
Paul Wilkins	f76f52df61	Limit Key frame Intra modes checks. Most of the focus so far has been on inter frames. At high speed settings the key frame is now taking a high % of the cycles. This patch puts in some masking to reduce the number of INTRA modes searched during key frame coding (as already happens for inter frames) at higher speed settings TODO: Develop this further with either adaptive rd thresholds when choosing which intra modes to consider or some other heuristic. Impact. At high speed settings on some clips the key frame was starting to dominate. In a coding of the first 50 frames of AKIYO at speed 2 limiting the key frame intra modes to DC or TM_PRED resulted in ~30% overall speedup. For Bus the number was lower at ~4-5%. Change-Id: I7bde68aee04995f9d9beb13a1902143112e341e2	2013-08-23 16:10:30 +01:00
Dmitry Kovalev	640dea4d9d	Adding vp9_is_scaled function. Change-Id: Ieb7077ca3586b9491912027eed450a4f6fd38d30	2013-08-22 14:04:59 -07:00
Jingning Han	fcb890d751	Merge "Enable zero coeff check in sub8x8 UV rd loop"	2013-08-21 22:07:00 -07:00
Dmitry Kovalev	2f1a0a0e2c	Removing PLANE_TYPE argument from cost_coeffs function. We can determine plane_type for another function arguments. Change-Id: I85331877aedb357632ae916a37b5b15f22c0bb1f	2013-08-21 13:02:28 -07:00
Adrian Grange	ce28d0ca89	Fix typos and minor stylistic cleanup Change-Id: I32e43474e8651ef2eb181d24860a8f118cfea7bf	2013-08-21 08:45:42 -07:00
Dmitry Kovalev	7f814c6bf8	Merge "Passing plane_bsize to foreach_transformed_block_visitor."	2013-08-20 14:25:01 -07:00
Jingning Han	1bf1428654	Enable zero coeff check in sub8x8 UV rd loop Check the minimum rate-distortion cost of regular quantization and all zero coeffs cases in the sub8x8 inter prediction rd loop for luma components. Use this as the cumulative rdcost sent to UV rd estimation. Change-Id: Ia4bc7700437d5e13d7cdad4cf9ae57ab036d3e97	2013-08-20 10:33:42 -07:00
Deb Mukherjee	2ffe64ad5c	Cleanup/enhancements of switchable filter search Cleans up the switchable filter search logic. Also adds a speed feature - a variance threshold - to disable filter search if source variance is lower than this value. Results: derfraw300 threshold = 16, psnr -0.238%, 4-5% speedup (tested on football) threshold = 32, psnr -0.381%, 8-9% speedup (tested on football) threshold = 64, psnr -0.611%, 12-13% speedup (tested on football) threshold = 96, psnr -0.804%, 16-17% speedup (tested on football) Based on these results, the threshold is chosen as 16 for speed 1, 32 for speed 2, 64 for speed 3 and 96 for speed 4. Change-Id: Ib630d39192773b1983d3d349b97973768e170c04	2013-08-20 09:47:04 -07:00
Jingning Han	3275ad701a	Enable early termination in uv rd loop This commit enables early termination in the rate-distortion optimization search loop for chroma components. When the cumulative rd cost is above the current best value, skip the rest per-block transform/quantization/coeff_cost and continue to the next prediction mode. For bus_cif at 2000 kbps, the average run-time goes down from 168546ms -> 164678ms, (2% speed-up) at speed 0 36197ms -> 34465ms, (4% speed-up) at speed 1 Change-Id: I9d3043864126e62bd0166250d66b3170d520b3c0	2013-08-19 16:31:19 -07:00
Dmitry Kovalev	82d4d9a008	Passing plane_bsize to foreach_transformed_block_visitor. Updating all foreach_transformed_block_visitor functions to work with plane block size instead of general block. Removing a lot of duplicated code. Change-Id: I6a9069e27528c611f5a648e1da0c5a5fd17f1bb4	2013-08-19 15:47:24 -07:00
Jingning Han	31c97c2bdf	Merge "Fix potential use of uninitialized value"	2013-08-19 15:15:58 -07:00
Jingning Han	5dc0b309ab	Merge "Fix the returned distortion value in rd_pick_intra"	2013-08-19 14:34:19 -07:00
Dmitry Kovalev	2e3478a593	Using plane_bsize instead of bsize. This change set is intermediate. The next one will remove all repetitive plane_bsize calculations, because it will be passed as argument to foreach_transformed_block_visitor. Change-Id: Ifc12e0b330e017c6851a28746b3a5460b9bf7f0b	2013-08-19 13:20:21 -07:00
Jingning Han	b34ce04378	Fix potential use of uninitialized value Initialize the best mode and tx_size values in the rate-distortion optimization search loop. Change-Id: Ibfb5c0895691f172abcd4265c23aef4cb99fa8af	2013-08-19 11:15:53 -07:00
Jingning Han	f67919ae86	Fix the returned distortion value in rd_pick_intra Return the distortion value in vp9_rd_pick_intra_mode_sb as sum of dist_y and dist_uv. Remove the right shift operation on dist_uv, and make it consistent with that of vp9_rd_pick_inter_mode_sb. Change-Id: I9d564e242d9add38e32595d33b0e0dddb1d55e5b	2013-08-16 21:23:22 -07:00
Dmitry Kovalev	26e5b5e25d	Removing unused or redundant arguments from *_args structures. Redundant dst, pre[2] from build_inter_predictors_args, unused cm from encode_b_args. Change-Id: I2c476cd328c5c0cca4c78ba451ca6ba2a2c37e2d	2013-08-16 12:51:20 -07:00
Dmitry Kovalev	367cb10fcf	Merge "Moving from ss_txfrm_size to tx_size."	2013-08-16 12:46:45 -07:00
Adrian Grange	79f4c1b9a4	Fixed typos and formatting Change-Id: I3814984a624bc64147c57efa74fbdda8eda47262	2013-08-16 09:15:26 -07:00
Dmitry Kovalev	afd9bd3e3c	Moving from ss_txfrm_size to tx_size. Updating foreach_transformed_block_visitor and corresponding functions to accept tx_size instead of ss_txfrm_size. List of functions per file: vp9_decodframe.c decode_block decode_block_intra vp9_detokenize.c decode_block vp9_encodemb.c optimize_block vp9_xform_quant vp9_encode_block_intra vp9_rdopt.c dist_block rate_block block_yrd_txfm vp9_tokenize.c set_entropy_context_b tokenize_b is_skippable Change-Id: I351bf563eb36cf34db71c3f06b9bbc9a61b55b73	2013-08-15 17:03:03 -07:00
Jingning Han	5e80a49307	Merge "Refactor rd loop for chroma components"	2013-08-15 16:02:12 -07:00
Dmitry Kovalev	9451e8d37e	Merge "Converting code from using ss_txfrm_size to tx_size."	2013-08-15 15:21:09 -07:00
Dmitry Kovalev	939b1e4a8c	Merge "Moving segmentation struct from MACROBLOCKD to VP9_COMMON."	2013-08-15 15:14:32 -07:00
Jingning Han	68369ca897	Refactor rd loop for chroma components This commit makes the rate-distortion optimization search of chroma components consistent across all block sizes. It removes redundant codes. Change-Id: I7e76f54d045e8efdd41d84a164c71f55b484471b	2013-08-15 14:54:48 -07:00
Jingning Han	ca983f34f7	Merge "Unify luma and chroma rd-cost estimation"	2013-08-15 13:48:15 -07:00
Dmitry Kovalev	bb3b817c1e	Converting code from using ss_txfrm_size to tx_size. Updated function signatures: txfrm_block_to_raster_block txfrm_block_to_raster_xy extend_for_intra vp9_optimize_b Change-Id: I7213f4c4b1b9ec802f90621d5ba61d5e4dac5e0a	2013-08-15 11:44:57 -07:00
Dmitry Kovalev	6f4fa44c42	Using { 0 } for initialization instead of memset. Change-Id: I4fad357465022d14bfc7e13b348c6da267587314	2013-08-15 11:37:56 -07:00
Dmitry Kovalev	b7616e387e	Moving segmentation struct from MACROBLOCKD to VP9_COMMON. VP9_COMMON is the right place to segmentatation struct because it has global segmentation parameters, not something specific to macroblock processing. Change-Id: Ib9ada0c06c253996eb3b5f6cccf6a323fbbba708	2013-08-15 10:47:48 -07:00
Jingning Han	ec01f52ffa	Unify luma and chroma rd-cost estimation This commit unifies the rate-distortion cost calculation process of luma and chroma components. It allows early termination to be enabled later in the rd search loop of chroma components, in consistent with luma pixels. Change-Id: I2e52a7c6496176bf2a5e3ef338d34ceb8aad9b3d	2013-08-15 09:41:33 -07:00
Paul Wilkins	26fead7ecf	Renaming in MB_MODE_INFO The macro block mode info context originally contained an entry for each 16x16 macroblock. In VP9 each entry refers to an 8x8 region not a macro block, so the naming is misleading. This first stage clean up changes the names of 3 entries in the structure to remove the mb_ prefix. TODO clean up the nomenclature more widely in respect of mbmi and bmi. Change-Id: Ia7305c6d0cb805dfe8cdc98dad21338f502e49c6	2013-08-14 12:47:52 +01:00
Jingning Han	7e0f88b6be	Use lookup table to find largest txfm size Refactor choose_largest_txfm_size_ and make it find the largest transform size via lookup table. Change-Id: I685e0396d71111b599d5367ab1b9c934bd5490c8	2013-08-13 10:32:14 -07:00
Jingning Han	dc70fbe42d	Merge "Refactor model based tx search in super_block_yrd"	2013-08-13 08:48:49 -07:00
Jingning Han	78136edcdc	SSE2 high precision 32x32 forward DCT Enable SSE2 implementation of high precision 32x32 forward DCT. The intermediate stacks are of 32-bits. The run-time goes down from 32126 cycles to 13442 cycles. Change-Id: Ib5ccafe3176c65bd6f2dbdef790bd47bbc880e56	2013-08-12 16:52:53 -07:00
Jingning Han	14cc7b319f	Refactor model based tx search in super_block_yrd Remove unnecessary conditional branches in model-based transform size search. Change-Id: Ic862dc33ed6710a186f6248239dd5f09b5c19981	2013-08-12 16:34:48 -07:00
Dmitry Kovalev	1aedfc992a	Using MV* instead of int_mv* as argument of vp9_clamp_mv_min_max. Change-Id: I3c45916a9059f11b41e9d798e34ffee052969a44	2013-08-12 13:56:04 -07:00
Dmitry Kovalev	3c43ec206c	Renaming BLOCK_SIZE_TYPES constant to BLOCK_SIZES. There will be another change set to rename BLOCK_SIZE_TYPE enum to BLOCK_SIZE. Change-Id: I8d1dfc873d6186fa5e554262f5169e929978085e	2013-08-09 17:47:32 -07:00
Dmitry Kovalev	e7c5ca8983	Merge "Inlining 16 as a stride for BLOCK_OFFSET macro."	2013-08-09 17:22:46 -07:00
James Zern	ef101af8ae	Merge "vp9_rd_pick_inter_mode_sb: fix uninitialized value"	2013-08-09 17:13:32 -07:00
Dmitry Kovalev	f1559bdeaf	Inlining 16 as a stride for BLOCK_OFFSET macro. Change-Id: I7f23d174eb089e5500f268a10db09648634c1b82	2013-08-09 16:40:05 -07:00
James Zern	f295774d43	vp9_rd_pick_inter_mode_sb: fix uninitialized value 'skippable' can remain unset and negatively affect later decisions address one aspect of issue #599 Change-Id: Iffdf0ac2e49ac481c27dc27c87fa546d4167bb28	2013-08-09 16:26:22 -07:00
Deb Mukherjee	2158909fc3	Merge "Adds a new subpel motion function"	2013-08-08 12:26:55 -07:00
Deb Mukherjee	1ba91a84ad	Adds a new subpel motion function Adds a new subpel motion estimation function that uses a 2-level tree-structured decision tree to eliminate redundant computations. It searches fewer points than iterative search (which can search the same point multiple times) but has the same quality roughly. This is made the default setting at speeds 0 and 1, while at speed 2 and above only a 1-level search is used. Also includes various cleanups for consistency and redundancy removal. Results: derf: +0.012% psnr stdhd: +0.09% psnr Speedup of about 2-3% Change-Id: Iedde4866f5475586dea0f0ba4cb7428fba24eee9	2013-08-08 11:41:49 -07:00
Dmitry Kovalev	8db2675b97	Adding ss_size_lookup table. Removing the old one bsize_from_dim_lookup. Now we have a way to determine block size for plane using its subsampling values (ss_size_lookup). And then we can find the number of pixels in the block (num_pels_log2_lookup). Change-Id: I6fc981da2ae093de81741d3d78eaefed11015db9	2013-08-07 15:33:17 -07:00
Deb Mukherjee	71b43b0ff0	Clean ups of the subpel search functions Removes some unused code and speed features, and organizes the interfaces for fractional mv step functions for use in new speed features to come. In the process a new speed feature - number of iterations per step during the subpel search - is exposed. No change when this parameter is set as the original value of 3. Results: subpel_iters_per_step = 3: baseline subpel_iters_per_step = 2: psnr -0.067%, 1% speedup subpel_iters_per_step = 1: psnr -0.331%, 3-4% speedup Change-Id: I2eba8a21f6461be8caf56af04a5337257a5693a8	2013-08-06 17:23:50 -07:00
Deb Mukherjee	fac7c8c9f9	Merge "Flexible support for various pattern searches"	2013-08-06 14:03:27 -07:00
Deb Mukherjee	15b5a6a2c7	Flexible support for various pattern searches Adds a few pattern searches to achieve various tradeoffs between motion estimation complexity and performance. The search framework is unified across these searches so that a common pattern search function is used for all. Besides it will be easier to experiment with various patterns or combinations thereof at different scales in the future. The new pattern search is multi-scale and is capable of using different patterns at different scales. The new hex search uses 8 points at the smallest scale and 6 points at other scales. Two other pattern searches - big-diamond and square are also added. Big diamond uses 4 points at the smallest scale and 8 points in diamond shape at the larger scales. Square is very similar conceptually to the default n-step search but is somewhat faster since it keeps only one survivor across all scales. Psnr/speed-up results on derf300: hex: -1.6% psnr%, 6-8% speed-up big-diamond: -0.96% psnr, 4-5% speedup square: -0.93% psnr, 4-5% speedup Change-Id: I02a7ef5193f762601e0994e2c99399a3535a43d2	2013-08-06 11:56:39 -07:00
Dmitry Kovalev	0c80065694	Inlining vp9_get_pred_probs_switchable_interp function. There was no benefit having this function. For example, inside read_switchable_filter_type switchable filter context was calculated twice. Change-Id: I79cd5bf95cbc0f6d8bf91a2e32289e01b18dcff1	2013-08-06 11:04:31 -07:00
Dmitry Kovalev	3e51acafec	Merge "Finally removing all old block size constants."	2013-08-06 10:30:37 -07:00
Dmitry Kovalev	4a692e4168	Merge "Changing the order switchable filter enum constants."	2013-08-06 10:30:26 -07:00
Dmitry Kovalev	25b7dc08cd	Merge "Removing unused functions."	2013-08-06 10:29:57 -07:00
Deb Mukherjee	33afddadb9	Merge "Add variance based mode/skipping"	2013-08-06 10:19:15 -07:00
Dmitry Kovalev	b9c7d04e95	Finally removing all old block size constants. Change-Id: I3aae21e88b876d53ecc955260479980ffe04ad8d	2013-08-05 15:23:49 -07:00
Deb Mukherjee	8b3faccb9e	Add variance based mode/skipping Adds a speed feature to skip all intra modes other than DC_PRED if the source variance is small. This feature is made part of speed 1 and up. Results on derf300: psnr -0.07%, speedup about 1-2% Also uses the source variance to fine-tune the early termination criteria when FLAG_EARLY_TERMINATE is on. This feature is made part of speed 2 and up. Results on derf300: psnr -0.52%, speedup about 5-7% Change-Id: I59e38aa836557cfa5405ae706fc64815cbfe4232	2013-08-05 14:14:01 -07:00
Jim Bankoski	9f988a2edf	Merge "cleanups after bw bh code"	2013-08-05 14:02:02 -07:00
Dmitry Kovalev	3f611555d7	Changing the order switchable filter enum constants. This changeset allows to remove vp9_switchable_interp and vp9_switchable_interp_map arrays and make code much clear. Actually we still have to use these mapping but only inside read_interp_filter_type and write_interp_filter_type functions. Change-Id: I4026c6f8c4acefba6c81421b7bacbaa52cc45f50	2013-08-05 12:26:15 -07:00
Jim Bankoski	5d2cb7ead0	cleanups after bw bh code Cons bw/bh parms that should have been const. Additional formatting. Change-Id: Icd36a5c9dc17dadd7284315ac0d6fef1a565ca16	2013-08-05 12:15:52 -07:00
Dmitry Kovalev	d007446b3f	Replacing long block size enum values with shorter ones (2). Change-Id: I428c4d42212b757112e3acfe5b81314cfbb5fd6b	2013-08-05 10:51:02 -07:00
Dmitry Kovalev	fe2a201eb1	Replacing "txfm" with "tx" in identifiers. Consistent names with TX_SIZE, TX_MODE, and TX_MODE. Change-Id: I79592218bf5a40ace89197a34a06ee7de581ed8d	2013-08-02 17:28:23 -07:00
Dmitry Kovalev	fec4ec4edd	Removing unused functions. Removed functions: model_rd_for_sb_y, block_error_sby, get_sb_variance Change-Id: Iec458df180caf6f8eac3605773841a4121dd3a8f	2013-08-02 16:41:09 -07:00
Dmitry Kovalev	25b77e2569	Changing function arg type from int_mv* to MV*. Change-Id: Ic878d31df2ce783a2c9a8c4bc9ed301ec8ffe25e	2013-08-02 15:26:32 -07:00
Adrian Grange	60ff123536	Merge "Fixed typos and added a few explanatory comments"	2013-08-02 11:37:47 -07:00
Adrian Grange	075b11f004	Merge "Changed name of rd_pick_intra4x4mby_modes"	2013-08-02 11:36:46 -07:00
Dmitry Kovalev	741537f3ce	Cleanup: replacing xd->seg with seg, and xd->lf with lf. Change-Id: I73b59d7699a8e7e7acd3bf8041cb6c98ce9ba4bf	2013-08-01 15:38:16 -07:00
Dmitry Kovalev	ce8dedc353	Cleanup: removing unused function arguments. Change-Id: I27471768980fc631916069f24bc7c482a5c9ca17	2013-08-01 13:41:38 -07:00
Dmitry Kovalev	b621e2d72e	Nice looking motion vector clamping functions. Removing assign_and_clamp_mv function, making implementation of clamp_mv and clamp_mv2 more clear and consistent. Change-Id: Iecd08e1c1bf0379f8314ebe01811f8253f4ade58	2013-08-01 13:40:26 -07:00
Adrian Grange	89e73c63c0	Fixed typos and added a few explanatory comments Change-Id: Ib4e4b41094b54874ee34343dd77c0c131ceed9d2	2013-08-01 09:23:49 -07:00
Adrian Grange	5271d47892	Changed name of rd_pick_intra4x4mby_modes The function name rd_pick_intra4x4mby_modes is confusing, so I changed it to rd_pick_intra_sub_8x8_y_modes to better reflect what the function does. Also added const qualifiers to some of the input parameters and removed camel-case. Change-Id: I23d53d4c7af5d79ed8a471acd59a09bbb47add39	2013-08-01 09:23:49 -07:00
Dmitry Kovalev	9239e96536	Removing get_mi_{row, col} functions. Passing mi_row and mi_col parameters to functions explicitly. Removing unused xd argument from scale_mv function. Change-Id: Icb4c495ec72d26fb066c14470d3ae0b741fbf18a	2013-07-31 14:06:55 -07:00
Dmitry Kovalev	500ade243a	Removing unused "ishp" arguments. Using different variable names "allow_hp" and "use_hp" instead of "usehp". Change-Id: I0cd5996ddeb46bd754473b680a993c0aaf8eb879	2013-07-31 11:27:53 -07:00
Adrian Grange	fbd73648dd	Merge "Cleanup typos, remove unnecessary lines, replace switch"	2013-07-30 12:59:46 -07:00
Adrian Grange	b30a06b930	Cleanup typos, remove unnecessary lines, replace switch Removed unnecessary code lines, replaced switch with an if, fixed spelling errors and formatting. Change-Id: Ie48aa4604aa0ed48362ca359d792fb21b2ec1dc6	2013-07-30 12:10:32 -07:00
Dmitry Kovalev	730a34416f	Renaming NB_TXFM_MODES constant to TX_MODES. Change-Id: I10bf06e3a3d5271221ae6a42a36074d01d493039	2013-07-29 13:38:40 -07:00
Dmitry Kovalev	23391ea835	Renaming TX_SIZE_MAX_SB to TX_SIZES. Change-Id: I6aa4191935aa93461a07c41b59fdae1eb5f5f107	2013-07-29 12:25:34 -07:00
Ronald S. Bultje	118ccdcd30	Inverse dimension order in token_cost array. This allows us to increment the position at the band-level only as we go from one band to the next; more importantly, that allows us to use an add instead of multiply instruction, and omit the instruction altogether if the band doesn't change from one coef to the next, thus being slightly faster (probably more noticeable on systems where a multiply is expensive, like arm). Change-Id: I4343fe35b9f9a47fa00b217bdcbf5f91ff96c381	2013-07-26 17:30:04 -07:00
Ronald S. Bultje	dcacce6dd9	Merge "Save pixels instead of coefficients in intra4x4 RD loop."	2013-07-26 17:20:58 -07:00
Ronald S. Bultje	d30c8f41ef	Merge "Add best_rd breakout in intra4x4 RD loop."	2013-07-26 17:20:51 -07:00
Dmitry Kovalev	c09b81719f	Merge "General cleanups."	2013-07-26 13:59:39 -07:00
Yunqing Wang	52256cdbca	Modify static threshold calculation Used 3 * standard_deviation in internal threshold calculation instead of fit curve. This actually approached the algorithm better. For comparison, similar tests were done: The overall psnr loss is less than before. 1. derf set: when static-thresh = 1, psnr loss is 0.329%; when static-thresh = 500, psnr loss is 0.970%; 2. stdhd set: when static-thresh = 1, psnr loss is 0.922%; when static-thresh = 500, psnr loss is 1.307%; Similar speedup is achieved. For example, clip bitrate static-thresh psnr time akiyo(cif) 500 0 48.952 5.077s(50f) akiyo 500 500 48.866 4.169s(50f) parkjoy(1080p) 4000 0 30.388 78.20s(30f) parkjoy 4000 500 30.367 70.85s(30f) sunflower(1080p) 4000 0 44.402 74.55s(30f) sunflower 4000 500 44.414 68.69s(30f) Change-Id: Ic78833642ce1911dbbd1cb6c899a2d7e2dfcc1f3	2013-07-25 19:59:33 -07:00
Yunqing Wang	845fd5011c	Merge "Add encoding option --static-thresh"	2013-07-25 14:58:00 -07:00
Yunqing Wang	d36852b702	Add encoding option --static-thresh This option exists in VP8, and it was rewritten in VP9 to support skipping on different partition levels. After prediction is done, we can check if the residuals in the partition block will be all quantized to 0. If this is true, the skip flag is set, and only prediction data are needed in reconstruction. Based on DCT's energy conservation property, the skipping check can be estimated in spatial domain. The prediction error is calculated and compared to a threshold. The threshold is determined by the dequant values, and also adjusted by partition sizes. To be precise, the DC and AC parts for Y, U, and V planes are checked to decide skipping or not. Test showed that 1. derf set: when static-thresh = 1, psnr loss is 0.666%; when static-thresh = 500, psnr loss is 1.162%; 2. stdhd set: when static-thresh = 1, psnr loss is 1.249%; when static-thresh = 500, psnr loss is 1.668%; For different clips, encoding speedup range is between several percentage and 20+% when static-thresh <= 500. For example, clip bitrate static-thresh psnr time akiyo(cif) 500 0 48.923 5.635s(50f) akiyo 500 500 48.863 4.402s(50f) parkjoy(1080p) 4000 0 30.380 77.54s(30f) parkjoy 4000 500 30.384 69.59s(30f) sunflower(1080p) 4000 0 44.461 85.2s(30f) sunflower 4000 500 44.418 78.1s(30f) Higher static-thresh values give larger speedup with larger quality loss. Change-Id: I857031ceb466ff314ab580ac5ec5d18542203c53	2013-07-25 14:28:05 -07:00
Dmitry Kovalev	7131cb0e3d	General cleanups. Removing unused constants, macros, and function declarations. Using ROUND_POWER_OF_TWO macro, vp9_zero, vp9_copy where possible. Moving #include from .h to .c. Merging for loops for motion vectors. Change-Id: Ic3bf841764a2bb177128bb3a6d7aa8f68229cd13	2013-07-25 14:13:48 -07:00
Adrian Grange	e862c6f9eb	Merge "Simplify handling of sub-partition motion vectors"	2013-07-25 12:58:38 -07:00
Adrian Grange	6f0f0e4907	Merge "Use local variables rather than structure members"	2013-07-25 12:57:52 -07:00
Adrian Grange	be700e140a	Simplify handling of sub-partition motion vectors Simplified the code that extracts and uses the motion vectors for the 4 sub-partitions in rd_pick_partition. Change-Id: Iaf698ef7ee3aef9edd59015e1ae065dd359b17d9	2013-07-25 11:51:51 -07:00
Dmitry Kovalev	fcc34796d2	Removing CONFIG_BALANCED_COEFTREE experiment. Change-Id: I61a8b0101eac3ee2e0621d56151b90c269fd4db4	2013-07-24 15:53:42 -07:00
Dmitry Kovalev	9139ee0908	Adding condition inside get_tx_type_{4x4, 8x8, 16x16}. Adding plane type check condition because it was always used outside of get_tx_type_{4x4, 8x8, 16x16}. Change-Id: I02f0bbfee8063474865bd903eb25b54d26e07230	2013-07-24 12:55:45 -07:00
Adrian Grange	4cfd36d8fd	Use local variables rather than structure members Although local copies of the mode member variables (mode, ref_frame) were made, they were not used in all places. Also, made a local copy of the second_ref_frame member. Change-Id: I84d8c822e5cb3d8a02fc3de8a4037ca3fea8bfad	2013-07-24 11:17:44 -07:00
Ronald S. Bultje	7817d3221f	Save pixels instead of coefficients in intra4x4 RD loop. Prevents doing duplicate IDCTs; encoding of first 50 frames of bus (speed 0) @ 1500kbps goes from 1min4.0 to 1min3.5, i.e. 0.87% faster overall. Change-Id: I2df39e29ed9d5ea5e7d2704a34940ba622832ddd	2013-07-24 09:03:20 -07:00
Ronald S. Bultje	b72ecbb1b9	Add best_rd breakout in intra4x4 RD loop. Encoding time of first 50 frames of bus (speed 0) @ 1500kbps goes from 1min5.4 to 1min4.0, i.e. 2.2% faster overall. Change-Id: I8c32f2aff9a649ce7dd49d910dc5ba16b99c3bc6	2013-07-24 09:02:05 -07:00
Ronald S. Bultje	47336afd8d	Merge "More optimizations for cost_coeffs()."	2013-07-23 21:36:12 -07:00
Dmitry Kovalev	db7f5d28b9	Removing vp9_is_interpolating_filter array. All filters are interpolating now, so we don't need this array, all values from this array are evaluated to true. Change-Id: I9af6d8219ae0eb984063cd15e4e2296374ae4961	2013-07-23 14:24:39 -07:00
Dmitry Kovalev	2855d8aea1	Merge "Adding update_tx_counts function."	2013-07-23 13:57:59 -07:00
James Zern	8dede954c7	Merge "vp9: make some static tables const"	2013-07-23 11:37:01 -07:00
Jim Bankoski	86a9dec73c	clean up bw, bh many structures use bw and bh and they have different meanings. This cl attempts to start this clean up and remove unneccessary 2 step look up log and then shift operations... also removed partition type multiple operation code in bitstream.c. Change-Id: I7e03e552bdfc0939738e430862e3073d30fdd5db	2013-07-23 06:51:44 -07:00
Paul Wilkins	7c134bc0cd	Merge "Reworked the auto_mv_step_size speed feature"	2013-07-23 04:49:55 -07:00
James Zern	3c8cce353f	vp9: make some static tables const Change-Id: I8bcae51271673da8755c66a51aea005dfe6a3739	2013-07-22 19:19:13 -07:00
Ronald S. Bultje	e20fcd9585	More optimizations for cost_coeffs(). 4x4: 163 -> 123 cycles (33% faster) 8x8: 491 -> 399 cycles (23% faster) 16x16: 1889 -> 1763 cycles (7% faster) 32x32: 8311 -> 8180 cycles (1.6% faster) Overall encoding time of first 50 frames of bus (speed 0) @ 1500kbps goes from 1min4.33 to 1min3.00, i.e. 2.11% faster. Change-Id: Ib52d1dbb5649b14de769d3e7a74af67440b5284f	2013-07-22 16:09:09 -07:00
Dmitry Kovalev	b2fc6fa969	Adding update_tx_counts function. Moving common encoder/decoder code to update_tx_counts. Also renaming vp9_get_pred_probs_tx_size to get_tx_probs2 and adding get_tx_probs to call vp9_get_pred_context_tx_size inside read_selected_tx_size only once (twice before). Change-Id: Ia50247f3893de88ef8e9041b0d44be44a40aaa4d	2013-07-22 14:57:43 -07:00
Yaowu Xu	fc186dcad6	fix a build error Change-Id: I3b05687f439ff6a7c426d2c97a6c58c831fa51ac	2013-07-22 12:37:30 -07:00
Jingning Han	416f315e82	Merge "Skip buffer update in sub8x8 rd loop"	2013-07-22 12:08:22 -07:00
Jingning Han	a5a9f5f7f3	Merge "Optimize operation flow in sub8x8 rd loop"	2013-07-22 12:08:15 -07:00
Jingning Han	409e77f2d4	Optimize operation flow in sub8x8 rd loop Stack the rate-distortion statistics in the sub8x8 rd loop. This allows the encoder to skip the forward transform, quantization, and coeff cost estimation, in the sub8x8 rd optimization search, if the motion vector(s) are of integer pixel value, and have been tested in the previous prediction filter type rd loops of the same block. This gives about 2% speed-up for bus_cif at 2000 kpbs, for speed 0. Its efficacy depends how frequently the motion search will select an integer motion vector. Change-Id: Iee15d4283ad4adea05522c1d40b198b127e6dd97	2013-07-22 10:40:33 -07:00
Paul Wilkins	1d189d6464	Re-order mode search in rd. Mode search order in rd loop changed to better reflect observed hit counts. Also some adjustment of the baseline mode rd thresholds to reflect the order change and observed frequencies. Change-Id: I47a131cc83e11551df8add6d6d8d413d78d3a63c	2013-07-22 17:21:12 +01:00
Dmitry Kovalev	39342db138	Merge "Consistent names for inter mode probabilities and encodings."	2013-07-20 22:40:51 -07:00
Dmitry Kovalev	f66821afbb	Merge "Removing frame_type field from MACROBLOCKD struct."	2013-07-20 22:40:06 -07:00
Jingning Han	c725502bf3	Skip buffer update in sub8x8 rd loop This commit allows the encoder to skip a few buffer update steps in rd_pick_best_mbsegmentation, when early breakout has been triggered in the rd_check_segment_txsize. It provides about 1% speed-up for bus_cif at 2000 kbps, in the settings of speed 0. Change-Id: Ica034f10a24dec572b397d8389a2b81020ebc0b9	2013-07-20 21:38:12 -07:00
Deb Mukherjee	302698fb12	Reworked the auto_mv_step_size speed feature This patch modifies the auto_mv_step_size speed feature to use a combination of the maximum magnitude mv from the last inter frame, and the maximum magnitude mv for the two reference mvs with the same reference. For arf frames, the max mav step for the resolution is used. The bounds therefore are slightly tighter. The feature is made a speed 1 feature. Rebased. Results (when this feature is turned on over speed 0): derfraw300: -0.046% psnr, about 5+% speedup (tested on football: goes from 4m30.760s to 4m17.410s). Change-Id: If492797a61b0b4b3e58c0b8f86afb880165fc9f6	2013-07-19 15:12:56 -07:00
Dmitry Kovalev	e71a4a77bb	Merge "Renaming TXFM_MODE to TX_MODE (like TX_SIZE, TX_TYPE)."	2013-07-19 12:14:32 -07:00
Dmitry Kovalev	97e96bc4e9	Removing frame_type field from MACROBLOCKD struct. Change-Id: Ia4e83913251c1cdc7aa2abd64bf01ecb1a962119	2013-07-19 11:55:36 -07:00
Dmitry Kovalev	c0eb57406c	Renaming TXFM_MODE to TX_MODE (like TX_SIZE, TX_TYPE). Moving TX_MODE enum to vp9_enums.h. Renaming txfm_mode variables to tx_mode. Change-Id: I459d1af6dd928ce7fccdf8ce30b6f1ca057bef92	2013-07-19 11:37:13 -07:00
Dmitry Kovalev	afe43d4089	Removing redundant VP9_COMMON* from function signatures. Functions: vp9_get_pred_context_switchable_interp, vp9_get_pred_context_intra_inter, vp9_get_pred_context_single_ref_p1, vp9_get_pred_context_single_ref_p2. Change-Id: I3d6fb8aee23c9062270768e1e6da416dd9bb8f96	2013-07-19 11:20:49 -07:00
Dmitry Kovalev	bc7acb134b	Consistent names for inter mode probabilities and encodings. Renaming vp9_sb_mv_ref_tree to vp9_inter_mode_tree, and vp9_sb_mv_ref_encoding_array to vp9_inter_mode_encodings. Change-Id: I0e91fbf81350d3ec5a2599064c74089b5d06133a	2013-07-19 10:40:04 -07:00
Yaowu Xu	37d901a47a	Merge "Add best_rd breakout to keyframe partition selection also."	2013-07-18 17:50:39 -07:00
Yaowu Xu	67fb0679ee	Merge "Merge scale_factors and scale_factors_uv."	2013-07-18 17:50:34 -07:00
Yaowu Xu	55b52e32da	Merge "Do in-place UV intra mode selection."	2013-07-18 17:50:07 -07:00
Yaowu Xu	51972d1279	Merge "Change break statement in a 2d loop to a return statement."	2013-07-18 17:49:58 -07:00
Dmitry Kovalev	92f4198d52	Merge "Using VP9_REF_NO_SCALE instead of (1 << VP9_REF_SCALE_SHIFT)."	2013-07-18 17:29:05 -07:00
Dmitry Kovalev	0b562b2d3d	Using VP9_REF_NO_SCALE instead of (1 << VP9_REF_SCALE_SHIFT). Change-Id: Ide58a74d31ff948319445a6337d2c05e98720e34	2013-07-18 15:12:46 -07:00
Ronald S. Bultje	96e4db2660	Add best_rd breakout to keyframe partition selection also. Change-Id: I96b8058f6dfecf8aa3e152cdcbfd7e10071fbbc9	2013-07-18 14:10:56 -07:00
Ronald S. Bultje	5ebe503f04	Merge scale_factors and scale_factors_uv. This prevents a duplicate memcpy of a 128-byte struct every time set_scale_factors() is called (which is a lot), thus leading to a decrease from 3.7 MB to 1.85 MB of struct copying per 64x64 block RD/partition loop. Overall, this decreases encoding time of the first 50 frames of bus @ 1500kbps (speed 0) from 1min5.9 to 1min4.9, i.e. about a 1.5% overall speedup. We can likely get more gains by removing the copy of the other struct (and replacing it with an indexing) as well. Change-Id: I3dceb7e79f71e6fe911b11cc994cf89a869dde7a	2013-07-18 14:10:56 -07:00
Ronald S. Bultje	df4b4fab26	Do in-place UV intra mode selection. This means we only do UV intra mode selection if we find any intra mode to actually be useful at all; in addition, we only do UV intra mode selection for the transform sizes that were selected, rather than all sizes available in this partition. First 50 frames of bus @ 1500kbps (speed 0) gains about 5% with this change. Change-Id: I7b461eb8b803247f57896c5a9505f745b55502b3	2013-07-18 14:10:56 -07:00
Ronald S. Bultje	e54a5782b9	Change break statement in a 2d loop to a return statement. The break statement only breaks out of the nested loop, not the top-level loop, so it doesn't always work as intended. Changing it to a return statement does what's intended. Change-Id: I585419823b39a04ec8826b1c8a216099b1728ba7	2013-07-18 14:10:56 -07:00
Ronald S. Bultje	2d4929e340	Remove motion vectors from PARTITION_INFO. The same information already exists in union b_mode_info. Change-Id: Iac5086b99a3c3cc270380138062bb693e58f9e6d	2013-07-18 14:10:52 -07:00
Ronald S. Bultje	9da67da04a	Merge "Fix bug where we don't choose any mode in RD selection."	2013-07-18 12:47:50 -07:00
Ronald S. Bultje	247197d57b	Fix bug where we don't choose any mode in RD selection. This could happen during golden overlay frame coding from a previous alt-ref frame if the special overlay code was triggered. Change-Id: I3056d0c547cd26903b260ef93c94026e96bd9868	2013-07-18 12:13:15 -07:00
Ronald S. Bultje	4f5815290c	Merge "Fix bug which skips zeromv even if near/nearest is not 0,0."	2013-07-18 10:06:51 -07:00
Ronald S. Bultje	deb7456058	Fix bug which skips zeromv even if near/nearest is not 0,0. Change-Id: Id4f454831f3f11099f39c30246adeaa52857d08d	2013-07-18 09:35:19 -07:00
Jingning Han	ced3c20165	Use mv_check_bounds in sub8x8 rd loop Make the use of mv_check_bounds consistent for mvs of both ref_frame[0] and ref_frame[1]. Change-Id: I1ca24865cc7232ca9cbe5db566c53abad1592211	2013-07-17 17:13:51 -07:00
Ronald S. Bultje	facecd80da	Merge "Add a best_yrd shortcut in splitmv mode search."	2013-07-17 16:11:13 -07:00
Ronald S. Bultje	056111c822	Merge "Skip redundant nearest/near/zero encodes in splitmv."	2013-07-17 16:10:51 -07:00
Ronald S. Bultje	0b1eba25b2	Merge "Skip nearest/near/zero redundant encodes."	2013-07-17 16:10:41 -07:00
Ronald S. Bultje	607424449c	Merge "Best_rd breakout in rd partition search."	2013-07-17 16:10:22 -07:00
Yaowu Xu	6ac5b7db2c	Merge "changed mode checking order"	2013-07-17 14:44:40 -07:00
Dmitry Kovalev	a7a1e96136	Merge changes Ieffea49e,Idf610746 * changes: Removing two unused arguments from vp9_inc_mv signature. Changing signature of vp9_get_pred_probs_tx_size.	2013-07-17 14:44:20 -07:00
Ronald S. Bultje	c6917528a5	Add a best_yrd shortcut in splitmv mode search. Encoding of first 50 frames of bus (speed 0) @ 1500kbps goes from 1min6.2 to 1min5.9, i.e. 0.5% faster overall. Change-Id: I59d8a3b2f0a75010fa041d5e2646c8caac5bd683	2013-07-17 14:21:57 -07:00
Ronald S. Bultje	161c995658	Skip redundant nearest/near/zero encodes in splitmv. Encode of first 50 frames of bus @ 1500kbps (speed 0) goes from 1min7.3 to 1min6.2, i.e. 1.7% faster overall. Change-Id: I19d2deacfbffadd61d32551cee9586757ab4a987	2013-07-17 13:53:48 -07:00
Yaowu Xu	42facc292d	changed mode checking order Change-Id: Ic4c4b363ed840935e42f495f13ea5e601a56f1b2	2013-07-17 13:43:50 -07:00
Ronald S. Bultje	8fea880b6f	Skip nearest/near/zero redundant encodes. Encode of first 50 frames of bus @ 1500kbps (speed 0) goes from 1min12.8 to 1min7.3, i.e. 8% faster. Change-Id: Ia22d1c7b687316c553cc60eacae988b24e175b62	2013-07-17 11:33:15 -07:00
Ronald S. Bultje	9f427bfe98	Best_rd breakout in rd partition search. About 15% faster for bus (speed 0) first 50 frames @ 1500kbps, which goes from 1min36 to 1min24. Results become slightly better (+0.2% on derf/yt, +0.4% on hd), probably because of a bugfix for skipmode in super_block_yrd(). Overall speed change (on derfraw300) is roughly -13%. This can probably be improved further by caching best_yrd between partition searches. Also, we might be able to get more speedups by always doing PARTITION_NONE before PARTITIONS_SPLIT, not just at the sb8x8 level. Change-Id: I83736949ebd5b4a3b400ee688d7661913fefc98b	2013-07-17 09:56:46 -07:00
Ronald S. Bultje	83c7e13a6b	Do a skip-block check for sub8x8 partitions also. +0.2% SSIM and glbPSNR on derfraw300. Change-Id: I9cba0bca55e606a22f557c7732b064f738efe84d	2013-07-17 09:46:47 -07:00
Yunqing Wang	df90d58f4f	Speed up motion estimation using small partitions' result(experiment) Current partition checking starts from small sizes, and then goes up to large sizes. This experiment uses the small partitions' motion estimation result, which is already available, to speed up the large partition's motion estimation. We can decide to skip some patition checkings if they are unlikely choices. We could use the motion vector(MV) result as current partition's prediction MV, limit the search range and reference frame. Current result at speed 1: psnr loss: 1.19% for stdhd, 0.287% for derf. speed gain: 14% for sunflower(hd), 11% for akiyo. Further improvement will be done later. Change-Id: I5abfd070e9cace2e91e2a0247d1325df313887ab	2013-07-17 09:11:47 -07:00
Paul Wilkins	d66eab15dd	Merge "Move uv intra mode selection in rd loop."	2013-07-17 05:19:26 -07:00
Paul Wilkins	154c34a3ee	Merge "Limit transform sizes searched for uv intra."	2013-07-17 03:40:11 -07:00
Paul Wilkins	2ee338ce3b	Move uv intra mode selection in rd loop. Use an estimate based on DC_PRED for intra uv cost within the rd loop then only do a full uv mode analysis if an intra mode is chosen. Significant speed gains in some cases. Currently only enabled for speed 2 pending speed/quality tests. Change-Id: Ie851a12400d5483bce47ec0e3ccb8516041e91c0	2013-07-17 11:11:21 +01:00
Paul Wilkins	6c667f0ffe	Limit transform sizes searched for uv intra. Apply limit if search_method == USE_LARGESTALL to the range of UV tx sizes searched. Change-Id: I6db29f0dd237285ffc50d75a37e8b68151ad821c	2013-07-17 11:08:55 +01:00
Paul Wilkins	5f4722c75f	Merge "Minor cleanup in code to fine uv tx_size."	2013-07-17 02:50:09 -07:00
Jingning Han	a142d6fc93	Skip redundant motion search in 4x4 level rd loop This commit makes the encoder to perform motion search only once per reference frame type for each 4x4/4x8/8x4 block. For bus_cif at 2000 kbps, the runtime goes from 253812ms -> 217817ms (14% speed-up) for speed 0. Change-Id: I5f17599ccc8cfaf93ccb4f98fcb6008af6d79e92	2013-07-16 17:21:11 -07:00
Dmitry Kovalev	5b65a71cdc	Changing signature of vp9_get_pred_probs_tx_size. Removing VP9_COMMON* argument and adding struct tx_probs* instead of MACROBLOCKD*. Change-Id: Idf61074631a90ec51eac22c8dcd977f44ac0757c	2013-07-16 16:34:54 -07:00
Paul Wilkins	30d2ea45ce	Minor cleanup in code to fine uv tx_size. Change-Id: I94b97a966b5efbc9a243048f1f5ddbbdc4b1846e	2013-07-16 18:27:33 +01:00
Dmitry Kovalev	ca75f1255f	Removing and moving around constant definitions. Removing unused and duplicated constants, moving them from .h to .c if possible. Change-Id: Ief4d6b984a3ca2e9b38504f0d855ed072cf7133f	2013-07-15 19:26:30 -07:00
Jingning Han	faff6ed0fb	Skip duplicate block encoding in the rd loop This speed feature allows the encoder to largely remove the spatial dependency between blocks inside a 64x64 superblock, thereby removing the need to repeatedly encode superblocks per partition type in the rate-distortion optimization loop. A major challenge lies in the intra modes tested in the rate-distortion optimization loop. The subsequent blocks do not have access to the reconstructed boundary pixels without the intermediate coding steps. This was resolved by using the original pixels for intra prediction in the rd loop, followed by an appropriately designed distortion modeling on the quantization parameters. Experiments also suggested that the performance impact is more discernible at lower bit-rate/psnr settings. Hence a quantizer dependent threshold is applied to deactivate skip of block coding. For bus_cif at 2000 kbps, speed 0: runtime 269854ms -> 237774ms (12% speed-up) at 0.05dB performance loss. speed 1: runtime 65312ms -> 61536ms, (7% speed-up) at 0.04dB performance loss. This operation is currently turned on in settings of speed 1. Change-Id: Ib689741dfff8dd38365d8c1b92860a3e176f56ec	2013-07-15 11:08:58 -07:00
Yaowu Xu	fb754b182f	Fix a build issue Change-Id: I23a75c495ed7ea917d7f312bef0990e20a6b53d9	2013-07-12 11:38:44 -07:00
Deb Mukherjee	94c481f9f1	Some minor cleanups for efficiency Implements some of the helper functions more efficiently with lookups rathers than branches. Modeling function is consolidated to reduce some computations. Also merged the two enums BLOCK_SIZE_TYPES and BlockSize into one because there is no need to keep them separate (even though the semantics are a little different). No bitstream or output change. About 0.5% speedup Change-Id: I7d71a66e8031ddb340744dc493f22976052b8f9f	2013-07-12 10:22:56 -07:00
Ronald S. Bultje	ee09dd9949	Remove unused function block_error(). Change-Id: I78a79fc51c2d7cc3c261f35b569155397f3dc0c4	2013-07-11 17:14:03 -07:00
Dmitry Kovalev	8c05e59065	Calling is_inter_mode() instead of custom code. Change-Id: Iccd4ab95ea51a6d57ed43947f2fd7ad92e8979cf	2013-07-11 14:14:47 -07:00
Dmitry Kovalev	c4ad3273c7	Moving segmentation related vars into separate struct. Adding segmentation struct to vp9_seg_common.h. Struct members are from macroblockd and VP9Common structs. Moving segmentation related constants and enums to vp9_seg_common.h. Change-Id: I23fabc33f11a359249f5f80d161daf569d02ec03	2013-07-11 11:57:57 -07:00
Jingning Han	18803f9cc4	Fix tx_type bug in intra4x4 rd loop This commit fixed the mis-use of the tx_type for inverse transform in intra4x4 rate-distortion optimization loop. It improves the overall coding performance. Change-Id: I7fe9953175b74890357dbcee33c138573766e980	2013-07-10 15:49:49 -07:00
Deb Mukherjee	7494bba66b	Merge "Prunes out full-rd computation based on modeled rd"	2013-07-10 15:37:11 -07:00
Jim Bankoski	865ca76604	Merge "remove warnings when NDEBUG is set"	2013-07-10 14:39:39 -07:00
Jim Bankoski	6591cf2f7e	remove warnings when NDEBUG is set Change-Id: Ie0cb732fdcb98616a422c4463bff80642248d136	2013-07-10 14:27:20 -07:00
Deb Mukherjee	53ff43adc3	Prunes out full-rd computation based on modeled rd Adds a speed feature to eliminate full-rd computation if the modeled rd or rd based on a different parameter in the same mode is already a lot larger than the best rd yet. Specifically, only search the sharp and smooth filters if the modeled rd cost based on the regular filter is within a certain factor of the best rd cost so far. Also, skip full-rd computation of non splitmv inter modes if the modeled rd cost based on pred error is within the same factor of the best rd cost so far. Also adds some enhancements in the rd search for splitmv mode to speed things up by early breakouts. Negligible impact on performance. Resuts on derfraw300: psnr: -0.013% with the splitmv enhancements, -0.24% with the rd breakout feature on. speedup: 6% with splitmv enhancements, 20% with also residual breakout (tested on football sequence at 600 Kbps) Change-Id: I37abc308ea9f110c1679ce649b6a7e73ab1ad5fc	2013-07-10 13:49:49 -07:00
Yaowu Xu	e52eec490c	Merge "Add a feature to reduce chrome intra mode search"	2013-07-10 11:35:47 -07:00
Ronald S. Bultje	b1df674a99	Remove memcpy() in handle_inter_mode() filter selection. Encode time of first 50 frames of bus (speed 0) @ 1500kbps goes from 2min4.9 to 2min3.1, i.e. a 1.4% speedup overall. Change-Id: I9b25e87974430cb942caa276410bb2eda815bd83	2013-07-10 09:27:56 -07:00
Yaowu Xu	bed27a960a	Add a feature to reduce chrome intra mode search Change-Id: I721ebdeef2b53ce3e5c3eba3f7462ae2103c95a8	2013-07-10 08:59:18 -07:00
Jim Bankoski	fb027a7658	removing case statements around prediction entropy coding Removes SEG_ID Removes MBSKIP Removes SWITCHABLE_INTERP Removes INTRA_INTER Removes COMP_INTER_INTER Removes COMP_REF_P Removes SINGLE_REF_P1 Removes SINGLE_REF_P2 Removes TX_SIZE Change-Id: Ie4520ae1f65c8cac312432c0616cc80dea5bf34b	2013-07-09 20:10:16 -07:00
Yaowu Xu	205efbc153	Revert "Remove memcpy() in handle_inter_mode() filter selection." This reverts commit `fcf7998a47`. Change-Id: Ic6532223faec9f1483b78adb2e37b79c7b1a0efb	2013-07-09 17:42:10 -07:00
Ronald S. Bultje	204d1b7058	Merge "Unbreak lossless."	2013-07-09 09:54:48 -07:00
Ronald S. Bultje	059c0ba5d4	Unbreak lossless. Change-Id: I8130ec9b5371c65e885f245a5ac73840c23cb4a1	2013-07-09 09:46:37 -07:00
Dmitry Kovalev	1c65c580d6	Merge "Refactoring setup_pre_planes function."	2013-07-08 20:08:05 -07:00
Ronald S. Bultje	8fde07a3ae	Don't recalculate mv_ref costs for each block/partition. Changes cost_mv_ref() into doing a LUT into pre-calculated cost arrays instead. Encode time of first 50 frames of bus (speed 0) @ 1500kbps goes from 2min11.6 to 2min10.9, i.e. 0.5% faster overall. Change-Id: If186e92c34c201b29cbbc058785a15c9c09e433a	2013-07-08 16:22:39 -07:00
Ronald S. Bultje	fcf7998a47	Remove memcpy() in handle_inter_mode() filter selection. Encode time of first 50 frames of bus (speed 0) @ 1500kbps goes from 2min4.9 to 2min3.1, i.e. a 1.4% speedup overall. Change-Id: Ibe8b08d159797504c5d0c5122de1b6da3b6595e0	2013-07-08 16:22:39 -07:00
Ronald S. Bultje	ed995afba1	Make frame-wide filter-type decision fully RD-based. Overall, on all test sets, this gains about +0.2% on all metrics. City is a clip where this really hurts (-1.0% on all metrics), I'm not quite sure why yet. Maybe interesting to look into in the future. Change-Id: I6f0eecb20e72f0194633270d30bf00d76d9eae78	2013-07-08 16:22:37 -07:00
Deb Mukherjee	d9b62160a0	Implements several heuristics to prune mode search Skips mode searches for intra and compound inter modes depending on the best mode so far and the reference frames. The various heuristics to be used are selected by bits from a flag. The previous direction based intra mode search pruning is also absorbed in this framework. Specifically the flags and their impact are: 1) FLAG_SKIP_INTRA_BESTINTER (skip intra mode search for oblique directional modes and TM_PRED if the best so far is an inter mode) derfraw300: -0.15%, 10% speedup 2) FLAG_SKIP_INTRA_DIRMISMATCH (skip D27, D63, D117 and D153 mode search if the best so far is not one of the closest hor/vert/diagonal directions. derfraw300: -0.05%, about 9% speedup 3) FLAG_SKIP_COMP_BESTINTRA (skip compound prediction mode search if the best so far is an intra mode) derfraw300: -0.06%, about 7-8% speedup 4) FLAG_SKIP_COMP_REFMISMATCH (skip compound prediction search if the best single ref inter mode does not have the same ref as one of the two references being tested in the compound mode) derfraw300: -0.56%, about 10% speedup Change-Id: I1a736cd29b36325489e7af9f32698d6394b2c495	2013-07-08 12:17:12 -07:00
Paul Wilkins	ef0ca2deaa	Merge "Fix to comp_inter_joint_search_thresh feature."	2013-07-04 03:27:00 -07:00
Dmitry Kovalev	f72e072555	Refactoring setup_pre_planes function. Removing set_refs, adding set_ref function. Change-Id: I5635c478b106ae4e57d317f1c83d929644307e63	2013-07-03 17:42:01 -07:00
Jingning Han	68172dbede	Merge "Enable early termination in rd search"	2013-07-03 14:20:41 -07:00
Jingning Han	2bd6fe08f8	Enable early termination in rd search This commit allows encoder to detect the cumulative rate-distortion cost per transformed block inside a partition. If the cumulative rd cost is already above the best rd value, it terminates the rest operations and continue to next prediction mode test. It reduces the runtime of bus at target bit-rate 2000 from 308 second to 266 second, i.e., about 13% speed-up at no performance penalty. Change-Id: I5f15a3d8955d97031d5653006027866a00654e7a	2013-07-03 12:54:18 -07:00
Paul Wilkins	f58b44ad62	Fix to comp_inter_joint_search_thresh feature. When this is 0 (BLOCK_SIZE_AB4X4) we want to do the inter joint search for all sizes. Change-Id: Id40cd6fe7790e7e1165352b9cef5e12fa8c0bc88	2013-07-03 16:58:34 +01:00
Paul Wilkins	72c5778ec5	Added two new skip experiments. sf->unused_mode_skip_lvl. Tests modes as normal for all sizes at or below the given level. At larger sizes it skips all modes that were not chosen at any smaller size. Hence setting BLOCK_SIZE_SB64X64 is in effect off. Setting BLOCK_SIZE_AB4X4 will only consider modes that were chosen for one or more 4x4 blocks at larger sizes. sf->reference_masking. Do a test encode of the NONE partition at one size and create a reference frame mask based on the best rd choice. In the full search only allow this reference frame. Currently it is testing 64x64 and repeats this in the full search. This does not work well with Jim's Partition code just now and is disabled by default. Change-Id: I8f8c52d2ef4a0c08100150b0ea4155d1aaab93dd	2013-07-03 16:56:06 +01:00
Dmitry Kovalev	be77f6bbbf	Removing redundant struct from union b_mode_info. Change-Id: I08fc6e474ff2c12cfa065bae4989c724276e2c83	2013-07-02 16:51:57 -07:00
Deb Mukherjee	37501d687c	Speed feature to binary search dir intramodes This speed feature will skip searching the directional intra prediction modes D63, D117, D27, D153 if the best intra mode so far is not one of the diagonal, horizontal or vertical directions closest to the respective directions being tested. In other words, this implements a sort of binary search in the angular domain. Speedup: about 9-10% Results: -0.05% only on derfraw300. Change-Id: I413584c41f2a3e8dabfbdeb40718c8fc4b1d63a2	2013-07-02 14:07:19 -07:00
Deb Mukherjee	8d3d2b76f3	Tx size selection enhancements (1) Refines the modeling function and uses that to add some speed features. Specifically, intead of using a flag use_largest_txfm as a speed feature, an enum tx_size_search_method is used, of which two of the types are USE_FULL_RD and USE_LARGESTALL. Two other new types are added: USE_LARGESTINTRA (use largest only for intra) USE_LARGESTINTRA_MODELINTER (use largest for intra, and model for inter) (2) Another change is that the framework for deciding transform type is simplified to use a heuristic count based method rather than an rd based method using txfm_cache. In practice the new method is found to work just as well - with derf only -0.01 down. The new method is more compatible with the new framework where certain rd costs are based on full rd and certain others are based on modeled rd or are not computed. In this patch the existing rd based method is still kept for use in the USE_FULL_RD mode. In the other modes, the count based method is used. However the recommendation is to remove it eventually since the benefit is limited, and will remove a lot of complications in the code (3) Finally a bug is fixed with the existing use_largest_txfm speed feature that causes mismatches when the lossless mode and 4x4 WH transform is forced. Results on derf: USE_FULL_RD: +0.03% (due to change in the tables), 0% encode time reduction USE_LARGESTINTRA: -0.21%, 15% encode time reduction (this one is a pretty good compromise) USE_LARGESTINTRA_MODELINTER: -0.98%, 22% encode time reduction (currently the benefit of modeling is limited for txfm size selection, but keeping this enum as a placeholder) . USE_LARGESTALL: -1.05%, 27% encode-time reduction (same as existing use_largest_txfm speed feature). Change-Id: I4d60a5f9ce78fbc90cddf2f97ed91d8bc0d4f936	2013-07-02 13:54:00 -07:00
Ronald S. Bultje	3cc6eb7c00	Merge "Make get_coef_context() branchless."	2013-07-02 11:48:15 -07:00
Jingning Han	b91a1586a3	Calculate rd cost per transformed block Compute the rate-distortion cost per transformed block, and cumulate the cost through all blocks inside a partition. This allows encoder to detect if the cumulative rd cost is already above the best rd cost, thereby enabling early termination in the rate-distortion optimization search. Change-Id: I0a856367a9a7b6dd0b466e7b767f54d5018d09ac	2013-07-02 09:58:46 -07:00
Paul Wilkins	b7cd01ed73	Revert "New motion threshold factor - speed feature." This reverts commit `1377278180`. Also fixes a spelling mistake. Change-Id: I5be8aa4d8d3c0323d4a6f41968a7b2c048949c3f	2013-07-02 15:06:40 +01:00
Ronald S. Bultje	26b6318de8	Make get_coef_context() branchless. This should significantly speedup cost_coeffs(). Basically what the patch does is to make the neighbour arrays padded by one item to prevent an eob check in get_coef_context(), then it populates each col/row scan and left/top edge coefficient with two times the same neighbour - this prevents a single/double context branch in get_coef_context(). Lastly, it populates neighbour arrays in pixel order (rather than scan order), so we don't have to dereference the scantable to get the correct neighbours. Total encoding time of first 50 frames of bus (speed 0) at 1500kbps goes from 2min10.1 to 2min5.3, i.e. a 2.6% overall speed increase. Change-Id: I42bcd2210fd7bec03767ef0e2945a665b851df56	2013-07-01 16:34:10 -07:00
Yaowu Xu	ba3b2604f0	Merge "Quantize (64-bit only, for now) SSSE3 SIMD."	2013-07-01 15:58:57 -07:00
Ronald S. Bultje	7353ceab9d	Quantize (64-bit only, for now) SSSE3 SIMD. Total encoding time for first 50 frames of bus (speed 0) @ 1500kbps goes 2min34.8 to 2min14.4, i.e. a 10.4% overall speedup. The code is x86-64 only, it needs some minor modifications to be 32bit compatible, because it uses 15 xmm registers, whereas 32bit only has 8. Change-Id: I2df53770c2e850813ffa713e1a91b45b0082b904	2013-07-01 11:36:07 -07:00
Paul Wilkins	1377278180	New motion threshold factor - speed feature. Added a speed feature that focuses only on thresholds for new motion modes. Moved sf->comp_inter_joint_search_thresh into speed 1. This has ~+0.4% impact on quality at speed 0 as our quality reference baseline. Slight adjustment to baseline thresholds. Change-Id: I7ebf104f1fe29af77ed4837b2e84be065621bbe5	2013-07-01 12:11:21 +01:00
Ronald S. Bultje	bc70c60b25	Merge "fixed a bug where sse is not populated"	2013-06-29 07:42:41 -07:00
Yaowu Xu	f853e662b7	fixed a bug where sse is not populated Change-Id: I692d800af1f976c84a76f8bd66864c4b39540abc	2013-06-28 17:10:22 -07:00
Ronald S. Bultje	d00b8e5f82	Inline vp9_get_coef_context() (and remove vp9_ prefix). Makes cost_coeffs() a lot faster: 4x4: 236 -> 181 cycles 8x8: 888 -> 588 cycles 16x16: 3550 -> 2483 cycles 32x32: 17392 -> 12010 cycles Total encode time of first 50 frames of bus (speed 0) @ 1500kbps goes from 2min51.6 to 2min43.9, i.e. 4.7% overall speedup. Change-Id: I16b8d595946393c8dc661599550b3f37f5718896	2013-06-28 10:40:21 -07:00
Ronald S. Bultje	e3ce2b2ab3	Minor change to prevent one level of dereference in cost_coeffs(). 4x4: 234 -> 236 cycles 8x8: 878 -> 888 cycles 16x16: 3664 -> 3550 cycles 32x32: 18134 -> 17392 cycles Change-Id: I37a51bfbb0060a3a54f09c6045c14a989811ed78	2013-06-28 10:29:07 -07:00
Ronald S. Bultje	91d223bd5c	Some minor optimizations for cost_coeffs(). Cycle timings for first 3 frames of bus (speed 0) at 1500kbps: 4x4: 298 -> 234 cycles 8x8: 1227 -> 878 cycles 16x16: 23426 -> 18134 cycles 32x32: 4906 -> 3664 cycles Total encode time of first 50 frames of bus @ 1500kbps (speed 0) goes from 3min0.7 to 2min51.6 seconds, i.e. 5.3% faster. Change-Id: I68a0e1b530b0563b84a67342cca4b45146077e95	2013-06-28 10:29:02 -07:00
Ronald S. Bultje	af660715c0	Make coefficient skip condition an explicit RD choice. This commit replaces zrun_zbin_boost, a method of biasing non-zero coefficients following runs of zero-coefficients to be rounded towards zero, with an explicit skip-block choice in the RD loop. The logic is basically that if individual coefficients should be rounded towards zero (from a RD point of view), the trellis/optimize loop should take care of it. If whole blocks should be zero (from a RD point of view), a single RD check is much more efficient than a complete serialization of the quantization loop. Quality change: derf +0.5% psnr, +1.6% ssim; yt +0.6% psnr, +1.1% ssim. SIMD for quantize will follow in a separate patch. Results for other test sets pending. Change-Id: Ife5fa641163ac5150ac428011e87188f1937c1f4	2013-06-28 10:28:49 -07:00
Yaowu Xu	8b9eea0a34	Minor cleanups Change-Id: I379617c1c731a686b3f7e032b8805860c1055b12	2013-06-28 09:19:50 -07:00
Paul Wilkins	05ffdf2625	Merge "Auto adapt step size feature."	2013-06-27 02:28:41 -07:00
Paul Wilkins	59af9049d3	Merge "Start adaptive threshold for each mode at max."	2013-06-27 02:28:36 -07:00
Paul Wilkins	5bcf069c6b	Merge "Change meaning of cpi->sf.first_step and rename."	2013-06-27 02:28:21 -07:00
Jingning Han	fc1cfd8e32	Merge "Make intra predictor reference buffer configurable"	2013-06-26 19:02:02 -07:00
Jingning Han	861cb06c67	Make intra predictor reference buffer configurable This commit enables configurable reference buffer pointer for intra predictor. This allows later removal of spatial dependency between blocks inside a 64x64 superblock in the rate-distortion optimization loop. Change-Id: I02418c2077efe19adc86e046a6b49364a980f5b1	2013-06-26 17:17:21 -07:00
Paul Wilkins	9f3ab83486	Auto adapt step size feature. Also tweaks to other features and experiments with what is on and off at different speed settings. Change-Id: I3e1d0be0d195216bf17c2ac5df67f34ce0b306b2	2013-06-26 19:48:39 +01:00
Dmitry Kovalev	49dee16879	Merge "Using get_plane_block_{width, height} instead of custom code."	2013-06-26 10:23:27 -07:00
Paul Wilkins	689957e3ad	Start adaptive threshold for each mode at max. Each frame we reset all adaptive thresholds to MAX rather than base. As modes are picked their thresholds drop down. Change-Id: Ia37f03a73003c2d9bfcda57edea07205e9a0e5e8	2013-06-26 17:04:47 +01:00
Paul Wilkins	e606cac046	Change meaning of cpi->sf.first_step and rename. Renamed cpi->sf.first_step to cpi->sf.reduce_first_step_size and changed its meaning such that it is a delta applied to reduce the default first step size (>> x) in the motion search rather than an absolute value. The default first step size is already changed according to the image dimensions (smaller for smaller images). cpi->sf.reduce_first_step_size now applies a further correction from the default. Change-Id: Ia94e08bc24c67b604831f980909af7e982fcd16d	2013-06-26 17:04:06 +01:00
Jingning Han	d19ea3861d	Refactor intra predictor block Remove vp9_intra4x4_predict(). Use the common intra prediction function for all block sizes. Change-Id: Ibd19d51dfa3da8bbdfb79ddeb81530b2e2089560	2013-06-25 16:33:13 -07:00
Dmitry Kovalev	dc0f457c94	Using get_plane_block_{width, height} instead of custom code. Change-Id: I453ed11b965e857a14c18ea5c0f4a0a48e7dc0d9	2013-06-25 14:11:18 -07:00
Dmitry Kovalev	87ee34aacb	Removing unused code. Removing block index (ib) parameter from get_tx_type_{8x8, 16x16} functions. Change-Id: Ia213335aae7a7cb027f97b9cc9b04519840250f1	2013-06-25 10:17:19 -07:00
Dmitry Kovalev	f27f76dfb3	Transforming scale_mv_component_q4 into scale_mv_q4 function. Using MV instead of int_mv for function arguments. Change-Id: Ic25e13dccbc98fac1fa1b3255127e00cca2a57f6	2013-06-21 15:34:29 -07:00
Ronald S. Bultje	54b2a59623	Implement SSE2 block_error. Change vp9_block_error() to return a 64bit error variable, change all callers to expect a 64bit return value (this will prevent overflows, which we basically don't check for at all right now). Remove duplicate block_error() function, which fixed that through truncation. Remove old (incompatible) mmx/sse2 block_error SIMD versions and replace with a new one that returns a 64bit value. Encoding time of first 50 frames of bus @ 1500kbps goes from 3min29 to 3min23, i.e. a 3% overall speedup. Change-Id: Ib71ac5508b5ee8a80f1753cd85d72df1629abe68	2013-06-21 12:54:52 -07:00
Yaowu Xu	ee07a261a0	rename variables to avoid build error in MSVC Change-Id: I7960178c95c54d5c4497e44cfc8c493566294b34	2013-06-20 18:31:48 -07:00
Deb Mukherjee	7947a33d72	Improving model rd with variance and quant step Improves the rd modeling function and implements them using interpolation from a table which is a little faster. Also uses sse as input to the modeling function rather than var - since there is no dc prediction used and as a result the sse works a little better. derfraw300: +0.05% Speedup: ~1% Change-Id: I151353c6451e0e8fe3ae18ab9842f8f67e5151ff	2013-06-20 10:06:28 -07:00
Jim Bankoski	1f94b97694	convert all speed things to speed features Change-Id: Ie24489a4d39f3e53e816eeebf75a1c9c7d94515a	2013-06-20 09:42:44 -07:00
Yaowu Xu	12180c8329	Remove unnecessary copying of probs. Change-Id: Ic924f07c6ab0c929c6cdf11880d3c625806e272c	2013-06-18 23:02:27 -07:00
Deb Mukherjee	4ad96115cd	Some cleanups in rd motion search No bitstream or output change - only cosmetics. Change-Id: Ic8c1d7ad010a87dcf27d12a38cd7dd5adba683a7	2013-06-13 17:25:23 -07:00
Deb Mukherjee	f18328cbf1	Adds a zero check in model_rd function Avoids divide-by-zero when variance is 0. Change-Id: I3c7f526979046ff7d17714ce960fe81d6e1442a0	2013-06-10 17:04:47 -07:00
John Koleszar	717d744a01	Fix use of get_uv_tx_size in loopfilter Change the argument of get_uv_tx_size() to be an MBMI pointer, so that the correct column's MBMI can be passed to the function. Change-Id: Ied6b8ec33b77cdd353119e8fd2d157811815fc98	2013-06-10 11:40:57 -07:00
Paul Wilkins	de6ec27d1a	Rd check on segment level reference mode. Do not allow the rd code to check compound modes if a segment level reference frame is selected. Change-Id: I95f0c57789e0eaceed7caf227e94b4ba3130a06c	2013-06-10 11:03:15 -07:00
Ronald S. Bultje	b12a8dac98	Allow non-zeromv if ref_frame=intra with segmentation skip/ref enabled. Change-Id: Ib5a95bb6ab643b276df3faa9bf99595e4a69ff18	2013-06-10 10:55:10 -07:00
Tero Rintaluoma	86bb6df005	Fixed point reference picture scaling Fixed point scaling factors are calculated once for each reference frame by using integer division. Otherwise fixed point scaling routines are used in all scaling calculations. This makes it possible to calculate fixed point scaling factors on device driver software and pass them to hardware and thus avoid division on hardware. TODO: - Missing check for maximum frame dimensions (currently scaling uses 14 bits) - Missing check for maximum scaling ratio (upscaling 16:1, downscaling 2:1) Problems: - Straightforward fixed point implementation can cause error +-1 compared to integer division (i.e. in x_step_q4). Should only be an issue for frames larger than 16k. Change-Id: I3cf4dabd610a4dc18da3bdb31ae244ebaf5d579c	2013-06-10 08:07:55 -07:00
Deb Mukherjee	21401942b0	Coding tx-size selection by use of spatial context Adds coding of transform size within a frame by use of context of transform sizes selected in left and above blocks. Also incorporates code for generating stats. TODO: generate and incorporate new default stats Change-Id: I6a7af099f6ad61d448521d9a51167aedaf638ed6	2013-06-07 16:07:58 -07:00
Paul Wilkins	340c7a48e6	Change to segment ref frame feature. Simplify feature to only support a single reference frame instead of a mask. Change-Id: I5dd3a98c7a224aafb35708850ab82e2f220e68fb	2013-06-07 21:42:22 +01:00
Deb Mukherjee	3ee1a21a42	Coding updates for tx-size selection Changes to the coding of transform sizes, along with forward and backward probability updates. Results: derf300: +0.241% Context based coding of transform sizes will be in a separate patch. Change-Id: I97241d60a926f014fee2de21fa4446ca56495756	2013-06-07 08:54:00 -07:00
Ronald S. Bultje	6ef805eb9d	Change ref frame coding. Code intra/inter, then comp/single, then the ref frame selection. Use contextualization for all steps. Don't code two past frames in comp pred mode. Change-Id: I4639a78cd5cccb283023265dbcc07898c3e7cf95	2013-06-06 17:28:09 -07:00
Ronald S. Bultje	ad34368786	New intra mode and partitioning probabilities. Split partition probabilities between keyframes and non-keyframes, since they are fairly different. Also have per-blocksize interframe y intramode probabilities, since these vary heavily between different blocksizes. Lastly, replace default probabilities for partitioning and intra modes with new ones generated from current codec. Replace counts with actual probabilities also. Change-Id: I77ca996e25e4a28e03bdbc542f27a3e64ca1234f	2013-06-06 10:45:30 -07:00
Jingning Han	d03e974fbd	Bug fix in rd_pick_inter_mode_sb_ Fix the calculation of step size in height. Change-Id: I0e0c0175f141f5a41214ae51cef233d13942d3c5	2013-06-06 10:04:26 -07:00
Paul Wilkins	26e24b1dd7	Merge "Rd thresholds change with block size." into experimental	2013-06-06 09:27:44 -07:00
Paul Wilkins	02590a5b1b	Merge "Turn off compound inter search refinement for good quality." into experimental	2013-06-06 09:27:31 -07:00
Jim Bankoski	b4c4f64862	signs reverted Change-Id: Ieface458c83eb6e7ee95595d9fc662f372117c9a	2013-06-06 08:59:22 -07:00
Paul Wilkins	c3316c2bc5	Rd thresholds change with block size. Added structures to support independent rd thresholds for different block sizes (and set experimental block size correction factors). Added structure to to allow dynamic adaptation of thresholds per mode and per block size basis depending on how often the mode/block size combination is seen (currently fixed factor). Removed some unused variables. TODO - Adaptation of thresholds based on how often each mode chosen. - The baseline mode values could also be adjusted based on the block size (e.g. for a particular intra mode use a low threshold for 4x4 prediction blocks but a relatively high value for 64x64. Change-Id: Iddee65ff3324ee309815ae7c1c5a8584720e7568	2013-06-06 15:45:53 +01:00
Paul Wilkins	c880e02f97	Turn off compound inter search refinement for good quality. Turn this feature off for some modes in "good" quality. Change-Id: I3f262d62cca8f01736b977af1465291e8be29f0a	2013-06-06 15:44:25 +01:00
Jim Bankoski	5a88271b09	don't tokenize & encode tokens for blocks in UMV This avoids encoding tokens for blocks that are entirely in the UMV border. This changes the bitstream. Change-Id: I32b4df46ac8a990d0c37cee92fd34f8ddd4fb6c9	2013-06-06 06:10:25 -07:00
Jingning Han	61e6586230	Merge "Fix UV intra coding rd loop" into experimental	2013-06-05 21:47:00 -07:00
Jingning Han	f04b15486a	Fix UV intra coding rd loop This commit makes the coding/reconstruction operations of intra coding rate-distortion loop for UV components consistent with those of the encoding process. key frame coding gains: derf: 0.11% stdhd: 0.42% Change-Id: I8d49f83924a320e3689ef2d60096c49d7f0c7a40	2013-06-05 21:18:02 -07:00
Deb Mukherjee	30226a658f	Cosmetic renaming VP9_MVREFS to VP9_INTER_MODES NO bitstream change Change-Id: I79f6146dac5fdd157051b6f8dc611c0b7b5e5f7f	2013-06-05 11:24:01 -07:00
Jingning Han	513d326d75	Merge "Make sb intra rd search consistent with encoding" into experimental	2013-06-04 14:59:05 -07:00
Jingning Han	51b6e73a68	Make sb intra rd search consistent with encoding This commit makes operations of the superblock intra coding rate distortion optimization consistent with those used in the encoding process. Given the test prediction mode and transform size, the rd optimizer encodes and reconstructs each transformed block of the superblock consecutively, then computes the total rate-distortion costs accosicated with the current superblock to select the coding decisions. It achieves coding performance gains: derf 0.353% yt 1.111% Change-Id: I0da2eb7a71361dfb8c1384927fc536b0c2790d07	2013-06-04 13:54:48 -07:00
Dmitry Kovalev	6a961e7dc8	Merge "Replacing memcpy with struct assignment." into experimental	2013-06-03 14:32:05 -07:00
Jingning Han	9068bce4e7	Put iterative motion search under speed control Enable iterative motion search for compound inter-inter prediction of block sizes 4x4/4x8/8x4 only when best coding quality is selected. The iterative motion search provides about 0.1% gains for derf and stdhd at this point, at the expense of longer runtime. Change-Id: Idc03e7f827e51f1bb8d269bc3752ee297a6bbfe5	2013-06-03 09:18:57 -07:00
Dmitry Kovalev	3b9ec31eaf	Replacing memcpy with struct assignment. Change-Id: Ib557cc6351404b9e178e95a545883eb3666f11f0	2013-05-31 16:00:32 -07:00
Dmitry Kovalev	317d832d38	Merge "Adding plane_block_width and plane_block_height functions." into experimental	2013-05-31 15:28:45 -07:00
Deb Mukherjee	0048ec2329	Costing fixes related to trellis optimization Migrates costing changes/fixes from the rebalance expt to the head without the expt on. Rebased. Change-Id: I51677d62f77ed08aca8d21a4c9a13103eb8de93f Results: derfraw300: +0.126%	2013-05-31 13:56:32 -07:00
Dmitry Kovalev	120a878199	Adding plane_block_width and plane_block_height functions. Change-Id: I02c17fb733c0f3c22dc3167c3d3182797415f1ae	2013-05-31 12:31:49 -07:00
Ronald S. Bultje	a288cb3b10	Merge "Merge all various transform size data trackers into single variables." into experimental	2013-05-31 09:59:24 -07:00
Scott LaVarnway	1e025dbfd1	Merge "Moved use_prev_in_find_mv_refs check to frame level" into experimental	2013-05-31 09:35:51 -07:00
Ronald S. Bultje	e9d68a5e36	Merge all various transform size data trackers into single variables. Change-Id: I2dfc569106b29fbe4da20585a0e85e5e9ea6a4db	2013-05-31 09:18:59 -07:00
Jim Bankoski	21595f8e38	Merge "Creates a new speed 1:" into experimental	2013-05-30 20:36:05 -07:00
Jim Bankoski	ced21bd6a6	Creates a new speed 1: This speed 1 - uses variance threshold stolen from static-thresh to determine split. Any superblock with greater than the variance set by static thresh * quantizer index squared is split. In addition transform size is set to largest size less than or equal to partition size, sub pixel filter is set to normal, and only 12 modes are used at all. Change-Id: If7a2858ee70f96d1eb989c04fd87a332b147abef	2013-05-30 19:53:00 -07:00
Ronald S. Bultje	16482bddf7	Merge "Remove splitmv." into experimental	2013-05-30 19:07:12 -07:00
Ronald S. Bultje	d2205f92c3	Merge changes I98c18fe5,I80c37cff into experimental * changes: Remove i4x4_pred. Remove unused table.	2013-05-30 19:06:44 -07:00
Ronald S. Bultje	e6485581fe	Remove splitmv. We leave it in rdopt.c as a local define for now - this can be removed later. In all other places, we remove it, thereby slightly decreasing the size of some arrays in the bitstream. Change-Id: Ic2a9beb97a4eda0b086f62c039d994b192f99ca5	2013-05-30 17:21:01 -07:00
Ronald S. Bultje	1efa79d32f	Remove i4x4_pred. It remains as a local define in rdopt.c so we can distinguish between split and non-split modes in the RD loop, but disappears outside that scope in the codec. Change-Id: I98c18fe5ab7e4fbd1d6620ec5695e2ea20513ce9	2013-05-30 16:44:58 -07:00
Ronald S. Bultje	f5827699bf	Merge "Merge all intra mode coding trees into a single one." into experimental	2013-05-30 11:27:51 -07:00
Jingning Han	5e97862a71	Merge "Enable iterative motion search for 4x4 inter pred" into experimental	2013-05-30 11:02:10 -07:00
Ronald S. Bultje	98c192ae83	Merge all intra mode coding trees into a single one. Also merge all counters. This removes a few unused probability updates from the bitstream. Change-Id: I20f58853e9dac84d8c0d9703ae012c55917516eb	2013-05-30 09:58:53 -07:00
Jim Bankoski	e987f03acd	Merge "valgrind - txfm_thresh not set" into experimental	2013-05-30 09:34:48 -07:00
Deb Mukherjee	c98bfcfbbb	Merge "Balancing coef-tree to reduce bool decodes" into experimental	2013-05-30 08:10:47 -07:00
Jim Bankoski	ecf023f6e4	Merge "fix valgrind warning" into experimental	2013-05-30 08:04:49 -07:00
Jingning Han	87626a8f6e	Enable iterative motion search for 4x4 inter pred This commit enables iterative motion search for 4x4/4x8/8x4 block size compound inter-inter prediction. WIP: borg run testing Change-Id: I2b318db4a03cdca5a8002b3fa6c0fa89b129288b	2013-05-30 10:49:35 +01:00
Ronald S. Bultje	17544d1478	Merge "Remove some unused code related to macroblock/splitmv coding." into experimental	2013-05-29 17:35:05 -07:00
Jingning Han	5c05fbf6bb	Merge "Refactor 4x4 block level rd loop" into experimental	2013-05-29 16:35:02 -07:00
Deb Mukherjee	b8b3f1a46d	Balancing coef-tree to reduce bool decodes This patch changes the coefficient tree to move the EOB to below the ZERO node in order to save number of bool decodes. The advantages of moving EOB one step down as opposed to two steps down in the other parallel patch are: 1. The coef modeling based on the One-node becomes independent of the tree structure above it, and 2. Fewer conext/counter increases are needed. The drawback is that the potential savings in bool decodes will be less, but assuming that 0s are much more predominant than 1's the potential savings is still likely to be substantial. Results on derf300: -0.237% Change-Id: Ie784be13dc98291306b338e8228703a4c2ea2242	2013-05-29 16:25:52 -07:00
Jim Bankoski	aae78c8ac7	valgrind - txfm_thresh not set For 4x4 blocks valgrind points out the cache was uninitalized. This resolves the issue by setting it. Change-Id: I22733000da048643762813a84fbda66d8e4040d2	2013-05-29 13:56:08 -07:00
Jingning Han	d0a3872019	Refactor 4x4 block level rd loop This commit makes clean-ups in the rate-distortion loop for 4x4, 4x8, and 8x4 block sizes for the use of iterative motion search. Removed unnecessary use of bmi in handle_inter_mode. Deprecated loop over labels in the 4x4/4x8/8x4 block rd search. Change-Id: I71203dbb68b65e66f073b37abd90d82ef5ae6826	2013-05-29 13:44:52 -07:00
Scott LaVarnway	353642bc53	Moved use_prev_in_find_mv_refs check to frame level This patch checks at the frame level to see if the previous mode info context can be used. This patch eliminates the flag check that was done for every mode and removes another check that was done prior to every vp9_find_mv_refs(). Change-Id: I9da5e18b7e7e28f8b1f90d527cad087073df2d73	2013-05-29 16:42:23 -04:00
Jim Bankoski	5e5470b254	fix valgrind warning scales for second reference frame vars are unitialized if the second ref frame is one of of those disallowed by refframeflags Change-Id: I4ce42de391178c1699dcaede18c5f12c84993c61	2013-05-29 12:34:10 -07:00
Jingning Han	84deeddbaf	Merge "Refactor rd loop for inter modes" into experimental	2013-05-29 10:55:23 -07:00
Jingning Han	6c97bba403	Merge "further clean-ups on intra4x4 coding" into experimental	2013-05-29 10:55:14 -07:00
Sami Pietila	88a4d4c510	Residual coding to cache energy class of tokens. Proposal for tuning the residual coding by changing how the context from previous tokens is calculated. Storing the energy class of previous tokens instead of the token itself eases the critical path of HW implementations. Change-Id: I6d71d856b84518f6c88de771ddd818436f794bab	2013-05-29 15:21:01 +01:00
Ronald S. Bultje	4487f5a690	Remove some unused code related to macroblock/splitmv coding. Change-Id: Ic40d56fb162f4e201547dfae33e62ccd9e865889	2013-05-29 06:29:56 -07:00
Jingning Han	94d700e763	Refactor rd loop for inter modes This commit pulls the iterative motion search for compound inter- inter out from handle_inter_mode_ as a separate function. Hence, it is applicable to 4x4/4x8/8x4 level compound inter search to be enabled later. Also edit the rd loop for 4x4 inter block sizes for cosmetic purpose. Change-Id: Ibc71a11cbe5a26cd52faba01026cf8446cf4d2b4	2013-05-28 16:31:33 -07:00
Jingning Han	4729a6f389	further clean-ups on intra4x4 coding Removed one 4x4 prediction step that was unnessary in the rd loop. Removed a unused modecosts estimate from encoder side. Change-Id: I65221a52719d6876492996955ef04142d2752d86	2013-05-28 11:19:05 -07:00
Yaowu Xu	601bab4fde	Merge "a few clean-ups" into experimental	2013-05-27 15:16:21 -07:00
Ronald S. Bultje	cba8e16e93	Decrease scope of frame_mv argument to handle_inter_mode(). Change-Id: I81c637c61ecc33cb66beb59a2a33166d66b9a0a2	2013-05-27 14:16:45 -07:00
Yaowu Xu	2b96ffe025	a few clean-ups 1. remove prediction mode conversion 2. unified bmode, same for key and non-key frame 3. set I4X4_PRED count for pdf to 0, as I4X4_PRED is no longer coded ever. It is determined by ref_frame and block partition Change-Id: If5b282957c24339b241acdb9f2afef85658fe47d	2013-05-27 13:53:56 -07:00
Ronald S. Bultje	f188bf1c3d	Remove unused mode_index argument from handle_inter_mode(). Change-Id: I07b8c15f33e6e7c63dd0033c18c4ac5c0303cf32	2013-05-27 08:49:17 -07:00
Ronald S. Bultje	5cac66078e	Remove splitmv. Also do per-partition motion vector referencing in <sb8x8 partitions, and adjust mvref finding for sub8x8 partitions. Change-Id: Id3ed1ed4d2a8910d11d327db6cc63b8eb79f941f	2013-05-26 14:40:49 -07:00
Jingning Han	826efc838c	Fix a bug in intra4x4 level rd loop This commit fixed a uninitialized value use in the intra 4x4/8x4/4x8 rate-distortion loop. Change-Id: I5c25b3536b59e4f5fbb23cf85baf93b2ccec7d72	2013-05-23 17:44:33 -07:00
Jingning Han	ae10319520	Make comp_inter_inter support 4x4 partition coding This commit refactors the iterative motion search for compound inter-inter mode, to make it support all partition types including 4x4/4x8/8x4 block sizes. Change-Id: I5f1212b0f307377291763e45c6bdc9693b5f04c8	2013-05-23 13:13:42 +01:00
Paul Wilkins	33ecd6ad54	Merge Scatter Scan experiment. Removal from under configure flag. A bit renaming Change-Id: I2213229dfe852001dfec16b149f47c52ce88f3aa	2013-05-23 13:09:27 +01:00
Jingning Han	7ac5ac52f9	Merge 4x4 block level partition into codebase Move 4x4/4x8/8x4 partition coding out of experimental list. This commit fixed the unit test failure issues. It also resolved the merge conflicts between 4x4 block level partition and iterative motion search for comp_inter_inter. Change-Id: I898671f0631f5ddc4f5cc68d4c62ead7de9c5a58	2013-05-23 11:58:50 +01:00
Deb Mukherjee	ddb2309568	Merge "Using 128 entry look up table for coef models" into experimental	2013-05-22 10:38:35 -07:00
Jingning Han	d2cacdc530	Merge "Make the intra rd search support 8x4/4x8" into experimental	2013-05-22 10:00:15 -07:00
Deb Mukherjee	de4d682ca4	Using 128 entry look up table for coef models Reverts to using 128 bit LUT for the coef models rather than 48 to ease hardware implementation. Also incorporates some cleanups including removing various hooks to support different lookup tables based on block_type and ref_type. Change-Id: I54100c120cca07a2ebd3a7776bc4630fa6a153f6	2013-05-22 08:44:31 -07:00
Paul Wilkins	0b713f8c18	Merge CONFIG_COMP_INTER_JOINT_SEARCH. Merge this experiment so that it is under a speed feature flag not a configuration flag. Change-Id: I536f7f125a4ff5149bb3a64f791e835c324535fd	2013-05-22 11:23:31 +01:00
Jingning Han	f153a5d063	Make the intra rd search support 8x4/4x8 This commit allows the rate-distortion optimization of intra coding capable of supporting 8x4 and 4x8 partition settings. It enables the entropy coding of intra modes in key frame using a unified contextual probability model conditioned on its above/left prediction modes. Coding performance: derf 0.464% Change-Id: Ieed055084e11fcb64d5d5faeb0e706d30268ba18	2013-05-21 21:03:00 -07:00
John Koleszar	ddf13be8ef	Merge "Initial version of alpha channel support" into experimental	2013-05-21 17:29:51 -07:00
Deb Mukherjee	7a645e4e12	Merging the model coef prob experiment Merges the experiment. Change-Id: I4eb19af6de6df6aa3a96a2e82f231d47ed9b3ae9	2013-05-21 14:44:38 -07:00
Scott LaVarnway	1db6373267	Merge "WIP: 4x4 idct/recon merge" into experimental	2013-05-21 10:45:53 -07:00
Dmitry Kovalev	4ac70bd7d3	Adding get_ref_frame_idx function. Change-Id: I4f1a4eca6794cda78d00512196caacd5567e2dcc	2013-05-20 16:09:00 -07:00
Deb Mukherjee	39a90bc8e8	Updating the model coef experiment Cleans up the experiment. Actually uses reduced counts for backward updates, and reduced number of probabilities in the context. No change in bitstream when the experiment is on. Between expt on and off: derfraw300 is down only -0.062% (which is better than when expts were run previously). Change-Id: I55285a049a0c22810bdb42914212ab5a4f8521b5	2013-05-20 12:46:36 -07:00
Scott LaVarnway	ba48a11130	WIP: 4x4 idct/recon merge This patch eliminates the intermediate diff buffer usage by combining the short idct and the add residual into one function. The encoder can use the same code as well. Change-Id: I296604bf73579c45105de0dd1adbcc91bcc53c22	2013-05-20 13:03:17 -04:00
Jingning Han	810b612c23	Enable bit-stream support to 8x4 and 4x8 partition The recursive partition type search is enabled down to 4x4, 4x8 and 8x4, followed by the corresponding rate-distortion optimization for the per-partition encoding mode decisions. The bit-stream writing/reading synchronized in supporting the rectangular partition of 8x8 block. This provides above 1% coding performance gains on derf. To do next: 1. re-design the rate-distortion loop for inter prediction below 8x8. 2. re-design the rate-distortion loop for intra prediction below 4x4. 3. make the loop-filter aware of rectangular partition of 8x8 block. 4. clean the unused probability models. 5. update default probability values. Change-Id: Idd41a315b16879db08f045a322241f46f1d53f20	2013-05-19 14:59:04 -07:00
John Koleszar	679e4abdd5	Initial version of alpha channel support This is a mostly-working implementation of an extra channel in the bitstream. Configure with --enable-alpha to test. Notable TODOs: - Add extra channel to all mismatch tests, PSNR, SSIM, etc - Configurable subsampling - Variable number of planes (currently always uses all 4) - Loop filtering - Per-plane lossless quantizer - ARNR support This implementation just uses the same contents as the Y channel for the A channel, due to lack of content and general pain in playing back 4 channel content. A later patch will use the actual alpha channel passed in from outside the codec. Change-Id: Ibf81f023b1c570bd84b3064e9b4b8ae52e087592	2013-05-16 22:21:09 -07:00
Jingning Han	8e3d0e4d7d	Add building blocks for 4x8/8x4 rd search These building blocks enable rate-distortion optimization search over block sizes of 8x4 and 4x8. Need to convert them into mmx/sse forms. Change-Id: I570ea2d22d14ceec3fe3575128d7dfa172a577de	2013-05-16 10:41:29 -07:00
Jingning Han	8468a5c1a0	Fix the transform type selection in 4x4 partition This commit allows proper transform type (DCT/ADST) selection in the settings of partition 4x4 level. Change-Id: Iec6f922a46480d777e7ca9142a99e8c131f0077b	2013-05-15 16:09:58 -07:00
Jingning Han	1f26840fbf	Enable recursive partition down to 4x4 This commit allows the rate-distortion optimization recursion at encoder to go down to 4x4 block size. It deprecates the use of I4X4_PRED and SPLITMV syntax elements from bit-stream writing/reading. Will remove the unused probability models in the next patch. The partition type search and bit-stream are now capable of supporting the rectangular partition of 8x8 block, i.e., 8x4 and 4x8. Need to revise the rate-distortion parts to get these two partition tested in the rd loop. Change-Id: I0dfe3b90a1507ad6138db10cc58e6e237a06a9d6	2013-05-14 12:39:56 -07:00
Yunqing Wang	dee12bdf8f	Merge "Do joint motion search iteratively" into experimental	2013-05-14 10:18:11 -07:00
Yunqing Wang	60456083e9	Do joint motion search iteratively Allow motion search multiple times iteratively, and break out the loop if this search couldn't find better motion vectors. Limit the maximum number of search to 2. Tests results: 1. stdhd set: 0.311%(overall psnr); 0.346%(ssim). positive gain on 10 out of 16 clips(best: 2.746% on sunflower; worst: -0.434% on old_town_cross). 2. derf set: 0.016%(overall psnr); 0.062%(ssim). positive gain on half of the clips(best: 0.499% on bowing; worst: -0.387 on city). Change-Id: Ibf0a51776d4caf7707be0586346db08128117559	2013-05-13 12:14:09 -07:00
Jingning Han	e996c9c5f1	Merge "Force bsize for UV in I4X4 and SPLITMV" into experimental	2013-05-13 10:51:39 -07:00
Paul Wilkins	e5f715201a	Change to band calculation. Change band calculation back to simpler model based on the order in which coefficients are coded in scan order not the absolute coefficient positions. With the scatter scan experiment enabled the results were appear broadly neutral on derf (-0.028) but up a little on std-hd +0.134). Without the scatterscan experiment on the results were up derf as well. Change-Id: Ie9ef03ce42a6b24b849a4bebe950d4a5dffa6791	2013-05-13 17:21:49 +01:00
Jingning Han	4c2c350309	Force bsize for UV in I4X4 and SPLITMV Use 4x4 block coding for UV components arbitrarily in I4X4_PRED and SPLITMV coding modes. This is a temporary solution to enable bit-stream support for recursive partition down to 4x4 block size. Will separate the functionalities of 4x4 block coding rate-distortion out from those of superblocks. Change-Id: I03dc15d5897014f175f3f2c91e9b266091d56797	2013-05-11 13:39:16 -07:00
Yunqing Wang	9755d9fda2	Remove unused mdcounts mdcounts seems no longer used. Change-Id: Idd8162e8acfa3f5be7a18767156cc79ccbc2bdee	2013-05-10 11:02:22 -07:00
Yunqing Wang	9f5811c2da	Add joint motion search in comp_inter_inter mode(experiment) In current code, motion vectors got from single prediction mode are used in compound prediction mode directly. These motion vectors may not give accurate prediction since they are searched independently. In this patch, we took Pascal's suggestion, and did joint motion search in compound prediction mode to find better motion vectors in this situation. Test results: Overall PSNR: 0.570%(derf), 0.918%(stdhd); SSIM: 0.572%(derf), 1.009%(stdhd); The encoder is a little slower. This can be improved since some c code is used in motion search. Change-Id: Ib30c9240f6c56c9b070867b4ca89412a76d9f3c6	2013-05-10 10:15:43 -07:00
Dmitry Kovalev	f0911886f3	Merge "Renaming 'Speed' to 'speed' inside VP9_COMP struct." into experimental	2013-05-08 16:35:35 -07:00
Dmitry Kovalev	8f4e9ac8bc	Removing y_to_uv_block_size and y_bsizet_to_block_size functions. Change-Id: I49527ff8dd8bef1074c18a964fed2a575f0b118a	2013-05-08 15:23:42 -07:00
Dmitry Kovalev	4be190d9d0	Renaming 'Speed' to 'speed' inside VP9_COMP struct. Change-Id: I4374b5af40ee9082ddf7956a9756a15ad9ad5436	2013-05-08 14:35:42 -07:00
John Koleszar	14a5c7285b	Make switchable filter search subsampling-aware Makes the temporary storage of the filtered data agnostic to the number of planes and how they're subsampled. Change-Id: I12f352cd69a47ebe1ac622af30db29b49becb7f4	2013-05-07 21:57:00 -07:00
John Koleszar	7465f52f81	Merge "Make setup_pred_block subsampling-aware." into experimental	2013-05-07 21:53:31 -07:00
Dmitry Kovalev	80997b3aa2	Merge "Adding get_switchable_rate function." into experimental	2013-05-07 17:10:48 -07:00
Paul Wilkins	a14ae84749	Deprecate code_zerogroup experiment. Delete code under the CONFIG_CODE_ZEROGROUP flag. Change-Id: I5fe6c7b42a5da9b73118e33594301da4129f320a	2013-05-07 16:52:55 -07:00
Dmitry Kovalev	455816231e	Adding get_switchable_rate function. Change-Id: I71311a14f8d7f48508b250f25d1d0914c6a1ac72	2013-05-07 16:52:04 -07:00
Paul Wilkins	1ed57a6a62	Deprecate comp_interintra_pred experiment. Delete code under the CONFIG_COMP_INTERINTRA_PRED flag. Change-Id: I3d1079cf46305c08f7e11d738596ea112e7b547f	2013-05-07 16:24:08 -07:00
Paul Wilkins	8c1b516d10	Deprecate the newbintramode experiment. Clean out code relating to newbintramode. Change-Id: Ie91f4f156cdf60ce0da8ca407c1c9cb00c7d0705	2013-05-07 16:00:59 -07:00
Jingning Han	cf8b5a09ed	Add building blocks for partition down to 4x4 Macro ab4x4 contains experiments for recursive partition down to 4x4 block size. Change-Id: Ic727842fa98a4df9fd51e0025a545dc76a5c76c1	2013-05-07 12:11:51 -07:00
John Koleszar	e559e14fa6	Make setup_pred_block subsampling-aware. Code previously set up the pointers by scaling by MI_UV_SIZE, which is 4:2:0 only. Change-Id: Ic13a92895cff018ec1345736746ed84cb31e6e31	2013-05-07 11:47:45 -07:00
Jingning Han	776c1482a3	Merge SB8X8 into the codebase Pull sb8x8 out of experimental list. verified via borg run tests. Fixed unit test failures. Change-Id: I12a4bbd17395930580c048ab68becad1ffe46e76	2013-05-07 09:08:25 -07:00
Dmitry Kovalev	2e5f0084f3	Adding model_rd_for_sb function. Iterating over all planes in the loop instead of custom y,uv code inside handle_inter_mode function. Change-Id: I301f9276d6d544c2fd7203d84f1318ac80ea625d	2013-05-06 12:42:53 -07:00
Jingning Han	8e1c97cf73	Fix a unit test failure of sb8x8 on scaling ref Disable the use of scaled reference frame for motion search in SPLITMV mode. This fixes the unit test failure issue triggered when merging sb8x8 from experimental list. Change-Id: I02ac25fd8db8d5762f8fee29513b947189875fa0	2013-05-06 10:28:18 -07:00
Ronald S. Bultje	f7fa367094	Fix first-pass intra4x4 for sb8x8 experiment. Change-Id: I1df17f45721c690d157800daa6a0b377e3d32bc2	2013-05-04 15:49:41 -07:00
Ronald S. Bultje	842c573e04	Merge "Fix overflow in RD error calculation code." into experimental	2013-05-03 18:03:06 -07:00
John Koleszar	6c622e2783	Merge "Separate transform and quant from vp9_encode_sb" into experimental	2013-05-03 17:19:01 -07:00
John Koleszar	4529c68b3b	Separate transform and quant from vp9_encode_sb This allows removing a large number of transform size specific functions, as well as supporting 444/alpha by routing all code through the subsampling-aware path. Change-Id: Ieb085cebe9f37f24fc24de179898b22abfda08a4	2013-05-03 12:14:50 -07:00
Ronald S. Bultje	ee808e52bd	Fix overflow in RD error calculation code. Change-Id: I61ef1f198c876f9f79787ea7d7385a862cfbae19	2013-05-03 10:33:07 -07:00
Dmitry Kovalev	7ab2d7bf55	Removing MAXF macro and using MAX instead. Change-Id: I51c53692b1150005645bf362c5e5a8275178a8fd	2013-05-02 11:57:16 -07:00
Ronald S. Bultje	f37d8400db	Store splitmv modes in context after 8x8 rd loop. Change-Id: I07aa89a67e0ac5f99ef0c448553dbc46b0ed27f2	2013-05-01 17:13:23 -07:00
Ronald S. Bultje	b6c2d872f0	Fix some crashes in sb8x8 experiment. Change-Id: I390bb1cedc835f439fd5dd6cda6572b29cbb139c	2013-05-01 14:45:27 -07:00
Dmitry Kovalev	79590f186c	Merge "Cleaning up encoder segmentation code." into experimental	2013-04-30 17:49:55 -07:00
Ronald S. Bultje	d068d869b9	sb8x8 integration in rd loop. Work-in-progress, not yet ready for review. TODO items: - bitstream writing (encoder) and reading (decoder) - decoder reconstruction Change-Id: I5afb7284e7e0480847b47cd0097cb469433c9081	2013-04-30 16:13:20 -07:00
Dmitry Kovalev	51a73fbba2	Merge "Consistent names for quant-related functions and variables." into experimental	2013-04-30 10:19:48 -07:00
Dmitry Kovalev	ee97da2c03	Cleaning up encoder segmentation code. Moving code from vp9_pack_bitstream to new function encode_segmentation. Change-Id: I1f1e59a1f038618ad95162b7db4b6f8164850ea8	2013-04-29 16:07:17 -07:00
Ronald S. Bultje	2dbaa4f4f4	Change above/left_context to use an 8x8 basis. Output changes slightly because of a minor bug in (at least) the sb32x16 block2above tx16x16 tables that previously existed in vp9_blockd.c. Change-Id: I624af28ac200a8322d64454cf05c79e9502968cc	2013-04-29 10:37:25 -07:00
Dmitry Kovalev	5a5a1f25a8	Consistent names for quant-related functions and variables. Change-Id: I3a6d601e90e8740b9c26dd0afbfe9d467b75d367	2013-04-26 12:30:20 -07:00
Ronald S. Bultje	1a46b30ebe	Grow MODE_INFO array to use an 8x8 basis. Change-Id: I087e08e7909a406b71715b8525c104208daa6889	2013-04-26 11:57:17 -07:00
John Koleszar	bb41ab4a0c	Remove BLOCKD structure All members can be referenced from their per-plane counterparts, and removes assumptions about 24 blocks per macroblock. Change-Id: I7ff2fa72d22c29163eb558981c8193765a8113d9	2013-04-26 10:35:54 -07:00
John Koleszar	4f55c5618a	Remove destination pointers from BLOCKD Access these members from MACROBLOCKD instead. Change-Id: I7907230dd473ff12ebe182b9280d8b7f12a888c4	2013-04-26 10:14:07 -07:00
John Koleszar	4b27eb1f18	Merge "quantize: make 4x4, 8x8 common with larger transforms" into experimental	2013-04-26 09:08:49 -07:00
Scott LaVarnway	57f180b388	Removed bmi from blockd This originally was "Removed update_blockd_bmi()". Now, this patch removed bmi from blockd and uses the bmi found in mode_info_context. Eliminates unnecessary bmi copies between blockd and mode_info_context. Change-Id: I287a4972974bb363f49e528daa9b2a2293f4bc76	2013-04-26 10:19:43 -04:00
John Koleszar	a672351af9	quantize: make 4x4, 8x8 common with larger transforms There were 4 variants of the quantize loop in vp9_quantize.c, now there is 1. Change-Id: Ic853393411214b32d46a6ba53769413bd14e1cac	2013-04-25 14:44:54 -07:00
Ronald S. Bultje	18f29ff581	Remove duplicate code in RD handle_inter_mode() function. Change-Id: I552d53f7e7331e9246d8a32d6c6dcc0cfa0cbeb0	2013-04-25 14:21:21 -07:00
Ronald S. Bultje	c849eaca59	Use b_width/height_log2 instead of mb_ where appropriate. Basic assumption: when talking about transform units, use b_; when talking about macroblock indices, use mb_. Change-Id: Ifd163f595d4924ff892de4eb0401ccd56dc81884	2013-04-25 14:20:59 -07:00
John Koleszar	a99e1aa8ca	Remove predictor pointers from BLOCKD Access these members from MACROBLOCKD instead. Change-Id: I2574622e577bb9feede47f6b7ccbb11f3e928ca8	2013-04-25 12:04:07 -07:00
John Koleszar	6c0c6b86c1	Remove diff from BLOCKD The underlying storage for these buffers is in the per-plane MACROBLOCKD area, so read it from there directly. Change-Id: Id6bd835117fdd9dea07db95ad06eff9f12afaaf7	2013-04-25 11:57:22 -07:00
John Koleszar	15255eef82	Move dequant from BLOCKD to per-plane MACROBLOCKD This data can vary per-plane, but not per-block. Change-Id: I1971b0b2c2e697d2118e38b54ef446e52f63c65a	2013-04-25 11:57:20 -07:00
John Koleszar	4bd0f4f646	Remove BLOCK structure All members can be referenced from their per-plane counterparts, and removes assumptions about 24 blocks per macroblock. Change-Id: I593fb0715e74cd84b48facd1c9b18c3ae1185d4b	2013-04-25 11:33:17 -07:00
Dmitry Kovalev	61a47da869	Adding is_inter_mode function. Change-Id: I2d32d46002cb92c63050c2b8328865c406103621	2013-04-25 10:23:00 -07:00
Jingning Han	b0e3b3df18	Move sbsegment out of experimental list Move rectangular superblock coding out of experimental list. Change-Id: I96c37547d122330d666a67b4bf577ae54547857f	2013-04-24 15:19:17 -07:00
Jingning Han	ff2b8aa2c9	Contextual entropy coding of partition syntax This commit enables selecting probability models for recursive block partition information syntax, depending on its above/left partition information, as well as the current block size. These conditional probability models are reasonably stationary and consistent across frames, hence the backward adaptive approach is used to maintain and update the contextual models. It achieves coding performance gains (on top of enabling rectangular block sizes): derf: 0.242% yt: 0.391% hd: 0.376% stdhd: 0.645% Change-Id: Ie513d9673337f0d27abd65fb566b711d0844ec2e	2013-04-24 14:23:14 -07:00
John Koleszar	bc30736f9b	Merge "Remove coeff from BLOCK" into experimental	2013-04-23 17:42:12 -07:00
John Koleszar	aa6a36b062	Merge "Convert coeff to per-plane MACROBLOCK data" into experimental	2013-04-23 17:41:59 -07:00
John Koleszar	48f3e66e16	Remove coeff from BLOCK Lookup the data per-plane from the MACROBLOCK struct. Change-Id: I9253c4d3cf886aa9ab4aeab23a2156bfcf994ede	2013-04-23 16:39:21 -07:00
John Koleszar	138ec38cab	Convert coeff to per-plane MACROBLOCK data This commit moves the coeff storage from the MACROBLOCK struct to its per-plane part. The next commit will remove the coeff member from the BLOCK structure so that it is consistently accessed per-plane. Also refactors vp9_sb_block_error_c and vp9_sb_uv_block_error_c to be variable subsampling aware. Change-Id: I18c30f87f27c3a012119b6c1970d5fa499804455	2013-04-23 16:28:17 -07:00
John Koleszar	4f35e3e1c1	Merge "Move src_diff to per-plane MACROBLOCK data" into experimental	2013-04-23 16:24:08 -07:00
Dmitry Kovalev	d0d1094a05	Merge "Adding get_scan_{4x4, 8x8, 16x16} functions." into experimental	2013-04-23 12:44:51 -07:00
John Koleszar	cbd1315ac4	Move src_diff to per-plane MACROBLOCK data First in a series of commits making certain MACROBLOCK members addressable per-plane. This commit also refactors the block subtraction functions vp9_subtract_b, vp9_subtract_sby_c, etc to be loops-over-planes and variable subsampling aware. Change-Id: I371d092b914ae0a495dfd852ea1a3d2467be6ec3	2013-04-23 12:18:51 -07:00
Deb Mukherjee	611b26bbe0	Merge "Removing the implicit compound inter experiment" into experimental	2013-04-22 23:22:28 -07:00
Deb Mukherjee	735febf1ce	Removing the implicit compound inter experiment Removing this experiment for now, since it has been broken with the latest code changes. Change-Id: I1be2181b56de490fcb577f5905b5e147a8ed82d8	2013-04-22 16:46:54 -07:00
Jim Bankoski	366ff224ef	Merge "new version of speed 1" into experimental	2013-04-22 16:42:33 -07:00
Jim Bankoski	e7bddba149	new version of speed 1 This version of speed 1 only disables modes at higher resolution that had distortions >2x the best mode we found... The hope is that this could be a replacement for speed 0 ... Change-Id: I7421f1016b8958314469da84c4dccddf25390720	2013-04-22 15:42:41 -07:00
Dmitry Kovalev	5de7e16ca2	Adding get_scan_{4x4, 8x8, 16x16} functions. Change-Id: Id4306ef6d65d4a3984aed50b775bdf48d4f6c438	2013-04-22 14:08:41 -07:00
John Koleszar	a443447b8b	Move pre, second_pre to per-plane MACROBLOCKD data Continue moving framebuffers to per-plane data. Change-Id: I237e5a998b364c4ec20316e7249206c0bff8631a	2013-04-22 12:05:24 -07:00
Deb Mukherjee	f12509f640	Merge "Removes the code_nonzerocount experiment" into experimental	2013-04-22 11:53:14 -07:00
Deb Mukherjee	0aa79be7d5	Removes the code_nonzerocount experiment This patch does not seem to give any benefits. Change-Id: I9d2b4091d6af3dfc0875f24db86c01e2de57f8db	2013-04-22 10:58:49 -07:00
Deb Mukherjee	6ce718eb18	Merge "End of orientation zero group experiment" into experimental	2013-04-22 10:33:12 -07:00
Deb Mukherjee	70d9f116fd	End of orientation zero group experiment Adds an experiment that codes an end-of-orientation symbol for every eligible zero encountered in scan order. This cleans out various other sub-experiments that were part of the origiinal patch, which will be later included if found useful. Results are slightly positive on all sets (0.1 - 0.2% range). Change-Id: I57765c605fefc7fb9d1b57f1b356843602abefaf	2013-04-22 09:27:59 -07:00
John Koleszar	6d5ac8f2e1	reconinter: remove unnecessary functions, params Removes the redundant dst pointers from vp9_build_inter_predictors_sb{y,uv} and the remaining mb specific functions. Change-Id: I7b6bf439d9394b85ea79b4fe61a3ffc1025720da	2013-04-22 08:20:54 -07:00
John Koleszar	fa8ddbd2a6	Merge "Move dst to per-plane MACROBLOCKD data" into experimental	2013-04-19 16:33:45 -07:00
John Koleszar	d12376aa2c	Move dst to per-plane MACROBLOCKD data First in a series of commits moving the framebuffers pointers to per-plane data, so that they can be indexed numerically rather than by name. Change-Id: I6e0d60fd4d51e6375c384eb7321776564df21775	2013-04-19 16:16:10 -07:00
Yunqing Wang	25edb68100	Merge "Remove unused parameters in handle_inter_mode" into experimental	2013-04-19 14:12:43 -07:00
Paul Wilkins	fb754fd37e	Merge "Mv ref candidates cut to 2." into experimental	2013-04-19 14:09:44 -07:00
Dmitry Kovalev	3689122b1c	Merge "Fixing member names inside TOKENVALUE and TOKENEXTRA structs." into experimental	2013-04-19 10:09:04 -07:00
Jim Bankoski	35b1d2e38f	Merge "catch all for new block sizes" into experimental	2013-04-19 09:57:38 -07:00
Jim Bankoski	afb04eb211	catch all for new block sizes Just make sure we don't stop them from testing in speed 1. Change-Id: Iec9b3dba0a32616ff7a451207e0f54b81bb72575	2013-04-19 09:48:56 -07:00
Jim Bankoski	6d82fe219d	Merge "set up a new speed 1" into experimental	2013-04-19 08:28:35 -07:00
Paul Wilkins	de80da39dc	Mv ref candidates cut to 2. Further simplification of mvref search to return only the top two candidates. Distance weights removed as the test order reflects distance anyway. Change-Id: I0518cab7280258fec2058670add4f853fab7b855	2013-04-19 16:13:53 +01:00
Jim Bankoski	b6ef0823c5	set up a new speed 1 slightly worse results for faster encodes Change-Id: I25ea82a18ce20635dbcd328808c1d05ac1f58fd7	2013-04-19 08:04:57 -07:00
Paul Wilkins	92e8a3f514	Simplification of MVref search. As we are no longer able to sort the candidate mvrefs in both encoder and decode and given that the cost of explicit signalling has proved prohibitive, it no longer makes sense to find more than 2 candidates. This patch: Modifies and simplifies add_candidate_mv() Removes the forced addition of a 0 vector in the MAX_MV_REF_CANDIDATES-1 position (in preparation to reducing MAX_MV_REF_CANDIDATES to 2). Re-orders the addition of candidates slightly. This actually gives small gains (circa 0.2% on std-hd) A subsequent patch will remove NEW_MVREF experiment, reduce MAX_MV_REF_CANDIDATES to 2 and remove distance weights as these are implicit now in the order. Change-Id: I3dbe1a6f8a1a18b3c108257069c22a1141a207a4	2013-04-19 11:19:59 +01:00
Dmitry Kovalev	77f4697a13	Fixing member names inside TOKENVALUE and TOKENEXTRA structs. Change-Id: I183ec5819d4d80966c92db36db75b8c3be0d381d	2013-04-18 16:18:08 -07:00
Jingning Han	f0b065e946	Merge "Make the use of pred buffers consistent in MB/SB" into experimental	2013-04-18 15:24:55 -07:00
Jingning Han	6f43ff5824	Make the use of pred buffers consistent in MB/SB Use in-place buffers (dst of MACROBLOCKD) for macroblock prediction. This makes the macroblock buffer handling consistent with those of superblock. Remove predictor buffer MACROBLOCKD. Change-Id: Id1bcd898961097b1e6230c10f0130753a59fc6df	2013-04-18 14:59:36 -07:00
Dmitry Kovalev	a8d903e539	Merge "Replacing VP9_COMBINEENTROPYCONTEXTS macro with function." into experimental	2013-04-18 14:26:34 -07:00
Dmitry Kovalev	8b20aa2337	Merge "Renaming y1dc_delta_q, uvdc_delta_q, uvac_delta_q fields from VP9Common." into experimental	2013-04-18 14:26:06 -07:00
Yunqing Wang	e304160885	Remove unused parameters in handle_inter_mode Removed 2 unused parameters. Change-Id: Ic2862569313c404047072b268c3d2be3f635492c	2013-04-18 11:55:46 -07:00
Ronald S. Bultje	e693472236	Fairly basic integration of rectangular blocks in encoding RD loop. Adds RD integration for 32x16, 16x32, 64x32 and 32x64 rectangular blocks. Derf almost +0.6%, HD a little over +1.0%, STDHD +1.3%. Change-Id: Id651fdb6a655fdbb5c47009757e63317acfb88a5	2013-04-17 09:25:06 -07:00
Dmitry Kovalev	9087d6d470	Replacing VP9_COMBINEENTROPYCONTEXTS macro with function. Change-Id: I3bbc31840af69481e1d9bb4427c9ee25abf82946	2013-04-16 15:30:28 -07:00
Dmitry Kovalev	1ad7c1f250	Renaming y1dc_delta_q, uvdc_delta_q, uvac_delta_q fields from VP9Common. New names are y_dc_delta_q, uv_dc_delta_q, uv_ac_delta_q. Change-Id: I4acae1fc23a4697ce2c5a5becb8dc28ef0a4b552	2013-04-16 15:05:52 -07:00
John Koleszar	e3cfe4e89e	Remove the mb_no_coeff_skip flag This flag was added to VP8 to allow a mode where MB-level skipping was not allowed, saving a bit per mb. It was never used in practice, and hasn't been tested in VP9, so remove it. Change-Id: Id450ec6904c6d06c1919508e7efc52d05cde5631	2013-04-16 12:36:16 -07:00
Dmitry Kovalev	a0d9309eab	Removing TRUE and FALSE macro definitions. Using regular 0 and 1 constants now. Change-Id: Ie763503cbb727847cc8f1d6506cd6f2ee607f056	2013-04-15 15:24:39 -07:00
Ronald S. Bultje	33a8df085d	Fix lingering x->skip settings if static_threshold is used. Keyframes don't set this variable, so it would use the last set values from inter frames. Change-Id: Ie1ef45ece2c44b21b5d55f6cea9f7d6e7a445692	2013-04-15 13:39:07 -07:00
Jingning Han	aaf33d7df5	Add rectangular block size variance/sad functions. With this, the RD loop properly supports rectangular blocks. Change-Id: Iece79048fb4e84741ee1ada982da129a7bf00470	2013-04-15 13:39:07 -07:00
Ronald S. Bultje	15eac18c4e	Make filter RD code and encode breakout variance size-independent. Static threshold results slightly up (+0.1% on derf), probably b/c we now take the filter (sharp/lowpass) into account for the breakout decision. Change-Id: I9f597601da434205142afd05f32690e7ba8fd690	2013-04-15 13:38:35 -07:00
Jingning Han	3ba9dd4165	Enable inter predictor for rectangular block size Combine superblock inter predictors into a unified function that allows configurable block width and height. The inter predictions of block sizes smaller than 16x16 are handled differently. To be continued on merging them later. Change-Id: I14075959dd5e221f00c205c99ca35c1c31ef728e	2013-04-12 11:51:58 -07:00
Yaowu Xu	7de5edd14a	Rename B_PRED to I4X4_PRED So it is consistent with I8x8_PRED. Change-Id: Iefa65124b2419690d83e526c611129c0ede29d11	2013-04-12 09:23:58 -07:00
Jingning Han	815e95fbeb	Make intra predictor support rectangular blocks The intra predictor supports configurable block sizes. It can handle intra prediction down to 4x4 sizes, when enabled in BLOCK_SIZE_TYPE. Change-Id: I7399ec2512393aa98aadda9813ca0c83e19af854	2013-04-11 16:45:57 -07:00
Scott LaVarnway	cff266bbef	Merge "WIP: removing predictor buffer usage from decoder" into experimental	2013-04-11 15:24:33 -07:00
Ronald S. Bultje	69902c6bf0	Merge "Merge pick_sb_modes and pick_sb64_modes." into experimental	2013-04-11 15:06:37 -07:00
Scott LaVarnway	6189f2bcb1	WIP: removing predictor buffer usage from decoder This patch will use the dest buffer instead of the predictor buffer. This will allow us in future commits to remove the extra mem copy that occurs in the dequant functions when eob == 0. We should also be able to remove extra params that are passed into the dequant functions. Change-Id: I7241bc1ab797a430418b1f3a95b5476db7455f6a	2013-04-11 13:55:18 -07:00
John Koleszar	c2bd46bf45	tokenize: convert skippable functions Use the common block walker to calculate skippability. Change-Id: I6721e42f065df237426c91c1d871ec226ba7cdcb	2013-04-11 12:27:37 -07:00
Ronald S. Bultje	605ff051f7	Merge pick_sb_modes and pick_sb64_modes. Change-Id: Iad69e7a3b7e470acf6094f6a52e7da69066fd552	2013-04-11 09:33:49 -07:00
Ronald S. Bultje	33d94a843f	Remove copying of coefficients and predictor in i8x8 RD loop. The resulting values are never used. Change-Id: I688caf30da9aab87aa280cce913eda4f33172293	2013-04-10 17:39:03 -07:00
Ronald S. Bultje	8fb5be48a6	Make usage of sb_type independent of literal values. Change-Id: I0d12f9ef9d960df0172a1377f8e5236eb6d90492	2013-04-10 17:38:57 -07:00
Ronald S. Bultje	b4f6098ef7	Make RD superblock mode search size-agnostic. Merge various super_block_yrd and super_block_uvrd versions into one common function that works for all sizes. Make transform size selection size-agnostic also. This fixes a slight bug in the intra UV superblock code where it used the wrong transform size for txsz > 8x8, and stores the txsz selection for superblocks properly (instead of forgetting it). Lastly, it removes the trellis search that was done for 16x16 intra predictors, since trellis is relatively expensive and should thus only be done after RD mode selection. Gives basically identical results on derf (+0.009%). Change-Id: If4485c6f0a0fe4038b3172f7a238477c35a6f8d3	2013-04-10 16:50:30 -07:00
Ronald S. Bultje	1932828d19	Merge "Make SB coding size-independent." into experimental	2013-04-10 08:51:58 -07:00
Ronald S. Bultje	a3874850dd	Make SB coding size-independent. Merge sb32x32 and sb64x64 functions; allow for rectangular sizes. Code gives identical encoder results before and after. There are a few macros for rectangular block sizes under the sbsegment experiment; this experiment is not yet functional and should not yet be used. Change-Id: I71f93b5d2a1596e99a6f01f29c3f0a456694d728	2013-04-09 21:28:27 -07:00
Jingning Han	12bf0796e6	Clamp inferred motion vectors only Clamp only the motion vectors inferred from neighboring reference macroblocks. The motion vectors obtained through motion search in NEWMV mode are constrained during the search process, which allows a relatively larger referencing region than the inferred mvs. Hence further clamping the best mv provided by the motion search may affect the efficacy of NEWMV mode. Synchronized the decoding process. The decoded mvs in NEWMV modes should be guaranteed to fit in the effective range. Put a mv range clamping function there for security purpose. This improves the coding performance of high motion sequences, e.g., derf set: foreman 0.233% husky 0.175% icd 0.135% mother_daughter 0.337% pamphlet 0.561% stdhd set: blue_sky 0.408% city 0.455% also saw sunflower goes down by -0.469%. Change-Id: I3fcbba669e56dab779857a8126a91b926e899cb5	2013-04-08 11:37:03 -07:00
John Koleszar	fa135d7b9e	Merge changes Ibbfa68d6,Idb76a0e2 into experimental * changes: Move EOB to per-plane data Move qcoeff, dqcoeff from BLOCKD to per-plane data	2013-04-05 15:56:50 -07:00
Ronald S. Bultje	36c3a67c20	Remove full-pixel-related code. This is a VP8-only feature (part of profile 3) that is unsupported in VP9. Change-Id: I78016eede8d9c834d44d4c517f3e8b8fc2a378b1	2013-04-05 12:50:19 -07:00
John Koleszar	05a79f2fbf	Move EOB to per-plane data Continue migrating data from BLOCKD/MACROBLOCKD to the per-plane structures. Change-Id: Ibbfa68d6da438d32dcbe8df68245ee28b0a2fa2c	2013-04-04 21:30:23 -07:00
John Koleszar	4c05a051ab	Move qcoeff, dqcoeff from BLOCKD to per-plane data Start grouping data per-plane, as part of refactoring to support additional planes, and chroma planes with other-than 4:2:0 subsampling. Change-Id: Idb76a0e23ab239180c818025bae1f36f1608bb23	2013-04-04 16:30:57 -07:00
Deb Mukherjee	73031aaa7d	Bugfix in encode_inter_mb_segment_8x8 Fixes an indexing bug. Looks like the bug has been there for a while. Change-Id: I9fc04b0c30754bcb47366ad94a08112925600c4d	2013-04-04 11:07:19 -07:00
John Koleszar	a417a6e32c	Merge "Removing redundant function arguments." into experimental	2013-04-01 21:09:48 -07:00

... 9 10 11 12 13 ...

1200 Commits