generic-library/vpx

Author	SHA1	Message	Date
Dmitry Kovalev	f1559bdeaf	Inlining 16 as a stride for BLOCK_OFFSET macro. Change-Id: I7f23d174eb089e5500f268a10db09648634c1b82	2013-08-09 16:40:05 -07:00
James Zern	f295774d43	vp9_rd_pick_inter_mode_sb: fix uninitialized value 'skippable' can remain unset and negatively affect later decisions address one aspect of issue #599 Change-Id: Iffdf0ac2e49ac481c27dc27c87fa546d4167bb28	2013-08-09 16:26:22 -07:00
Deb Mukherjee	2158909fc3	Merge "Adds a new subpel motion function"	2013-08-08 12:26:55 -07:00
Deb Mukherjee	1ba91a84ad	Adds a new subpel motion function Adds a new subpel motion estimation function that uses a 2-level tree-structured decision tree to eliminate redundant computations. It searches fewer points than iterative search (which can search the same point multiple times) but has the same quality roughly. This is made the default setting at speeds 0 and 1, while at speed 2 and above only a 1-level search is used. Also includes various cleanups for consistency and redundancy removal. Results: derf: +0.012% psnr stdhd: +0.09% psnr Speedup of about 2-3% Change-Id: Iedde4866f5475586dea0f0ba4cb7428fba24eee9	2013-08-08 11:41:49 -07:00
Dmitry Kovalev	8db2675b97	Adding ss_size_lookup table. Removing the old one bsize_from_dim_lookup. Now we have a way to determine block size for plane using its subsampling values (ss_size_lookup). And then we can find the number of pixels in the block (num_pels_log2_lookup). Change-Id: I6fc981da2ae093de81741d3d78eaefed11015db9	2013-08-07 15:33:17 -07:00
Deb Mukherjee	71b43b0ff0	Clean ups of the subpel search functions Removes some unused code and speed features, and organizes the interfaces for fractional mv step functions for use in new speed features to come. In the process a new speed feature - number of iterations per step during the subpel search - is exposed. No change when this parameter is set as the original value of 3. Results: subpel_iters_per_step = 3: baseline subpel_iters_per_step = 2: psnr -0.067%, 1% speedup subpel_iters_per_step = 1: psnr -0.331%, 3-4% speedup Change-Id: I2eba8a21f6461be8caf56af04a5337257a5693a8	2013-08-06 17:23:50 -07:00
Deb Mukherjee	fac7c8c9f9	Merge "Flexible support for various pattern searches"	2013-08-06 14:03:27 -07:00
Deb Mukherjee	15b5a6a2c7	Flexible support for various pattern searches Adds a few pattern searches to achieve various tradeoffs between motion estimation complexity and performance. The search framework is unified across these searches so that a common pattern search function is used for all. Besides it will be easier to experiment with various patterns or combinations thereof at different scales in the future. The new pattern search is multi-scale and is capable of using different patterns at different scales. The new hex search uses 8 points at the smallest scale and 6 points at other scales. Two other pattern searches - big-diamond and square are also added. Big diamond uses 4 points at the smallest scale and 8 points in diamond shape at the larger scales. Square is very similar conceptually to the default n-step search but is somewhat faster since it keeps only one survivor across all scales. Psnr/speed-up results on derf300: hex: -1.6% psnr%, 6-8% speed-up big-diamond: -0.96% psnr, 4-5% speedup square: -0.93% psnr, 4-5% speedup Change-Id: I02a7ef5193f762601e0994e2c99399a3535a43d2	2013-08-06 11:56:39 -07:00
Dmitry Kovalev	0c80065694	Inlining vp9_get_pred_probs_switchable_interp function. There was no benefit having this function. For example, inside read_switchable_filter_type switchable filter context was calculated twice. Change-Id: I79cd5bf95cbc0f6d8bf91a2e32289e01b18dcff1	2013-08-06 11:04:31 -07:00
Dmitry Kovalev	3e51acafec	Merge "Finally removing all old block size constants."	2013-08-06 10:30:37 -07:00
Dmitry Kovalev	4a692e4168	Merge "Changing the order switchable filter enum constants."	2013-08-06 10:30:26 -07:00
Dmitry Kovalev	25b7dc08cd	Merge "Removing unused functions."	2013-08-06 10:29:57 -07:00
Deb Mukherjee	33afddadb9	Merge "Add variance based mode/skipping"	2013-08-06 10:19:15 -07:00
Dmitry Kovalev	b9c7d04e95	Finally removing all old block size constants. Change-Id: I3aae21e88b876d53ecc955260479980ffe04ad8d	2013-08-05 15:23:49 -07:00
Deb Mukherjee	8b3faccb9e	Add variance based mode/skipping Adds a speed feature to skip all intra modes other than DC_PRED if the source variance is small. This feature is made part of speed 1 and up. Results on derf300: psnr -0.07%, speedup about 1-2% Also uses the source variance to fine-tune the early termination criteria when FLAG_EARLY_TERMINATE is on. This feature is made part of speed 2 and up. Results on derf300: psnr -0.52%, speedup about 5-7% Change-Id: I59e38aa836557cfa5405ae706fc64815cbfe4232	2013-08-05 14:14:01 -07:00
Jim Bankoski	9f988a2edf	Merge "cleanups after bw bh code"	2013-08-05 14:02:02 -07:00
Dmitry Kovalev	3f611555d7	Changing the order switchable filter enum constants. This changeset allows to remove vp9_switchable_interp and vp9_switchable_interp_map arrays and make code much clear. Actually we still have to use these mapping but only inside read_interp_filter_type and write_interp_filter_type functions. Change-Id: I4026c6f8c4acefba6c81421b7bacbaa52cc45f50	2013-08-05 12:26:15 -07:00
Jim Bankoski	5d2cb7ead0	cleanups after bw bh code Cons bw/bh parms that should have been const. Additional formatting. Change-Id: Icd36a5c9dc17dadd7284315ac0d6fef1a565ca16	2013-08-05 12:15:52 -07:00
Dmitry Kovalev	d007446b3f	Replacing long block size enum values with shorter ones (2). Change-Id: I428c4d42212b757112e3acfe5b81314cfbb5fd6b	2013-08-05 10:51:02 -07:00
Dmitry Kovalev	fe2a201eb1	Replacing "txfm" with "tx" in identifiers. Consistent names with TX_SIZE, TX_MODE, and TX_MODE. Change-Id: I79592218bf5a40ace89197a34a06ee7de581ed8d	2013-08-02 17:28:23 -07:00
Dmitry Kovalev	fec4ec4edd	Removing unused functions. Removed functions: model_rd_for_sb_y, block_error_sby, get_sb_variance Change-Id: Iec458df180caf6f8eac3605773841a4121dd3a8f	2013-08-02 16:41:09 -07:00
Dmitry Kovalev	25b77e2569	Changing function arg type from int_mv* to MV*. Change-Id: Ic878d31df2ce783a2c9a8c4bc9ed301ec8ffe25e	2013-08-02 15:26:32 -07:00
Adrian Grange	60ff123536	Merge "Fixed typos and added a few explanatory comments"	2013-08-02 11:37:47 -07:00
Adrian Grange	075b11f004	Merge "Changed name of rd_pick_intra4x4mby_modes"	2013-08-02 11:36:46 -07:00
Dmitry Kovalev	741537f3ce	Cleanup: replacing xd->seg with seg, and xd->lf with lf. Change-Id: I73b59d7699a8e7e7acd3bf8041cb6c98ce9ba4bf	2013-08-01 15:38:16 -07:00
Dmitry Kovalev	ce8dedc353	Cleanup: removing unused function arguments. Change-Id: I27471768980fc631916069f24bc7c482a5c9ca17	2013-08-01 13:41:38 -07:00
Dmitry Kovalev	b621e2d72e	Nice looking motion vector clamping functions. Removing assign_and_clamp_mv function, making implementation of clamp_mv and clamp_mv2 more clear and consistent. Change-Id: Iecd08e1c1bf0379f8314ebe01811f8253f4ade58	2013-08-01 13:40:26 -07:00
Adrian Grange	89e73c63c0	Fixed typos and added a few explanatory comments Change-Id: Ib4e4b41094b54874ee34343dd77c0c131ceed9d2	2013-08-01 09:23:49 -07:00
Adrian Grange	5271d47892	Changed name of rd_pick_intra4x4mby_modes The function name rd_pick_intra4x4mby_modes is confusing, so I changed it to rd_pick_intra_sub_8x8_y_modes to better reflect what the function does. Also added const qualifiers to some of the input parameters and removed camel-case. Change-Id: I23d53d4c7af5d79ed8a471acd59a09bbb47add39	2013-08-01 09:23:49 -07:00
Dmitry Kovalev	9239e96536	Removing get_mi_{row, col} functions. Passing mi_row and mi_col parameters to functions explicitly. Removing unused xd argument from scale_mv function. Change-Id: Icb4c495ec72d26fb066c14470d3ae0b741fbf18a	2013-07-31 14:06:55 -07:00
Dmitry Kovalev	500ade243a	Removing unused "ishp" arguments. Using different variable names "allow_hp" and "use_hp" instead of "usehp". Change-Id: I0cd5996ddeb46bd754473b680a993c0aaf8eb879	2013-07-31 11:27:53 -07:00
Adrian Grange	fbd73648dd	Merge "Cleanup typos, remove unnecessary lines, replace switch"	2013-07-30 12:59:46 -07:00
Adrian Grange	b30a06b930	Cleanup typos, remove unnecessary lines, replace switch Removed unnecessary code lines, replaced switch with an if, fixed spelling errors and formatting. Change-Id: Ie48aa4604aa0ed48362ca359d792fb21b2ec1dc6	2013-07-30 12:10:32 -07:00
Dmitry Kovalev	730a34416f	Renaming NB_TXFM_MODES constant to TX_MODES. Change-Id: I10bf06e3a3d5271221ae6a42a36074d01d493039	2013-07-29 13:38:40 -07:00
Dmitry Kovalev	23391ea835	Renaming TX_SIZE_MAX_SB to TX_SIZES. Change-Id: I6aa4191935aa93461a07c41b59fdae1eb5f5f107	2013-07-29 12:25:34 -07:00
Ronald S. Bultje	118ccdcd30	Inverse dimension order in token_cost array. This allows us to increment the position at the band-level only as we go from one band to the next; more importantly, that allows us to use an add instead of multiply instruction, and omit the instruction altogether if the band doesn't change from one coef to the next, thus being slightly faster (probably more noticeable on systems where a multiply is expensive, like arm). Change-Id: I4343fe35b9f9a47fa00b217bdcbf5f91ff96c381	2013-07-26 17:30:04 -07:00
Ronald S. Bultje	dcacce6dd9	Merge "Save pixels instead of coefficients in intra4x4 RD loop."	2013-07-26 17:20:58 -07:00
Ronald S. Bultje	d30c8f41ef	Merge "Add best_rd breakout in intra4x4 RD loop."	2013-07-26 17:20:51 -07:00
Dmitry Kovalev	c09b81719f	Merge "General cleanups."	2013-07-26 13:59:39 -07:00
Yunqing Wang	52256cdbca	Modify static threshold calculation Used 3 * standard_deviation in internal threshold calculation instead of fit curve. This actually approached the algorithm better. For comparison, similar tests were done: The overall psnr loss is less than before. 1. derf set: when static-thresh = 1, psnr loss is 0.329%; when static-thresh = 500, psnr loss is 0.970%; 2. stdhd set: when static-thresh = 1, psnr loss is 0.922%; when static-thresh = 500, psnr loss is 1.307%; Similar speedup is achieved. For example, clip bitrate static-thresh psnr time akiyo(cif) 500 0 48.952 5.077s(50f) akiyo 500 500 48.866 4.169s(50f) parkjoy(1080p) 4000 0 30.388 78.20s(30f) parkjoy 4000 500 30.367 70.85s(30f) sunflower(1080p) 4000 0 44.402 74.55s(30f) sunflower 4000 500 44.414 68.69s(30f) Change-Id: Ic78833642ce1911dbbd1cb6c899a2d7e2dfcc1f3	2013-07-25 19:59:33 -07:00
Yunqing Wang	845fd5011c	Merge "Add encoding option --static-thresh"	2013-07-25 14:58:00 -07:00
Yunqing Wang	d36852b702	Add encoding option --static-thresh This option exists in VP8, and it was rewritten in VP9 to support skipping on different partition levels. After prediction is done, we can check if the residuals in the partition block will be all quantized to 0. If this is true, the skip flag is set, and only prediction data are needed in reconstruction. Based on DCT's energy conservation property, the skipping check can be estimated in spatial domain. The prediction error is calculated and compared to a threshold. The threshold is determined by the dequant values, and also adjusted by partition sizes. To be precise, the DC and AC parts for Y, U, and V planes are checked to decide skipping or not. Test showed that 1. derf set: when static-thresh = 1, psnr loss is 0.666%; when static-thresh = 500, psnr loss is 1.162%; 2. stdhd set: when static-thresh = 1, psnr loss is 1.249%; when static-thresh = 500, psnr loss is 1.668%; For different clips, encoding speedup range is between several percentage and 20+% when static-thresh <= 500. For example, clip bitrate static-thresh psnr time akiyo(cif) 500 0 48.923 5.635s(50f) akiyo 500 500 48.863 4.402s(50f) parkjoy(1080p) 4000 0 30.380 77.54s(30f) parkjoy 4000 500 30.384 69.59s(30f) sunflower(1080p) 4000 0 44.461 85.2s(30f) sunflower 4000 500 44.418 78.1s(30f) Higher static-thresh values give larger speedup with larger quality loss. Change-Id: I857031ceb466ff314ab580ac5ec5d18542203c53	2013-07-25 14:28:05 -07:00
Dmitry Kovalev	7131cb0e3d	General cleanups. Removing unused constants, macros, and function declarations. Using ROUND_POWER_OF_TWO macro, vp9_zero, vp9_copy where possible. Moving #include from .h to .c. Merging for loops for motion vectors. Change-Id: Ic3bf841764a2bb177128bb3a6d7aa8f68229cd13	2013-07-25 14:13:48 -07:00
Adrian Grange	e862c6f9eb	Merge "Simplify handling of sub-partition motion vectors"	2013-07-25 12:58:38 -07:00
Adrian Grange	6f0f0e4907	Merge "Use local variables rather than structure members"	2013-07-25 12:57:52 -07:00
Adrian Grange	be700e140a	Simplify handling of sub-partition motion vectors Simplified the code that extracts and uses the motion vectors for the 4 sub-partitions in rd_pick_partition. Change-Id: Iaf698ef7ee3aef9edd59015e1ae065dd359b17d9	2013-07-25 11:51:51 -07:00
Dmitry Kovalev	fcc34796d2	Removing CONFIG_BALANCED_COEFTREE experiment. Change-Id: I61a8b0101eac3ee2e0621d56151b90c269fd4db4	2013-07-24 15:53:42 -07:00
Dmitry Kovalev	9139ee0908	Adding condition inside get_tx_type_{4x4, 8x8, 16x16}. Adding plane type check condition because it was always used outside of get_tx_type_{4x4, 8x8, 16x16}. Change-Id: I02f0bbfee8063474865bd903eb25b54d26e07230	2013-07-24 12:55:45 -07:00
Adrian Grange	4cfd36d8fd	Use local variables rather than structure members Although local copies of the mode member variables (mode, ref_frame) were made, they were not used in all places. Also, made a local copy of the second_ref_frame member. Change-Id: I84d8c822e5cb3d8a02fc3de8a4037ca3fea8bfad	2013-07-24 11:17:44 -07:00
Ronald S. Bultje	7817d3221f	Save pixels instead of coefficients in intra4x4 RD loop. Prevents doing duplicate IDCTs; encoding of first 50 frames of bus (speed 0) @ 1500kbps goes from 1min4.0 to 1min3.5, i.e. 0.87% faster overall. Change-Id: I2df39e29ed9d5ea5e7d2704a34940ba622832ddd	2013-07-24 09:03:20 -07:00
Ronald S. Bultje	b72ecbb1b9	Add best_rd breakout in intra4x4 RD loop. Encoding time of first 50 frames of bus (speed 0) @ 1500kbps goes from 1min5.4 to 1min4.0, i.e. 2.2% faster overall. Change-Id: I8c32f2aff9a649ce7dd49d910dc5ba16b99c3bc6	2013-07-24 09:02:05 -07:00
Ronald S. Bultje	47336afd8d	Merge "More optimizations for cost_coeffs()."	2013-07-23 21:36:12 -07:00
Dmitry Kovalev	db7f5d28b9	Removing vp9_is_interpolating_filter array. All filters are interpolating now, so we don't need this array, all values from this array are evaluated to true. Change-Id: I9af6d8219ae0eb984063cd15e4e2296374ae4961	2013-07-23 14:24:39 -07:00
Dmitry Kovalev	2855d8aea1	Merge "Adding update_tx_counts function."	2013-07-23 13:57:59 -07:00
James Zern	8dede954c7	Merge "vp9: make some static tables const"	2013-07-23 11:37:01 -07:00
Jim Bankoski	86a9dec73c	clean up bw, bh many structures use bw and bh and they have different meanings. This cl attempts to start this clean up and remove unneccessary 2 step look up log and then shift operations... also removed partition type multiple operation code in bitstream.c. Change-Id: I7e03e552bdfc0939738e430862e3073d30fdd5db	2013-07-23 06:51:44 -07:00
Paul Wilkins	7c134bc0cd	Merge "Reworked the auto_mv_step_size speed feature"	2013-07-23 04:49:55 -07:00
James Zern	3c8cce353f	vp9: make some static tables const Change-Id: I8bcae51271673da8755c66a51aea005dfe6a3739	2013-07-22 19:19:13 -07:00
Ronald S. Bultje	e20fcd9585	More optimizations for cost_coeffs(). 4x4: 163 -> 123 cycles (33% faster) 8x8: 491 -> 399 cycles (23% faster) 16x16: 1889 -> 1763 cycles (7% faster) 32x32: 8311 -> 8180 cycles (1.6% faster) Overall encoding time of first 50 frames of bus (speed 0) @ 1500kbps goes from 1min4.33 to 1min3.00, i.e. 2.11% faster. Change-Id: Ib52d1dbb5649b14de769d3e7a74af67440b5284f	2013-07-22 16:09:09 -07:00
Dmitry Kovalev	b2fc6fa969	Adding update_tx_counts function. Moving common encoder/decoder code to update_tx_counts. Also renaming vp9_get_pred_probs_tx_size to get_tx_probs2 and adding get_tx_probs to call vp9_get_pred_context_tx_size inside read_selected_tx_size only once (twice before). Change-Id: Ia50247f3893de88ef8e9041b0d44be44a40aaa4d	2013-07-22 14:57:43 -07:00
Yaowu Xu	fc186dcad6	fix a build error Change-Id: I3b05687f439ff6a7c426d2c97a6c58c831fa51ac	2013-07-22 12:37:30 -07:00
Jingning Han	416f315e82	Merge "Skip buffer update in sub8x8 rd loop"	2013-07-22 12:08:22 -07:00
Jingning Han	a5a9f5f7f3	Merge "Optimize operation flow in sub8x8 rd loop"	2013-07-22 12:08:15 -07:00
Jingning Han	409e77f2d4	Optimize operation flow in sub8x8 rd loop Stack the rate-distortion statistics in the sub8x8 rd loop. This allows the encoder to skip the forward transform, quantization, and coeff cost estimation, in the sub8x8 rd optimization search, if the motion vector(s) are of integer pixel value, and have been tested in the previous prediction filter type rd loops of the same block. This gives about 2% speed-up for bus_cif at 2000 kpbs, for speed 0. Its efficacy depends how frequently the motion search will select an integer motion vector. Change-Id: Iee15d4283ad4adea05522c1d40b198b127e6dd97	2013-07-22 10:40:33 -07:00
Paul Wilkins	1d189d6464	Re-order mode search in rd. Mode search order in rd loop changed to better reflect observed hit counts. Also some adjustment of the baseline mode rd thresholds to reflect the order change and observed frequencies. Change-Id: I47a131cc83e11551df8add6d6d8d413d78d3a63c	2013-07-22 17:21:12 +01:00
Dmitry Kovalev	39342db138	Merge "Consistent names for inter mode probabilities and encodings."	2013-07-20 22:40:51 -07:00
Dmitry Kovalev	f66821afbb	Merge "Removing frame_type field from MACROBLOCKD struct."	2013-07-20 22:40:06 -07:00
Jingning Han	c725502bf3	Skip buffer update in sub8x8 rd loop This commit allows the encoder to skip a few buffer update steps in rd_pick_best_mbsegmentation, when early breakout has been triggered in the rd_check_segment_txsize. It provides about 1% speed-up for bus_cif at 2000 kbps, in the settings of speed 0. Change-Id: Ica034f10a24dec572b397d8389a2b81020ebc0b9	2013-07-20 21:38:12 -07:00
Deb Mukherjee	302698fb12	Reworked the auto_mv_step_size speed feature This patch modifies the auto_mv_step_size speed feature to use a combination of the maximum magnitude mv from the last inter frame, and the maximum magnitude mv for the two reference mvs with the same reference. For arf frames, the max mav step for the resolution is used. The bounds therefore are slightly tighter. The feature is made a speed 1 feature. Rebased. Results (when this feature is turned on over speed 0): derfraw300: -0.046% psnr, about 5+% speedup (tested on football: goes from 4m30.760s to 4m17.410s). Change-Id: If492797a61b0b4b3e58c0b8f86afb880165fc9f6	2013-07-19 15:12:56 -07:00
Dmitry Kovalev	e71a4a77bb	Merge "Renaming TXFM_MODE to TX_MODE (like TX_SIZE, TX_TYPE)."	2013-07-19 12:14:32 -07:00
Dmitry Kovalev	97e96bc4e9	Removing frame_type field from MACROBLOCKD struct. Change-Id: Ia4e83913251c1cdc7aa2abd64bf01ecb1a962119	2013-07-19 11:55:36 -07:00
Dmitry Kovalev	c0eb57406c	Renaming TXFM_MODE to TX_MODE (like TX_SIZE, TX_TYPE). Moving TX_MODE enum to vp9_enums.h. Renaming txfm_mode variables to tx_mode. Change-Id: I459d1af6dd928ce7fccdf8ce30b6f1ca057bef92	2013-07-19 11:37:13 -07:00
Dmitry Kovalev	afe43d4089	Removing redundant VP9_COMMON* from function signatures. Functions: vp9_get_pred_context_switchable_interp, vp9_get_pred_context_intra_inter, vp9_get_pred_context_single_ref_p1, vp9_get_pred_context_single_ref_p2. Change-Id: I3d6fb8aee23c9062270768e1e6da416dd9bb8f96	2013-07-19 11:20:49 -07:00
Dmitry Kovalev	bc7acb134b	Consistent names for inter mode probabilities and encodings. Renaming vp9_sb_mv_ref_tree to vp9_inter_mode_tree, and vp9_sb_mv_ref_encoding_array to vp9_inter_mode_encodings. Change-Id: I0e91fbf81350d3ec5a2599064c74089b5d06133a	2013-07-19 10:40:04 -07:00
Yaowu Xu	37d901a47a	Merge "Add best_rd breakout to keyframe partition selection also."	2013-07-18 17:50:39 -07:00
Yaowu Xu	67fb0679ee	Merge "Merge scale_factors and scale_factors_uv."	2013-07-18 17:50:34 -07:00
Yaowu Xu	55b52e32da	Merge "Do in-place UV intra mode selection."	2013-07-18 17:50:07 -07:00
Yaowu Xu	51972d1279	Merge "Change break statement in a 2d loop to a return statement."	2013-07-18 17:49:58 -07:00
Dmitry Kovalev	92f4198d52	Merge "Using VP9_REF_NO_SCALE instead of (1 << VP9_REF_SCALE_SHIFT)."	2013-07-18 17:29:05 -07:00
Dmitry Kovalev	0b562b2d3d	Using VP9_REF_NO_SCALE instead of (1 << VP9_REF_SCALE_SHIFT). Change-Id: Ide58a74d31ff948319445a6337d2c05e98720e34	2013-07-18 15:12:46 -07:00
Ronald S. Bultje	96e4db2660	Add best_rd breakout to keyframe partition selection also. Change-Id: I96b8058f6dfecf8aa3e152cdcbfd7e10071fbbc9	2013-07-18 14:10:56 -07:00
Ronald S. Bultje	5ebe503f04	Merge scale_factors and scale_factors_uv. This prevents a duplicate memcpy of a 128-byte struct every time set_scale_factors() is called (which is a lot), thus leading to a decrease from 3.7 MB to 1.85 MB of struct copying per 64x64 block RD/partition loop. Overall, this decreases encoding time of the first 50 frames of bus @ 1500kbps (speed 0) from 1min5.9 to 1min4.9, i.e. about a 1.5% overall speedup. We can likely get more gains by removing the copy of the other struct (and replacing it with an indexing) as well. Change-Id: I3dceb7e79f71e6fe911b11cc994cf89a869dde7a	2013-07-18 14:10:56 -07:00
Ronald S. Bultje	df4b4fab26	Do in-place UV intra mode selection. This means we only do UV intra mode selection if we find any intra mode to actually be useful at all; in addition, we only do UV intra mode selection for the transform sizes that were selected, rather than all sizes available in this partition. First 50 frames of bus @ 1500kbps (speed 0) gains about 5% with this change. Change-Id: I7b461eb8b803247f57896c5a9505f745b55502b3	2013-07-18 14:10:56 -07:00
Ronald S. Bultje	e54a5782b9	Change break statement in a 2d loop to a return statement. The break statement only breaks out of the nested loop, not the top-level loop, so it doesn't always work as intended. Changing it to a return statement does what's intended. Change-Id: I585419823b39a04ec8826b1c8a216099b1728ba7	2013-07-18 14:10:56 -07:00
Ronald S. Bultje	2d4929e340	Remove motion vectors from PARTITION_INFO. The same information already exists in union b_mode_info. Change-Id: Iac5086b99a3c3cc270380138062bb693e58f9e6d	2013-07-18 14:10:52 -07:00
Ronald S. Bultje	9da67da04a	Merge "Fix bug where we don't choose any mode in RD selection."	2013-07-18 12:47:50 -07:00
Ronald S. Bultje	247197d57b	Fix bug where we don't choose any mode in RD selection. This could happen during golden overlay frame coding from a previous alt-ref frame if the special overlay code was triggered. Change-Id: I3056d0c547cd26903b260ef93c94026e96bd9868	2013-07-18 12:13:15 -07:00
Ronald S. Bultje	4f5815290c	Merge "Fix bug which skips zeromv even if near/nearest is not 0,0."	2013-07-18 10:06:51 -07:00
Ronald S. Bultje	deb7456058	Fix bug which skips zeromv even if near/nearest is not 0,0. Change-Id: Id4f454831f3f11099f39c30246adeaa52857d08d	2013-07-18 09:35:19 -07:00
Jingning Han	ced3c20165	Use mv_check_bounds in sub8x8 rd loop Make the use of mv_check_bounds consistent for mvs of both ref_frame[0] and ref_frame[1]. Change-Id: I1ca24865cc7232ca9cbe5db566c53abad1592211	2013-07-17 17:13:51 -07:00
Ronald S. Bultje	facecd80da	Merge "Add a best_yrd shortcut in splitmv mode search."	2013-07-17 16:11:13 -07:00
Ronald S. Bultje	056111c822	Merge "Skip redundant nearest/near/zero encodes in splitmv."	2013-07-17 16:10:51 -07:00
Ronald S. Bultje	0b1eba25b2	Merge "Skip nearest/near/zero redundant encodes."	2013-07-17 16:10:41 -07:00
Ronald S. Bultje	607424449c	Merge "Best_rd breakout in rd partition search."	2013-07-17 16:10:22 -07:00
Yaowu Xu	6ac5b7db2c	Merge "changed mode checking order"	2013-07-17 14:44:40 -07:00
Dmitry Kovalev	a7a1e96136	Merge changes Ieffea49e,Idf610746 * changes: Removing two unused arguments from vp9_inc_mv signature. Changing signature of vp9_get_pred_probs_tx_size.	2013-07-17 14:44:20 -07:00
Ronald S. Bultje	c6917528a5	Add a best_yrd shortcut in splitmv mode search. Encoding of first 50 frames of bus (speed 0) @ 1500kbps goes from 1min6.2 to 1min5.9, i.e. 0.5% faster overall. Change-Id: I59d8a3b2f0a75010fa041d5e2646c8caac5bd683	2013-07-17 14:21:57 -07:00
Ronald S. Bultje	161c995658	Skip redundant nearest/near/zero encodes in splitmv. Encode of first 50 frames of bus @ 1500kbps (speed 0) goes from 1min7.3 to 1min6.2, i.e. 1.7% faster overall. Change-Id: I19d2deacfbffadd61d32551cee9586757ab4a987	2013-07-17 13:53:48 -07:00
Yaowu Xu	42facc292d	changed mode checking order Change-Id: Ic4c4b363ed840935e42f495f13ea5e601a56f1b2	2013-07-17 13:43:50 -07:00
Ronald S. Bultje	8fea880b6f	Skip nearest/near/zero redundant encodes. Encode of first 50 frames of bus @ 1500kbps (speed 0) goes from 1min12.8 to 1min7.3, i.e. 8% faster. Change-Id: Ia22d1c7b687316c553cc60eacae988b24e175b62	2013-07-17 11:33:15 -07:00
Ronald S. Bultje	9f427bfe98	Best_rd breakout in rd partition search. About 15% faster for bus (speed 0) first 50 frames @ 1500kbps, which goes from 1min36 to 1min24. Results become slightly better (+0.2% on derf/yt, +0.4% on hd), probably because of a bugfix for skipmode in super_block_yrd(). Overall speed change (on derfraw300) is roughly -13%. This can probably be improved further by caching best_yrd between partition searches. Also, we might be able to get more speedups by always doing PARTITION_NONE before PARTITIONS_SPLIT, not just at the sb8x8 level. Change-Id: I83736949ebd5b4a3b400ee688d7661913fefc98b	2013-07-17 09:56:46 -07:00
Ronald S. Bultje	83c7e13a6b	Do a skip-block check for sub8x8 partitions also. +0.2% SSIM and glbPSNR on derfraw300. Change-Id: I9cba0bca55e606a22f557c7732b064f738efe84d	2013-07-17 09:46:47 -07:00
Yunqing Wang	df90d58f4f	Speed up motion estimation using small partitions' result(experiment) Current partition checking starts from small sizes, and then goes up to large sizes. This experiment uses the small partitions' motion estimation result, which is already available, to speed up the large partition's motion estimation. We can decide to skip some patition checkings if they are unlikely choices. We could use the motion vector(MV) result as current partition's prediction MV, limit the search range and reference frame. Current result at speed 1: psnr loss: 1.19% for stdhd, 0.287% for derf. speed gain: 14% for sunflower(hd), 11% for akiyo. Further improvement will be done later. Change-Id: I5abfd070e9cace2e91e2a0247d1325df313887ab	2013-07-17 09:11:47 -07:00
Paul Wilkins	d66eab15dd	Merge "Move uv intra mode selection in rd loop."	2013-07-17 05:19:26 -07:00
Paul Wilkins	154c34a3ee	Merge "Limit transform sizes searched for uv intra."	2013-07-17 03:40:11 -07:00
Paul Wilkins	2ee338ce3b	Move uv intra mode selection in rd loop. Use an estimate based on DC_PRED for intra uv cost within the rd loop then only do a full uv mode analysis if an intra mode is chosen. Significant speed gains in some cases. Currently only enabled for speed 2 pending speed/quality tests. Change-Id: Ie851a12400d5483bce47ec0e3ccb8516041e91c0	2013-07-17 11:11:21 +01:00
Paul Wilkins	6c667f0ffe	Limit transform sizes searched for uv intra. Apply limit if search_method == USE_LARGESTALL to the range of UV tx sizes searched. Change-Id: I6db29f0dd237285ffc50d75a37e8b68151ad821c	2013-07-17 11:08:55 +01:00
Paul Wilkins	5f4722c75f	Merge "Minor cleanup in code to fine uv tx_size."	2013-07-17 02:50:09 -07:00
Jingning Han	a142d6fc93	Skip redundant motion search in 4x4 level rd loop This commit makes the encoder to perform motion search only once per reference frame type for each 4x4/4x8/8x4 block. For bus_cif at 2000 kbps, the runtime goes from 253812ms -> 217817ms (14% speed-up) for speed 0. Change-Id: I5f17599ccc8cfaf93ccb4f98fcb6008af6d79e92	2013-07-16 17:21:11 -07:00
Dmitry Kovalev	5b65a71cdc	Changing signature of vp9_get_pred_probs_tx_size. Removing VP9_COMMON* argument and adding struct tx_probs* instead of MACROBLOCKD*. Change-Id: Idf61074631a90ec51eac22c8dcd977f44ac0757c	2013-07-16 16:34:54 -07:00
Paul Wilkins	30d2ea45ce	Minor cleanup in code to fine uv tx_size. Change-Id: I94b97a966b5efbc9a243048f1f5ddbbdc4b1846e	2013-07-16 18:27:33 +01:00
Dmitry Kovalev	ca75f1255f	Removing and moving around constant definitions. Removing unused and duplicated constants, moving them from .h to .c if possible. Change-Id: Ief4d6b984a3ca2e9b38504f0d855ed072cf7133f	2013-07-15 19:26:30 -07:00
Jingning Han	faff6ed0fb	Skip duplicate block encoding in the rd loop This speed feature allows the encoder to largely remove the spatial dependency between blocks inside a 64x64 superblock, thereby removing the need to repeatedly encode superblocks per partition type in the rate-distortion optimization loop. A major challenge lies in the intra modes tested in the rate-distortion optimization loop. The subsequent blocks do not have access to the reconstructed boundary pixels without the intermediate coding steps. This was resolved by using the original pixels for intra prediction in the rd loop, followed by an appropriately designed distortion modeling on the quantization parameters. Experiments also suggested that the performance impact is more discernible at lower bit-rate/psnr settings. Hence a quantizer dependent threshold is applied to deactivate skip of block coding. For bus_cif at 2000 kbps, speed 0: runtime 269854ms -> 237774ms (12% speed-up) at 0.05dB performance loss. speed 1: runtime 65312ms -> 61536ms, (7% speed-up) at 0.04dB performance loss. This operation is currently turned on in settings of speed 1. Change-Id: Ib689741dfff8dd38365d8c1b92860a3e176f56ec	2013-07-15 11:08:58 -07:00
Yaowu Xu	fb754b182f	Fix a build issue Change-Id: I23a75c495ed7ea917d7f312bef0990e20a6b53d9	2013-07-12 11:38:44 -07:00
Deb Mukherjee	94c481f9f1	Some minor cleanups for efficiency Implements some of the helper functions more efficiently with lookups rathers than branches. Modeling function is consolidated to reduce some computations. Also merged the two enums BLOCK_SIZE_TYPES and BlockSize into one because there is no need to keep them separate (even though the semantics are a little different). No bitstream or output change. About 0.5% speedup Change-Id: I7d71a66e8031ddb340744dc493f22976052b8f9f	2013-07-12 10:22:56 -07:00
Ronald S. Bultje	ee09dd9949	Remove unused function block_error(). Change-Id: I78a79fc51c2d7cc3c261f35b569155397f3dc0c4	2013-07-11 17:14:03 -07:00
Dmitry Kovalev	8c05e59065	Calling is_inter_mode() instead of custom code. Change-Id: Iccd4ab95ea51a6d57ed43947f2fd7ad92e8979cf	2013-07-11 14:14:47 -07:00
Dmitry Kovalev	c4ad3273c7	Moving segmentation related vars into separate struct. Adding segmentation struct to vp9_seg_common.h. Struct members are from macroblockd and VP9Common structs. Moving segmentation related constants and enums to vp9_seg_common.h. Change-Id: I23fabc33f11a359249f5f80d161daf569d02ec03	2013-07-11 11:57:57 -07:00
Jingning Han	18803f9cc4	Fix tx_type bug in intra4x4 rd loop This commit fixed the mis-use of the tx_type for inverse transform in intra4x4 rate-distortion optimization loop. It improves the overall coding performance. Change-Id: I7fe9953175b74890357dbcee33c138573766e980	2013-07-10 15:49:49 -07:00
Deb Mukherjee	7494bba66b	Merge "Prunes out full-rd computation based on modeled rd"	2013-07-10 15:37:11 -07:00
Jim Bankoski	865ca76604	Merge "remove warnings when NDEBUG is set"	2013-07-10 14:39:39 -07:00
Jim Bankoski	6591cf2f7e	remove warnings when NDEBUG is set Change-Id: Ie0cb732fdcb98616a422c4463bff80642248d136	2013-07-10 14:27:20 -07:00
Deb Mukherjee	53ff43adc3	Prunes out full-rd computation based on modeled rd Adds a speed feature to eliminate full-rd computation if the modeled rd or rd based on a different parameter in the same mode is already a lot larger than the best rd yet. Specifically, only search the sharp and smooth filters if the modeled rd cost based on the regular filter is within a certain factor of the best rd cost so far. Also, skip full-rd computation of non splitmv inter modes if the modeled rd cost based on pred error is within the same factor of the best rd cost so far. Also adds some enhancements in the rd search for splitmv mode to speed things up by early breakouts. Negligible impact on performance. Resuts on derfraw300: psnr: -0.013% with the splitmv enhancements, -0.24% with the rd breakout feature on. speedup: 6% with splitmv enhancements, 20% with also residual breakout (tested on football sequence at 600 Kbps) Change-Id: I37abc308ea9f110c1679ce649b6a7e73ab1ad5fc	2013-07-10 13:49:49 -07:00
Yaowu Xu	e52eec490c	Merge "Add a feature to reduce chrome intra mode search"	2013-07-10 11:35:47 -07:00
Ronald S. Bultje	b1df674a99	Remove memcpy() in handle_inter_mode() filter selection. Encode time of first 50 frames of bus (speed 0) @ 1500kbps goes from 2min4.9 to 2min3.1, i.e. a 1.4% speedup overall. Change-Id: I9b25e87974430cb942caa276410bb2eda815bd83	2013-07-10 09:27:56 -07:00
Yaowu Xu	bed27a960a	Add a feature to reduce chrome intra mode search Change-Id: I721ebdeef2b53ce3e5c3eba3f7462ae2103c95a8	2013-07-10 08:59:18 -07:00
Jim Bankoski	fb027a7658	removing case statements around prediction entropy coding Removes SEG_ID Removes MBSKIP Removes SWITCHABLE_INTERP Removes INTRA_INTER Removes COMP_INTER_INTER Removes COMP_REF_P Removes SINGLE_REF_P1 Removes SINGLE_REF_P2 Removes TX_SIZE Change-Id: Ie4520ae1f65c8cac312432c0616cc80dea5bf34b	2013-07-09 20:10:16 -07:00
Yaowu Xu	205efbc153	Revert "Remove memcpy() in handle_inter_mode() filter selection." This reverts commit `fcf7998a47`. Change-Id: Ic6532223faec9f1483b78adb2e37b79c7b1a0efb	2013-07-09 17:42:10 -07:00
Ronald S. Bultje	204d1b7058	Merge "Unbreak lossless."	2013-07-09 09:54:48 -07:00
Ronald S. Bultje	059c0ba5d4	Unbreak lossless. Change-Id: I8130ec9b5371c65e885f245a5ac73840c23cb4a1	2013-07-09 09:46:37 -07:00
Dmitry Kovalev	1c65c580d6	Merge "Refactoring setup_pre_planes function."	2013-07-08 20:08:05 -07:00
Ronald S. Bultje	8fde07a3ae	Don't recalculate mv_ref costs for each block/partition. Changes cost_mv_ref() into doing a LUT into pre-calculated cost arrays instead. Encode time of first 50 frames of bus (speed 0) @ 1500kbps goes from 2min11.6 to 2min10.9, i.e. 0.5% faster overall. Change-Id: If186e92c34c201b29cbbc058785a15c9c09e433a	2013-07-08 16:22:39 -07:00
Ronald S. Bultje	fcf7998a47	Remove memcpy() in handle_inter_mode() filter selection. Encode time of first 50 frames of bus (speed 0) @ 1500kbps goes from 2min4.9 to 2min3.1, i.e. a 1.4% speedup overall. Change-Id: Ibe8b08d159797504c5d0c5122de1b6da3b6595e0	2013-07-08 16:22:39 -07:00
Ronald S. Bultje	ed995afba1	Make frame-wide filter-type decision fully RD-based. Overall, on all test sets, this gains about +0.2% on all metrics. City is a clip where this really hurts (-1.0% on all metrics), I'm not quite sure why yet. Maybe interesting to look into in the future. Change-Id: I6f0eecb20e72f0194633270d30bf00d76d9eae78	2013-07-08 16:22:37 -07:00
Deb Mukherjee	d9b62160a0	Implements several heuristics to prune mode search Skips mode searches for intra and compound inter modes depending on the best mode so far and the reference frames. The various heuristics to be used are selected by bits from a flag. The previous direction based intra mode search pruning is also absorbed in this framework. Specifically the flags and their impact are: 1) FLAG_SKIP_INTRA_BESTINTER (skip intra mode search for oblique directional modes and TM_PRED if the best so far is an inter mode) derfraw300: -0.15%, 10% speedup 2) FLAG_SKIP_INTRA_DIRMISMATCH (skip D27, D63, D117 and D153 mode search if the best so far is not one of the closest hor/vert/diagonal directions. derfraw300: -0.05%, about 9% speedup 3) FLAG_SKIP_COMP_BESTINTRA (skip compound prediction mode search if the best so far is an intra mode) derfraw300: -0.06%, about 7-8% speedup 4) FLAG_SKIP_COMP_REFMISMATCH (skip compound prediction search if the best single ref inter mode does not have the same ref as one of the two references being tested in the compound mode) derfraw300: -0.56%, about 10% speedup Change-Id: I1a736cd29b36325489e7af9f32698d6394b2c495	2013-07-08 12:17:12 -07:00
Paul Wilkins	ef0ca2deaa	Merge "Fix to comp_inter_joint_search_thresh feature."	2013-07-04 03:27:00 -07:00
Dmitry Kovalev	f72e072555	Refactoring setup_pre_planes function. Removing set_refs, adding set_ref function. Change-Id: I5635c478b106ae4e57d317f1c83d929644307e63	2013-07-03 17:42:01 -07:00
Jingning Han	68172dbede	Merge "Enable early termination in rd search"	2013-07-03 14:20:41 -07:00
Jingning Han	2bd6fe08f8	Enable early termination in rd search This commit allows encoder to detect the cumulative rate-distortion cost per transformed block inside a partition. If the cumulative rd cost is already above the best rd value, it terminates the rest operations and continue to next prediction mode test. It reduces the runtime of bus at target bit-rate 2000 from 308 second to 266 second, i.e., about 13% speed-up at no performance penalty. Change-Id: I5f15a3d8955d97031d5653006027866a00654e7a	2013-07-03 12:54:18 -07:00
Paul Wilkins	f58b44ad62	Fix to comp_inter_joint_search_thresh feature. When this is 0 (BLOCK_SIZE_AB4X4) we want to do the inter joint search for all sizes. Change-Id: Id40cd6fe7790e7e1165352b9cef5e12fa8c0bc88	2013-07-03 16:58:34 +01:00
Paul Wilkins	72c5778ec5	Added two new skip experiments. sf->unused_mode_skip_lvl. Tests modes as normal for all sizes at or below the given level. At larger sizes it skips all modes that were not chosen at any smaller size. Hence setting BLOCK_SIZE_SB64X64 is in effect off. Setting BLOCK_SIZE_AB4X4 will only consider modes that were chosen for one or more 4x4 blocks at larger sizes. sf->reference_masking. Do a test encode of the NONE partition at one size and create a reference frame mask based on the best rd choice. In the full search only allow this reference frame. Currently it is testing 64x64 and repeats this in the full search. This does not work well with Jim's Partition code just now and is disabled by default. Change-Id: I8f8c52d2ef4a0c08100150b0ea4155d1aaab93dd	2013-07-03 16:56:06 +01:00
Dmitry Kovalev	be77f6bbbf	Removing redundant struct from union b_mode_info. Change-Id: I08fc6e474ff2c12cfa065bae4989c724276e2c83	2013-07-02 16:51:57 -07:00
Deb Mukherjee	37501d687c	Speed feature to binary search dir intramodes This speed feature will skip searching the directional intra prediction modes D63, D117, D27, D153 if the best intra mode so far is not one of the diagonal, horizontal or vertical directions closest to the respective directions being tested. In other words, this implements a sort of binary search in the angular domain. Speedup: about 9-10% Results: -0.05% only on derfraw300. Change-Id: I413584c41f2a3e8dabfbdeb40718c8fc4b1d63a2	2013-07-02 14:07:19 -07:00
Deb Mukherjee	8d3d2b76f3	Tx size selection enhancements (1) Refines the modeling function and uses that to add some speed features. Specifically, intead of using a flag use_largest_txfm as a speed feature, an enum tx_size_search_method is used, of which two of the types are USE_FULL_RD and USE_LARGESTALL. Two other new types are added: USE_LARGESTINTRA (use largest only for intra) USE_LARGESTINTRA_MODELINTER (use largest for intra, and model for inter) (2) Another change is that the framework for deciding transform type is simplified to use a heuristic count based method rather than an rd based method using txfm_cache. In practice the new method is found to work just as well - with derf only -0.01 down. The new method is more compatible with the new framework where certain rd costs are based on full rd and certain others are based on modeled rd or are not computed. In this patch the existing rd based method is still kept for use in the USE_FULL_RD mode. In the other modes, the count based method is used. However the recommendation is to remove it eventually since the benefit is limited, and will remove a lot of complications in the code (3) Finally a bug is fixed with the existing use_largest_txfm speed feature that causes mismatches when the lossless mode and 4x4 WH transform is forced. Results on derf: USE_FULL_RD: +0.03% (due to change in the tables), 0% encode time reduction USE_LARGESTINTRA: -0.21%, 15% encode time reduction (this one is a pretty good compromise) USE_LARGESTINTRA_MODELINTER: -0.98%, 22% encode time reduction (currently the benefit of modeling is limited for txfm size selection, but keeping this enum as a placeholder) . USE_LARGESTALL: -1.05%, 27% encode-time reduction (same as existing use_largest_txfm speed feature). Change-Id: I4d60a5f9ce78fbc90cddf2f97ed91d8bc0d4f936	2013-07-02 13:54:00 -07:00
Ronald S. Bultje	3cc6eb7c00	Merge "Make get_coef_context() branchless."	2013-07-02 11:48:15 -07:00
Jingning Han	b91a1586a3	Calculate rd cost per transformed block Compute the rate-distortion cost per transformed block, and cumulate the cost through all blocks inside a partition. This allows encoder to detect if the cumulative rd cost is already above the best rd cost, thereby enabling early termination in the rate-distortion optimization search. Change-Id: I0a856367a9a7b6dd0b466e7b767f54d5018d09ac	2013-07-02 09:58:46 -07:00
Paul Wilkins	b7cd01ed73	Revert "New motion threshold factor - speed feature." This reverts commit `1377278180`. Also fixes a spelling mistake. Change-Id: I5be8aa4d8d3c0323d4a6f41968a7b2c048949c3f	2013-07-02 15:06:40 +01:00
Ronald S. Bultje	26b6318de8	Make get_coef_context() branchless. This should significantly speedup cost_coeffs(). Basically what the patch does is to make the neighbour arrays padded by one item to prevent an eob check in get_coef_context(), then it populates each col/row scan and left/top edge coefficient with two times the same neighbour - this prevents a single/double context branch in get_coef_context(). Lastly, it populates neighbour arrays in pixel order (rather than scan order), so we don't have to dereference the scantable to get the correct neighbours. Total encoding time of first 50 frames of bus (speed 0) at 1500kbps goes from 2min10.1 to 2min5.3, i.e. a 2.6% overall speed increase. Change-Id: I42bcd2210fd7bec03767ef0e2945a665b851df56	2013-07-01 16:34:10 -07:00
Yaowu Xu	ba3b2604f0	Merge "Quantize (64-bit only, for now) SSSE3 SIMD."	2013-07-01 15:58:57 -07:00
Ronald S. Bultje	7353ceab9d	Quantize (64-bit only, for now) SSSE3 SIMD. Total encoding time for first 50 frames of bus (speed 0) @ 1500kbps goes 2min34.8 to 2min14.4, i.e. a 10.4% overall speedup. The code is x86-64 only, it needs some minor modifications to be 32bit compatible, because it uses 15 xmm registers, whereas 32bit only has 8. Change-Id: I2df53770c2e850813ffa713e1a91b45b0082b904	2013-07-01 11:36:07 -07:00
Paul Wilkins	1377278180	New motion threshold factor - speed feature. Added a speed feature that focuses only on thresholds for new motion modes. Moved sf->comp_inter_joint_search_thresh into speed 1. This has ~+0.4% impact on quality at speed 0 as our quality reference baseline. Slight adjustment to baseline thresholds. Change-Id: I7ebf104f1fe29af77ed4837b2e84be065621bbe5	2013-07-01 12:11:21 +01:00
Ronald S. Bultje	bc70c60b25	Merge "fixed a bug where sse is not populated"	2013-06-29 07:42:41 -07:00
Yaowu Xu	f853e662b7	fixed a bug where sse is not populated Change-Id: I692d800af1f976c84a76f8bd66864c4b39540abc	2013-06-28 17:10:22 -07:00
Ronald S. Bultje	d00b8e5f82	Inline vp9_get_coef_context() (and remove vp9_ prefix). Makes cost_coeffs() a lot faster: 4x4: 236 -> 181 cycles 8x8: 888 -> 588 cycles 16x16: 3550 -> 2483 cycles 32x32: 17392 -> 12010 cycles Total encode time of first 50 frames of bus (speed 0) @ 1500kbps goes from 2min51.6 to 2min43.9, i.e. 4.7% overall speedup. Change-Id: I16b8d595946393c8dc661599550b3f37f5718896	2013-06-28 10:40:21 -07:00
Ronald S. Bultje	e3ce2b2ab3	Minor change to prevent one level of dereference in cost_coeffs(). 4x4: 234 -> 236 cycles 8x8: 878 -> 888 cycles 16x16: 3664 -> 3550 cycles 32x32: 18134 -> 17392 cycles Change-Id: I37a51bfbb0060a3a54f09c6045c14a989811ed78	2013-06-28 10:29:07 -07:00
Ronald S. Bultje	91d223bd5c	Some minor optimizations for cost_coeffs(). Cycle timings for first 3 frames of bus (speed 0) at 1500kbps: 4x4: 298 -> 234 cycles 8x8: 1227 -> 878 cycles 16x16: 23426 -> 18134 cycles 32x32: 4906 -> 3664 cycles Total encode time of first 50 frames of bus @ 1500kbps (speed 0) goes from 3min0.7 to 2min51.6 seconds, i.e. 5.3% faster. Change-Id: I68a0e1b530b0563b84a67342cca4b45146077e95	2013-06-28 10:29:02 -07:00
Ronald S. Bultje	af660715c0	Make coefficient skip condition an explicit RD choice. This commit replaces zrun_zbin_boost, a method of biasing non-zero coefficients following runs of zero-coefficients to be rounded towards zero, with an explicit skip-block choice in the RD loop. The logic is basically that if individual coefficients should be rounded towards zero (from a RD point of view), the trellis/optimize loop should take care of it. If whole blocks should be zero (from a RD point of view), a single RD check is much more efficient than a complete serialization of the quantization loop. Quality change: derf +0.5% psnr, +1.6% ssim; yt +0.6% psnr, +1.1% ssim. SIMD for quantize will follow in a separate patch. Results for other test sets pending. Change-Id: Ife5fa641163ac5150ac428011e87188f1937c1f4	2013-06-28 10:28:49 -07:00
Yaowu Xu	8b9eea0a34	Minor cleanups Change-Id: I379617c1c731a686b3f7e032b8805860c1055b12	2013-06-28 09:19:50 -07:00
Paul Wilkins	05ffdf2625	Merge "Auto adapt step size feature."	2013-06-27 02:28:41 -07:00
Paul Wilkins	59af9049d3	Merge "Start adaptive threshold for each mode at max."	2013-06-27 02:28:36 -07:00
Paul Wilkins	5bcf069c6b	Merge "Change meaning of cpi->sf.first_step and rename."	2013-06-27 02:28:21 -07:00
Jingning Han	fc1cfd8e32	Merge "Make intra predictor reference buffer configurable"	2013-06-26 19:02:02 -07:00
Jingning Han	861cb06c67	Make intra predictor reference buffer configurable This commit enables configurable reference buffer pointer for intra predictor. This allows later removal of spatial dependency between blocks inside a 64x64 superblock in the rate-distortion optimization loop. Change-Id: I02418c2077efe19adc86e046a6b49364a980f5b1	2013-06-26 17:17:21 -07:00
Paul Wilkins	9f3ab83486	Auto adapt step size feature. Also tweaks to other features and experiments with what is on and off at different speed settings. Change-Id: I3e1d0be0d195216bf17c2ac5df67f34ce0b306b2	2013-06-26 19:48:39 +01:00
Dmitry Kovalev	49dee16879	Merge "Using get_plane_block_{width, height} instead of custom code."	2013-06-26 10:23:27 -07:00
Paul Wilkins	689957e3ad	Start adaptive threshold for each mode at max. Each frame we reset all adaptive thresholds to MAX rather than base. As modes are picked their thresholds drop down. Change-Id: Ia37f03a73003c2d9bfcda57edea07205e9a0e5e8	2013-06-26 17:04:47 +01:00
Paul Wilkins	e606cac046	Change meaning of cpi->sf.first_step and rename. Renamed cpi->sf.first_step to cpi->sf.reduce_first_step_size and changed its meaning such that it is a delta applied to reduce the default first step size (>> x) in the motion search rather than an absolute value. The default first step size is already changed according to the image dimensions (smaller for smaller images). cpi->sf.reduce_first_step_size now applies a further correction from the default. Change-Id: Ia94e08bc24c67b604831f980909af7e982fcd16d	2013-06-26 17:04:06 +01:00
Jingning Han	d19ea3861d	Refactor intra predictor block Remove vp9_intra4x4_predict(). Use the common intra prediction function for all block sizes. Change-Id: Ibd19d51dfa3da8bbdfb79ddeb81530b2e2089560	2013-06-25 16:33:13 -07:00
Dmitry Kovalev	dc0f457c94	Using get_plane_block_{width, height} instead of custom code. Change-Id: I453ed11b965e857a14c18ea5c0f4a0a48e7dc0d9	2013-06-25 14:11:18 -07:00
Dmitry Kovalev	87ee34aacb	Removing unused code. Removing block index (ib) parameter from get_tx_type_{8x8, 16x16} functions. Change-Id: Ia213335aae7a7cb027f97b9cc9b04519840250f1	2013-06-25 10:17:19 -07:00
Dmitry Kovalev	f27f76dfb3	Transforming scale_mv_component_q4 into scale_mv_q4 function. Using MV instead of int_mv for function arguments. Change-Id: Ic25e13dccbc98fac1fa1b3255127e00cca2a57f6	2013-06-21 15:34:29 -07:00
Ronald S. Bultje	54b2a59623	Implement SSE2 block_error. Change vp9_block_error() to return a 64bit error variable, change all callers to expect a 64bit return value (this will prevent overflows, which we basically don't check for at all right now). Remove duplicate block_error() function, which fixed that through truncation. Remove old (incompatible) mmx/sse2 block_error SIMD versions and replace with a new one that returns a 64bit value. Encoding time of first 50 frames of bus @ 1500kbps goes from 3min29 to 3min23, i.e. a 3% overall speedup. Change-Id: Ib71ac5508b5ee8a80f1753cd85d72df1629abe68	2013-06-21 12:54:52 -07:00
Yaowu Xu	ee07a261a0	rename variables to avoid build error in MSVC Change-Id: I7960178c95c54d5c4497e44cfc8c493566294b34	2013-06-20 18:31:48 -07:00
Deb Mukherjee	7947a33d72	Improving model rd with variance and quant step Improves the rd modeling function and implements them using interpolation from a table which is a little faster. Also uses sse as input to the modeling function rather than var - since there is no dc prediction used and as a result the sse works a little better. derfraw300: +0.05% Speedup: ~1% Change-Id: I151353c6451e0e8fe3ae18ab9842f8f67e5151ff	2013-06-20 10:06:28 -07:00
Jim Bankoski	1f94b97694	convert all speed things to speed features Change-Id: Ie24489a4d39f3e53e816eeebf75a1c9c7d94515a	2013-06-20 09:42:44 -07:00
Yaowu Xu	12180c8329	Remove unnecessary copying of probs. Change-Id: Ic924f07c6ab0c929c6cdf11880d3c625806e272c	2013-06-18 23:02:27 -07:00
Deb Mukherjee	4ad96115cd	Some cleanups in rd motion search No bitstream or output change - only cosmetics. Change-Id: Ic8c1d7ad010a87dcf27d12a38cd7dd5adba683a7	2013-06-13 17:25:23 -07:00
Deb Mukherjee	f18328cbf1	Adds a zero check in model_rd function Avoids divide-by-zero when variance is 0. Change-Id: I3c7f526979046ff7d17714ce960fe81d6e1442a0	2013-06-10 17:04:47 -07:00
John Koleszar	717d744a01	Fix use of get_uv_tx_size in loopfilter Change the argument of get_uv_tx_size() to be an MBMI pointer, so that the correct column's MBMI can be passed to the function. Change-Id: Ied6b8ec33b77cdd353119e8fd2d157811815fc98	2013-06-10 11:40:57 -07:00
Paul Wilkins	de6ec27d1a	Rd check on segment level reference mode. Do not allow the rd code to check compound modes if a segment level reference frame is selected. Change-Id: I95f0c57789e0eaceed7caf227e94b4ba3130a06c	2013-06-10 11:03:15 -07:00
Ronald S. Bultje	b12a8dac98	Allow non-zeromv if ref_frame=intra with segmentation skip/ref enabled. Change-Id: Ib5a95bb6ab643b276df3faa9bf99595e4a69ff18	2013-06-10 10:55:10 -07:00
Tero Rintaluoma	86bb6df005	Fixed point reference picture scaling Fixed point scaling factors are calculated once for each reference frame by using integer division. Otherwise fixed point scaling routines are used in all scaling calculations. This makes it possible to calculate fixed point scaling factors on device driver software and pass them to hardware and thus avoid division on hardware. TODO: - Missing check for maximum frame dimensions (currently scaling uses 14 bits) - Missing check for maximum scaling ratio (upscaling 16:1, downscaling 2:1) Problems: - Straightforward fixed point implementation can cause error +-1 compared to integer division (i.e. in x_step_q4). Should only be an issue for frames larger than 16k. Change-Id: I3cf4dabd610a4dc18da3bdb31ae244ebaf5d579c	2013-06-10 08:07:55 -07:00
Deb Mukherjee	21401942b0	Coding tx-size selection by use of spatial context Adds coding of transform size within a frame by use of context of transform sizes selected in left and above blocks. Also incorporates code for generating stats. TODO: generate and incorporate new default stats Change-Id: I6a7af099f6ad61d448521d9a51167aedaf638ed6	2013-06-07 16:07:58 -07:00
Paul Wilkins	340c7a48e6	Change to segment ref frame feature. Simplify feature to only support a single reference frame instead of a mask. Change-Id: I5dd3a98c7a224aafb35708850ab82e2f220e68fb	2013-06-07 21:42:22 +01:00
Deb Mukherjee	3ee1a21a42	Coding updates for tx-size selection Changes to the coding of transform sizes, along with forward and backward probability updates. Results: derf300: +0.241% Context based coding of transform sizes will be in a separate patch. Change-Id: I97241d60a926f014fee2de21fa4446ca56495756	2013-06-07 08:54:00 -07:00
Ronald S. Bultje	6ef805eb9d	Change ref frame coding. Code intra/inter, then comp/single, then the ref frame selection. Use contextualization for all steps. Don't code two past frames in comp pred mode. Change-Id: I4639a78cd5cccb283023265dbcc07898c3e7cf95	2013-06-06 17:28:09 -07:00
Ronald S. Bultje	ad34368786	New intra mode and partitioning probabilities. Split partition probabilities between keyframes and non-keyframes, since they are fairly different. Also have per-blocksize interframe y intramode probabilities, since these vary heavily between different blocksizes. Lastly, replace default probabilities for partitioning and intra modes with new ones generated from current codec. Replace counts with actual probabilities also. Change-Id: I77ca996e25e4a28e03bdbc542f27a3e64ca1234f	2013-06-06 10:45:30 -07:00
Jingning Han	d03e974fbd	Bug fix in rd_pick_inter_mode_sb_ Fix the calculation of step size in height. Change-Id: I0e0c0175f141f5a41214ae51cef233d13942d3c5	2013-06-06 10:04:26 -07:00
Paul Wilkins	26e24b1dd7	Merge "Rd thresholds change with block size." into experimental	2013-06-06 09:27:44 -07:00
Paul Wilkins	02590a5b1b	Merge "Turn off compound inter search refinement for good quality." into experimental	2013-06-06 09:27:31 -07:00
Jim Bankoski	b4c4f64862	signs reverted Change-Id: Ieface458c83eb6e7ee95595d9fc662f372117c9a	2013-06-06 08:59:22 -07:00
Paul Wilkins	c3316c2bc5	Rd thresholds change with block size. Added structures to support independent rd thresholds for different block sizes (and set experimental block size correction factors). Added structure to to allow dynamic adaptation of thresholds per mode and per block size basis depending on how often the mode/block size combination is seen (currently fixed factor). Removed some unused variables. TODO - Adaptation of thresholds based on how often each mode chosen. - The baseline mode values could also be adjusted based on the block size (e.g. for a particular intra mode use a low threshold for 4x4 prediction blocks but a relatively high value for 64x64. Change-Id: Iddee65ff3324ee309815ae7c1c5a8584720e7568	2013-06-06 15:45:53 +01:00
Paul Wilkins	c880e02f97	Turn off compound inter search refinement for good quality. Turn this feature off for some modes in "good" quality. Change-Id: I3f262d62cca8f01736b977af1465291e8be29f0a	2013-06-06 15:44:25 +01:00
Jim Bankoski	5a88271b09	don't tokenize & encode tokens for blocks in UMV This avoids encoding tokens for blocks that are entirely in the UMV border. This changes the bitstream. Change-Id: I32b4df46ac8a990d0c37cee92fd34f8ddd4fb6c9	2013-06-06 06:10:25 -07:00
Jingning Han	61e6586230	Merge "Fix UV intra coding rd loop" into experimental	2013-06-05 21:47:00 -07:00
Jingning Han	f04b15486a	Fix UV intra coding rd loop This commit makes the coding/reconstruction operations of intra coding rate-distortion loop for UV components consistent with those of the encoding process. key frame coding gains: derf: 0.11% stdhd: 0.42% Change-Id: I8d49f83924a320e3689ef2d60096c49d7f0c7a40	2013-06-05 21:18:02 -07:00
Deb Mukherjee	30226a658f	Cosmetic renaming VP9_MVREFS to VP9_INTER_MODES NO bitstream change Change-Id: I79f6146dac5fdd157051b6f8dc611c0b7b5e5f7f	2013-06-05 11:24:01 -07:00
Jingning Han	513d326d75	Merge "Make sb intra rd search consistent with encoding" into experimental	2013-06-04 14:59:05 -07:00
Jingning Han	51b6e73a68	Make sb intra rd search consistent with encoding This commit makes operations of the superblock intra coding rate distortion optimization consistent with those used in the encoding process. Given the test prediction mode and transform size, the rd optimizer encodes and reconstructs each transformed block of the superblock consecutively, then computes the total rate-distortion costs accosicated with the current superblock to select the coding decisions. It achieves coding performance gains: derf 0.353% yt 1.111% Change-Id: I0da2eb7a71361dfb8c1384927fc536b0c2790d07	2013-06-04 13:54:48 -07:00
Dmitry Kovalev	6a961e7dc8	Merge "Replacing memcpy with struct assignment." into experimental	2013-06-03 14:32:05 -07:00
Jingning Han	9068bce4e7	Put iterative motion search under speed control Enable iterative motion search for compound inter-inter prediction of block sizes 4x4/4x8/8x4 only when best coding quality is selected. The iterative motion search provides about 0.1% gains for derf and stdhd at this point, at the expense of longer runtime. Change-Id: Idc03e7f827e51f1bb8d269bc3752ee297a6bbfe5	2013-06-03 09:18:57 -07:00
Dmitry Kovalev	3b9ec31eaf	Replacing memcpy with struct assignment. Change-Id: Ib557cc6351404b9e178e95a545883eb3666f11f0	2013-05-31 16:00:32 -07:00
Dmitry Kovalev	317d832d38	Merge "Adding plane_block_width and plane_block_height functions." into experimental	2013-05-31 15:28:45 -07:00
Deb Mukherjee	0048ec2329	Costing fixes related to trellis optimization Migrates costing changes/fixes from the rebalance expt to the head without the expt on. Rebased. Change-Id: I51677d62f77ed08aca8d21a4c9a13103eb8de93f Results: derfraw300: +0.126%	2013-05-31 13:56:32 -07:00
Dmitry Kovalev	120a878199	Adding plane_block_width and plane_block_height functions. Change-Id: I02c17fb733c0f3c22dc3167c3d3182797415f1ae	2013-05-31 12:31:49 -07:00
Ronald S. Bultje	a288cb3b10	Merge "Merge all various transform size data trackers into single variables." into experimental	2013-05-31 09:59:24 -07:00
Scott LaVarnway	1e025dbfd1	Merge "Moved use_prev_in_find_mv_refs check to frame level" into experimental	2013-05-31 09:35:51 -07:00
Ronald S. Bultje	e9d68a5e36	Merge all various transform size data trackers into single variables. Change-Id: I2dfc569106b29fbe4da20585a0e85e5e9ea6a4db	2013-05-31 09:18:59 -07:00
Jim Bankoski	21595f8e38	Merge "Creates a new speed 1:" into experimental	2013-05-30 20:36:05 -07:00
Jim Bankoski	ced21bd6a6	Creates a new speed 1: This speed 1 - uses variance threshold stolen from static-thresh to determine split. Any superblock with greater than the variance set by static thresh * quantizer index squared is split. In addition transform size is set to largest size less than or equal to partition size, sub pixel filter is set to normal, and only 12 modes are used at all. Change-Id: If7a2858ee70f96d1eb989c04fd87a332b147abef	2013-05-30 19:53:00 -07:00
Ronald S. Bultje	16482bddf7	Merge "Remove splitmv." into experimental	2013-05-30 19:07:12 -07:00
Ronald S. Bultje	d2205f92c3	Merge changes I98c18fe5,I80c37cff into experimental * changes: Remove i4x4_pred. Remove unused table.	2013-05-30 19:06:44 -07:00
Ronald S. Bultje	e6485581fe	Remove splitmv. We leave it in rdopt.c as a local define for now - this can be removed later. In all other places, we remove it, thereby slightly decreasing the size of some arrays in the bitstream. Change-Id: Ic2a9beb97a4eda0b086f62c039d994b192f99ca5	2013-05-30 17:21:01 -07:00
Ronald S. Bultje	1efa79d32f	Remove i4x4_pred. It remains as a local define in rdopt.c so we can distinguish between split and non-split modes in the RD loop, but disappears outside that scope in the codec. Change-Id: I98c18fe5ab7e4fbd1d6620ec5695e2ea20513ce9	2013-05-30 16:44:58 -07:00
Ronald S. Bultje	f5827699bf	Merge "Merge all intra mode coding trees into a single one." into experimental	2013-05-30 11:27:51 -07:00
Jingning Han	5e97862a71	Merge "Enable iterative motion search for 4x4 inter pred" into experimental	2013-05-30 11:02:10 -07:00
Ronald S. Bultje	98c192ae83	Merge all intra mode coding trees into a single one. Also merge all counters. This removes a few unused probability updates from the bitstream. Change-Id: I20f58853e9dac84d8c0d9703ae012c55917516eb	2013-05-30 09:58:53 -07:00
Jim Bankoski	e987f03acd	Merge "valgrind - txfm_thresh not set" into experimental	2013-05-30 09:34:48 -07:00
Deb Mukherjee	c98bfcfbbb	Merge "Balancing coef-tree to reduce bool decodes" into experimental	2013-05-30 08:10:47 -07:00
Jim Bankoski	ecf023f6e4	Merge "fix valgrind warning" into experimental	2013-05-30 08:04:49 -07:00
Jingning Han	87626a8f6e	Enable iterative motion search for 4x4 inter pred This commit enables iterative motion search for 4x4/4x8/8x4 block size compound inter-inter prediction. WIP: borg run testing Change-Id: I2b318db4a03cdca5a8002b3fa6c0fa89b129288b	2013-05-30 10:49:35 +01:00
Ronald S. Bultje	17544d1478	Merge "Remove some unused code related to macroblock/splitmv coding." into experimental	2013-05-29 17:35:05 -07:00
Jingning Han	5c05fbf6bb	Merge "Refactor 4x4 block level rd loop" into experimental	2013-05-29 16:35:02 -07:00
Deb Mukherjee	b8b3f1a46d	Balancing coef-tree to reduce bool decodes This patch changes the coefficient tree to move the EOB to below the ZERO node in order to save number of bool decodes. The advantages of moving EOB one step down as opposed to two steps down in the other parallel patch are: 1. The coef modeling based on the One-node becomes independent of the tree structure above it, and 2. Fewer conext/counter increases are needed. The drawback is that the potential savings in bool decodes will be less, but assuming that 0s are much more predominant than 1's the potential savings is still likely to be substantial. Results on derf300: -0.237% Change-Id: Ie784be13dc98291306b338e8228703a4c2ea2242	2013-05-29 16:25:52 -07:00
Jim Bankoski	aae78c8ac7	valgrind - txfm_thresh not set For 4x4 blocks valgrind points out the cache was uninitalized. This resolves the issue by setting it. Change-Id: I22733000da048643762813a84fbda66d8e4040d2	2013-05-29 13:56:08 -07:00
Jingning Han	d0a3872019	Refactor 4x4 block level rd loop This commit makes clean-ups in the rate-distortion loop for 4x4, 4x8, and 8x4 block sizes for the use of iterative motion search. Removed unnecessary use of bmi in handle_inter_mode. Deprecated loop over labels in the 4x4/4x8/8x4 block rd search. Change-Id: I71203dbb68b65e66f073b37abd90d82ef5ae6826	2013-05-29 13:44:52 -07:00
Scott LaVarnway	353642bc53	Moved use_prev_in_find_mv_refs check to frame level This patch checks at the frame level to see if the previous mode info context can be used. This patch eliminates the flag check that was done for every mode and removes another check that was done prior to every vp9_find_mv_refs(). Change-Id: I9da5e18b7e7e28f8b1f90d527cad087073df2d73	2013-05-29 16:42:23 -04:00
Jim Bankoski	5e5470b254	fix valgrind warning scales for second reference frame vars are unitialized if the second ref frame is one of of those disallowed by refframeflags Change-Id: I4ce42de391178c1699dcaede18c5f12c84993c61	2013-05-29 12:34:10 -07:00
Jingning Han	84deeddbaf	Merge "Refactor rd loop for inter modes" into experimental	2013-05-29 10:55:23 -07:00
Jingning Han	6c97bba403	Merge "further clean-ups on intra4x4 coding" into experimental	2013-05-29 10:55:14 -07:00
Sami Pietila	88a4d4c510	Residual coding to cache energy class of tokens. Proposal for tuning the residual coding by changing how the context from previous tokens is calculated. Storing the energy class of previous tokens instead of the token itself eases the critical path of HW implementations. Change-Id: I6d71d856b84518f6c88de771ddd818436f794bab	2013-05-29 15:21:01 +01:00
Ronald S. Bultje	4487f5a690	Remove some unused code related to macroblock/splitmv coding. Change-Id: Ic40d56fb162f4e201547dfae33e62ccd9e865889	2013-05-29 06:29:56 -07:00
Jingning Han	94d700e763	Refactor rd loop for inter modes This commit pulls the iterative motion search for compound inter- inter out from handle_inter_mode_ as a separate function. Hence, it is applicable to 4x4/4x8/8x4 level compound inter search to be enabled later. Also edit the rd loop for 4x4 inter block sizes for cosmetic purpose. Change-Id: Ibc71a11cbe5a26cd52faba01026cf8446cf4d2b4	2013-05-28 16:31:33 -07:00
Jingning Han	4729a6f389	further clean-ups on intra4x4 coding Removed one 4x4 prediction step that was unnessary in the rd loop. Removed a unused modecosts estimate from encoder side. Change-Id: I65221a52719d6876492996955ef04142d2752d86	2013-05-28 11:19:05 -07:00
Yaowu Xu	601bab4fde	Merge "a few clean-ups" into experimental	2013-05-27 15:16:21 -07:00
Ronald S. Bultje	cba8e16e93	Decrease scope of frame_mv argument to handle_inter_mode(). Change-Id: I81c637c61ecc33cb66beb59a2a33166d66b9a0a2	2013-05-27 14:16:45 -07:00
Yaowu Xu	2b96ffe025	a few clean-ups 1. remove prediction mode conversion 2. unified bmode, same for key and non-key frame 3. set I4X4_PRED count for pdf to 0, as I4X4_PRED is no longer coded ever. It is determined by ref_frame and block partition Change-Id: If5b282957c24339b241acdb9f2afef85658fe47d	2013-05-27 13:53:56 -07:00
Ronald S. Bultje	f188bf1c3d	Remove unused mode_index argument from handle_inter_mode(). Change-Id: I07b8c15f33e6e7c63dd0033c18c4ac5c0303cf32	2013-05-27 08:49:17 -07:00
Ronald S. Bultje	5cac66078e	Remove splitmv. Also do per-partition motion vector referencing in <sb8x8 partitions, and adjust mvref finding for sub8x8 partitions. Change-Id: Id3ed1ed4d2a8910d11d327db6cc63b8eb79f941f	2013-05-26 14:40:49 -07:00
Jingning Han	826efc838c	Fix a bug in intra4x4 level rd loop This commit fixed a uninitialized value use in the intra 4x4/8x4/4x8 rate-distortion loop. Change-Id: I5c25b3536b59e4f5fbb23cf85baf93b2ccec7d72	2013-05-23 17:44:33 -07:00
Jingning Han	ae10319520	Make comp_inter_inter support 4x4 partition coding This commit refactors the iterative motion search for compound inter-inter mode, to make it support all partition types including 4x4/4x8/8x4 block sizes. Change-Id: I5f1212b0f307377291763e45c6bdc9693b5f04c8	2013-05-23 13:13:42 +01:00
Paul Wilkins	33ecd6ad54	Merge Scatter Scan experiment. Removal from under configure flag. A bit renaming Change-Id: I2213229dfe852001dfec16b149f47c52ce88f3aa	2013-05-23 13:09:27 +01:00
Jingning Han	7ac5ac52f9	Merge 4x4 block level partition into codebase Move 4x4/4x8/8x4 partition coding out of experimental list. This commit fixed the unit test failure issues. It also resolved the merge conflicts between 4x4 block level partition and iterative motion search for comp_inter_inter. Change-Id: I898671f0631f5ddc4f5cc68d4c62ead7de9c5a58	2013-05-23 11:58:50 +01:00
Deb Mukherjee	ddb2309568	Merge "Using 128 entry look up table for coef models" into experimental	2013-05-22 10:38:35 -07:00
Jingning Han	d2cacdc530	Merge "Make the intra rd search support 8x4/4x8" into experimental	2013-05-22 10:00:15 -07:00
Deb Mukherjee	de4d682ca4	Using 128 entry look up table for coef models Reverts to using 128 bit LUT for the coef models rather than 48 to ease hardware implementation. Also incorporates some cleanups including removing various hooks to support different lookup tables based on block_type and ref_type. Change-Id: I54100c120cca07a2ebd3a7776bc4630fa6a153f6	2013-05-22 08:44:31 -07:00
Paul Wilkins	0b713f8c18	Merge CONFIG_COMP_INTER_JOINT_SEARCH. Merge this experiment so that it is under a speed feature flag not a configuration flag. Change-Id: I536f7f125a4ff5149bb3a64f791e835c324535fd	2013-05-22 11:23:31 +01:00
Jingning Han	f153a5d063	Make the intra rd search support 8x4/4x8 This commit allows the rate-distortion optimization of intra coding capable of supporting 8x4 and 4x8 partition settings. It enables the entropy coding of intra modes in key frame using a unified contextual probability model conditioned on its above/left prediction modes. Coding performance: derf 0.464% Change-Id: Ieed055084e11fcb64d5d5faeb0e706d30268ba18	2013-05-21 21:03:00 -07:00
John Koleszar	ddf13be8ef	Merge "Initial version of alpha channel support" into experimental	2013-05-21 17:29:51 -07:00
Deb Mukherjee	7a645e4e12	Merging the model coef prob experiment Merges the experiment. Change-Id: I4eb19af6de6df6aa3a96a2e82f231d47ed9b3ae9	2013-05-21 14:44:38 -07:00

... 3 4 5 6 7 ...

723 Commits