generic-library/vpx

Author	SHA1	Message	Date
Dmitry Kovalev	0e0a6f840b	Merge "Consistent update for inter_mode probabilities."	2013-07-31 12:02:35 -07:00
Jingning Han	ac7bab7575	Merge "Make the use of ref_frame index consistent"	2013-07-31 09:11:37 -07:00
Jingning Han	86c384d398	Make the use of ref_frame index consistent Refactor the frame buffer referencing in choose_partition and make it consistent with other places. This means to prevent potential issues when we extend reference frame buffer. Change-Id: I5ff33ed5f671e1f4cc7049622212769a9b4578d9	2013-07-30 19:49:36 -07:00
Dmitry Kovalev	8701bc11df	Consistent update for inter_mode probabilities. Using inter-mode counts instead of inter-mode-tree branch counts inside FRAME_COUNTS structure. Change-Id: I60dde13af37d06146d7d15543311c1b5044e9e04	2013-07-30 18:06:34 -07:00
Adrian Grange	fbd73648dd	Merge "Cleanup typos, remove unnecessary lines, replace switch"	2013-07-30 12:59:46 -07:00
Adrian Grange	b30a06b930	Cleanup typos, remove unnecessary lines, replace switch Removed unnecessary code lines, replaced switch with an if, fixed spelling errors and formatting. Change-Id: Ie48aa4604aa0ed48362ca359d792fb21b2ec1dc6	2013-07-30 12:10:32 -07:00
Yaowu Xu	88e48444da	Merge "removed duplication"	2013-07-30 09:38:02 -07:00
Yaowu Xu	a15d1f3134	removed duplication Change-Id: Ica23b66f6664e5a5b168499584f0afffbc54794f	2013-07-30 09:09:14 -07:00
Jingning Han	525745b17a	Remove a redundant branching in tokenize_b The tokenize_b function is only called when output flag is on. Hence removing the conditional branch on it therein. Change-Id: Ib709f47f23f39ca05a695faf86fa3377f11f2dd0	2013-07-29 17:08:13 -07:00
Jingning Han	455f2de20b	Tune tokenization/detokenization flow for speed-up This commit optimizes the tokenization and detokenization operational flow for speed-up. It makes the coding process about 0.3% faster at speed 0. Change-Id: I28008df7482874e4b5f237f2d418ff82a249dd56	2013-07-29 16:15:30 -07:00
Jingning Han	b5323ed89a	Skip redundant tokenization in rd loop This commit makes the encoder skip the redundant tokenization process in the rate-distortion optimization search loop, while updating the entropy contexts accordingly. It makes the speed 0 encoding process about 0.5% faster at no performance change. Change-Id: I34a4155a0b5332afeb45c93a51c7f35a294d685c	2013-07-29 16:09:16 -07:00
Jingning Han	5875d7a4a4	Merge "16x16 inverse 2D-DCT with DC only"	2013-07-29 15:29:25 -07:00
Jingning Han	a7c4de22e1	16x16 inverse 2D-DCT with DC only This commit provides special handle on 16x16 inverse 2D-DCT, where only DC coefficient is quantized to be non-zero value. Change-Id: I7bf71be7fa13384fab453dc8742b5b50e77a277c	2013-07-29 14:45:53 -07:00
Dmitry Kovalev	828119d6ab	Renaming txfm to tx for consistency in some places. Change-Id: I2a6a646570e2af66315e7c658d00d99f80c4b127	2013-07-29 14:35:55 -07:00
Dmitry Kovalev	730a34416f	Renaming NB_TXFM_MODES constant to TX_MODES. Change-Id: I10bf06e3a3d5271221ae6a42a36074d01d493039	2013-07-29 13:38:40 -07:00
Dmitry Kovalev	23391ea835	Renaming TX_SIZE_MAX_SB to TX_SIZES. Change-Id: I6aa4191935aa93461a07c41b59fdae1eb5f5f107	2013-07-29 12:25:34 -07:00
Jingning Han	decb1b94de	Merge "Shortcut 8x8/16x16 inverse 2D-DCT"	2013-07-29 11:04:07 -07:00
Ronald S. Bultje	118ccdcd30	Inverse dimension order in token_cost array. This allows us to increment the position at the band-level only as we go from one band to the next; more importantly, that allows us to use an add instead of multiply instruction, and omit the instruction altogether if the band doesn't change from one coef to the next, thus being slightly faster (probably more noticeable on systems where a multiply is expensive, like arm). Change-Id: I4343fe35b9f9a47fa00b217bdcbf5f91ff96c381	2013-07-26 17:30:04 -07:00
Ronald S. Bultje	dcacce6dd9	Merge "Save pixels instead of coefficients in intra4x4 RD loop."	2013-07-26 17:20:58 -07:00
Ronald S. Bultje	d30c8f41ef	Merge "Add best_rd breakout in intra4x4 RD loop."	2013-07-26 17:20:51 -07:00
Jingning Han	38fa487164	Shortcut 8x8/16x16 inverse 2D-DCT This commit brought back the shortcut implementation of 8x8/16x16 inverse 2D-DCT. When the eob <= 10, it skips the inverse transform operations on row 4:7/4:15 in the first round. For bus_cif at 1000 kbps, this provides about 2% speed-up at speed 0. Change-Id: I453e2d72956467d75be4ad8c04b4482ab889d572	2013-07-26 17:19:14 -07:00
Jingning Han	b9c3dd481a	Merge "Special handle on DC only inverse 8x8 2D-DCT"	2013-07-26 16:04:14 -07:00
Jingning Han	325e0aa650	Special handle on DC only inverse 8x8 2D-DCT This commit enables a special handle for the 8x8 inverse 2D-DCT, where only DC coefficient is quantized to be non-zero. For bus_cif at 2000 kbps, it provides about 1% speed-up at speed 0. Change-Id: I2523222359eec26b144cf8fd4c63a4ad63b1b011	2013-07-26 14:16:51 -07:00
Dmitry Kovalev	c09b81719f	Merge "General cleanups."	2013-07-26 13:59:39 -07:00
Yaowu Xu	4f75a1f4ed	Merge "Auto min and max partition size experiment."	2013-07-26 12:10:27 -07:00
Paul Wilkins	fe5e2a91bb	Auto min and max partition size experiment. Speed feature experiment to set an upper and lower partition size limit based on what has been seen in spatial neighbors. This seems to gives quite reasonable speed gains in local (10-15%) and when used with speed 0 the losses are small (0.25% derf, 0.35% stdhd). However, for now I am only enabling it on speed 1 as there may be clashes with the existing temporal partition selection in speed 2. Using a tighter min / max around the range derived from the neighbors increases speed further but at the cost of a bigger quality loss. However, I think this spatial method could be combined with data from either the last frame or a variance method (or both) to refine the range of minimum and maximum partition size. I.e. consider the min and max from spatial and temporal neighbors and the variance recommendation. Change-Id: I1b96bf8b84368d6aad0c7aa600fe141b4f07435f	2013-07-26 18:30:49 +01:00
Yunqing Wang	52256cdbca	Modify static threshold calculation Used 3 * standard_deviation in internal threshold calculation instead of fit curve. This actually approached the algorithm better. For comparison, similar tests were done: The overall psnr loss is less than before. 1. derf set: when static-thresh = 1, psnr loss is 0.329%; when static-thresh = 500, psnr loss is 0.970%; 2. stdhd set: when static-thresh = 1, psnr loss is 0.922%; when static-thresh = 500, psnr loss is 1.307%; Similar speedup is achieved. For example, clip bitrate static-thresh psnr time akiyo(cif) 500 0 48.952 5.077s(50f) akiyo 500 500 48.866 4.169s(50f) parkjoy(1080p) 4000 0 30.388 78.20s(30f) parkjoy 4000 500 30.367 70.85s(30f) sunflower(1080p) 4000 0 44.402 74.55s(30f) sunflower 4000 500 44.414 68.69s(30f) Change-Id: Ic78833642ce1911dbbd1cb6c899a2d7e2dfcc1f3	2013-07-25 19:59:33 -07:00
Yunqing Wang	845fd5011c	Merge "Add encoding option --static-thresh"	2013-07-25 14:58:00 -07:00
Yunqing Wang	d36852b702	Add encoding option --static-thresh This option exists in VP8, and it was rewritten in VP9 to support skipping on different partition levels. After prediction is done, we can check if the residuals in the partition block will be all quantized to 0. If this is true, the skip flag is set, and only prediction data are needed in reconstruction. Based on DCT's energy conservation property, the skipping check can be estimated in spatial domain. The prediction error is calculated and compared to a threshold. The threshold is determined by the dequant values, and also adjusted by partition sizes. To be precise, the DC and AC parts for Y, U, and V planes are checked to decide skipping or not. Test showed that 1. derf set: when static-thresh = 1, psnr loss is 0.666%; when static-thresh = 500, psnr loss is 1.162%; 2. stdhd set: when static-thresh = 1, psnr loss is 1.249%; when static-thresh = 500, psnr loss is 1.668%; For different clips, encoding speedup range is between several percentage and 20+% when static-thresh <= 500. For example, clip bitrate static-thresh psnr time akiyo(cif) 500 0 48.923 5.635s(50f) akiyo 500 500 48.863 4.402s(50f) parkjoy(1080p) 4000 0 30.380 77.54s(30f) parkjoy 4000 500 30.384 69.59s(30f) sunflower(1080p) 4000 0 44.461 85.2s(30f) sunflower 4000 500 44.418 78.1s(30f) Higher static-thresh values give larger speedup with larger quality loss. Change-Id: I857031ceb466ff314ab580ac5ec5d18542203c53	2013-07-25 14:28:05 -07:00
Dmitry Kovalev	7131cb0e3d	General cleanups. Removing unused constants, macros, and function declarations. Using ROUND_POWER_OF_TWO macro, vp9_zero, vp9_copy where possible. Moving #include from .h to .c. Merging for loops for motion vectors. Change-Id: Ic3bf841764a2bb177128bb3a6d7aa8f68229cd13	2013-07-25 14:13:48 -07:00
Dmitry Kovalev	d53fc9ee4e	Merge "Adding lookup table for size group."	2013-07-25 13:57:28 -07:00
Dmitry Kovalev	08fd41ccd7	Adding lookup table for size group. Change-Id: Ia6144d77ebed66e0739b62e4d673e26a95aa9550	2013-07-25 12:58:54 -07:00
Adrian Grange	e862c6f9eb	Merge "Simplify handling of sub-partition motion vectors"	2013-07-25 12:58:38 -07:00
Adrian Grange	6f0f0e4907	Merge "Use local variables rather than structure members"	2013-07-25 12:57:52 -07:00
Dmitry Kovalev	d604914f09	Merge "Removing vp9_adapt_mode_context function."	2013-07-25 12:46:31 -07:00
Jingning Han	d571af76d3	Merge "Make coeff_optimize initialized per-plane"	2013-07-25 12:46:14 -07:00
Yaowu Xu	51a8458822	Merge "fix a bug where flags are not reset"	2013-07-25 12:18:51 -07:00
Adrian Grange	be700e140a	Simplify handling of sub-partition motion vectors Simplified the code that extracts and uses the motion vectors for the 4 sub-partitions in rd_pick_partition. Change-Id: Iaf698ef7ee3aef9edd59015e1ae065dd359b17d9	2013-07-25 11:51:51 -07:00
Jingning Han	2f58faffa4	Make coeff_optimize initialized per-plane This commit makes the initialization of trellis coeff optimization a per-plane operation, thereby eliminating the redundant steps in encode_sby and encode_sbuv. It makes the encoder at speed 0 slightly faster. Change-Id: Iffe9faca6a109dafc0dd69dc7273cbdec19b17cd	2013-07-25 11:44:29 -07:00
Dmitry Kovalev	47d61f008f	Removing vp9_adapt_mode_context function. Moving code from vp9_adapt_mode_context to vp9_adapt_mode_probs. Change-Id: I60829c30b28968cd813551ef3a206dfb98d323c9	2013-07-25 10:48:45 -07:00
Yaowu Xu	3e386aefc2	fix a bug where flags are not reset The feature that uses small partition results as a measure to skip mode evaluation at larger partition requires the flags to be reset. The reset was missing in the code path that calls rd_use_partition(). Change-Id: Ia0a3a0aee1a862b6e2333d596808db7c48033d50	2013-07-25 10:28:38 -07:00
Scott LaVarnway	a0e8b45fee	Merge "pack_inter_mode_mvs cleanup"	2013-07-25 04:47:56 -07:00
Dmitry Kovalev	fcc34796d2	Removing CONFIG_BALANCED_COEFTREE experiment. Change-Id: I61a8b0101eac3ee2e0621d56151b90c269fd4db4	2013-07-24 15:53:42 -07:00
Dmitry Kovalev	9139ee0908	Adding condition inside get_tx_type_{4x4, 8x8, 16x16}. Adding plane type check condition because it was always used outside of get_tx_type_{4x4, 8x8, 16x16}. Change-Id: I02f0bbfee8063474865bd903eb25b54d26e07230	2013-07-24 12:55:45 -07:00
Adrian Grange	4cfd36d8fd	Use local variables rather than structure members Although local copies of the mode member variables (mode, ref_frame) were made, they were not used in all places. Also, made a local copy of the second_ref_frame member. Change-Id: I84d8c822e5cb3d8a02fc3de8a4037ca3fea8bfad	2013-07-24 11:17:44 -07:00
Adrian Grange	a183f17d33	Merge "Correct spelling mistakes"	2013-07-24 09:48:57 -07:00
Ronald S. Bultje	7817d3221f	Save pixels instead of coefficients in intra4x4 RD loop. Prevents doing duplicate IDCTs; encoding of first 50 frames of bus (speed 0) @ 1500kbps goes from 1min4.0 to 1min3.5, i.e. 0.87% faster overall. Change-Id: I2df39e29ed9d5ea5e7d2704a34940ba622832ddd	2013-07-24 09:03:20 -07:00
Ronald S. Bultje	b72ecbb1b9	Add best_rd breakout in intra4x4 RD loop. Encoding time of first 50 frames of bus (speed 0) @ 1500kbps goes from 1min5.4 to 1min4.0, i.e. 2.2% faster overall. Change-Id: I8c32f2aff9a649ce7dd49d910dc5ba16b99c3bc6	2013-07-24 09:02:05 -07:00
Adrian Grange	bc8b0529db	Correct spelling mistakes Change-Id: Id4138293efeac4503b2e01ce7a6c150a5abeef77	2013-07-24 07:58:26 -07:00
Ronald S. Bultje	47336afd8d	Merge "More optimizations for cost_coeffs()."	2013-07-23 21:36:12 -07:00

1 2 3 4 5 ...

1406 Commits