generic-library/vpx

Author	SHA1	Message	Date
Dmitry Kovalev	939b1e4a8c	Merge "Moving segmentation struct from MACROBLOCKD to VP9_COMMON."	2013-08-15 15:14:32 -07:00
Johann	a9aa7d07d0	Merge "vp9: neon: add vp9_convolve_avg_neon"	2013-08-15 14:55:15 -07:00
Johann	63e140eaa7	Merge "vp9: neon: add vp9_convolve_copy_neon"	2013-08-15 14:55:08 -07:00
Jingning Han	68369ca897	Refactor rd loop for chroma components This commit makes the rate-distortion optimization search of chroma components consistent across all block sizes. It removes redundant codes. Change-Id: I7e76f54d045e8efdd41d84a164c71f55b484471b	2013-08-15 14:54:48 -07:00
Jingning Han	c2ff1882ff	Merge "Remove unused RDCOST_8X8 macro"	2013-08-15 13:48:25 -07:00
Jingning Han	ca983f34f7	Merge "Unify luma and chroma rd-cost estimation"	2013-08-15 13:48:15 -07:00
Dmitry Kovalev	bb3b817c1e	Converting code from using ss_txfrm_size to tx_size. Updated function signatures: txfrm_block_to_raster_block txfrm_block_to_raster_xy extend_for_intra vp9_optimize_b Change-Id: I7213f4c4b1b9ec802f90621d5ba61d5e4dac5e0a	2013-08-15 11:44:57 -07:00
Dmitry Kovalev	6f4fa44c42	Using { 0 } for initialization instead of memset. Change-Id: I4fad357465022d14bfc7e13b348c6da267587314	2013-08-15 11:37:56 -07:00
Dmitry Kovalev	81d7bd50f5	Renaming d27 predictor to d207. 27 degrees intra predictor is actually 207 degrees, so renaming it. Change-Id: Ife96a910437eb80ccdc0b7a5b7a62c77542ae5be	2013-08-15 11:09:49 -07:00
Mans Rullgard	67e53716e0	vp9: neon: optimise vp9_wide_mbfilter_neon Break up long dependency chains to improve instruction scheduling. Change-Id: I0e0cb66943df24af920767bb4167b25c38af9630	2013-08-15 19:07:22 +01:00
James Zern	89a1fcf884	Merge "vp9_dx_iface: check for NULL/0-size input"	2013-08-15 10:59:22 -07:00
Dmitry Kovalev	b7616e387e	Moving segmentation struct from MACROBLOCKD to VP9_COMMON. VP9_COMMON is the right place to segmentatation struct because it has global segmentation parameters, not something specific to macroblock processing. Change-Id: Ib9ada0c06c253996eb3b5f6cccf6a323fbbba708	2013-08-15 10:47:48 -07:00
Jingning Han	b0646f9e98	Remove unused RDCOST_8X8 macro Change-Id: I17c7d7eaa60fe69c543403c340f7c1078bfd339f	2013-08-15 10:40:44 -07:00
Dmitry Kovalev	4d73416099	Merge "Quantization code cleanup."	2013-08-15 10:23:01 -07:00
Deb Mukherjee	24856b6abc	Speed feature to skip split partition based on var Adds a speed feature to disable split partition search based on a given threshold on the source variance. A tighter threshold derived from the threshold provided is used to also disable horizontal and vertical partitions. Results on derfraw300: threshold = 16, psnr = -0.057%, speedup ~1% (football) threshold = 32, psnr = -0.150%, speedup ~4-5% (football) threshold = 64, psnr = -0.570%, speedup ~10-12% (football) Results on stdhdraw250: threshold = 32, psnr = -0.18%, speedup is somewhat more than derf because of a larger number of smoother blocks at higher resolution. Based on these results, a threshold of 32 is chosen for speed 1, and a threshold of 64 is chosen for speeds 2 and above. Change-Id: If08912fb6c67fd4242d12a0d094783a99f52f6c6	2013-08-15 10:01:45 -07:00
Jingning Han	ec01f52ffa	Unify luma and chroma rd-cost estimation This commit unifies the rate-distortion cost calculation process of luma and chroma components. It allows early termination to be enabled later in the rd search loop of chroma components, in consistent with luma pixels. Change-Id: I2e52a7c6496176bf2a5e3ef338d34ceb8aad9b3d	2013-08-15 09:41:33 -07:00
Paul Wilkins	1a3641d91b	Merge "Renaming in MB_MODE_INFO"	2013-08-15 02:12:48 -07:00
James Zern	20395189cd	vp9_dx_iface: check for NULL/0-size input avoids a crash caused by issue #585 Change-Id: I301595ee0227699b0da6f0dad6d870dd546e94ef	2013-08-14 18:35:22 -07:00
hkuang	39f42c8713	Merge "Add neon optimize vp9_short_idct16x16_add."	2013-08-14 14:16:20 -07:00
hkuang	cf6beea661	Add neon optimize vp9_short_idct16x16_add. Change-Id: I27134b9a5cace2bdad53534562c91d829b48838d	2013-08-14 13:52:16 -07:00
Dmitry Kovalev	bb072000e8	foreach_transformed_block_in_plane cleanup, explicit tx_size var. Making foreach_transformed_block_in_plane more clear (it's not finished yet). Using explicit tx_size variable consistently instead of (ss_txfrm_size / 2) or (ss_txfrm_size >> 1) expression. Change-Id: I1b9bba2c0a9f817fca72c88324bbe6004766fb7d	2013-08-14 11:39:31 -07:00
Dmitry Kovalev	f2c073efaa	Adding const to arguments of intra prediction functions. Adding const to above and left pointers. Cleanup. Change-Id: I51e195fa2e2923048043fe68b4e38a47ee82cda1	2013-08-14 10:35:56 -07:00
Mans Rullgard	0f1deccf86	vp9: neon: add vp9_convolve_avg_neon Change-Id: I33cff9ac4f2234558f6f87729f9b2e88a33fbf58	2013-08-14 16:27:55 +01:00
Mans Rullgard	635ba269be	vp9: neon: add vp9_convolve_copy_neon Change-Id: I15adbbda15d1842e9f15f21878a5ffbb75c3c0c9	2013-08-14 16:27:55 +01:00
Paul Wilkins	26fead7ecf	Renaming in MB_MODE_INFO The macro block mode info context originally contained an entry for each 16x16 macroblock. In VP9 each entry refers to an 8x8 region not a macro block, so the naming is misleading. This first stage clean up changes the names of 3 entries in the structure to remove the mb_ prefix. TODO clean up the nomenclature more widely in respect of mbmi and bmi. Change-Id: Ia7305c6d0cb805dfe8cdc98dad21338f502e49c6	2013-08-14 12:47:52 +01:00
Paul Wilkins	54979b4350	Merge "Honor min_partition_size properly for non-square splits"	2013-08-14 04:45:18 -07:00
Guillaume Martres	fc50477082	Honor min_partition_size properly for non-square splits Don't do vertical or horizontal splits if subsize < min_partition_size, except for edge blocks where it makes sense. Change-Id: I479aa66ba1838d227b5de8312d46be184a8d6401	2013-08-13 15:24:03 -07:00
Dmitry Kovalev	bcc8e9d9c6	Merge "Little cleanup inside decode_tile() function."	2013-08-13 14:43:10 -07:00
Guillaume Martres	ecb78b3e0c	Merge "Trivial clean up."	2013-08-13 12:40:37 -07:00
Jingning Han	7e0f88b6be	Use lookup table to find largest txfm size Refactor choose_largest_txfm_size_ and make it find the largest transform size via lookup table. Change-Id: I685e0396d71111b599d5367ab1b9c934bd5490c8	2013-08-13 10:32:14 -07:00
Dmitry Kovalev	8105ce6dce	Merge "Using is_inter_block() instead of repetitive code."	2013-08-13 10:00:01 -07:00
Jingning Han	dc70fbe42d	Merge "Refactor model based tx search in super_block_yrd"	2013-08-13 08:48:49 -07:00
Paul Wilkins	5459f68d71	Trivial clean up. Delete unused / commented out variable references. Change-Id: Iaf20c0c3744f89adb296d153b516b5ea41b4f3b4	2013-08-13 13:26:18 +01:00
Paul Wilkins	8e35263bed	Merge "Honor min_partition_size properly"	2013-08-13 05:19:51 -07:00
Jingning Han	39fe235032	Merge "SSE2 high precision 32x32 forward DCT"	2013-08-12 23:03:47 -07:00
Dmitry Kovalev	2c7ae8c29a	Little cleanup inside decode_tile() function. Change-Id: I3ed4beb59371fe21ca3e82253aa98e0cbd5e0630	2013-08-12 18:28:13 -07:00
Johann	4417c04531	Merge "vp9: neon: optimise convolve8_vert functions"	2013-08-12 17:54:47 -07:00
Johann	4cabbca4ce	Merge "vp9: neon: optimise convolve8_horiz functions"	2013-08-12 17:54:42 -07:00
Dmitry Kovalev	32006aadd8	Using is_inter_block() instead of repetitive code. Change-Id: If0b04c476c34fb8c102c9f750d7fe5669a86a532	2013-08-12 17:42:14 -07:00
Jingning Han	78136edcdc	SSE2 high precision 32x32 forward DCT Enable SSE2 implementation of high precision 32x32 forward DCT. The intermediate stacks are of 32-bits. The run-time goes down from 32126 cycles to 13442 cycles. Change-Id: Ib5ccafe3176c65bd6f2dbdef790bd47bbc880e56	2013-08-12 16:52:53 -07:00
Jingning Han	14cc7b319f	Refactor model based tx search in super_block_yrd Remove unnecessary conditional branches in model-based transform size search. Change-Id: Ic862dc33ed6710a186f6248239dd5f09b5c19981	2013-08-12 16:34:48 -07:00
Dmitry Kovalev	b89eef8f82	Merge "Simplifying vp9_mvref_common.c."	2013-08-12 16:24:22 -07:00
Dmitry Kovalev	b214cd0dab	Merge "Removing foreach_predicted_block_uv function."	2013-08-12 15:54:01 -07:00
Dmitry Kovalev	98e3d73e16	Merge "Using MV* instead of int_mv* as argument of vp9_clamp_mv_min_max."	2013-08-12 15:53:25 -07:00
Dmitry Kovalev	1a5e6ffb02	Simplifying vp9_mvref_common.c. Change-Id: I272df2e33fa05310466acf06c179728514dd7494	2013-08-12 15:52:08 -07:00
Dmitry Kovalev	9d5885b0ab	Quantization code cleanup. Change-Id: I77b42418b852093f79260cbd880533a0bd86678f	2013-08-12 15:23:47 -07:00
Dmitry Kovalev	c66320b3e4	Merge "Entropy context related cleanups."	2013-08-12 15:18:24 -07:00
Dmitry Kovalev	bd1bc1d303	Merge "Making scaling code more clear."	2013-08-12 15:17:26 -07:00
Dmitry Kovalev	9a31d05e24	Removing unused convolve_avg_c function + cleanup. Change-Id: Id2b126c6456627c25e4041a82e304d0151d951ba	2013-08-12 14:28:00 -07:00
Dmitry Kovalev	1aedfc992a	Using MV* instead of int_mv* as argument of vp9_clamp_mv_min_max. Change-Id: I3c45916a9059f11b41e9d798e34ffee052969a44	2013-08-12 13:56:04 -07:00
Dmitry Kovalev	76d166e413	Removing foreach_predicted_block_uv function. Adding function build_inter_predictors_for_planes to build inter predictors for specified planes. This function allows to remove condition "#if CONFIG_ALPHA" and use MAX_MB_PLANE for general case. Renaming 'which_mv' local var to 'ref', and 'weight' argument to 'ref'. Change-Id: I1a97160c9263006929d38953f266bc68e9c56c7d	2013-08-12 13:54:13 -07:00
Dmitry Kovalev	a72e269318	Making scaling code more clear. Reusing existing functions, using constants instead of magic numbers. Change-Id: Idc689ffba52c9a8b203fcf26bd67110ecb5635f9	2013-08-12 13:30:26 -07:00
Jingning Han	3984b41c87	Fix a compile failure in vp9_get_compressed_data The lf struct is now with VP9_COMMON, instead of MACROBLOCKD. Change-Id: Idfdd4f91f78f486078a138322d58bb61e93e1bc9	2013-08-12 11:42:17 -07:00
Dmitry Kovalev	8b0e6035a2	Entropy context related cleanups. Adding set_skip_context() function used from both encoder and decoder. Change-Id: Ia22cfad3211a00a63eb294f64f857b78f4aa9b85	2013-08-12 11:24:24 -07:00
Mans Rullgard	ad7021dd6c	vp9: neon: optimise convolve8_vert functions Invert loops to operate vertically in the inner loop. This allows removing redundant loads. Also add preloading of data. Change-Id: I4fa85c0ab1735bcb1dd6ea58937efac949172bdc	2013-08-12 15:37:48 +01:00
Dmitry Kovalev	097046ae28	Merge "Removing redundant code and function arguments."	2013-08-11 12:20:58 -07:00
Mans Rullgard	b84dc949c8	vp9: neon: optimise convolve8_horiz functions Each iteration of the horizontal loop reuses 7 of the 11 source values. Loading only the 4 new values saves some time. Also add preload for source data. Overall 4% faster on Chromebook. Change-Id: I8f69e749f2b7f79e9734620dcee51dbfcd716b44	2013-08-11 16:21:55 +01:00
Dmitry Kovalev	3c43ec206c	Renaming BLOCK_SIZE_TYPES constant to BLOCK_SIZES. There will be another change set to rename BLOCK_SIZE_TYPE enum to BLOCK_SIZE. Change-Id: I8d1dfc873d6186fa5e554262f5169e929978085e	2013-08-09 17:47:32 -07:00
Guillaume Martres	58b07a6f9d	Honor min_partition_size properly It represents the minimum partition size, so don't split if bsize == min_partition_size . Change-Id: Id77c32d6afef7d2ddec0368eaae18fb13227d30e	2013-08-09 17:28:33 -07:00
Dmitry Kovalev	67fe9d17cb	Removing redundant code and function arguments. Change-Id: Ia5cdda0f755befcd1e64397452c42cb7031ca574	2013-08-09 17:24:40 -07:00
Dmitry Kovalev	e7c5ca8983	Merge "Inlining 16 as a stride for BLOCK_OFFSET macro."	2013-08-09 17:22:46 -07:00
James Zern	ef101af8ae	Merge "vp9_rd_pick_inter_mode_sb: fix uninitialized value"	2013-08-09 17:13:32 -07:00
Dmitry Kovalev	f1559bdeaf	Inlining 16 as a stride for BLOCK_OFFSET macro. Change-Id: I7f23d174eb089e5500f268a10db09648634c1b82	2013-08-09 16:40:05 -07:00
James Zern	f295774d43	vp9_rd_pick_inter_mode_sb: fix uninitialized value 'skippable' can remain unset and negatively affect later decisions address one aspect of issue #599 Change-Id: Iffdf0ac2e49ac481c27dc27c87fa546d4167bb28	2013-08-09 16:26:22 -07:00
Dmitry Kovalev	125146034e	Merge "Using MV struct instead of int[2] array."	2013-08-09 15:33:08 -07:00
Dmitry Kovalev	cd0629fe68	Merge "Removing plane_block_{width, height}_log2by4 functions."	2013-08-09 15:26:51 -07:00
Dmitry Kovalev	ff7df102d9	Merge "Moving loopfilter struct to VP9_COMMON."	2013-08-09 15:23:00 -07:00
Dmitry Kovalev	816d6c989c	Moving loopfilter struct to VP9_COMMON. Loop filter configuration doesn't belong to macroblock, so moving it from MACROBLOCKD to VP9_COMMON. Also moving the declaration of loopfilter struct from vp9_blockd.h to vp9_loopfilter.h. Change-Id: I4b3e34be9623b47cda35f9b1f9951f8c5b1d5d28	2013-08-09 14:41:51 -07:00
Dmitry Kovalev	8ffe85ad00	Moving scale_factors and related code to separate files. Change-Id: I531829e5aee2a4a7a112d528ecccbddf052d0e74	2013-08-09 14:07:09 -07:00
Scott LaVarnway	ace93a175d	Merge "Bug fix: call set_offsets before rd_auto_partition_range"	2013-08-09 12:30:52 -07:00
Dmitry Kovalev	fa0cd61087	Merge "Using buf_2d struct instead of separate buffer and stride vars."	2013-08-09 11:50:58 -07:00
Scott LaVarnway	41251ae558	Bug fix: call set_offsets before rd_auto_partition_range The set_offsets call is necessary inorder to set the mode_info_context ptr correctly. Change-Id: I644910cc5bacc50ee9cd78458843274ad8ee636d	2013-08-09 14:09:49 -04:00
Adrian Grange	0eef1acbef	Merge "Correct bug in loopfilter initialization"	2013-08-09 09:51:58 -07:00
Adrian Grange	12eb2d0267	Correct bug in loopfilter initialization The memset sets 16 bytes rather than the correct size of the final array dimension (MAX_MODE_LF_DELTAS). (In response to bug posted by Manjit Hota to codec-devel and webm-discuss lists) Change-Id: I8980f5aa71ddc9d7ef57c5b4700bc28ddf8651c7	2013-08-09 09:21:15 -07:00
Yaowu Xu	6ec2b85bad	Added lpf level picking using partial frame Change-Id: I599ab1bd22b5f3f10d5962c609952abdef8ff67a	2013-08-09 07:37:08 -07:00
Yaowu Xu	6a7a4ba753	renamed vp8_yv12_copy_y to vpx_yv12_copy_y Becuase the routine is used by both vp8 and vp9 Change-Id: I2d35b287b5bc2394865d931a27da61f4ce7edeeb	2013-08-09 07:37:08 -07:00
Yaowu Xu	c7c9901845	added a speed feature on lpf level picking Change-Id: Id578f8afdeab3702fc8386969f2d832d8f1b5420	2013-08-09 07:36:32 -07:00
Dmitry Kovalev	6fd2407035	Using buf_2d struct instead of separate buffer and stride vars. Change-Id: Id5cc3566cc16d1e3030ddb4d1c58459320321dca	2013-08-08 21:25:48 -07:00
Dmitry Kovalev	6a8ec3eac2	General code cleanup. Removing redundant parenthesis and curly braces. Combining declarations with initializations. Adding useful intermediate variables instead of recalculating expressions every time. Change-Id: I00106f404afd60bfc189905b0fded881684f941a	2013-08-08 21:12:34 -07:00
Dmitry Kovalev	ee40e1a637	Merge "Cleanup inside vp9_reconinter.c."	2013-08-08 14:59:38 -07:00
Deb Mukherjee	2158909fc3	Merge "Adds a new subpel motion function"	2013-08-08 12:26:55 -07:00
Dmitry Kovalev	9e3bcdd135	Merge "Removing unneeded intermediate entropy_nodes_adapt var."	2013-08-08 12:16:57 -07:00
Dmitry Kovalev	47fad4c2d7	Using MV struct instead of int[2] array. Change-Id: Iab951c555037e36b154f319f351c5e67f9abb931	2013-08-08 12:01:56 -07:00
Dmitry Kovalev	ac008f0030	Removing unneeded intermediate entropy_nodes_adapt var. Change-Id: I541a178d997b4541e0e2d4d5b854e2ed6b113c3a	2013-08-08 11:52:02 -07:00
Deb Mukherjee	1ba91a84ad	Adds a new subpel motion function Adds a new subpel motion estimation function that uses a 2-level tree-structured decision tree to eliminate redundant computations. It searches fewer points than iterative search (which can search the same point multiple times) but has the same quality roughly. This is made the default setting at speeds 0 and 1, while at speed 2 and above only a 1-level search is used. Also includes various cleanups for consistency and redundancy removal. Results: derf: +0.012% psnr stdhd: +0.09% psnr Speedup of about 2-3% Change-Id: Iedde4866f5475586dea0f0ba4cb7428fba24eee9	2013-08-08 11:41:49 -07:00
Adrian Grange	83ee80c045	Moved fast motion search level decision to function Moving this block of code into a function makes the code easier to read and change. Change-Id: If4ede570cce1eab1982b188c4d3e4fd3d4db236e	2013-08-08 11:01:44 -07:00
Adrian Grange	aae6a4c895	Simplify & fix potential bug in rd_pick_partition Different partitionings were not being evaluated against best_rd and there were unnecessary calls to RDCOST. This could have resulted in a non-optimal partioning being selected. I simplified the variables used to track the rate, distortion and RD values throughout the function. Change-Id: Ifa7085ee80d824e86791432a5bc6d8fea5a3e313	2013-08-08 09:55:45 -07:00
Jingning Han	6bfcce8c7a	Merge "Use low precision 32x32fdct for encodemb in speed1"	2013-08-07 19:05:14 -07:00
Dmitry Kovalev	61c33d0ad5	Removing plane_block_{width, height}_log2by4 functions. Change-Id: I040b82b8e32aee272d10cbb021c7ba1c76343d7a	2013-08-07 17:06:33 -07:00
Dmitry Kovalev	a766d8918e	Cleanup inside vp9_reconinter.c. Using block width and block height instead of their logarithms. Using SUBPEL_BITS and SUBPEL_SHIFTS constants instead of magic numbers. Change-Id: I4e10e93c907c8a5e1cb27dfe74d1fcdcc4995448	2013-08-07 17:02:28 -07:00
Dmitry Kovalev	82d7c6fb3c	Merge "Using only one scale function in scale_factors struct."	2013-08-07 16:32:09 -07:00
Dmitry Kovalev	1492698ed3	Merge "Adding ss_size_lookup table."	2013-08-07 16:08:24 -07:00
Jingning Han	debb9c68c8	Use low precision 32x32fdct for encodemb in speed1 The low precision 32x32 fdct has all the intermediate steps within 16-bit depth, hence allowing faster SSE2 implementation, at the expense of larger round-trip error. It was used in the rate-distortion optimization search loop only. Using the low precision version, in replace of the high precision one, affects the compression performance by about 0.7% (derf, stdhd) at speed 0. For speed 1, it makes derf set down by only 0.017%. Change-Id: I4e7d18fac5bea5317b91c8e7dabae143bc6b5c8b	2013-08-07 15:34:12 -07:00
Dmitry Kovalev	8db2675b97	Adding ss_size_lookup table. Removing the old one bsize_from_dim_lookup. Now we have a way to determine block size for plane using its subsampling values (ss_size_lookup). And then we can find the number of pixels in the block (num_pels_log2_lookup). Change-Id: I6fc981da2ae093de81741d3d78eaefed11015db9	2013-08-07 15:33:17 -07:00
Dmitry Kovalev	ea2348ca29	Merge "Removing NMS_STATS defines."	2013-08-07 15:28:30 -07:00
Christian Duvivier	78182538d6	Neon version of vp9_short_idct4x4_add. Change-Id: Idec4cae0cb9b3a29835fd2750d354c1393d47aa4	2013-08-06 18:41:27 -07:00
Deb Mukherjee	296931c817	Merge "Clean ups of the subpel search functions"	2013-08-06 17:28:48 -07:00
Deb Mukherjee	71b43b0ff0	Clean ups of the subpel search functions Removes some unused code and speed features, and organizes the interfaces for fractional mv step functions for use in new speed features to come. In the process a new speed feature - number of iterations per step during the subpel search - is exposed. No change when this parameter is set as the original value of 3. Results: subpel_iters_per_step = 3: baseline subpel_iters_per_step = 2: psnr -0.067%, 1% speedup subpel_iters_per_step = 1: psnr -0.331%, 3-4% speedup Change-Id: I2eba8a21f6461be8caf56af04a5337257a5693a8	2013-08-06 17:23:50 -07:00
Dmitry Kovalev	63ec0587c1	Merge "Motion vector code cleanup."	2013-08-06 16:00:01 -07:00
Dmitry Kovalev	1c552e79bd	Using only one scale function in scale_factors struct. Functions scale_mv_q4 and scale_mv_q3_to_q4 were almost identical except q3->q4 conversion in scale_mv_q3_to_q4. Now q3->q4 conversion happens directly in vp9_build_inter_predictor. Also adding useful constants: SUBPEL_BITS and SUBPEL_MASK. Change-Id: Ia0a6ad2ac07c45fdf95a5139ece6286c035e9639	2013-08-06 15:43:56 -07:00
Jingning Han	2c091f9768	Merge "Place holder for high-precision 32x32 fdct"	2013-08-06 14:47:30 -07:00
Jim Bankoski	5b307886fb	variance x86inc guards also fixed bug in sad calcs Change-Id: I6571fcbe37556c16ae32be66dc0fd879852aac1d	2013-08-06 14:17:13 -07:00
Jim Bankoski	6eb1254b88	sse3 intrapred x86inc protected Change-Id: I4a3c83119cdf8a205920034c8019d855d5504605	2013-08-06 14:17:13 -07:00
Deb Mukherjee	fac7c8c9f9	Merge "Flexible support for various pattern searches"	2013-08-06 14:03:27 -07:00
Jim Bankoski	c9126e0b30	sad + miscellaneous updates Enable use_x86inc as a commandline option. Fix Bug with sse2 when x86inc is disabled. Adds Sad asm protection to x86inc protection Change-Id: Iee0f9dd235ea10e8ace512eb362ba9bebe8c9df6	2013-08-06 12:16:04 -07:00
Dmitry Kovalev	8725ca2ed2	Merge "Inlining vp9_get_pred_probs_switchable_interp function."	2013-08-06 11:57:45 -07:00
Deb Mukherjee	15b5a6a2c7	Flexible support for various pattern searches Adds a few pattern searches to achieve various tradeoffs between motion estimation complexity and performance. The search framework is unified across these searches so that a common pattern search function is used for all. Besides it will be easier to experiment with various patterns or combinations thereof at different scales in the future. The new pattern search is multi-scale and is capable of using different patterns at different scales. The new hex search uses 8 points at the smallest scale and 6 points at other scales. Two other pattern searches - big-diamond and square are also added. Big diamond uses 4 points at the smallest scale and 8 points in diamond shape at the larger scales. Square is very similar conceptually to the default n-step search but is somewhat faster since it keeps only one survivor across all scales. Psnr/speed-up results on derf300: hex: -1.6% psnr%, 6-8% speed-up big-diamond: -0.96% psnr, 4-5% speedup square: -0.93% psnr, 4-5% speedup Change-Id: I02a7ef5193f762601e0994e2c99399a3535a43d2	2013-08-06 11:56:39 -07:00
Jingning Han	28566a6cd5	Place holder for high-precision 32x32 fdct Resolve compile warnings on re-define FDCT32x32_2D template. Change-Id: Idb3a54ef8d2710ce7245b726379a0e5c875f5cad	2013-08-06 11:44:08 -07:00
Dmitry Kovalev	0c80065694	Inlining vp9_get_pred_probs_switchable_interp function. There was no benefit having this function. For example, inside read_switchable_filter_type switchable filter context was calculated twice. Change-Id: I79cd5bf95cbc0f6d8bf91a2e32289e01b18dcff1	2013-08-06 11:04:31 -07:00
Jingning Han	7d61f8fe53	Merge "Move fdct32x32 SSE2 implementation in separate file."	2013-08-06 10:46:41 -07:00
Jim Bankoski	efc94102f0	Merge "intrapred x86inc guards"	2013-08-06 10:39:19 -07:00
Dmitry Kovalev	a39abe2627	Motion vector code cleanup. Converting arguments of two functions (clamp_mv_ref, lower_mv_precision) from int_mv* to MV*. Rewriting is_inside function to make it much shorter. Change-Id: Ie4c4cf3eccd46707c7df099ec21fb1b61c72fc7a	2013-08-06 10:31:11 -07:00
Dmitry Kovalev	3e51acafec	Merge "Finally removing all old block size constants."	2013-08-06 10:30:37 -07:00
Dmitry Kovalev	4a692e4168	Merge "Changing the order switchable filter enum constants."	2013-08-06 10:30:26 -07:00
Dmitry Kovalev	25b7dc08cd	Merge "Removing unused functions."	2013-08-06 10:29:57 -07:00
Deb Mukherjee	33afddadb9	Merge "Add variance based mode/skipping"	2013-08-06 10:19:15 -07:00
Christian Duvivier	3d98205fce	Move fdct32x32 SSE2 implementation in separate file. This is in preparation for the SSE2 version of the high-precision 32x32 forward DCT which will share a lot of code with the existing low precision version used for rate-distortion search. Change-Id: I7084b6bdfb480b1fabb8493fb14e3f7fcc7888c0	2013-08-06 10:17:11 -07:00
Jim Bankoski	25ec1375c9	intrapred x86inc guards Change-Id: If0399d8e11f4ebe75a5c91abb8d6a52a7709065b	2013-08-06 09:39:30 -07:00
Jim Bankoski	62c6aa884d	block error / x86inc mods Change-Id: Icb607745634e10b9bac5019d06661ece09fcdb40	2013-08-06 06:23:38 -07:00
Jim Bankoski	a93b115cd6	reworked config for use_x86_inc Support enabling it or disabling it. Moved read out to configure.sh so that its done once instead of in make and in config. Change-Id: I73a9190cf31de9f03e8a577f478fa522f8c01c8b	2013-08-05 17:35:25 -07:00
James Zern	d115cd8b12	Merge changes I082959ab,Ib6932640 * changes: vp9/decoder: threaded row-based loop filter vp9/decoder: add thread worker	2013-08-05 16:07:09 -07:00
Dmitry Kovalev	b9c7d04e95	Finally removing all old block size constants. Change-Id: I3aae21e88b876d53ecc955260479980ffe04ad8d	2013-08-05 15:23:49 -07:00
Jim Bankoski	f4837579d1	fixed script problem with config_force_x86_inc Change-Id: I226e5094d216b09dc47fa5511a66e2d314608000	2013-08-05 14:48:20 -07:00
Jim Bankoski	a5a7322459	Merge "Begin to restrict x86inc.asm usage"	2013-08-05 14:17:49 -07:00
Deb Mukherjee	8b3faccb9e	Add variance based mode/skipping Adds a speed feature to skip all intra modes other than DC_PRED if the source variance is small. This feature is made part of speed 1 and up. Results on derf300: psnr -0.07%, speedup about 1-2% Also uses the source variance to fine-tune the early termination criteria when FLAG_EARLY_TERMINATE is on. This feature is made part of speed 2 and up. Results on derf300: psnr -0.52%, speedup about 5-7% Change-Id: I59e38aa836557cfa5405ae706fc64815cbfe4232	2013-08-05 14:14:01 -07:00
Jim Bankoski	9f988a2edf	Merge "cleanups after bw bh code"	2013-08-05 14:02:02 -07:00
James Zern	a0ffa2794b	vp9/decoder: threaded row-based loop filter Currently the only threaded option for vp9 decode. Enabled when the decoder config thread count is > 1. Change-Id: I082959abac9e31aa4a38ed9fd68b94680e57f4df	2013-08-05 13:22:04 -07:00
James Zern	183b77d5ab	vp9/decoder: add thread worker vp9/decoder/vp9_thread.[hc] Original source: http://git.chromium.org/webm/libwebp.git 100644 blob b1615d0fb8d311666b2fa4561076c62d72c2e3ff src/utils/thread.c 100644 blob 13a61a4c84194c3374080cbf03d881d3cd6af40d src/utils/thread.h Local modifications: - s/WebP/VP9/g - camelcase functions -> lower with _'s Change-Id: Ib6932640ee34f8b4782c6fbd15864a59d5d4c5fe	2013-08-05 13:21:13 -07:00
Dmitry Kovalev	3f611555d7	Changing the order switchable filter enum constants. This changeset allows to remove vp9_switchable_interp and vp9_switchable_interp_map arrays and make code much clear. Actually we still have to use these mapping but only inside read_interp_filter_type and write_interp_filter_type functions. Change-Id: I4026c6f8c4acefba6c81421b7bacbaa52cc45f50	2013-08-05 12:26:15 -07:00
Jim Bankoski	5d2cb7ead0	cleanups after bw bh code Cons bw/bh parms that should have been const. Additional formatting. Change-Id: Icd36a5c9dc17dadd7284315ac0d6fef1a565ca16	2013-08-05 12:15:52 -07:00
Jim Bankoski	c3809f3de5	Begin to restrict x86inc.asm usage Chromium does not support 32bit builds for Mac which use x86inc.asm. Make the files which include it work if 64bit or not PIC enabled starting with vp9_copy_sse2.asm Consolidate these targets in vp9_rtcd_defs.sh Change-Id: If18f0b957a611efd085a3ee7d245cf1eb91e8248	2013-08-05 12:07:30 -07:00
Dmitry Kovalev	d007446b3f	Replacing long block size enum values with shorter ones (2). Change-Id: I428c4d42212b757112e3acfe5b81314cfbb5fd6b	2013-08-05 10:51:02 -07:00
Dmitry Kovalev	319867d71c	Merge "Cleaning up vp9_build_inter_predictor function."	2013-08-05 01:52:11 -07:00
Dmitry Kovalev	78671e2eff	Merge "Replacing "txfm" with "tx" in identifiers."	2013-08-04 02:52:22 -07:00
Jim Bankoski	f703f98757	reworked find_mv_ref This is an attempt at rewriting vp9_find_mv_refs_idx. I believe that it gains about 1-2% decode speed Change-Id: Ia5359c94ce9bb43b32652890e605e9a385485c1b	2013-08-03 20:25:55 -07:00
Dmitry Kovalev	fe2a201eb1	Replacing "txfm" with "tx" in identifiers. Consistent names with TX_SIZE, TX_MODE, and TX_MODE. Change-Id: I79592218bf5a40ace89197a34a06ee7de581ed8d	2013-08-02 17:28:23 -07:00
Dmitry Kovalev	5edc65d00d	Removing NMS_STATS defines. Change-Id: Iabab0e59042a33456df1d449c0d0f01debc00c7c	2013-08-02 17:10:15 -07:00
Dmitry Kovalev	7b50333e8f	Merge "Adding is_inter_block function."	2013-08-02 16:54:32 -07:00
Dmitry Kovalev	5f0a52faaf	Cleaning up vp9_build_inter_predictor function. Change-Id: I94f6b4272b95ac101de6d10f048116ba065788b0	2013-08-02 16:53:18 -07:00
Dmitry Kovalev	fec4ec4edd	Removing unused functions. Removed functions: model_rd_for_sb_y, block_error_sby, get_sb_variance Change-Id: Iec458df180caf6f8eac3605773841a4121dd3a8f	2013-08-02 16:41:09 -07:00
Dmitry Kovalev	603931e291	Merge "Changing function arg type from int_mv* to MV*."	2013-08-02 16:30:06 -07:00
Dmitry Kovalev	a6adc82e78	Merge "Cleanups around allow_high_precision_mv flag."	2013-08-02 16:27:05 -07:00
Dmitry Kovalev	680ec32d18	Adding is_inter_block function. Using it instead of long unclear verbose check "mbmi->ref_frame[0] != INTRA_FRAME". Change-Id: I9c7b4b3797942fa962bf3ba7460fff3084beabe9	2013-08-02 16:25:33 -07:00
Dmitry Kovalev	d4e020c4b1	Merge "Cleaning up set_contexts_on_border function."	2013-08-02 16:22:50 -07:00
Yunqing Wang	d340c114fb	Merge "Add more checking to using_small_partition_info"	2013-08-02 15:55:09 -07:00
Dmitry Kovalev	769bcab3f5	Cleaning up set_contexts_on_border function. Change-Id: I8f21c18b29f54b277fb1c167f278f109d9f3b996	2013-08-02 15:52:26 -07:00
Dmitry Kovalev	25b77e2569	Changing function arg type from int_mv* to MV*. Change-Id: Ic878d31df2ce783a2c9a8c4bc9ed301ec8ffe25e	2013-08-02 15:26:32 -07:00
Dmitry Kovalev	5d86f3886d	Moving struct loop_filter_info from .h to .c file. Change-Id: I3fe90eb40088a5b07bdc7d66d93ffe6ef99943d5	2013-08-02 11:53:49 -07:00
Adrian Grange	60ff123536	Merge "Fixed typos and added a few explanatory comments"	2013-08-02 11:37:47 -07:00
Adrian Grange	075b11f004	Merge "Changed name of rd_pick_intra4x4mby_modes"	2013-08-02 11:36:46 -07:00
Johann	8ff58093f0	Merge "vp9: neon: convolve: replace some insns with simpler equivalents"	2013-08-02 11:28:31 -07:00
Johann	8bebfbf7c5	Merge "vp9: neon: convolve: simplify branching to C fallbacks"	2013-08-02 11:28:25 -07:00
Johann	7d14ce8ba5	Merge "vp9: neon: optimise loads in horiz convolve functions"	2013-08-02 11:28:04 -07:00
Johann	319b7dc283	Merge "vp9: neon: add vp9_mb_lpf_* functions"	2013-08-02 11:27:52 -07:00
Dmitry Kovalev	86053d3ae2	Cleanups around allow_high_precision_mv flag. Change-Id: Ic07f5f8ffeaedd5b7513b464871f83afc82dcd5c	2013-08-02 11:21:16 -07:00
Dmitry Kovalev	b47153deed	Replacing long block size enum values with shorter ones. Change-Id: I0e9329490828684a4fd46f540d89114cc68e8407	2013-08-02 10:48:27 -07:00
Yunqing Wang	0d68080445	Merge "Comment out 2 unused speed features"	2013-08-02 09:58:46 -07:00
Mans Rullgard	355cb14dc7	vp9: neon: convolve: replace some insns with simpler equivalents Change-Id: I5d6906772e6e6adf68d7f0fd5b8b5207a64a3a37	2013-08-02 08:11:28 -07:00
Mans Rullgard	2003468df8	vp9: neon: convolve: simplify branching to C fallbacks Change-Id: Ic7cacd02d6dc9243ad8fc85082c5618a9d1e66dc	2013-08-02 08:11:25 -07:00
Mans Rullgard	5e2e78d024	vp9: neon: optimise loads in horiz convolve functions Loading to single lanes in multiple registers is expensive since it requires a read and write of each register which saturates the register file access. Loading to single registers followed by a separate transpose reduces this pressure. Change-Id: I4cc35887ddbca80e5e635b50d2b1d158de9668ee	2013-08-02 08:11:08 -07:00
Mans Rullgard	d85ae87183	vp9: neon: add vp9_mb_lpf_* functions Change-Id: I13e0880df234f15abc4cc7c57fe84488d5d46a75	2013-08-02 08:10:50 -07:00
Dmitry Kovalev	d91e9f4e36	Merge "Cleanup: replacing xd->seg with seg, and xd->lf with lf."	2013-08-01 23:17:17 -07:00
Dmitry Kovalev	4144fee9e9	Merge "Cleanup: reusing clamp_mv function."	2013-08-01 23:16:56 -07:00
Jingning Han	555bbd68c7	Merge "Remove unused vp9_short_idct10_32x32_add"	2013-08-01 15:41:35 -07:00
Dmitry Kovalev	741537f3ce	Cleanup: replacing xd->seg with seg, and xd->lf with lf. Change-Id: I73b59d7699a8e7e7acd3bf8041cb6c98ce9ba4bf	2013-08-01 15:38:16 -07:00
Dmitry Kovalev	9f4f001ba5	Merge "Cleanup: removing unused function arguments."	2013-08-01 15:07:12 -07:00
Dmitry Kovalev	422d38bca1	Cleanup: reusing clamp_mv function. Change-Id: I8715f08a3554bdb557c5f935f1dfbd671f18e766	2013-08-01 15:06:34 -07:00
Dmitry Kovalev	ddf02e323a	Merge "Nice looking motion vector clamping functions."	2013-08-01 14:50:14 -07:00
Deb Mukherjee	19d42de3ca	Merge "Adds a source variance computation function"	2013-08-01 14:18:43 -07:00
Dmitry Kovalev	0497b8d7cd	Merge "vp9_get_pred_context_intra_inter cleanup."	2013-08-01 14:15:53 -07:00
Dmitry Kovalev	ce8dedc353	Cleanup: removing unused function arguments. Change-Id: I27471768980fc631916069f24bc7c482a5c9ca17	2013-08-01 13:41:38 -07:00
Dmitry Kovalev	b621e2d72e	Nice looking motion vector clamping functions. Removing assign_and_clamp_mv function, making implementation of clamp_mv and clamp_mv2 more clear and consistent. Change-Id: Iecd08e1c1bf0379f8314ebe01811f8253f4ade58	2013-08-01 13:40:26 -07:00
Deb Mukherjee	dbea726daf	Adds a source variance computation function Adds a function to compute source variance for various sb_types to be used for pruning mode and partition searches. [The existing activity measure function is currently specialized for only 16x16 MBs and needs to be updated]. Change-Id: I22a41e6f1430184201487326fdbebb9b47e6fc24	2013-08-01 13:01:54 -07:00
Jingning Han	67719abde1	Remove unused vp9_short_idct10_32x32_add The inverse 32x32 transform detects all zero entries and skips the computations accordingly per 8 rows in the first 1-D operation. The function vp9_short_idct10_32x32_add performs differently and is not used anywhere, hence removed. Change-Id: Ic4fad422debbde7b6b6ffed47c69fbd4268a906c	2013-08-01 12:45:16 -07:00
Jingning Han	56df76bf1b	Merge "Optimize 32x32 2D inverse DCT for speed-up"	2013-08-01 11:53:39 -07:00
Yunqing Wang	215b010f4b	Add more checking to using_small_partition_info If the partition is out of partition size range, we don't need to process small partition information. Change-Id: Ice9bfbbdebe1f2ef79271a3aee17de0ed4608376	2013-08-01 11:37:41 -07:00
Yunqing Wang	7965a6ea34	Comment out 2 unused speed features use_min_partition_size and use_max_partition_size are not used currently, and could be added back if needed later. Change-Id: Ib22a9c06b064567a7c1d6d5445567ed77e0d3acc	2013-08-01 11:03:34 -07:00
Dmitry Kovalev	ff4bfa726b	Merge "Adding missing const to vp9_extra_bits array."	2013-08-01 10:19:51 -07:00
Adrian Grange	89e73c63c0	Fixed typos and added a few explanatory comments Change-Id: Ib4e4b41094b54874ee34343dd77c0c131ceed9d2	2013-08-01 09:23:49 -07:00
Adrian Grange	5271d47892	Changed name of rd_pick_intra4x4mby_modes The function name rd_pick_intra4x4mby_modes is confusing, so I changed it to rd_pick_intra_sub_8x8_y_modes to better reflect what the function does. Also added const qualifiers to some of the input parameters and removed camel-case. Change-Id: I23d53d4c7af5d79ed8a471acd59a09bbb47add39	2013-08-01 09:23:49 -07:00
Dmitry Kovalev	5b65246a71	Adding missing const to vp9_extra_bits array. Change-Id: Icd128ab58719e0b9066bdfa66a5d0d427a84d6df	2013-07-31 18:51:18 -07:00
Dmitry Kovalev	fb3e78a73a	vp9_get_pred_context_intra_inter cleanup. Change-Id: I8beeee4c020425175f7d5ec83be86afa7b95da1a	2013-07-31 18:33:04 -07:00
Jingning Han	9d67495f72	Optimize 32x32 2D inverse DCT for speed-up This commit exploits the sparsity of quantized coefficient matrix. It detects each 32x8 array and skip the corresponding inverse transformation if all entries are zero. For ped1080p at 8000 kbps, this on average reduces the runtime of 32x32 inverse 2D-DCT SSE2 function from 6256 cycles -> 5200 cycles. It makes the overall encoding process about 2% faster at speed 0. The speed-up is more pronounceable for the decoding process. Change-Id: If20056c3566bd117642a76f8884c83e8bc8efbcf	2013-07-31 17:13:31 -07:00
Jingning Han	12f5762756	Remove unnecessary arguments in rd_pick_ref_frame This commit removes redundant arguments passing in the function of rd_pick_reference_frame. This resolves the clang warnings about potential use of uninitialized values. Change-Id: Ic68f949a9f8fcd0a583786b0c75321104ea44739	2013-07-31 17:04:13 -07:00
Dmitry Kovalev	8259cdf298	vp9_decodemv.c cleanup. Inlining VP9_NMV_UPDATE_PROB constant, consistent local variable names. Change-Id: I01692501982568fa535882d6b320e3c692f88abb	2013-07-31 15:03:36 -07:00
Dmitry Kovalev	9239e96536	Removing get_mi_{row, col} functions. Passing mi_row and mi_col parameters to functions explicitly. Removing unused xd argument from scale_mv function. Change-Id: Icb4c495ec72d26fb066c14470d3ae0b741fbf18a	2013-07-31 14:06:55 -07:00
Dmitry Kovalev	3be9fd9120	Merge "Removing unused "ishp" arguments."	2013-07-31 12:03:04 -07:00
Dmitry Kovalev	0e0a6f840b	Merge "Consistent update for inter_mode probabilities."	2013-07-31 12:02:35 -07:00
Dmitry Kovalev	500ade243a	Removing unused "ishp" arguments. Using different variable names "allow_hp" and "use_hp" instead of "usehp". Change-Id: I0cd5996ddeb46bd754473b680a993c0aaf8eb879	2013-07-31 11:27:53 -07:00
Jingning Han	ac7bab7575	Merge "Make the use of ref_frame index consistent"	2013-07-31 09:11:37 -07:00
Jingning Han	86c384d398	Make the use of ref_frame index consistent Refactor the frame buffer referencing in choose_partition and make it consistent with other places. This means to prevent potential issues when we extend reference frame buffer. Change-Id: I5ff33ed5f671e1f4cc7049622212769a9b4578d9	2013-07-30 19:49:36 -07:00
Dmitry Kovalev	8701bc11df	Consistent update for inter_mode probabilities. Using inter-mode counts instead of inter-mode-tree branch counts inside FRAME_COUNTS structure. Change-Id: I60dde13af37d06146d7d15543311c1b5044e9e04	2013-07-30 18:06:34 -07:00
Adrian Grange	fbd73648dd	Merge "Cleanup typos, remove unnecessary lines, replace switch"	2013-07-30 12:59:46 -07:00
Adrian Grange	b30a06b930	Cleanup typos, remove unnecessary lines, replace switch Removed unnecessary code lines, replaced switch with an if, fixed spelling errors and formatting. Change-Id: Ie48aa4604aa0ed48362ca359d792fb21b2ec1dc6	2013-07-30 12:10:32 -07:00
Yaowu Xu	88e48444da	Merge "removed duplication"	2013-07-30 09:38:02 -07:00
Yaowu Xu	a15d1f3134	removed duplication Change-Id: Ica23b66f6664e5a5b168499584f0afffbc54794f	2013-07-30 09:09:14 -07:00
Jingning Han	525745b17a	Remove a redundant branching in tokenize_b The tokenize_b function is only called when output flag is on. Hence removing the conditional branch on it therein. Change-Id: Ib709f47f23f39ca05a695faf86fa3377f11f2dd0	2013-07-29 17:08:13 -07:00
Jingning Han	455f2de20b	Tune tokenization/detokenization flow for speed-up This commit optimizes the tokenization and detokenization operational flow for speed-up. It makes the coding process about 0.3% faster at speed 0. Change-Id: I28008df7482874e4b5f237f2d418ff82a249dd56	2013-07-29 16:15:30 -07:00
Jingning Han	b5323ed89a	Skip redundant tokenization in rd loop This commit makes the encoder skip the redundant tokenization process in the rate-distortion optimization search loop, while updating the entropy contexts accordingly. It makes the speed 0 encoding process about 0.5% faster at no performance change. Change-Id: I34a4155a0b5332afeb45c93a51c7f35a294d685c	2013-07-29 16:09:16 -07:00
Jingning Han	5875d7a4a4	Merge "16x16 inverse 2D-DCT with DC only"	2013-07-29 15:29:25 -07:00
John Koleszar	9c6fafb25b	Merge "Remove unnecessary 64 byte alignment"	2013-07-29 15:09:15 -07:00
Jingning Han	a7c4de22e1	16x16 inverse 2D-DCT with DC only This commit provides special handle on 16x16 inverse 2D-DCT, where only DC coefficient is quantized to be non-zero value. Change-Id: I7bf71be7fa13384fab453dc8742b5b50e77a277c	2013-07-29 14:45:53 -07:00
Dmitry Kovalev	828119d6ab	Renaming txfm to tx for consistency in some places. Change-Id: I2a6a646570e2af66315e7c658d00d99f80c4b127	2013-07-29 14:35:55 -07:00
John Koleszar	a31effca75	Remove unnecessary 64 byte alignment Fixes a warning on MSVS 2012 where the alignment of vp9_default_iscan_8x8 didn't match between its declaration and definition. Change-Id: I1466a15635f4b22594d705d570b7e399bfb6cf21	2013-07-29 14:02:02 -07:00
Dmitry Kovalev	730a34416f	Renaming NB_TXFM_MODES constant to TX_MODES. Change-Id: I10bf06e3a3d5271221ae6a42a36074d01d493039	2013-07-29 13:38:40 -07:00
Dmitry Kovalev	23391ea835	Renaming TX_SIZE_MAX_SB to TX_SIZES. Change-Id: I6aa4191935aa93461a07c41b59fdae1eb5f5f107	2013-07-29 12:25:34 -07:00
Jingning Han	decb1b94de	Merge "Shortcut 8x8/16x16 inverse 2D-DCT"	2013-07-29 11:04:07 -07:00
Dmitry Kovalev	cc0ff7ecfa	Cleanup: replacing xd->mode_info_context with temp variable. Change-Id: I5a3e83102784cabb918a5404405fcab99c5bb9b6	2013-07-26 19:05:37 -07:00
Ronald S. Bultje	118ccdcd30	Inverse dimension order in token_cost array. This allows us to increment the position at the band-level only as we go from one band to the next; more importantly, that allows us to use an add instead of multiply instruction, and omit the instruction altogether if the band doesn't change from one coef to the next, thus being slightly faster (probably more noticeable on systems where a multiply is expensive, like arm). Change-Id: I4343fe35b9f9a47fa00b217bdcbf5f91ff96c381	2013-07-26 17:30:04 -07:00
Dmitry Kovalev	35e7e7b614	Merge "vp9_decodemv.c cleanup."	2013-07-26 17:24:34 -07:00
Ronald S. Bultje	6f3054b65d	Merge "d45 intra prediction SSSE3 optimizations."	2013-07-26 17:21:09 -07:00
Ronald S. Bultje	dcacce6dd9	Merge "Save pixels instead of coefficients in intra4x4 RD loop."	2013-07-26 17:20:58 -07:00
Ronald S. Bultje	d30c8f41ef	Merge "Add best_rd breakout in intra4x4 RD loop."	2013-07-26 17:20:51 -07:00
Jingning Han	38fa487164	Shortcut 8x8/16x16 inverse 2D-DCT This commit brought back the shortcut implementation of 8x8/16x16 inverse 2D-DCT. When the eob <= 10, it skips the inverse transform operations on row 4:7/4:15 in the first round. For bus_cif at 1000 kbps, this provides about 2% speed-up at speed 0. Change-Id: I453e2d72956467d75be4ad8c04b4482ab889d572	2013-07-26 17:19:14 -07:00
Dmitry Kovalev	d42e60d2d8	vp9_decodemv.c cleanup. Renaming: read_intra_mode_info -> read_intra_frame_mode_info read_inter_mode_info -> read_inter_frame_mode_info read_intra_block_part -> read_intra_block_mode_info read_inter_block_part -> read_inter_block_mode_info read_ref_frame -> read_ref_frames read_reference_frame -> read_is_inter_block Using num_4x4_blocks_{wide, high}_lookup instead of bit shifts. Change-Id: I83c81573b4ef6f53f2f8d24683895014bebfba61	2013-07-26 16:49:49 -07:00
Jingning Han	b9c3dd481a	Merge "Special handle on DC only inverse 8x8 2D-DCT"	2013-07-26 16:04:14 -07:00
Dmitry Kovalev	620861dedc	Merge "Making read_inter_mode_info function more clear."	2013-07-26 15:47:40 -07:00
hkuang	aaa9755746	Merge "Fix some format error and code error in neon code."	2013-07-26 15:24:28 -07:00
Jingning Han	325e0aa650	Special handle on DC only inverse 8x8 2D-DCT This commit enables a special handle for the 8x8 inverse 2D-DCT, where only DC coefficient is quantized to be non-zero. For bus_cif at 2000 kbps, it provides about 1% speed-up at speed 0. Change-Id: I2523222359eec26b144cf8fd4c63a4ad63b1b011	2013-07-26 14:16:51 -07:00
hkuang	588b4daf54	Fix some format error and code error in neon code. Change-Id: I748dee8938dfb19f417f24eed005f3d216f83a82	2013-07-26 14:14:57 -07:00
Dmitry Kovalev	c09b81719f	Merge "General cleanups."	2013-07-26 13:59:39 -07:00
Ronald S. Bultje	94b0c6791d	d45 intra prediction SSSE3 optimizations. Change-Id: Ie48035ff4f93c41f8a9b3023e6444fd10432d8fb	2013-07-26 13:30:02 -07:00
Yaowu Xu	4f75a1f4ed	Merge "Auto min and max partition size experiment."	2013-07-26 12:10:27 -07:00
Paul Wilkins	fe5e2a91bb	Auto min and max partition size experiment. Speed feature experiment to set an upper and lower partition size limit based on what has been seen in spatial neighbors. This seems to gives quite reasonable speed gains in local (10-15%) and when used with speed 0 the losses are small (0.25% derf, 0.35% stdhd). However, for now I am only enabling it on speed 1 as there may be clashes with the existing temporal partition selection in speed 2. Using a tighter min / max around the range derived from the neighbors increases speed further but at the cost of a bigger quality loss. However, I think this spatial method could be combined with data from either the last frame or a variance method (or both) to refine the range of minimum and maximum partition size. I.e. consider the min and max from spatial and temporal neighbors and the variance recommendation. Change-Id: I1b96bf8b84368d6aad0c7aa600fe141b4f07435f	2013-07-26 18:30:49 +01:00
Yunqing Wang	52256cdbca	Modify static threshold calculation Used 3 * standard_deviation in internal threshold calculation instead of fit curve. This actually approached the algorithm better. For comparison, similar tests were done: The overall psnr loss is less than before. 1. derf set: when static-thresh = 1, psnr loss is 0.329%; when static-thresh = 500, psnr loss is 0.970%; 2. stdhd set: when static-thresh = 1, psnr loss is 0.922%; when static-thresh = 500, psnr loss is 1.307%; Similar speedup is achieved. For example, clip bitrate static-thresh psnr time akiyo(cif) 500 0 48.952 5.077s(50f) akiyo 500 500 48.866 4.169s(50f) parkjoy(1080p) 4000 0 30.388 78.20s(30f) parkjoy 4000 500 30.367 70.85s(30f) sunflower(1080p) 4000 0 44.402 74.55s(30f) sunflower 4000 500 44.414 68.69s(30f) Change-Id: Ic78833642ce1911dbbd1cb6c899a2d7e2dfcc1f3	2013-07-25 19:59:33 -07:00
Dmitry Kovalev	048e9c0991	Making read_inter_mode_info function more clear. Now read_inter_mode_info calls read_intra_block_part (renamed from read_intra_block_modes) or read_inter_block_part (just added). Change-Id: I541badea6b663e0ae692ec158665efb90ed20c03	2013-07-25 15:30:18 -07:00
Johann	67b07c520d	Merge "Add const to vp9_accum_mv_refs parameter"	2013-07-25 15:10:52 -07:00
Yunqing Wang	845fd5011c	Merge "Add encoding option --static-thresh"	2013-07-25 14:58:00 -07:00
Yunqing Wang	d36852b702	Add encoding option --static-thresh This option exists in VP8, and it was rewritten in VP9 to support skipping on different partition levels. After prediction is done, we can check if the residuals in the partition block will be all quantized to 0. If this is true, the skip flag is set, and only prediction data are needed in reconstruction. Based on DCT's energy conservation property, the skipping check can be estimated in spatial domain. The prediction error is calculated and compared to a threshold. The threshold is determined by the dequant values, and also adjusted by partition sizes. To be precise, the DC and AC parts for Y, U, and V planes are checked to decide skipping or not. Test showed that 1. derf set: when static-thresh = 1, psnr loss is 0.666%; when static-thresh = 500, psnr loss is 1.162%; 2. stdhd set: when static-thresh = 1, psnr loss is 1.249%; when static-thresh = 500, psnr loss is 1.668%; For different clips, encoding speedup range is between several percentage and 20+% when static-thresh <= 500. For example, clip bitrate static-thresh psnr time akiyo(cif) 500 0 48.923 5.635s(50f) akiyo 500 500 48.863 4.402s(50f) parkjoy(1080p) 4000 0 30.380 77.54s(30f) parkjoy 4000 500 30.384 69.59s(30f) sunflower(1080p) 4000 0 44.461 85.2s(30f) sunflower 4000 500 44.418 78.1s(30f) Higher static-thresh values give larger speedup with larger quality loss. Change-Id: I857031ceb466ff314ab580ac5ec5d18542203c53	2013-07-25 14:28:05 -07:00
Johann	6c8ef8d957	Add const to vp9_accum_mv_refs parameter Change-Id: I0625d8ffddf590dfecd1bb8b8d6f57ef64b8bf18	2013-07-25 14:25:33 -07:00
Dmitry Kovalev	7131cb0e3d	General cleanups. Removing unused constants, macros, and function declarations. Using ROUND_POWER_OF_TWO macro, vp9_zero, vp9_copy where possible. Moving #include from .h to .c. Merging for loops for motion vectors. Change-Id: Ic3bf841764a2bb177128bb3a6d7aa8f68229cd13	2013-07-25 14:13:48 -07:00
Dmitry Kovalev	d53fc9ee4e	Merge "Adding lookup table for size group."	2013-07-25 13:57:28 -07:00
Dmitry Kovalev	08fd41ccd7	Adding lookup table for size group. Change-Id: Ia6144d77ebed66e0739b62e4d673e26a95aa9550	2013-07-25 12:58:54 -07:00
Adrian Grange	e862c6f9eb	Merge "Simplify handling of sub-partition motion vectors"	2013-07-25 12:58:38 -07:00
Adrian Grange	6f0f0e4907	Merge "Use local variables rather than structure members"	2013-07-25 12:57:52 -07:00
Dmitry Kovalev	be00d3970d	Merge "Removing duplicated code for merging two probabilities."	2013-07-25 12:52:26 -07:00
Dmitry Kovalev	d604914f09	Merge "Removing vp9_adapt_mode_context function."	2013-07-25 12:46:31 -07:00
Jingning Han	d571af76d3	Merge "Make coeff_optimize initialized per-plane"	2013-07-25 12:46:14 -07:00
Dmitry Kovalev	f7ece83141	Merge "Inlining inc_mv_component_count function."	2013-07-25 12:45:23 -07:00
Dmitry Kovalev	9f8335d091	Merge "Removing duplicated PREDICTION_PROBS constant."	2013-07-25 12:45:03 -07:00
Yaowu Xu	51a8458822	Merge "fix a bug where flags are not reset"	2013-07-25 12:18:51 -07:00
Adrian Grange	be700e140a	Simplify handling of sub-partition motion vectors Simplified the code that extracts and uses the motion vectors for the 4 sub-partitions in rd_pick_partition. Change-Id: Iaf698ef7ee3aef9edd59015e1ae065dd359b17d9	2013-07-25 11:51:51 -07:00
Jingning Han	2f58faffa4	Make coeff_optimize initialized per-plane This commit makes the initialization of trellis coeff optimization a per-plane operation, thereby eliminating the redundant steps in encode_sby and encode_sbuv. It makes the encoder at speed 0 slightly faster. Change-Id: Iffe9faca6a109dafc0dd69dc7273cbdec19b17cd	2013-07-25 11:44:29 -07:00
Dmitry Kovalev	778989a097	Removing duplicated PREDICTION_PROBS constant. Already defined in vp9_seg_common.h. Change-Id: I5a0e3fa15966b1ebeb77ccd506b55fc231c22342	2013-07-25 11:08:21 -07:00
Dmitry Kovalev	47d61f008f	Removing vp9_adapt_mode_context function. Moving code from vp9_adapt_mode_context to vp9_adapt_mode_probs. Change-Id: I60829c30b28968cd813551ef3a206dfb98d323c9	2013-07-25 10:48:45 -07:00
Yaowu Xu	3e386aefc2	fix a bug where flags are not reset The feature that uses small partition results as a measure to skip mode evaluation at larger partition requires the flags to be reset. The reset was missing in the code path that calls rd_use_partition(). Change-Id: Ia0a3a0aee1a862b6e2333d596808db7c48033d50	2013-07-25 10:28:38 -07:00
Jingning Han	242157c756	Merge "SSE2 inverse 4x4 2D-DCT with DC only"	2013-07-25 08:49:37 -07:00
Scott LaVarnway	a0e8b45fee	Merge "pack_inter_mode_mvs cleanup"	2013-07-25 04:47:56 -07:00
Jingning Han	384e37e32b	SSE2 inverse 4x4 2D-DCT with DC only Add SSE2 implementation to handle the special case of inverse 2D-DCT where only DC coefficient is non-zero. Change-Id: I2c6a59e21e5e77b8cf39a4af5eecf4d5ade32e2f	2013-07-24 23:19:56 -07:00
Jingning Han	91fa12429c	Merge "Merge vp9_dc_only_idct_add and vp9_short_idct4x4_1"	2013-07-24 23:18:24 -07:00
Dmitry Kovalev	40358dc406	Removing duplicated code for merging two probabilities. Adding common merge_probs and merge_probs2 functions. Changing ints to usigned ints in some places. Change-Id: Icf088ffdea7cf5b95284a128916409bdd53506b0	2013-07-24 17:44:04 -07:00
Dmitry Kovalev	4450fa4cd9	Inlining vp9_init_mode_contexts function. Change-Id: I21ee76bcae101cc9f6ef1d867622e50b7ae565fc	2013-07-24 17:03:03 -07:00
Jingning Han	d2de1ca37b	Merge vp9_dc_only_idct_add and vp9_short_idct4x4_1 They share the same functionality, so merging together. Change-Id: I98a0386fcee052cb854f9ff90c283c1b844bcb79	2013-07-24 16:51:15 -07:00
Dmitry Kovalev	fcc34796d2	Removing CONFIG_BALANCED_COEFTREE experiment. Change-Id: I61a8b0101eac3ee2e0621d56151b90c269fd4db4	2013-07-24 15:53:42 -07:00
Dmitry Kovalev	1787b00214	Merge "Adding condition inside get_tx_type_{4x4, 8x8, 16x16}."	2013-07-24 15:23:22 -07:00
Dmitry Kovalev	0064958c71	Inlining inc_mv_component_count function. Change-Id: Ic99d07a56b1752ec49fc5074b1dd6804b17609a0	2013-07-24 15:03:00 -07:00
Dmitry Kovalev	9139ee0908	Adding condition inside get_tx_type_{4x4, 8x8, 16x16}. Adding plane type check condition because it was always used outside of get_tx_type_{4x4, 8x8, 16x16}. Change-Id: I02f0bbfee8063474865bd903eb25b54d26e07230	2013-07-24 12:55:45 -07:00
James Zern	9e29b4cd54	Merge "vp9_find_mv_refs_idx: remove unused split_count"	2013-07-24 12:49:15 -07:00
James Zern	e6c0387edd	vp9_find_mv_refs_idx: remove unused split_count variable was write only Change-Id: I04b002178f66961836ee08fb60a05b91b54e91d8	2013-07-24 11:51:37 -07:00
Adrian Grange	4cfd36d8fd	Use local variables rather than structure members Although local copies of the mode member variables (mode, ref_frame) were made, they were not used in all places. Also, made a local copy of the second_ref_frame member. Change-Id: I84d8c822e5cb3d8a02fc3de8a4037ca3fea8bfad	2013-07-24 11:17:44 -07:00
Adrian Grange	a183f17d33	Merge "Correct spelling mistakes"	2013-07-24 09:48:57 -07:00
Ronald S. Bultje	7817d3221f	Save pixels instead of coefficients in intra4x4 RD loop. Prevents doing duplicate IDCTs; encoding of first 50 frames of bus (speed 0) @ 1500kbps goes from 1min4.0 to 1min3.5, i.e. 0.87% faster overall. Change-Id: I2df39e29ed9d5ea5e7d2704a34940ba622832ddd	2013-07-24 09:03:20 -07:00
Ronald S. Bultje	b72ecbb1b9	Add best_rd breakout in intra4x4 RD loop. Encoding time of first 50 frames of bus (speed 0) @ 1500kbps goes from 1min5.4 to 1min4.0, i.e. 2.2% faster overall. Change-Id: I8c32f2aff9a649ce7dd49d910dc5ba16b99c3bc6	2013-07-24 09:02:05 -07:00
Adrian Grange	bc8b0529db	Correct spelling mistakes Change-Id: Id4138293efeac4503b2e01ce7a6c150a5abeef77	2013-07-24 07:58:26 -07:00
Ronald S. Bultje	47336afd8d	Merge "More optimizations for cost_coeffs()."	2013-07-23 21:36:12 -07:00
Jingning Han	666c266623	Merge "Unify the use of encode_b_args/optimize_block_args"	2013-07-23 18:08:50 -07:00
Dmitry Kovalev	1099a436d3	Moving counts from FRAME_CONTEXT to new struct FRAME_COUNTS. Counts are separate from frame context. We have several frame contexts but need only one copy of all counts. Change-Id: I5279b0321cb450bbea7049adaa9275306a7cef7d	2013-07-23 17:02:08 -07:00
Jingning Han	ab77828b36	Unify the use of encode_b_args/optimize_block_args The struct optimize_block_args is defined same as encode_b_args. Remove this redundant definition, and use encode_b_args consistently. Change-Id: I1703aeeb3bacf92e98a34f4355202712110173d9	2013-07-23 16:04:02 -07:00
Dmitry Kovalev	8d13b0d1df	Removing LOW_PRECISION_MV_UPDATE define. Change-Id: I78d16ee758e1fae0200b746f00031f6d9c6d6ce7	2013-07-23 15:41:45 -07:00
Dmitry Kovalev	a9bbabd94b	Merge "Removing vp9_is_interpolating_filter array."	2013-07-23 15:01:19 -07:00
Adrian Grange	719cd35f3a	Merge "Rolled-up several for loops into one"	2013-07-23 15:00:06 -07:00
Adrian Grange	646edbc1b2	Rolled-up several for loops into one Several consecutive for loops executed over the same index range, so I rolled them into one. Change-Id: I5cfcc8c38c738478965768409cca9d09adf224e1	2013-07-23 14:32:21 -07:00
Dmitry Kovalev	db7f5d28b9	Removing vp9_is_interpolating_filter array. All filters are interpolating now, so we don't need this array, all values from this array are evaluated to true. Change-Id: I9af6d8219ae0eb984063cd15e4e2296374ae4961	2013-07-23 14:24:39 -07:00
Dmitry Kovalev	2855d8aea1	Merge "Adding update_tx_counts function."	2013-07-23 13:57:59 -07:00
Dmitry Kovalev	0d59d6efcd	Merge "Removing MODE_COUNT_TESTING from vp9_entropymode.c."	2013-07-23 13:57:05 -07:00
Jingning Han	825f676ceb	Merge "Make xform_quant operations tx_type independent"	2013-07-23 13:40:27 -07:00
Dmitry Kovalev	9c2c17dec7	Merge "Cleanup inside vp9_get_pred_context_tx_size."	2013-07-23 12:45:49 -07:00
Dmitry Kovalev	a97d4ab123	Removing MODE_COUNT_TESTING from vp9_entropymode.c. Change-Id: I5367bc1d9e660d86879d285a6f146d8a47e62464	2013-07-23 12:37:41 -07:00
Jingning Han	e9e2fe8ec3	Make xform_quant operations tx_type independent The xform_quant() module is only used by inter modes, hence removing the redundant switches therein conditioned on tx_type. Change-Id: Ib87ce5b2f2e4cbf3ceb133a1108afa173c933a3f	2013-07-23 12:37:25 -07:00
James Zern	8dede954c7	Merge "vp9: make some static tables const"	2013-07-23 11:37:01 -07:00
Jingning Han	4ef1d35abf	Merge "Skip inverse transform when eob is zero"	2013-07-23 10:31:19 -07:00
James Zern	c3871f8f70	Merge "VP9_COMMON: remove unused temp_scale_frame"	2013-07-23 10:30:55 -07:00
Deb Mukherjee	9360fd3dcf	Merge "Diamond search change to accelerate movement"	2013-07-23 10:14:10 -07:00
Jingning Han	0359ad7f9a	Skip inverse transform when eob is zero When all the transform coefficients were quantized to zero, skip the inverse transform operation. For bus_cif at 1000 kbps, the runtime goes from 154967ms -> 149842ms, i.e., about 3% speed-up, at speed 0. Change-Id: Ic0a813fff5e28972d4888ee42d8747846a6c3cc6	2013-07-23 10:06:41 -07:00
Paul Wilkins	cedd24ec61	Merge "Renaming of segment constants."	2013-07-23 08:16:12 -07:00
Scott LaVarnway	7bc294a3fe	pack_inter_mode_mvs cleanup xd->mode_info_context is set to m prior to this call. Change-Id: Ibc442529961750c29ccf0c6cae08cb2b0431415f	2013-07-23 10:08:28 -04:00
Jim Bankoski	256ee00093	Merge "clean up bw, bh"	2013-07-23 06:58:28 -07:00
Jim Bankoski	86a9dec73c	clean up bw, bh many structures use bw and bh and they have different meanings. This cl attempts to start this clean up and remove unneccessary 2 step look up log and then shift operations... also removed partition type multiple operation code in bitstream.c. Change-Id: I7e03e552bdfc0939738e430862e3073d30fdd5db	2013-07-23 06:51:44 -07:00
Scott LaVarnway	2fd20eb37d	Merge "Eliminated prev_mip memsets/memcpys in encoder"	2013-07-23 06:43:52 -07:00
Paul Wilkins	7c134bc0cd	Merge "Reworked the auto_mv_step_size speed feature"	2013-07-23 04:49:55 -07:00
Paul Wilkins	32042af14b	Renaming of segment constants. Renamed: MAX_MB_SEGMENTS to MAX_SEGMENTS MB_SEG_TREE_PROBS to SEG_TREE_PROBS The minimum unit for segmentation in the segment map is now 8x8 so it is misleading to use MB_ as macro-block traditionally refers to a 16x16 region. Change-Id: I0b55a6f0426bb46dd13435fcfa5bae0a30a7fa22	2013-07-23 12:09:04 +01:00
James Zern	3c8cce353f	vp9: make some static tables const Change-Id: I8bcae51271673da8755c66a51aea005dfe6a3739	2013-07-22 19:19:13 -07:00
Frank Galligan	e88db77892	Merge "Speedup loopfilter neon code."	2013-07-22 17:39:42 -07:00
Dmitry Kovalev	0ad079e583	Cleanup inside vp9_get_pred_context_tx_size. Using max_txsize_lookup to get max transform size. Change-Id: If4b39beba3c06a581effd8cab698ea90727dc2c9	2013-07-22 17:18:11 -07:00
James Zern	ab139094ed	Merge "VP9_COMMON: drop cur_tile_{row,col}_idx"	2013-07-22 17:12:39 -07:00
Frank Galligan	5af6bf6c43	Speedup loopfilter neon code. Try and cut down the cycle count by rearranging the instructions so there are less stalls. Change-Id: Ic1383335ee0f05e656477d9ee9c179ec231285d5	2013-07-22 17:00:01 -07:00
Ronald S. Bultje	e20fcd9585	More optimizations for cost_coeffs(). 4x4: 163 -> 123 cycles (33% faster) 8x8: 491 -> 399 cycles (23% faster) 16x16: 1889 -> 1763 cycles (7% faster) 32x32: 8311 -> 8180 cycles (1.6% faster) Overall encoding time of first 50 frames of bus (speed 0) @ 1500kbps goes from 1min4.33 to 1min3.00, i.e. 2.11% faster. Change-Id: Ib52d1dbb5649b14de769d3e7a74af67440b5284f	2013-07-22 16:09:09 -07:00
James Zern	38a4412e1b	vp9: apply loopfilter inline if possible excludes tiled content currently Change-Id: I44155253e8d6771e5e039d663be5f21cc9d0355d	2013-07-22 15:52:10 -07:00
Dmitry Kovalev	b2fc6fa969	Adding update_tx_counts function. Moving common encoder/decoder code to update_tx_counts. Also renaming vp9_get_pred_probs_tx_size to get_tx_probs2 and adding get_tx_probs to call vp9_get_pred_context_tx_size inside read_selected_tx_size only once (twice before). Change-Id: Ia50247f3893de88ef8e9041b0d44be44a40aaa4d	2013-07-22 14:57:43 -07:00
James Zern	746154d905	Merge "filter_block_plane: remove MACROBLOCKD param"	2013-07-22 13:43:34 -07:00

... 4 5 6 7 8 ...

2659 Commits