generic-library/vpx

Author	SHA1	Message	Date
Yaowu Xu	66d94ac13c	Improve 32x32 forward dct The commit improves the 32x32 forward dct implementation: 1. change to use same constants and rounding as other forward dcts 2. select rounding to specifically minimize the roundtrip error, which improved average 19/block to .77/block using 100000 random input. Test showed a small but consistent gain on all test sets, about .15% Change-Id: If0afd6a71880a522f60c1c234be0462092c2eb53	2013-02-26 09:23:01 -08:00
Dmitry Kovalev	9bf3f75168	Changing pitch value meaning for fht and iht transforms. Pitch now means the number of elements, not the number of bytes. Change-Id: Idb9f2f012e39b09d596a3cc1802305a80b7c13af	2013-02-25 18:19:55 -08:00
Yaowu Xu	ecb03e9a3f	make cost_coeffs to use combined context Change-Id: Ia15f4244595fab49bffda0c651a750a8a9481d28	2013-02-25 17:01:33 -08:00
Dmitry Kovalev	9770d564f4	Code cleanup. Removing switch statements for inverse hybrid transforms. Making code style consistent for all similar transform implementations. Renaming shortpitch and short_pitch variables to half_pitch. Change-Id: I875f7a82aae4e8063a58777bf1cc3f1e67b48582	2013-02-25 15:14:01 -08:00
Dmitry Kovalev	3171b69dee	Merge "Code cleanup." into experimental	2013-02-25 14:14:22 -08:00
Dmitry Kovalev	0287d20a05	Merge "Code cleanup." into experimental	2013-02-25 13:58:06 -08:00
Jingning Han	e7b67d33a9	Merge "Improving the forward 16x16 ADST/DCT accuracy" into experimental	2013-02-25 13:38:33 -08:00
Dmitry Kovalev	20b0cb599b	Code cleanup. Removing redundant parentheses, better code formatting, introducing ROUND_POWER_OF_TWO macro to replace repeated expression. Change-Id: I91aad7a53ed03482428b2419de4bb99fd92c6771	2013-02-25 13:38:18 -08:00
Dmitry Kovalev	ab196b7e9b	Code cleanup. Lower case names of variables. Removing redundant spaces, parentheses, casts, and variables. Change-Id: I55b80c55b7d5adca44c1e8adb40a124c0680f229	2013-02-25 13:33:56 -08:00
James Zern	b2fc3ca066	vp9: promote gf_group_bits calculation to 64-bit avoids signed integer overflow Change-Id: I9ffcdba90b21edb324d1b173fd11d613e0592931	2013-02-25 13:00:18 -08:00
Paul Wilkins	0e36158c70	Merge "Minor rate control refactoring and experiments." into experimental	2013-02-25 12:49:54 -08:00
Jingning Han	65821d6680	Improving the forward 16x16 ADST/DCT accuracy Increase the first stage dynamic range by 4 times, and reduce it back with proper rounding before applying the second stage. Hence it still fits in the given dynamic range and slightly improves the key frame coding performance. Change-Id: Ia4c5907446f20a95dc3de079c314b3ad1221d8aa	2013-02-25 12:13:37 -08:00
Jingning Han	77a3becf92	clean up forward and inverse hybrid transform Rebased. Remove the old matrix multiplication transform computation. The 16x16 ADST/DCT can be switched on/off and evaluated by setting ACTIVE_HT16 300/0 in vp9/common/vp9_blockd.h. Change-Id: Icab2dbd18538987e1dc4e88c45abfc4cfc6e133f	2013-02-25 09:16:12 -08:00
Paul Wilkins	97da8b8c33	Minor rate control refactoring and experiments. Some minor refactoring code relating to estimates of bits per MB at a given Q and estimating the allowed Q range. Most of the changes here were included in a previous commit. This commit seeks to separate out the refactoring from more the material changes. Two #define control flags have been added for experimentation. ONE_SHOT_Q_ESTIMATE force the two pass encoder to use its initial Q range estimate for the whole clip even if this results in a miss on the target data rate. In effect this tightens the Q range seen at the expense of rate control accuracy. DISABLE_RC_LONG_TERM_MEM is a related flag that disables the long term memory in the rate control. Local adjustments are still made to try and better hit the rate target on a per frame basis but the impact of rate control misses is not propagated to the remainder of the clip. This means that for example an overshoot early on will not cause frames later in the clip to be starved of bits. Again the result of this relaxation amy be less rate control accuracy especially on short clips. The flags are disabled by default for now. Change-Id: I7482f980146d8ea033b5d50cc689f772e4bd119e	2013-02-25 17:07:45 +00:00
Yaowu Xu	499fe05dc0	optimize forward 16x16 DCT for accuracy This commit added pre/post scaling for first half of fDCT16x16 to reduce error, by simulation of 100,000 blocks for random inputs, the average sse reduced from 2.1/block to 0.0498/block. also enabled tests for 16x16 fDCT and iDCT Change-Id: Id2a95f0464c6dd4118797d456237ae90274c0f02	2013-02-25 07:47:27 -08:00
Ronald S. Bultje	0c9e2e9a1d	Split coefficient token tables intra vs. inter. Change-Id: I5416455f8f129ca0f450d00e48358d2012605072	2013-02-23 07:33:46 -08:00
Paul Wilkins	c17672a33d	Further changes to coefficient contexts. This patch alters the balance of context between the coefficient bands (reflecting the position of coefficients within a transform blocks) and the energy of the previous token (or tokens) within a block. In this case the number of coefficient bands is reduced but more previous token energy bands are supported. Some initial rebalancing of the default tables has been by running multiple derf clips at multiple data rates using the ENTOPY_STATS macro. Further balancing needs to be done using larger image formatsd especially in regard to the bigger transform sizes which are not as well represented in encodings of smaller image formats. Change-Id: If9736e95c391e711b04aef6393d26f60f36e1f8a	2013-02-23 07:29:09 -08:00
Yaowu Xu	bf0570a7e6	Merge "optimize 8x8 fdct rounding for accuracy" into experimental	2013-02-22 22:20:57 -08:00
Yaowu Xu	22012ee994	optimize 8x8 fdct rounding for accuracy The commit added a final rounding choice for 8x8 forward dct to get rid of a sign bias at DC position and improve the accuracry in term of round trip error for 8x8 fDCT/iDCT. This commit also enabled forward 8x8 dct test. Change-Id: Ib67f99b0a24d513e230c7812bc04569d472fdc50	2013-02-22 16:55:30 -08:00
James Zern	e5fb6321a1	give vp9 variance struct a unique name variance_vtable clashed with vp8/common/variance.h Change-Id: I09c1de44d5519f1bd13f58c01144c0de4706de6f	2013-02-22 16:25:13 -08:00
James Zern	c21226b638	Merge "vp8: make gf_group_bits 64-bit"	2013-02-22 15:31:28 -08:00
James Zern	5e0724abad	Merge "vp8_first_pass(): avoid floating point div by 0"	2013-02-22 15:30:14 -08:00
James Zern	4e00060d29	vp8: make gf_group_bits 64-bit avoids signed integer overflow; matches kf_group_bits Change-Id: I193145cdc4fa53e70fba0a1731a03eb1a574931d	2013-02-22 12:45:28 -08:00
James Zern	fba9772dd2	vp8_first_pass(): avoid floating point div by 0 Change-Id: Id1e6a12db6b0c1d3f64ead8fd8834aadc30fbed2	2013-02-22 12:41:59 -08:00
Jingning Han	936aa281b5	Fixed the buffer overflow issue The issue that potentially broke the encoding process was due to the fact that the length of token link is calculated from the total number of tokens coded, while it is possible, in high bit-rate setting, this length is greater than the buffer length initially assigned to the cpi->tok. This patch increases the initially allocated buffer length assigned to cpi->tok from (mb_rows * mb_cols * 24 * 16) to (mb_rows * mb_cols * (1 + 24 * 16)). It resolves the buffer overflow problem. Change-Id: I8661a8d39ea0a3c24303e3f71a170787a1d5b1df	2013-02-22 12:30:35 -08:00
John Koleszar	606a2561d6	Merge "Code cleanup." into experimental	2013-02-22 11:20:20 -08:00
Dmitry Kovalev	548b4dd5f2	Code cleanup. Removing redundant 'extern' keywords and parentheses, fixing indentation, making variable names lower case, using short expressions x = c instead of x = x c, minor code simplifications. Change-Id: If6a25fcf306d1db26e90d27e3c24a32735c607de	2013-02-22 11:03:14 -08:00
Jingning Han	c67a20994f	Merge "Forward butterfly hybrid transform" into experimental	2013-02-22 09:20:26 -08:00
Paul Wilkins	b5f3cb6e37	Merge "Experimental removal of over quant code" into experimental	2013-02-22 08:44:40 -08:00
Paul Wilkins	dbf4942046	Experimental removal of over quant code The over quant code was added in VP8 post bitstream freeze to allow compression to lower data rates In VP9 the real qualtizer range has been greatly extended anyway. Change-Id: I5d384fa5e9a83ef75a3df34ee30627bd21901526	2013-02-22 14:00:51 +00:00
Jingning Han	babbd5d170	Forward butterfly hybrid transform This patch includes 4x4, 8x8, and 16x16 forward butterfly ADST/DCT hybrid transform. The kernel of 4x4 ADST is sin((2k+1)(n+1)/(2N+1)). The kernel of 8x8/16x16 ADST is of the form sin((2k+1)(2n+1)/4N). Change-Id: I8f1ab3843ce32eb287ab766f92e0611e1c5cb4c1	2013-02-21 18:24:28 -08:00
Dmitry Kovalev	5a18106fb7	Code cleanup. Removing redundant 'extern' keywords. Moving VP9DX_BOOL_DECODER from .h to .c file. Change-Id: I5a3056cb3d33db7ed3c3f4629675aa8e21014e66	2013-02-21 13:50:15 -08:00
Ronald S. Bultje	8c16dee4f2	Merge "Remove "eobs" array in MACROBLOCKD." into experimental	2013-02-21 11:30:29 -08:00
John Koleszar	4674312382	Merge "Code cleanup." into experimental	2013-02-21 10:56:17 -08:00
Dmitry Kovalev	5da8534963	Code cleanup. Removing redundant 'extern' keyword from function declarations and making function arguments lower case. Change-Id: Idae9a2183b067f2b6c85ad84738d275e8bbff9d9	2013-02-21 10:34:33 -08:00
Ronald S. Bultje	35524e2231	Remove "eobs" array in MACROBLOCKD. The information is a duplicate of "eob" in BLOCKD. Change-Id: Ia6416273bd004611da801e4bfa6e2d328d6f02a3	2013-02-21 10:07:36 -08:00
Deb Mukherjee	048f593703	Merge "Refactoring of switchable filter search for speed" into experimental	2013-02-21 09:23:50 -08:00
John Koleszar	138ffb6ea9	Merge "Avoid division in intra prediction" into experimental	2013-02-21 08:33:17 -08:00
Deb Mukherjee	28b1db9278	Refactoring of switchable filter search for speed Refactors the switchable filter search in the rd loop to improve encode speed. Uses a piecewise approximation to a closed form expression to estimate rd cost for a Laplacian source with a given variance and quantization step-size. About 40% encode time reduction is achieved. Results (on a feb 12 baseline) show a slight drop: derf: -0.019% yt: +0.010% std-hd: -0.162% hd: -0.050% Change-Id: Ie861badf5bba1e3b1052e29a0ef1b7e256edbcd0	2013-02-20 18:34:42 -08:00
Jingning Han	abfd2a4880	Merge "Fixed the buffer overflow issue" into experimental	2013-02-20 16:27:27 -08:00
Jingning Han	232ccc2fbe	Fixed the buffer overflow issue The issue that potentially broke the encoding process was due to the fact that the length of token link is calculated from the total number of tokens coded, while it is possible, in high bit-rate setting, this length is greater than the buffer length initially assigned to the cpi->tok. This patch increases the initially allocated buffer length assigned to cpi->tok from (mb_rows * mb_cols * 24 * 16) to (mb_rows * mb_cols * (1 + 24 * 16)). It resolves the buffer overflow problem. Change-Id: I8661a8d39ea0a3c24303e3f71a170787a1d5b1df	2013-02-20 15:41:48 -08:00
Dmitry Kovalev	e6c89a1f9b	Merge "Code cleanup." into experimental	2013-02-20 12:47:54 -08:00
Yaowu Xu	441f24de3d	Merge "Merge lossless experiment" into experimental	2013-02-20 12:27:26 -08:00
Dmitry Kovalev	eb6aee50a4	Code cleanup. Change-Id: I7c6e3bebd94856b24dbe2aded7f9e04ef8bb8c08	2013-02-20 11:36:31 -08:00
Yaowu Xu	d262e26cc7	Merge lossless experiment Change-Id: I7b7b8d4fda3a23699e0c920d727f8c15d37d43aa	2013-02-20 07:54:28 -08:00
Paul Wilkins	ef01b956d8	Entropy stats output code. Fixes to make Entropy stats code work again Change-Id: I62e380481a4eb4c170076ac6ab36f0c2b203e914	2013-02-20 14:33:19 +00:00
Tero Rintaluoma	56e6c66b49	Avoid division in intra prediction - Using multiplication and shifting instead of division in intra prediction. - Maximum absolute difference is 1 for division statements in d45, d27, d63 prediction modes. However, errors can cumulate for large block sizes when using already predicted values. - Maximum number of non-matching result values in loops using division are: 4x4 0/16 8x8 0/64 16x16 10/256 32x32 13/1024 64x64 122/4096 Overall PSNR derf: 0.005 yt: -0.022 std-hd: 0.021 hd: -0.006 Change-Id: I3979a02eb6351636442c1af1e23d6c4e6ec1d01d	2013-02-20 10:37:36 +02:00
Yaowu Xu	6b1b341774	Merge "fixed an enc/dec mis-match issue" into experimental	2013-02-19 16:53:30 -08:00
Yaowu Xu	b13f38d4b3	fixed an enc/dec mis-match issue The issue was caused by a out-of-order merge, which leads to wrong functions are called at lossless mode. Change-Id: If157729abab62954c729e0377e7f53edb7db22ca	2013-02-19 16:26:27 -08:00
Jingning Han	cd907b1601	16x16 butterfly inverse ADST/DCT hybrid transform rebased. This patch includes 16x16 butterfly inverse ADST/DCT hybrid transform. It uses the variant ADST of kernel sin((2k+1)*(2n+1)/4N), which allows a butterfly implementation. The coding gains as compared to DCT 16x16 are about 0.1% for both derf and std-hd. It is noteworthy that for std-hd sets many sequences gains about 0.5%, some 0.2%. There are also few points that provides -1% to -3% performance. Hence the average goes to about 0.1%. Change-Id: Ie80ac84cf403390f6e5d282caa58723739e5ec17	2013-02-19 09:07:00 -08:00
Ronald S. Bultje	ae81d3a03f	Merge "Minor cosmetic cleanups." into experimental	2013-02-19 08:54:44 -08:00
Ronald S. Bultje	0694ea0ed6	Merge "Prevent filling transform size cache with uninitialized values." into experimental	2013-02-19 08:54:35 -08:00
Yaowu Xu	93d6b86cfd	Use lossless for Q0 The commit changes the coding mode to lossless whenever the lowest quantizer is choosen. As expected, test results showed no difference for cif and std-hd set where Q0 is rarely used. For yt and yt-hd set, Q0 is used for a number of clips, where this commit helped a lot in the high end. Average over all clips in the sets: yt: 2.391% 1.017% 1.066% hd: 1.937% .764% .787% Change-Id: I9fa9df8646fd70cb09ffe9e4202b86b67da16765	2013-02-19 06:18:42 -08:00
Ronald S. Bultje	aa84c16da2	Minor cosmetic cleanups. Change-Id: I13d8ae754827368755575dd699a087b3b11f5b16	2013-02-15 17:21:16 -08:00
Ronald S. Bultje	ebfdaa0e0b	Prevent filling transform size cache with uninitialized values. The 32x32 value in case of splitmv was uninitialized. this leads to all kind of erratic behaviour down the line. Also fill in dummy values for superblocks in keyframes (the values are currently unused, but we run into integer overflows anyway, which makes detecting bad cases harder). Lastly, in case we did not find any RD value at all, don't set tx_diff to INT_MIN, but instead set it to zero (since if we couldn't find a mode, it's unlikely that any particular transform would have made that worse or better; rather, it's likely equally bad for all tx_sizes). Change-Id: If236fd3aa2037e5b398d03f3b1978fbbc5ce740e	2013-02-15 17:21:16 -08:00
Ronald S. Bultje	4dfcb129fd	Merge "Remove some unused structs and members from the decoder." into experimental	2013-02-15 17:11:38 -08:00
Ronald S. Bultje	5bb103c486	Merge "Remove Y2 and Y-no-DC token types from the bitstream." into experimental	2013-02-15 17:11:20 -08:00
Jingning Han	e343732a92	Fixed a subtle issue that breaks encoding process This issue breaks the encoding process of the codebase. The effect emerges only in particular test sequence at certain bit-rates and frame limits. Change-Id: I02e080f2a49624eef9a21c424053dc2a1d902452	2013-02-15 14:49:30 -08:00
Ronald S. Bultje	6cde1c58d7	Remove some unused structs and members from the decoder. Change-Id: Ie309cb1f683a51c5dfac405fb32e8e2d6ee143ed	2013-02-15 14:06:30 -08:00
Ronald S. Bultje	3af36ea8cc	Remove Y2 and Y-no-DC token types from the bitstream. Change-Id: I7a5314daca993d46b8666ba1ec2ff3766c1e5042	2013-02-15 14:06:30 -08:00
Ronald S. Bultje	48598e30b1	Remove y2dc/ac Q delta values from the bitstream. Since there is no Y2, these values are always zero. This changes the bitstream results slightly, hence a separate commit. Change-Id: I2f838f184341868f35113ec77ca89da53c4644e0	2013-02-15 14:06:30 -08:00
Ronald S. Bultje	46dff5d233	Remove some Y2-related code. Change-Id: I4f46d142c2a8d1e8a880cfac63702dcbfb999b78	2013-02-15 14:06:25 -08:00
Scott LaVarnway	7755657ea7	Merge "WIP: ssse3 version of convolve avg functions" into experimental	2013-02-15 07:54:21 -08:00
John Koleszar	716db10f0d	Merge "Moved vp9_get_coef_band to header file" into experimental	2013-02-14 18:02:55 -08:00
Scott LaVarnway	ae886d6bff	Moved vp9_get_coef_band to header file allowing the compiler to inline. Change-Id: I66e5caf5e7fefa68a223ff0603aa3f9e11e35dbb	2013-02-14 12:27:25 -08:00
Yaowu Xu	03f28c0a12	Merge "Rewrote fdct16x16" into experimental	2013-02-14 09:06:37 -08:00
Paul Wilkins	45712dc8c8	Merge "Abstract selection of coef band." into experimental	2013-02-14 03:23:31 -08:00
Yunqing Wang	048b9d41a6	Rewrote fdct16x16 Used same algorithm as others. Change-Id: Ifdac560762aec9735cb4bb6f1dbf549e415c38a0	2013-02-13 16:19:10 -08:00
Ronald S. Bultje	51afedbe28	Merge "Remove 2nd-order transform for first-order DC coefficients." into experimental	2013-02-13 13:58:02 -08:00
Ronald S. Bultje	89a206ef2f	Add support for tile rows. These allow sending partial bitstream packets over the network before encoding a complete frame is completed, thus lowering end-to-end latency. The tile-rows are not independent. Change-Id: I99986595cbcbff9153e2a14f49b4aa7dee4768e2	2013-02-13 12:31:00 -08:00
Ronald S. Bultje	42d6be8080	Remove 2nd-order transform for first-order DC coefficients. Since addition of the larger-scale transforms (16x16, 32x32), these don't give a benefit at macroblock-sizes anymore. At superblock-sizes, 2nd-order transform was never used over the larger transforms. Future work should test whether there is a benefit for that use case. Change-Id: I90cadfc42befaf201de3eb0c4f7330c56e33330a	2013-02-13 12:28:19 -08:00
Paul Wilkins	9255ad107f	Abstract selection of coef band. This patch abstracts the selection of the coefficient band context into a function as a precursor to further experiments with the coefficient context. It also removes the large per TX size coefficient band structures and uses a single matrix for all block sizes within the test function. This may have an impact on quality (results to follow) but is only an intermediate step in the process of redefining the context. Also the quality impact will be larger initially because the default tables will be out of step with the new banding. In particular the 4x4 will in this case only use 7 bands. If needed we can add back block size dependency localized within the function, but this can follow on after the other changes to the definition of the context. Change-Id: Id7009c2f4f9bb1d02b861af85fd8223d4285bde5	2013-02-13 19:01:25 +00:00
Paul Wilkins	56049d9488	Fixed encoder decoder mismatch. Reverted part of change I19981d1ef0b33e4e5732739574f367fe82771a84 That gives rise to an enc/dec mismatch. As things stand the memsets are still needed. Change-Id: I9fa076a703909aa0c4da0059ac6ae19aa530db30	2013-02-13 18:56:56 +00:00
Paul Wilkins	0d284ffed1	Abstract the selection of coefficient context. This is an initial step to facilitate experimentation with changes to the prior token context used to code coefficients to take better account of the energy of preceding tokens. This patch merely abstracts the selection of context into two functions and does not alter the output. Change-Id: I117fff0b49c61da83aed641e36620442f86def86	2013-02-13 18:56:30 +00:00
Paul Wilkins	afa57bfc97	Merge "Remove NEWCOEFCONTEXT experiment." into experimental	2013-02-13 10:41:13 -08:00
Yaowu Xu	f01b08c96c	Merge "enable bitstream lossless support" into experimental	2013-02-13 10:26:58 -08:00
Yaowu Xu	d3de97794f	Merge "fix the lossless experiment" into experimental	2013-02-13 09:54:35 -08:00
Yaowu Xu	17db5d00be	enable bitstream lossless support 1. Added a bit in frame header to to indicate if a frame is encoded in lossless mode, so decoder does not make the decision based on Q0 2. Minor changes to make sure that lossy coding works same as when the lossless experiment is not enabled. 3. Renamed function pointers for transforms to be consistent, using prefix fwd_txm and inv_txm for forward and inverse respectively To encode in lossless mode, using "--lossless=1 --min-q=0 --max-q=0" with vpxenc. Change-Id: Ifae53b26d2ffbe378d707e29d96817b8a5e6c068	2013-02-13 09:24:39 -08:00
Yaowu Xu	16f25f9dc8	fix the lossless experiment Change-Id: I95acfc1417634b52d344586ab97f0abaa9a4b256	2013-02-13 09:20:26 -08:00
Scott LaVarnway	30f866f44b	WIP: ssse3 version of convolve avg functions Initial ssse3 convolve avg functions and is one step closer to using x86inc.asm. The decoder performance improved by 8% for the test clip used. This should be revisited later to see if averaging outside the loop is better than having many similar filter functions. Change-Id: Ice3fafb423b02710b0448ffca18b296bcac649e9	2013-02-13 09:15:38 -08:00
Paul Wilkins	6a9f0c61a4	Remove NEWCOEFCONTEXT experiment. Removal of the NEWCOEFCONTEXT experiment to reduce code clutter and make it easier to experiment with some other changes to the coefficient coding context. Change-Id: Icd17b421384c354df6117cc714747647c5eb7e98	2013-02-13 15:12:17 +00:00
Paul Wilkins	649be94cf0	Removal of Hybrid DWT/DCT experiment. Removal of experiment to simplify code base for other changes. Change-Id: If0a33952504558511926ad212bc311fc2bffb19a	2013-02-13 15:08:48 +00:00
Christian Duvivier	097f205289	Merge "Faster vp9_regular_quantize_b_8x8." into experimental	2013-02-12 17:08:00 -08:00
Christian Duvivier	0e4397f0cd	Faster vp9_regular_quantize_b_8x8. A couple of scalar optimizations speeding up quantization by about 1.6x. Overall encoder speedup is around 3%. Change-Id: I19981d1ef0b33e4e5732739574f367fe82771a84	2013-02-12 15:55:58 -08:00
Yunqing Wang	7630cf0c3f	Merge "Rewrote fdct8x8" into experimental	2013-02-12 15:52:31 -08:00
John Koleszar	1d60b6bcb5	Merge "Replace as_mv struct with array" into experimental	2013-02-12 13:59:04 -08:00
Ronald S. Bultje	f496f601fb	Add tile column size limits (256 pixels min, 4096 pixels max). This is after discussion with the hardware team. Update the unit test to take these sizes into account. Split out some duplicate code into a separate file so it can be shared. Change-Id: I8311d11b0191d8bb37e8eb4ac962beb217e1bff5	2013-02-12 10:33:34 -08:00
Ronald S. Bultje	cb00be1fa2	Merge "Clean up detokenize contextualization to be like tokenizer." into experimental	2013-02-12 09:47:29 -08:00
Scott LaVarnway	ff024f812b	Merge "Bug fix: ssse3 version of subpixel did not match C code" into experimental	2013-02-12 08:45:24 -08:00
Yunqing Wang	aa295918ed	Rewrote fdct8x8 Use consistent algorithm. Change-Id: Ib8484821ebc454b9d3380a3d6571798decd037f3	2013-02-11 22:28:05 -08:00
Ronald S. Bultje	491d095214	Clean up detokenize contextualization to be like tokenizer. Change-Id: I47174f797df2103da8913c6fb4f4e741817bae82	2013-02-11 17:21:37 -08:00
Christian Duvivier	094e2572df	Faster convolve8_avg. Implement convolve8_avg using common functions which are already optimized instead of using more obscure ones which have only C versions. Encoder overall speed-up of about 12%. Change-Id: I8c57aa76936c8a48f22b115f19f61d9f2ae1e4b6	2013-02-11 16:53:11 -08:00
Jingning Han	f1060e4cd8	Merge "butterfly inverse 4x4 ADST" into experimental	2013-02-11 14:46:06 -08:00
Yunqing Wang	ab2dc6ae57	Merge "Integerization of dct32x32" into experimental	2013-02-11 12:15:26 -08:00
Jingning Han	57e995ff9c	butterfly inverse 4x4 ADST fixed format issues. Implement the inverse 4x4 ADST using 9 multiplications. For this particular dimension, the original ADST transform can be factorized into simpler operations, hence is retained. Change-Id: Ie5d9749942468df299ab74e90d92cd899569e960	2013-02-11 10:42:39 -08:00
Ronald S. Bultje	5f2e8449b7	Merge "Port sadNxNx4d functions to x86inc.asm." into experimental	2013-02-11 08:20:12 -08:00
Paul Wilkins	aec5bed3db	Change rd thresholds and add speed trade off flags. Experimental tweaks to various thresholds to measure quality / speed trade off. Add flag that allows static segmentation to be turned off and disables it unless in the second pass of a two pass encode. Change-Id: I219702ffe858412a83db801cbbbd869924b8c61b	2013-02-11 11:54:36 +00:00
Scott LaVarnway	eda30b410e	Bug fix: ssse3 version of subpixel did not match C code A 16 bit overflow condition occurs when using the EIGHTTAP_SMOOTH filters. (vp9_sub_pel_filters_8lp) Changed the order of the adds to fix this problem. Also added ssse3 support for 4x4 subpixel filtering. Change-Id: I475eaadae920794c2de5e01e9735c059a856518e	2013-02-09 15:15:14 -08:00
Paul Wilkins	e4f949b55a	Merge "Nearest / Zero Mv default entropy tweak." into experimental	2013-02-09 04:21:08 -08:00
John Koleszar	7ca517f755	Replace as_mv struct with array Replace as_mv.{first, second} with a two element array, so that they can easily be processed with an index variable. Change-Id: I1e429155544d2a94a5b72a5b467c53d8b8728190	2013-02-08 20:23:35 -08:00
John Koleszar	dc836109e4	Merge "Pass macroblock index to pick inter functions" into experimental	2013-02-08 20:20:37 -08:00
Ronald S. Bultje	c0ce2ab349	Port sadNxNx4d functions to x86inc.asm. Change-Id: Ic639f5742f7a007753d7a3fa5c66235172eb31d8	2013-02-08 17:59:32 -08:00
Ronald S. Bultje	02ff360b33	Add sad64x64 and sad32x32 SSE2 versions. Also port the 4x4, 16x16, 8x16 and 16x8 versions to x86inc.asm; this makes them all slightly faster, particularly on x86-64. Remove SSE3 sad16x16 version, since the SSE2 version is now faster. About 1.5% overall encoding speedup. Change-Id: Id4011a78cce7839f554b301d0800d5ca021af797	2013-02-08 16:32:25 -08:00
Ronald S. Bultje	639b863d22	Make cost_coeffs() more efficient. Cache the constant offset in one variable to prevent re-loading that in each loop iteration, and mark the function as inline so we can use the fact that the transform size is always known in the caller. Almost 1% faster encoding overall. Change-Id: Id78325a60b025057d8f4ecd9003a74086ccbf85a	2013-02-08 16:32:24 -08:00
John Koleszar	6125a1ed81	Pass macroblock index to pick inter functions Pass the current mb row and column around rather than the recon_yoffset and recon_uvoffset, since those offsets will change from predictor to predictor, based on the reference frame selection. Change-Id: If3f9df059e00f5048ca729d3d083ff428e1859c1	2013-02-08 14:25:40 -08:00
John Koleszar	6dfc95fe63	Merge changes Icd1a2a5a,I204d17a1,I3ed92117 into experimental * changes: Initial support for resolution changes on P-frames Avoid allocating memory when resizing frames Adds a test for the VP8E_SET_SCALEMODE control	2013-02-08 14:20:05 -08:00
John Koleszar	3de8ee6ba1	Merge changes Ife0d8147,I7d469716,Ic9a5615f into experimental * changes: Restore SSSE3 subpixel filters in new convolve framework Convert subpixel filters to use convolve framework Add 8-tap generic convolver	2013-02-08 13:19:47 -08:00
John Koleszar	393b485627	Initial support for resolution changes on P-frames Allows inter-frames to change resolution. Currently these are almost equivalent to keyframes, as only intra prediction modes are allowed, but without the other context resets that occur on keyframes. Change-Id: Icd1a2a5af0d9462cc792588427b0a1f5b12e40d3	2013-02-08 12:20:30 -08:00
John Koleszar	c03d45def9	Avoid allocating memory when resizing frames As long as the new frame is smaller than the size that was originally allocated, we don't need to free and reallocate the memory allocated. Instead, do the allocation on the size of the first frame. We could make this passed in from the application instead, if we wanted to support external upscaling. Change-Id: I204d17a130728bbd91155bb4bd863a99bb99b038	2013-02-08 12:20:30 -08:00
John Koleszar	88f99f4ec2	Adds a test for the VP8E_SET_SCALEMODE control Tests that the external interface to set the internal codec scaling works as expected. Also updates the test to pull the height from the decoded frame size rather than parsing the keyframe header, in anticipation of allowing resolution changes on non-keyframes. Change-Id: I3ed92117d8e5288fbbd1e7b618f2f233d0fe2c17	2013-02-08 12:20:30 -08:00
John Koleszar	29d47ac80e	Restore SSSE3 subpixel filters in new convolve framework This commit adds the 8 tap SSSE3 subpixel filters back into the code underneath the convolve API. The C code is still called for 4x4 blocks, as well as compound prediction modes. This restores the encode performance to be within about 8% of the baseline. Change-Id: Ife0d81477075ae33c05b53c65003951efdc8b09c	2013-02-08 12:18:14 -08:00
Yunqing Wang	dbccffe299	Integerization of dct32x32 Test on derf set showed 0.047% overall psnr change. Change-Id: Id16c276c251a3943850ac9b95e9b09a56cf42b19	2013-02-08 08:50:47 -08:00
Paul Wilkins	bbede82f24	Nearest / Zero Mv default entropy tweak. Tweak to default mode context to account for the fact that when there are no non zero motion candidates Nearest is now the preferred mode for coding a 0,0 vector. Also resolve duplicate function name and typos. Change-Id: I76802788d46c84e3d1c771be216a537ab7b12817	2013-02-08 10:16:13 +00:00
Yaowu Xu	e6ad9ab02c	move dct/idct constants to a header file also removed some un-unsed functions. Change-Id: Ie363bcc8d94441d054137d2ef7c4fe59f56027e5	2013-02-07 13:51:45 -08:00
Jingning Han	d15e1da494	Butterfly ADST based hybrid transform Refactor the 8x8 inverse hybrid transform. It is now consistent with the new inverse DCT. Overall performance loss (due to the use of this variant ADST, and the rounding errors in the butterfly implementation) for std-hd is -0.02. Fixed BUILD warning. Devise a variant of the original ADST, which allows butterfly computation structure. This new transform has kernel of the form: sin((2k+1)*(2n+1) / (4N)). One of its butterfly structures using floating-point multiplications was reported in Z. Wang, "Fast algorithms for the discrete W transform and for the discrete Fourier transform", IEEE Trans. on ASSP, 1984. This patch includes the butterfly implementation of the inverse ADST/DCT hybrid transform of dimension 8x8. Change-Id: I3533cb715f749343a80b9087ce34b3e776d1581d	2013-02-07 10:07:46 -08:00
Paul Wilkins	29731308c4	Added skip switches for SB32 and SB64 Added switches and code to skip/breakout from doing SB32 and SB64 tests based on whether the 16x16 MB tests used split modes. Also to optionally skip 64x64 if 16x16 was chosen over 32x32. Impact varies depending on clip from a few % up to almost 50% on encode speed. Only the split mode breakout is currently enabled. Change-Id: Ib5836140b064b350ffa3057778ed2cadcc495cf8	2013-02-07 10:45:41 +00:00
Ronald S. Bultje	5cfd82bcaf	Use fdct8x4 instead of fdct4x4 where the block size allows it. This allows for faster SIMD implementations in the future (currently there is no speed impact). Change-Id: I732647e9148b5dcb44e6bc8728138f0141218329	2013-02-06 16:13:02 -08:00
Ronald S. Bultje	aac73df1a7	Use configure checks for various inline keywords. Change-Id: I8508f1a3d3430f998bb9295f849e88e626a52a24	2013-02-06 16:12:56 -08:00
Ronald S. Bultje	a788e0fe63	Add sse2 versions of sub_pixel_variance{32x32,64x64}. 7.5% faster overall encoding. Change-Id: Ie9bb7f9fdf93659eda106404cb342525df1ba02f	2013-02-06 11:20:59 -08:00
Ronald S. Bultje	a001fe9708	Merge "Reindent segmentation code." into experimental	2013-02-06 10:07:30 -08:00
Ronald S. Bultje	55cafb6156	Reindent segmentation code. Indentation was off by 2 spaces for this particular block. Change-Id: I1e587b7ad3eff77ade5521252d20c7bb2daa0f6d	2013-02-06 09:18:25 -08:00
John Koleszar	31cbe2ed9a	Eliminate tautology Unreachable code that does nothing anyway removed forever. Change-Id: I14105d2dd9dbc9d558f36464055e350dbeb45488	2013-02-06 08:22:59 -08:00
Paul Wilkins	8b4e9c5925	Merge "Change definition of NearestMV." into experimental	2013-02-06 04:06:31 -08:00
Ronald S. Bultje	278df745d2	Fix mismatch after merge of the tiling patch. Change-Id: I8ecc178b4d4069e721c7fec6d7631c00e4a3e5d5	2013-02-05 17:15:04 -08:00
Ronald S. Bultje	1407bdc243	[WIP] Add column-based tiling. This patch adds column-based tiling. The idea is to make each tile independently decodable (after reading the common frame header) and also independendly encodable (minus within-frame cost adjustments in the RD loop) to speed-up hardware & software en/decoders if they used multi-threading. Column-based tiling has the added advantage (over other tiling methods) that it minimizes realtime use-case latency, since all threads can start encoding data as soon as the first SB-row worth of data is available to the encoder. There is some test code that does random tile ordering in the decoder, to confirm that each tile is indeed independently decodable from other tiles in the same frame. At tile edges, all contexts assume default values (i.e. 0, 0 motion vector, no coefficients, DC intra4x4 mode), and motion vector search and ordering do not cross tiles in the same frame. t log Tile independence is not maintained between frames ATM, i.e. tile 0 of frame 1 is free to use motion vectors that point into any tile of frame 0. We support 1 (i.e. no tiling), 2 or 4 column-tiles. The loopfilter crosses tile boundaries. I discussed this briefly with Aki and he says that's OK. An in-loop loopfilter would need to do some sync between tile threads, but that shouldn't be a big issue. Resuls: with tiling disabled, we go up slightly because of improved edge use in the intra4x4 prediction. With 2 tiles, we lose about ~1% on derf, ~0.35% on HD and ~0.55% on STD/HD. With 4 tiles, we lose another ~1.5% on derf ~0.77% on HD and ~0.85% on STD/HD. Most of this loss is concentrated in the low-bitrate end of clips, and most of it is because of the loss of edges at tile boundaries and the resulting loss of intra predictors. TODO: - more tiles (perhaps allow row-based tiling also, and max. 8 tiles)? - maybe optionally (for EC purposes), motion vectors themselves should not cross tile edges, or we should emulate such borders as if they were off-frame, to limit error propagation to within one tile only. This doesn't have to be the default behaviour but could be an optional bitstream flag. Change-Id: I5951c3a0742a767b20bc9fb5af685d9892c2c96f	2013-02-05 15:43:03 -08:00
Ronald S. Bultje	822864131b	Merge "Add SSE3 versions for sad{32x32,64x64}x4d functions." into experimental	2013-02-05 15:40:46 -08:00
Yaowu Xu	c9ae73b251	Merge "rewrite 4x4 idct and fdct" into experimental	2013-02-05 15:26:36 -08:00
Ronald S. Bultje	58c983d109	Add SSE3 versions for sad{32x32,64x64}x4d functions. Overall encoding about 15% faster. Change-Id: I176a775c704317509e32eee83739721804120ff2	2013-02-05 15:21:47 -08:00
John Koleszar	7a07eea13f	Convert subpixel filters to use convolve framework Update the code to call the new convolution functions to do subpixel prediction rather than the existing functions. Remove the old C and assembly code, since it is unused. This causes a 50% performance reduction on the decoder, but that will be resolved when the asm for the new functions is available. There is no consensus for whether 6-tap or 2-tap predictors will be supported in the final codec, so these filters are implemented in terms of the 8-tap code, so that quality testing of these modes can continue. Implementing the lower complexity algorithms is a simple exercise, should it be necessary. This code produces slightly better results in the EIGHTTAP_SMOOTH case, since the filter is now applied in only one direction when the subpel motion is only in one direction. Like the previous code, the filtering is skipped entirely on full-pel MVs. This combination seems to give the best quality gains, but this may be indicative of a bug in the encoder's filter selection, since the encoder could achieve the result of skipping the filtering on full-pel by selecting one of the other filters. This should be revisited. Quality gains on derf positive on almost all clips. The only clip that seemed to be hurt at all datarates was football (-0.115% PSNR average, -0.587% min). Overall averages 0.375% PSNR, 0.347% SSIM. Change-Id: I7d469716091b1d89b4b08adde5863999319d69ff	2013-02-05 14:23:17 -08:00
John Koleszar	5ca6a3667f	Add 8-tap generic convolver This commit introduces a new convolution function which will be used to replace the existing subpixel interpolation functions. It is much the same as the existing functions, but allows for changing the filter kernel on a per-pixel basis, and doesn't bake in knowledge of the filter to be applied or the size of the resulting block into the function name. Replacing the existing subpel filters will come in a later commit. Change-Id: Ic9a5615f2f456cb77f96741856fc650d6d78bb91	2013-02-05 14:19:28 -08:00
Yaowu Xu	fa36981ec8	rewrite 4x4 idct and fdct This commit changes the 4x4 iDCT to use same algorithm & constants as other iDCTs. The 4x4 fDCT is also changed to be based on the new iDCT. Change-Id: Ib1a902693228af903862e1f5a08078c36f2089b0	2013-02-05 11:42:49 -08:00
Paul Wilkins	81043e8d62	Change definition of NearestMV. This commit makes the NearestMV match the chosen best reference MV. It can be a 0,0 or non zero vector which means the the compound nearest mv mode can combine a 0,0 and a non zero vector. Change-Id: I2213d09996ae2916e53e6458d7d110350dcffd7a	2013-02-05 17:03:25 +00:00
Scott LaVarnway	77440d508b	Merge "Added vp9_short_idct1_32x32_c" into experimental	2013-02-05 08:56:05 -08:00
Scott LaVarnway	5780c4cbd5	Added vp9_short_idct1_32x32_c and called this function in vp9_dequant_idct_add_32x32_c when eob == 1. For the test clip used, the decoder performance improved by 21+%. Based on Yaowu's 16 point idct work. Change-Id: Ib579a90fed531d45777980e04bf0c9b23c093c43	2013-02-04 16:49:17 -08:00
Paul Wilkins	3ab538767c	Re-factor code for rd thresholds. Separate out code to set the main encode speed related rd thresholds. Some values changed from the initial defaults for various new modes. Quality test results pending but even the addition of some further non-zero defaults helps encode speed somewhat in limited testing on derf clips. Adjustment of thresholds for quality / speed tradeoff to follow. Change-Id: I117ee473157e151a1b93193d5f393449328de20d	2013-02-04 18:48:41 +00:00
Yaowu Xu	1eb79dc1dc	re-write 8 point idct to be consistent with idct16 and idct32. Change-Id: Ie89dbd32b65c33274b7fecb4b41160fcf1962204	2013-02-04 07:31:25 -08:00
Yaowu Xu	ccaaeb4b5a	a couple of minor fixes fixed a function prototypes to prevent compiler warnings; removed a function not in use; un-capitialize "Refstride" to ref_stride Change-Id: Ib4472b6084f357d96328c6a06e795b6813a9edba	2013-02-04 07:19:32 -08:00
Yaowu Xu	af4c9d2f88	Merge "Changes 16 point idct" into experimental	2013-02-01 08:22:20 -08:00
Yaowu Xu	c1f611be74	Merge "fix a small bug in 16 point forward dct" into experimental	2013-02-01 05:57:41 -08:00
Yaowu Xu	91e0e80142	Changes 16 point idct This commit changes the inverse 16 point dct to use the same algorithm as the one for 32 point idct. In fact, now 16 point dct uses the exact version of the souce code for even portion of the 32 point idct. Tests showed current implementation has significant better accuracy than the previous version. With this implementation and the minor bug fix on forward 16 point dct, encoding tests showed about 0.2% better compression of CIF set, test results on std-hd setting pending. Change-Id: I68224b60c816ba03434e9f08bee147c7e344fb63	2013-01-31 19:52:18 -08:00
Frank Galligan	f67d740b34	Add support for x64 and win64 yasm flags. Some projects must define only win64 for Windows 64bit builds using yasm. Change-Id: I1d09590d66a7bfc8b4412e1cc8685978ac60b748	2013-01-31 16:25:37 -08:00
Yaowu Xu	ab1cad9bdd	fix a small bug in 16 point forward dct The commit fixes a minor error in 16 point fdct where in a rotation can produce result of -1 instead of 0. Change-Id: I45aac4a52bcd06225c6d04e643547a13e1c1aade	2013-01-31 15:39:41 -08:00
Yaowu Xu	c94e55add0	Merge "A fix point implementation of 32x32 idct" into experimental	2013-01-31 10:48:01 -08:00
Yaowu Xu	5149d7f7bd	A fix point implementation of 32x32 idct This commit changes the 32x32 idct to use integer only. The algorithm was taken directly from "A Fast Computational Algorithm for the Discrete Cosine Tranform" by W. Chen, et al., which was published in IEEE Transaction on Communication Vol. Com.-25 No. 9, 1977. The signal flow graph in the original paper is for a 32 point forward dct, the current implementation of inverse DCT was done by follow the graph in reversed direction. With this implementation, the 32 point inverse dct contains a 16 point inverse dct in its even portion, similarly the 16 point idct further contains 8 point and 4 point inverse dcts. As of patch 4, encoding tests showed there is no compression loss when compared against the floating point baseline. Numbers even showed very small postives. (cif: .01%, std-hd: .05%). Change-Id: I2d2d17a424b0b04b42422ef33ec53f5802b0f378	2013-01-31 09:45:49 -08:00
Deb Mukherjee	a53be60904	Merge "Adding a frame parallel decoding mode" into experimental	2013-01-30 12:03:45 -08:00
Ronald S. Bultje	b499c24c2f	Merge "don't code the branch for the predicted seg_id if that flag is false." into experimental	2013-01-30 10:02:51 -08:00
Ronald S. Bultje	3a4b18bc67	don't code the branch for the predicted seg_id if that flag is false. Change-Id: Icb6e21dc0c2d9918faa33c8bf70943660df7ad88	2013-01-30 09:30:46 -08:00
Ronald S. Bultje	4d53a95a34	Merge "Default superblock skip flag to 32x32 for skip-blocks." into experimental	2013-01-30 09:12:17 -08:00
Ronald S. Bultje	de6718a3b9	Merge "Reset skip flag in superblock RD loop." into experimental	2013-01-30 09:12:02 -08:00
Deb Mukherjee	d28750537e	Merge "Further improvement on compound inter-intra expt" into experimental	2013-01-30 08:38:17 -08:00
Ronald S. Bultje	3febf9707d	Default superblock skip flag to 32x32 for skip-blocks. This is identical to the later decisions made in encode_superblock(). This commit doesn't actually change anything, but makes the mbmi state more consistent between the RD loop and the final encode result. Change-Id: I9e735afb7c5a52e5b61728cb88c67ef9b9bf59be	2013-01-29 21:46:31 -08:00
Ronald S. Bultje	b90996c51b	Reset skip flag in superblock RD loop. This is the superblock equivalent of commit `290b83a`. Change-Id: Ib3945dd9e992fa9ec1fdea5a11e17a3cc0e37637	2013-01-29 21:42:56 -08:00
Ronald S. Bultje	2f6fce3e5a	Write only visible area (for better comparison with rec.yuv). Change-Id: I32bf4ee532a15af78619cbcd8a193224029fab50	2013-01-29 16:58:52 -08:00
Frank Galligan	0524f33108	libvpx: Fix warnings on windows. Warnings found when tyring to build libvpx in Chromium. Change-Id: I5824d9e2c06351e0cf46e9f5fa102cc8b04cf963	2013-01-29 13:57:09 -08:00
Ronald S. Bultje	5a9da2d906	Merge "Fix block pointer corruption in intra8x8 prediction with 4x4 transform." into experimental	2013-01-29 12:49:42 -08:00
Ronald S. Bultje	64401f838f	Merge "Fix overread/write reported by valgrind if (mb_cols) & 3 != 0." into experimental	2013-01-29 12:49:22 -08:00
Paul Wilkins	d8e86af263	Merge "Remove eob_max_offset markers." into experimental	2013-01-29 09:29:45 -08:00
Paul Wilkins	5d1c62c639	Merge "Segment Skip Flag" into experimental	2013-01-29 09:29:26 -08:00
Scott LaVarnway	8b7eced6fe	Merge "Added eob == 0 check to vp9_dequant_idct_add_32x32_c" into experimental	2013-01-29 09:19:58 -08:00
Ronald S. Bultje	ffc2e4f4af	Fix block pointer corruption in intra8x8 prediction with 4x4 transform. The RD loop would change the pointer after the first mode (DC) was tested, leading to corrupt block objects being provided for the others. This would essentially render the i8x8 predictor useless. Change-Id: I16c5906ca64fb34878ac32ce59af8974e4582bb8	2013-01-29 09:18:47 -08:00
Paul Wilkins	93762ca9b2	Remove eob_max_offset markers. Remove eob_max_offset markers and replace with the generic skip_block flag to indicate to the quantizer that all coeffs to be set to 0 and eob position set to 0; Change-Id: Id477e8f8d4ec1a5562758904071013c24b76bfd7	2013-01-29 13:39:34 +00:00
Deb Mukherjee	3b04d467ac	Further improvement on compound inter-intra expt Adds a special combination mode specific to intra prediciton mode D45. Current results with the compound inter/intra experiment: derf: 0.2% yt: 0.55% std-hd: 0.75% hd: 0.74% Change-Id: I8976bdf3b9b0b66ab8c5c628bbc62c14fc72ca86	2013-01-29 00:21:29 -08:00
Paul Wilkins	0ff9b033b0	Segment Skip Flag First step in simplifying the segment mode and segment EOB flags into a simpler segment skip flag that implies 0,0 mv and EOB at position 0. Change-Id: Ib750cac31a7a02dc21082580498efd9f7d8d72a5	2013-01-28 17:28:04 +00:00
Paul Wilkins	5f2429259f	Merge "Simplify Zero bin and zero bin run code." into experimental	2013-01-28 08:35:36 -08:00
Paul Wilkins	8e2c03fbfd	Simplify Zero bin and zero bin run code. Simplification to eliminate a number of very large data data structures. All zero run, zbin boosts for different transform sizes are now limited to a maximum run length of 15 before they max out the boost. Some further work still needs be done to refactor, rationalize and optimize the multiple quantizer functions. The simplification coupled with tweaks to the 16 element array now used for all transform sizes, has minimal effect on quality. Change-Id: I6f3948b8ca0418b60d4db9030ff19026a34ed423	2013-01-28 13:21:10 +00:00
Ronald S. Bultje	9dc9f07fb8	Fix overread/write reported by valgrind if (mb_cols) & 3 != 0. We'd backup and restore all cols for a 64x64 SB, but the array wouldn't be big enough to hold all that data. Change-Id: Ic68ea721bf07e0b2f3937bd16b0b734bcc743ce1	2013-01-25 17:18:08 -08:00
Deb Mukherjee	dfd89f2eab	Adding a frame parallel decoding mode Adds a flag to disable features that would inhibit frame parallel decoding. This includes backward adaptation and MV sorting based on search in ref frame buffer. Also includes some minor clean-ups. Change-Id: I434846717a47b7bcb244b37ea670c5cdf776f14d	2013-01-25 17:16:19 -08:00
Ronald S. Bultje	3ca5b35ce5	Merge "Remove "update_context" variable from VP9_COMP context." into experimental	2013-01-25 09:43:42 -08:00
Scott LaVarnway	9d4c26531b	Added eob == 0 check to vp9_dequant_idct_add_32x32_c Added a quick eob == 0 check. Once the integer version of the dct32x32 is complete, we can check for other eob cases. For the 1080p clip used, the decoder performance improved by 4%. Change-Id: I9390b6ed3c8be0c0c0a0c44c578d9a031d6e026e	2013-01-24 17:09:56 -08:00
Ronald S. Bultje	0a7b3953f0	Remove "update_context" variable from VP9_COMP context. The variable is always zero. Change-Id: Id5cdbecad543bca465a5b1d471badaec7e112c8d	2013-01-24 16:28:53 -08:00
Paul Wilkins	fcb4a25cd5	Mvref speedup Quality / decode speed trade off changes. Simpler insert method without sort. Quality impact small. Change-Id: Id0c0941bc508d985405abd06a13ffe7489170b62	2013-01-24 17:26:37 +00:00
Scott LaVarnway	70019f6070	Merge "Intrinsic version of loopfilter now matches C code" into experimental	2013-01-24 08:45:22 -08:00
Deb Mukherjee	01cafaab1d	Adds an error-resilient mode with test Adds an error-resilient mode where frames can be continued to be decoded even when there are errors (due to network losses) on a prior frame. Specifically, backward updates are turned off and probabilities of various symbols are reset to defaults at the beginning of each frame. Further, the last frame's mvs are not used for the mv reference list, and the sorting of the initial list based on search on previous frames is turned off as well. Also adds a test where an arbitrary set of frames are skipped from decoding to simulate errors. The test verifies (1) that if the error frames are droppable - i.e. frame buffer updates have been turned off - there are no mismatch errors for the remaining frames after the error frames; and (2) if the error-frames are non droppable, there are not only no decoding errors but the mismatch PSNR between the decoder's version of the post-error frames and the encoder's version is at least 20 dB. Change-Id: Ie6e2bcd436b1e8643270356d3a930e8989ff52a5	2013-01-23 21:56:15 -08:00
Deb Mukherjee	ebb1157cde	Merge "Modifies the comp inter-intra expt" into experimental	2013-01-23 09:43:07 -08:00
Scott LaVarnway	6a997400ff	Intrinsic version of loopfilter now matches C code Updated the instrinsic code to match Yaowu's latest loopfilter change. (I584393906c4f5f948a581d6590959522572743bb) The decoder performance improved by ~30% for the test clip used. Change-Id: I026cfc75d5bcb7d8d58be6f0440ac9e126ef39d2	2013-01-23 09:31:40 -08:00
John Koleszar	bed59eb8de	Merge changes Ia82cef79,I7324a75a,I7b66daad,I73344451,I91dc210f,I5945b5ce into experimental * changes: Use alt-ref frame context for keyframes Preserve the previous golden frame on golden updates Generalize and increase frame coding contexts Start to anonymize reference frames Update encoder to use fb_idx_ref_cnt Remove buffer-to-buffer copy logic	2013-01-22 08:31:55 -08:00
John Koleszar	2f24ad9e85	Use alt-ref frame context for keyframes This matches the behavior prior to generalizing the frame context selection, and intuitively makes sense in that the first forward ref is immediately after the keyframe, so it's quality is improved a bit by using the keyframe's entropy context rather than the default. Change-Id: Ia82cef79382b9d8cfafdc44ba0533d4dc3e44053	2013-01-18 14:40:39 -08:00
Yaowu Xu	b95ed6883a	a minor change to a portion of loop filtering The loop filtering used for MB edge or internal edge of a MB using 8x8 tranform was reading 5 pixel each side and writting 3 pixel each side. With suggestion from Aki and Scott on hardware&software performance, this commit changed to read 4 pixel each side and write 3 pixel each side. Change-Id: I584393906c4f5f948a581d6590959522572743bb	2013-01-18 10:44:13 -08:00
Frank Galligan	9ca907b53e	libvpx: Fix some warnings. Change-Id: If8be8b9d28a29631f29c46daea8a226ab3580610	2013-01-18 09:51:57 -08:00
John Koleszar	26bd81b955	Preserve the previous golden frame on golden updates This commit restores the quality lost when the buffer-to-buffer copy logic was removed. Note that this is specific to the current use of golden frames and will need rework when RTC functionality is added. Change-Id: I7324a75acd96eafd9e0f9b8633d782e390d5dc21	2013-01-16 15:57:02 -08:00
John Koleszar	4b65837bc6	Generalize and increase frame coding contexts Previously there were two frame coding contexts tracked, one for normal frames and one for alt-ref frames. Generalize this by signalling the context to use in the bitstream, rather than tieing it to the alt ref refresh bit. Also increase the number of contexts available to 4, which may be useful for temporal scalability. Change-Id: I7b66daaddd55c535c20cd16713541fab182b1662	2013-01-16 14:07:27 -08:00
John Koleszar	da832a80e4	Start to anonymize reference frames Remove lst_fb_idx, gld_fb_idx, alt_fb_idx, refresh_last_frame, refresh_golden_frame, refresh_alt_ref_frame from common. Gold/Alt are encode side conventions. From the decoder's perspective, we want to be dealing with numbered references. Updates to active_ref 2 signal mode context switches, vestigial from refresh_alt_ref_frame. This needs some clean up to make sense with increased numbers of reference frames, as well as reimplementing the swapping of alt/golden which was previously done using the buffer-to-buffer copy mechanism removed in an earlier commit. Change-Id: I7334445158b7666f9295d2a2dd22aa03f4485f58	2013-01-16 14:06:23 -08:00
John Koleszar	394b0a6a30	Update encoder to use fb_idx_ref_cnt Do reference counting the same way on the encoder as the decoder does, rather than maintaining the 'flags' member of YV12_BUFFER_CONFIG. Change-Id: I91dc210ffca081acaf9d5c09a06e7461b3c3139c	2013-01-15 17:36:39 -08:00
John Koleszar	b8e027989f	Remove buffer-to-buffer copy logic This is the first in a series of commits to add additional reference frames to the codec. Each frame will be able to update any of the available references, but copying between references is not supported. Change-Id: I5945b5ce6cc3582c495102b4e7eed4f08c44d5a1	2013-01-15 17:36:39 -08:00
Yaowu Xu	9bf73f46f9	fix a number issues that cause failures During master jenkins verification proces Change-Id: I3722b8753eaf39f99b45979ce407a8ea0bea0b89	2013-01-14 18:32:32 -08:00
Deb Mukherjee	b34838bea5	Modifies the comp inter-intra expt Uses a single 1D table to implement the weighting of the predictors for the compound inter-intra experiment. Change-Id: I204ffbe4f9fc79d5d43b6c724ad253d800461012	2013-01-14 17:32:26 -08:00
John Koleszar	24bc1a7189	Use INT64_MAX instead of LLONG_MAX These variables have the type int64_t, not long long. long long could be a larger type than 64 bits. Emulate INT64_MAX for older versions of MSVC, and remove the unreferenced vpx_ports/vpxtypes.h Change-Id: Ideaca71838fcd3849d816d5ab17aa347c97d03b0	2013-01-14 15:57:21 -08:00
Ronald S. Bultje	c9071601a2	Remove compound intra-intra experiment. This experiment gives little gains and adds relatively much code complexity (and it hinders other experiments), so let's get rid of it. Change-Id: Id25e79a137a1b8a01138aa27a1fa0ba4a2df274a	2013-01-14 15:47:25 -08:00
Yaowu Xu	741fbe9656	Merge experiment "subpelrefmv" Change-Id: Iac7f3d108863552b850c92c727e00c95571c9e96	2013-01-14 15:18:47 -08:00
Yaowu Xu	f7dab60096	Merge experiment "widerlpf" Change-Id: I0c94475075e66e13cfe4c20fab7db6474441ae86	2013-01-14 15:17:35 -08:00
Yaowu Xu	d8c5bceee5	Merge "changed UV plane loop filtering for TX_8X8" into experimental	2013-01-14 14:47:31 -08:00
Yaowu Xu	8750414368	Merge "change to evaluate reference mvs using above only" into experimental	2013-01-14 14:40:38 -08:00
Yaowu Xu	ad9a16ed17	changed UV plane loop filtering for TX_8X8 In commit `9a1d73d`, loop filtering was added for UV 4x4 boundaries when TX_8X8 is used by a MB. This commit further refined the decision to be based on the actual transform used for the UV planes. When UV planes use 4x4 transform, i.e. when prediction mode used is either I8X8_PRED or SPLITMV, UV planes are filtered on 4x4 boundaries, and no filtering is applied on 4x4 block boundaries when UV planes use 8X8 transform. Change-Id: Ibb404face0a1d129b4b4abaf67c55d82e8df8bec	2013-01-14 14:28:20 -08:00
Paul Wilkins	e2c696a7aa	Merge "Fix compiler warnings" into experimental	2013-01-14 14:20:57 -08:00
Adrian Grange	c7576f97ff	Merge "Merge prediction filter" into experimental	2013-01-14 14:18:21 -08:00
Yaowu Xu	fdf8654189	change to evaluate reference mvs using above only Change-Id: Ibcc342efac0a9be7a21d9b2c09984d9e16bbb225	2013-01-14 14:01:40 -08:00
Yaowu Xu	113005b11d	Fix compiler warnings The warnings caused verify failure with gerrit for several commits Change-Id: I030df8638bd69b8783a3ac58e720ff9f0bfd546c	2013-01-14 13:56:52 -08:00
Adrian Grange	7bcaac3e64	Merge prediction filter Removed the experimental flag from around the prediction filter. Change-Id: Ic1dd2db8fe8ac17ed5129f83094d4c5cdd5527d2	2013-01-14 12:57:07 -08:00
Ronald S. Bultje	290b83ab62	Reset x->skip for each iteration in the RD loop. This prevents ill-defined behaviour, such as setting x->skip for a mode that is excluded because of frame-level flags (e.g. filter selection, compound prediction selection), then not breaking out of the RD loop because the mode is not allowed, but keeping the flag on. Whatever mode is iterated through next in the RD loop will then carry this flag, and all sort of bad stuff happens, such as x->skip being set on intra pred modes. Change-Id: I5bec46b36e38292174acb1c564b3caf00a9b4b9a	2013-01-14 12:44:32 -08:00
John Koleszar	76ac5b3937	Fix unused variable warnings Previous commit does not build cleanly on Jenkins with the DWT/DCT hybrid experiment enabled (--enable-dwtdcthybrid). Change-Id: Ia67e8f59d17ef2d5200ec6b90dfe6711ed6835a5	2013-01-14 12:12:43 -08:00
Deb Mukherjee	516db21c2c	Further enhancements/fixes on dct/dwt hybrid txfm Fixes some scaling issues. Adds an option to only compute the dct on the low-low subband for 32x32 and 64x64 blocks using only a single 16x16 dct after 1 and 2 wavelet decomposition levels respectively. Also adds an option to use a 8x8 dct as building block. Currenlty with the 2/6 filter and with a single 16x16 dct on the low low band, the reuslts compared to full 32x32 dct is as follows: derf: -0.15% yt: -0.29% std-hd: -0.18% hd: -0.6% These are my current recommended settings, since the 2/6 filter is very simple. Results with 8x8 dct are about 0.3% worse. Change-Id: I00100cdc96e32deced591985785ef0d06f325e44	2013-01-12 16:00:53 -08:00
Jim Bankoski	e42b280e11	Merge "WIP: Added sse2 version of vp9_mb_lpf_horizontal_edge_w" into experimental	2013-01-11 17:15:41 -08:00
Scott LaVarnway	b20ce07d76	WIP: Added sse2 version of vp9_mb_lpf_horizontal_edge_w and vp9_mb_lpf_vertical_edge_w_sse2. This was quickly done so we can run some tests over the weekend. Future commits will optimize/refactor these functions further. The decoder performance improved by ~17% for the clip used. Change-Id: I612687cd5a7670ee840a0cbc3c68dc2b84d4af76	2013-01-11 17:11:04 -08:00
Jim Bankoski	385bea686b	Merge "Upstream changes from Chromium Android Clang build." into experimental	2013-01-11 17:06:26 -08:00
Yaowu Xu	bbe1c9257f	Merge "Add loop filtering for UV plane" into experimental	2013-01-11 16:56:39 -08:00
Yaowu Xu	9a1d73d036	Add loop filtering for UV plane On block boundary within a MB when 8x8 block boundary only is filtered for Y. Change-Id: Ie1c804c877d199e78e2fecd8c2d3f1e114ce9ec1	2013-01-11 16:32:06 -08:00
Frank Galligan	bc45f23192	Upstream changes from Chromium Android Clang build. See https://codereview.chromium.org/11875006/ Change-Id: Ied2a17df2b3222635f84aef120eaa9feb53750d2	2013-01-11 15:37:23 -08:00
Scott LaVarnway	9dc69dfb70	Merge "Initial sse2 version of the wide loopfilters" into experimental	2013-01-11 15:34:26 -08:00
Scott LaVarnway	4987c0f07e	Initial sse2 version of the wide loopfilters Updated the rtcd_defs and used the sse2 uv version of the loopfilter. The performance improved by ~8% for the test clip used. Change-Id: I5a0bca3b6674198d40ca4a77b8cc722ddde79c36	2013-01-11 14:54:14 -08:00
Paul Wilkins	d27ae620bc	Remove INT64_MAX references. Replace INT64_MAX references with LLONG_MAX for windows build. Change-Id: Ib8b45c1e9c15c043b2f54c27ed83b8682b2be34f	2013-01-11 19:45:26 +00:00
Yaowu Xu	d5a8b62d06	Merge "Reduce the usage of widerlpf" into experimental	2013-01-11 11:15:43 -08:00
Jim Bankoski	9431536045	rtcd for new wider loop filters Change-Id: I8826bcdcf72ba6d86bde31cd13902a710399805c	2013-01-11 09:45:45 -08:00
Yaowu Xu	6c9fb22e13	Reduce the usage of widerlpf The commit changed to not to use wider lpf within a superblock when 32x32 transform is used for the block. The commit also changed to use the shorter version of loop filtering: for UV planes. Change-Id: I344c1fb9a3be9d1200782a788bcb0b001fedcff8	2013-01-10 20:15:47 -08:00
Ronald S. Bultje	aa2effa954	Merge tx32x32 experiment. Change-Id: I615651e4c7b09e576a341ad425cf80c393637833	2013-01-10 08:23:59 -08:00
Ronald S. Bultje	460501fe84	Merge "Merge superblocks64 experiment." into experimental	2013-01-10 08:18:33 -08:00
Ronald S. Bultje	6884a83f06	Merge superblocks64 experiment. Change-Id: If6c88752dffdb566f8d4322f135145270716fb8e	2013-01-09 17:21:40 -08:00
Yaowu Xu	51bae955e6	experiment a wider loop filter for MB border when larger transforms are used Change-Id: I25251442b44bf251df4c25a1c1fcf71fb2ad913b	2013-01-09 16:39:05 -08:00
Adrian Grange	7d6b5425d7	New prediction filter This patch removes the old pred-filter experiment and replaces it with one that is implemented using the switchable filter framework. If the pred-filter experiment is enabled, three interopolation filters are tested during mode selection; the standard 8-tap interpolation filter, a sharp 8-tap filter and a (new) 8-tap smoothing filter. The 6-tap filter code has been preserved for now and if the enable-6tap experiment is enabled (in addition to the pred-filter experiment) the original 6-tap filter replaces the new 8-tap smooth filter in the switchable mode. The new experiment applies the prediction filter in cases of a fractional-pel motion vector. Future patches will apply the filter where the mv is pel-aligned and also to intra predicted blocks. Change-Id: I08e8cba978f2bbf3019f8413f376b8e2cd85eba4	2013-01-09 12:00:39 -08:00
Deb Mukherjee	4b7304ee68	Adds 64x64 hybrid dct/dwt transform This is to add to the 64x64 transform experiment as an alternative to a 64x64 DCT. Two levels of wavelet decomposition is used on a 64x64 block, followed by 16x16 DCT on the four lowest subbands. The highest three subbands are left untransformed after the first level DWT. Change-Id: I3d48d5800468d655191933894df6b46e15adca56	2013-01-08 14:05:58 -08:00
Ronald S. Bultje	cd0f36b24f	Merge "Merge superblocks (32x32) experiment." into experimental	2013-01-08 13:31:37 -08:00
Yunqing Wang	f1c56a8c8c	Merge "vp9_sub_pixel_variance16x2 SSE2 optimization" into experimental	2013-01-08 12:59:08 -08:00
Ronald S. Bultje	4455036cfc	Merge superblocks (32x32) experiment. Change-Id: I0df99742029834a85c4933652b0587cf5b6b2587	2013-01-08 12:54:45 -08:00
Yunqing Wang	8d568312a2	vp9_sub_pixel_variance16x2 SSE2 optimization About 5% decoder speedup. Change-Id: Ib6687d337af758a536a0e7e289f400990f1f9794	2013-01-08 12:01:55 -08:00
John Koleszar	879cb7d962	Merge vp9-preview changes into experimental branch Incorportate vp9-preview changes by merging master branch into experimental. Conflicts: test/test.mk vp9/common/vp9_filter.c vp9/common/vp9_idctllm.c vp9/common/vp9_invtrans.h vp9/common/vp9_mbpitch.c vp9/common/vp9_rtcd_defs.sh vp9/common/vp9_systemdependent.h vp9/common/vp9_type_aliases.h vp9/common/x86/vp9_asm_stubs.c vp9/common/x86/vp9_subpixel_mmx.asm vp9/decoder/vp9_decodframe.c vp9/decoder/vp9_dequantize.c vp9/decoder/vp9_dequantize.h vp9/decoder/vp9_onyxd_int.h vp9/encoder/vp9_bitstream.c vp9/encoder/vp9_encodeframe.c vp9/encoder/vp9_rdopt.c Change-Id: I17f51c3666d1b59cf1a699f87607cbc5d30a87c5	2013-01-08 10:19:59 -08:00
Yaowu Xu	c14439c3d3	reset segement map on key frame This is to fix a decoder crash when decoder skips a number of frame to continue decoding from a later key frame. Change-Id: I3ba116eba6c3440e0528a21f53745f694302e4ad	2013-01-08 08:54:45 -08:00
Yaowu Xu	08e207ad04	Merge "minor loop filter refactoring and cleanup" into experimental	2013-01-08 08:40:03 -08:00
Yaowu Xu	d278d01836	minor loop filter refactoring and cleanup This commit did a couple of minor cleanup/refactoring to prepare for futher loop filter experiments. It merged y_only version of loop filter function into the regular one, which makes sure that same logic is used for functions for picking level and for actual loop filtering. Change-Id: Id10c94dccd45f58e5310bacfdf6ee63cbb60b86f	2013-01-07 16:23:58 -08:00
Ronald S. Bultje	3ed14846e1	Remove a few redundant function arguments in encodeframe.c. Also reindent a block of code that was misindented after addition of the tx32x32 experiment. Change-Id: Ic3e4aae3effd8a40136da68c9f382af03632ba08	2013-01-07 11:41:49 -08:00
Ronald S. Bultje	c13d9fef42	Re-enable support for static_threshold (encode_breakout). Change-Id: Ibd7380f478d3127f9db91d0a4fd2fd0dfde961ab	2013-01-07 11:02:14 -08:00
Ronald S. Bultje	e6216d163a	Don't use tx32x32 for macroblocks. Change-Id: Ib674e0153ca360867ab7a20ba291ac9171a01250	2013-01-07 09:40:19 -08:00
Ronald S. Bultje	c3941665e9	64x64 blocksize support. 3.2% gains on std/hd, 1.0% gains on hd. Change-Id: I481d5df23d8a4fc650a5bcba956554490b2bd200	2013-01-05 18:20:25 -08:00
Adrian Grange	81d1171fd4	Fix mode selection infinite loop bug Mode selection for SBs could enter an infinite loop because the interpolation filter mode index was not being reset correctly. Change-Id: I4bbe726f29ef5b6836e94884067c46084713cc11	2013-01-04 09:00:47 -08:00
Paul Wilkins	c6ba3a3d85	Further change to mv reference search. This experimental change reorders the search so that all possible references that match the target reference frame are tested first and these in order of distance from the current block. These will usually be the highest scoring candidates. If we do not find enough good candidates this way we try non matching cases. These will usually be lower scoring candidates. The change in order together with breakouts when we have found enough candidates should reduce the computational cost and especially reduce the number of sort operations. Quality Results: Std Hd +0.228%, Hd +0.074%, YT +0.046%, derf +0.137% This effect is probably due to the fact that more distant weak candidates are now less likely to get "promoted" over near candidates even if they are repeated. Change-Id: Iec37e77d88a48ad0ee1f315b14327a95d63f81f6	2013-01-04 15:18:10 +00:00
Yaowu Xu	df7ce5a711	Merge "make cost_coeffs() and tokenize_b() consistent" into experimental	2013-01-03 09:57:07 -08:00
Yaowu Xu	818f5698fb	Merge "Merge cost_coeffs_2x2() into cost_coeffs()" into experimental	2013-01-03 09:33:21 -08:00
Yaowu Xu	83664f457b	make cost_coeffs() and tokenize_b() consistent Change-Id: I7cdb5c32a1400f88ec36d08ea982e38b77731602	2013-01-03 09:31:47 -08:00
Adrian Grange	259b800832	New interpolation filter selection algorithm Old Scheme: When SWITCHABLE filter selection is enabled the encoder evaluates the use of each interpolation filter type and selects the best one to use at the MB level. A frame- level flag can be set to force the use of a particular filter type for all MBs in a frame if it is more efficient to encode that way. The logic here involved a Q dependent threshold that assumed that the second 8-tap filter was a high-pass filter. However, this requires a trip around the recode loop. If the frame-level flag indicates use of a particular filter, the other filters are not evaluated in the pick_mode loop. New Scheme: Each filter type is evaluated at the MB level and a record of the best filter is kept, irrespective of what filter is signaled at the frame-level. Once all MBs have been encoded, a decision is made as to what frame-level mode to set for the next frame. If one filter is used by 80% or more of the MBs, then this filter is forced since it is assumed that this will be more efficient if the next frame has similar characteristics. i.e. there is a one-frame lag between measuring the filter selection and setting the frame-level mode to use. Change-Id: I6a7e7ced8f27e120fafb99db2dc9c6293f8d20f7	2013-01-03 08:12:43 -08:00
Yaowu Xu	bd28510ef9	Merge cost_coeffs_2x2() into cost_coeffs() Remove special case function cost_coeffs_2x2() and change function cost_coeffs() to handle 2nd order haar block as it is handle all other block types already. Change-Id: I2aac6f81ee0ae9e03d6a8da4f8681d69b79ce41f	2013-01-03 08:00:00 -08:00
Yunqing Wang	37166d5c1e	Merge "Switch the order of calculating 2-D inverse transform" into experimental	2013-01-02 11:45:27 -08:00
Yunqing Wang	e9c69ab102	Merge "Skip finding best ref_mvs when the mode is ZEROMV" into experimental	2013-01-02 11:45:19 -08:00
Paul Wilkins	cad4a91429	Change INT64_MAX to LLONG_MAX This is needed to make the windows build work after the removal of vp9_type_alisases.h. Change-Id: I8addf38e9f3c8b864e0e30a8916a26e0264dd02c	2013-01-02 18:06:00 +00:00
Paul Wilkins	313d1100af	Added update-able mv-ref probabilities. Part of NEW_MVREF experiment. Added update-able probabilities. Change-Id: I5a4fcf4aaed1d0d1dac980f69d535639a3d59401	2013-01-02 14:22:11 +00:00
Yunqing Wang	0f4de1573a	Skip finding best ref_mvs when the mode is ZEROMV Read mode before calling vp9_find_best_ref_mvs(). If the mode is ZEROMV, the best ref_mvs are not needed. Then, we can skip calling vp9_find_best_ref_mvs(). Change-Id: I5baa3658dd3f1c7107211cbbbcf919b4584be2e2	2012-12-27 16:18:53 -08:00
Yunqing Wang	cc80247f16	Switch the order of calculating 2-D inverse transform The 2-D inverse transform X = M1ZTransposed_M2 was calculated in 2 steps from left to right: 1. Vertical transform: Y = M1Z 2. Horizontal transform: X= YTransposed_M2 In SIMD, a transpose is needed in vertical transform. Here, switched the calculation order to do it from right to left. In this way, we could eliminate that transpose by writing the intermediate results out to their transposed positions. Change-Id: I34dfe5eb01292f6e363712420d99475e2e81e12c	2012-12-27 14:09:30 -08:00
John Koleszar	5ebe94f9f1	Build fixes to merge vp9-preview into master Various fixups to resolve issues when building vp9-preview under the more stringent checks placed on the experimental branch. Change-Id: I21749de83552e1e75c799003f849e6a0f1a35b07	2012-12-26 11:21:09 -08:00
Yunqing Wang	6ee08f3ccf	Fix a warning Fixed the warning: the size of array ‘intermediate_buffer’ can’t be evaluated [-Wvla]. Change-Id: Ibcffd6969bd71cee0c10f7cf18960e58cd0bd915	2012-12-21 15:26:56 -08:00
Scott LaVarnway	89ac94f8fb	Removed mmx versions of vp9_bilinear_predict filters These filters will not work with VP9. Change-Id: Ic26c77961084fcea6bfa97f4cd95afdea2282e85	2012-12-21 14:41:49 -08:00
John Koleszar	229273391f	Merge "add emmintrin_compat.h for builds with gcc < 4" into vp9-preview	2012-12-21 14:21:50 -08:00
Jim Bankoski	ad64ca4494	fixed sizes of global arrays Change-Id: Ibc077cf1c1da0c86063f88c6d3073c6876989119	2012-12-21 13:09:04 -08:00
John Koleszar	9a7023d2ad	Fix MSVS build for removed vp9/common/vp9_onyxd.h Change-Id: I75ad0b4ca5b53b5bf759cc26a484ec196d275279	2012-12-20 16:14:55 -08:00
James Zern	9dab3ce624	add emmintrin_compat.h for builds with gcc < 4 Change-Id: If7822e6fcd0d3568b934032322b19ba3e401df26	2012-12-20 14:56:13 -08:00
Jim Bankoski	1dffce7f96	add private to assembly files to insure proper chromebuild Change-Id: I6e43ca73f35401a974ed8ee27738d4318f09fd37	2012-12-20 09:40:18 -08:00
Deb Mukherjee	08f0c7cc9c	New previous coef context experiment Adds an experiment to derive the previous context of a coefficient not just from the previous coefficient in the scan order but from a combination of several neighboring coefficients previously encountered in scan order. A precomputed table of neighbors for each location for each scan type and block size is used. Currently 5 neighbors are used. Results are about 0.2% positive using a strategy where the max coef magnitude from the 5 neigbors is used to derive the context. Change-Id: Ie708b54d8e1898af742846ce2d1e2b0d89fd4ad5	2012-12-19 18:49:39 -08:00
Scott LaVarnway	a6b2070d1a	Disabled x86inc style assembly functions.... part 2 Missed a file Change-Id: I33179de6755bc9eda9ad906e4fec6902ace435a5	2012-12-19 14:13:25 -08:00
John Koleszar	05ec800ea4	Use boolcoder API instead of inlining This patch changes the token packing to call the bool encoder API rather than inlining it into the token packing function, and similarly removes a special get_signed case from the detokenizer. This allows easier experimentation with changing the bool coder as a whole. Change-Id: I52c3625bbe4960b68cfb873b0e39ade0c82f9e91	2012-12-19 12:52:41 -08:00
Scott LaVarnway	08dabbcee1	Disabled x86inc style assembly functions Temporary fix for 32-bit mac build errors. Change-Id: I2038f033cac16ea796097d0edd0f1c3da03246d7	2012-12-19 11:53:43 -08:00
Ronald S. Bultje	4cca47b538	Use standard integer types for pixel values and coefficients. For coefficients, use int16_t (instead of short); for pixel values in 16-bit intermediates, use uint16_t (instead of unsigned short); for all others, use uint8_t (instead of unsigned char). Change-Id: I3619cd9abf106c3742eccc2e2f5e89a62774f7da	2012-12-18 15:31:19 -08:00
Yaowu Xu	b41c3583ac	Merge "correct logic in cnvcontext experiment for tx32x32" into experimental	2012-12-18 14:23:39 -08:00
Yaowu Xu	c29fb02903	Merge "Problem of over smoothing with intra modes." into vp9-preview	2012-12-18 14:22:19 -08:00
Ronald S. Bultje	5cab8b7a18	Merge "Give 4x4 scan and coef_band tables a _4x4 suffix." into experimental	2012-12-18 14:17:46 -08:00
Ronald S. Bultje	58961c74ea	Merge "Remove redundant "Prob" type (it's a duplicate of vp9_prob)." into experimental	2012-12-18 14:17:18 -08:00
Yaowu Xu	de269c8a62	correct logic in cnvcontext experiment for tx32x32 Change-Id: I004ded11983b7fda85793912ebc5c6f266dc5eb5	2012-12-18 13:53:17 -08:00
Yunqing Wang	779c5f28a8	Fix uninitialized warning Fixed uninitialized warning for txfm_size. Change-Id: I42b7e802c3e84825d49f34e632361502641b7cbf	2012-12-18 13:19:04 -08:00
Yunqing Wang	e8d610dda0	Fix a warning Fixed the warning: the size of array ‘intermediate_buffer’ can’t be evaluated [-Wvla]. Change-Id: Ibcffd6969bd71cee0c10f7cf18960e58cd0bd915	2012-12-18 12:09:46 -08:00
Ronald S. Bultje	8986eb5c26	Give 4x4 scan and coef_band tables a _4x4 suffix. This matches the names of tables for all other transform sizes. Change-Id: Ia7681b7f8d34c97c27b0eb0e34d490cd0f8d02c6	2012-12-18 10:49:10 -08:00
Ronald S. Bultje	ebb5f2f7bd	Remove redundant "Prob" type (it's a duplicate of vp9_prob). Change-Id: I9548891d7b8ff672a31579bcdce74e4cea529883	2012-12-18 10:38:12 -08:00
John Koleszar	1306ba7659	Remove vp9_type_aliases.h Prefer the standard fixed-size integer typedefs. Change-Id: Iad75582350669e49a8da3b7facb9c259e9514a5b	2012-12-17 11:32:37 -08:00
Yaowu Xu	0405cd8e9f	fixed a warning where variable is used without initialization Change-Id: Ic6b52623802641060cad4a72271050aeaf20ad5c	2012-12-17 11:11:07 -08:00
Paul Wilkins	d8f5d1b257	Problem of over smoothing with intra modes. In some cases intra modes in inter frames give an over smoothed appearance. Especially with noisy but flat content. Also in some cases there were problems with key frame sizing again with very flat but noisy content. These are temporary changes to help alleviate the visual problems but will almost certainly hurt metric results especially at the very low data rate end. Change-Id: I11549179a19277ffc283d9788bc70168f2a8bdc9	2012-12-17 11:54:17 +00:00
Yaowu Xu	6247b239bc	reset segement map on key frame This is to fix a decoder crash when decoder skips a number of frame to continue decoding from a later key frame. Change-Id: I3ba116eba6c3440e0528a21f53745f694302e4ad	2012-12-14 06:35:32 -08:00
Yaowu Xu	f8ff3e5d47	prevents redefine of INT64_MAX MSVC 2012 (_MSC_VER=1600) introduced the definition, this commit prevents the redefinition of the macro Change-Id: I7de92e7e9e865a342f2bcc4b071f8d3c9b2a508c	2012-12-13 16:09:52 -08:00
Yaowu Xu	fd6f492604	remove floating point inverse transforms Change-Id: I9c651bd7c161974bf5f929446361b00d85e57a3f	2012-12-13 16:02:25 -08:00
Yaowu Xu	2b9ec585d6	fixed an encoder/decoder mismatch The mismatch was caused by an improper merge of cleanup code around tokenize_b() and stuff_b() with TX32X32 experiment. Change-Id: I225ae62f015983751f017386548d9c988c30664c	2012-12-13 15:33:21 -08:00
Yaowu Xu	c681887652	fixed build issue with round() not defined in msvc Change-Id: I8fe8462a0c2f636d8b43c0243832ca67578f3665	2012-12-13 15:15:56 -08:00
Deb Mukherjee	7fa3deb1f5	Build fixes with teh super blcoks and 32x32 expts Change-Id: I3c751f8d57ac7d3b754476dc6ce144d162534e6d	2012-12-13 12:18:38 -08:00
Deb Mukherjee	9c318ee371	Merge "Further improvements on the hybrid dwt/dct expt" into experimental	2012-12-13 11:04:56 -08:00
Deb Mukherjee	210dc5b2db	Further improvements on the hybrid dwt/dct expt Modifies the scanning pattern and uses a floating point 16x16 dct implementation for now to handle scaling better. Also experiments are in progress with 2/6 and 9/7 wavelets. Results have improved to within ~0.25% of 32x32 dct for std-hd and about 0.03% for derf. This difference can probably be bridged by re-optimizing the entropy stats for these transforms. Currently the stats used are common between 32x32 dct and dwt/dct. Experiments are in progress with various scan pattern - wavelet combinations. Ideally the subbands should be tokenized separately, and an experiment will be condcuted next on that. Change-Id: Ia9cbfc2d63cb7a47e562b2cd9341caf962bcc110	2012-12-13 10:37:49 -08:00
Ronald S. Bultje	f4608e3606	Merge "New default coefficient/band probabilities." into experimental	2012-12-13 09:56:50 -08:00
Ronald S. Bultje	5a5df19de3	New default coefficient/band probabilities. Gives 0.5-0.6% improvement on derf and stdhd, and 1.1% on hd. The old tables basically derive from times that we had only 4x4 or only 4x4 and 8x8 DCTs. Note that some values are filled with 128, because e.g. ADST ever only occurs as Y-with-DC, as does 32x32; 16x16 ever only occurs as Y-with-DC or as UV (as complement of 32x32 Y); and 8x8 Y2 ever only has 4 coefficients max. If preferred, I can add values of other tables in their place (e.g. use 4x4 2nd order high-frequency probabilities for 8x8 2nd order), so that they make at least some sense if we ever implement a larger 2nd order transform for the 8x8 DCT (etc.), please let me know Change-Id: I917db356f2aff8865f528eb873c56ef43aa5ce22	2012-12-12 16:23:57 -08:00
Scott LaVarnway	b575394e21	Improved vp9_ihtllm_c As suggested by Yaowu, we can use eob to reduce the complexity of the vp9_ihtllm_c function. For the 1080p test clip used, the decoder performance improved by 17%. Change-Id: I32486f2f06f9b8f60467d2a574209aa3a3daa435	2012-12-12 15:49:39 -08:00
Ronald S. Bultje	39de1e14ed	Merge "Consistently use get_prob(), clip_prob() and newly added clip_pixel()." into experimental	2012-12-12 10:34:14 -08:00
Ronald S. Bultje	4d0ec7aacd	Consistently use get_prob(), clip_prob() and newly added clip_pixel(). Add a function clip_pixel() to clip a pixel value to the [0,255] range of allowed values, and use this where-ever appropriate (e.g. prediction, reconstruction). Likewise, consistently use the recently added function clip_prob(), which calculates a binary probability in the [1,255] range. If possible, try to use get_prob() or its sister get_binary_prob() to calculate binary probabilities, for consistency. Since in some places, this means that binary probability calculations are changed (we use {255,256}count0/(total) in a range of places, and all of these are now changed to use 256count0+(total>>1)/total), this changes the encoding result, so this patch warrants some extensive testing. Change-Id: Ibeeff8d886496839b8e0c0ace9ccc552351f7628	2012-12-12 10:01:19 -08:00
Yaowu Xu	0c35b27689	Merge "clean up tokenize_b() and stuff_b()" into experimental	2012-12-11 13:51:56 -08:00
Yaowu Xu	899f0fc126	clean up tokenize_b() and stuff_b() Change-Id: I0c1be01aae933243311ad321b6c456adaec1a0f5	2012-12-11 13:32:16 -08:00
Yaowu Xu	6b380c0cfa	Merge "experiment with CONTEXT conversion" into experimental	2012-12-11 09:46:36 -08:00
Deb Mukherjee	f09c4cde85	Merge "A bug fix related to switchable filters" into experimental	2012-12-10 12:28:06 -08:00
Deb Mukherjee	14a38a8735	A bug fix related to switchable filters The switchable count update was mistakenly inside a macro. Change-Id: Iec04c52ad57034b88312dbaf05eee1f47ce265b3	2012-12-10 12:10:36 -08:00
Paul Wilkins	d124465975	Further changes to mv reference code. Some further changes and refactoring of mv reference code and selection of center point for searches. Mainly relates to not passing so many different local copies of things around. Some place holder comments. Change-Id: I309f10ffe9a9cde7663e7eae19eb594371c8d055	2012-12-10 17:31:51 +00:00
John Koleszar	d1356faeb8	Merge remote-tracking branch 'origin/vp9-preview' into experimental	2012-12-07 17:26:31 -08:00
Yaowu Xu	ab480cede5	experiment with CONTEXT conversion This commit changed the ENTROPY_CONTEXT conversion between MBs that have different transform sizes. In additioin, this commit also did a number of cleanup/bug fix: 1. removed duplicate function vp9_fix_contexts() and changed to use vp8_reset_mb_token_contexts() for both encoder and decoder 2. fixed a bug in stuff_mb_16x16 where wrong context was used for the UV. 3. changed reset all context to 0 if a MB is skipped to simplify the logic. Change-Id: I7bc57a5fb6dbf1f85eac1543daaeb3a61633275c	2012-12-07 17:25:45 -08:00
Jim Bankoski	fccebcba57	Merge "Fix implicit cast." into vp9-preview	2012-12-07 17:16:01 -08:00
Jim Bankoski	26a4918282	Merge "Fix meaninglesss if." into vp9-preview	2012-12-07 17:15:52 -08:00
Ronald S. Bultje	fbf052df42	Clean up 4x4 coefficient decoding code. Don't use vp9_decode_coefs_4x4() for 2nd order DC or luma blocks. The code introduces some overhead which is unnecessary for these cases. Also, remove variable declarations that are only used once, remove magic offsets into the coefficient buffer (use xd->block[i].qcoeff instead of xd->qcoeff + magic_offset), and fix a few Google Style Guide violations. Change-Id: I0ae653fd80ca7f1e4bccd87ecef95ddfff8f28b4	2012-12-07 16:27:07 -08:00
Ronald S. Bultje	885cf816eb	Introduce vp9_coeff_probs/counts/stats/accum types. Use these, instead of the 4/5-dimensional arrays, to hold statistics, counts, accumulations and probabilities for coefficient tokens. This commit also re-allows ENTROPY_STATS to compile. Change-Id: If441ffac936f52a3af91d8f2922ea8a0ceabdaa5	2012-12-07 16:09:59 -08:00
Frank Galligan	1c0ee77589	Fix meaninglesss if. Change-Id: I0cb06d77805246fe39d39ad3bc5df3c3f52c7050	2012-12-07 15:44:39 -08:00
Frank Galligan	8d449ce0a9	Remove unused symbols from vp9 asm offsets C files. Change-Id: I366e6d175da3012f1c8607fd7fad99fbbb616091	2012-12-07 15:38:40 -08:00
Frank Galligan	eec0bc4f1e	Fix implicit cast. Change-Id: I1eb7433061a6c529471026e0ebdc6467942062eb	2012-12-07 15:25:44 -08:00
Ronald S. Bultje	c456b35fdf	32x32 transform for superblocks. This adds Debargha's DCT/DWT hybrid and a regular 32x32 DCT, and adds code all over the place to wrap that in the bitstream/encoder/decoder/RD. Some implementation notes (these probably need careful review): - token range is extended by 1 bit, since the value range out of this transform is [-16384,16383]. - the coefficients coming out of the FDCT are manually scaled back by 1 bit, or else they won't fit in int16_t (they are 17 bits). Because of this, the RD error scoring does not right-shift the MSE score by two (unlike for 4x4/8x8/16x16). - to compensate for this loss in precision, the quantizer is halved also. This is currently a little hacky. - FDCT and IDCT is double-only right now. Needs a fixed-point impl. - There are no default probabilities for the 32x32 transform yet; I'm simply using the 16x16 luma ones. A future commit will add newly generated probabilities for all transforms. - No ADST version. I don't think we'll add one for this level; if an ADST is desired, transform-size selection can scale back to 16x16 or lower, and use an ADST at that level. Additional notes specific to Debargha's DWT/DCT hybrid: - coefficient scale is different for the top/left 16x16 (DCT-over-DWT) block than for the rest (DWT pixel differences) of the block. Therefore, RD error scoring isn't easily scalable between coefficient and pixel domain. Thus, unfortunately, we need to compute the RD distortion in the pixel domain until we figure out how to scale these appropriately. Change-Id: I00386f20f35d7fabb19aba94c8162f8aee64ef2b	2012-12-07 14:45:05 -08:00
Johann	1009f76566	Use 'vpx_scale' consistently Change-Id: I178352813d2b8702d081caf405de9dbad9af2cc3	2012-12-05 16:05:44 -08:00
Paul Wilkins	7405040142	Merge "Change to MV reference search." into experimental	2012-12-05 09:14:46 -08:00
Johann	52d350febf	Begin to refactor vpx_scale usage in VP9 Only declare the functions in vpx_scale RTCD and include the relevant header. Remove unused files and functions in vpx_scale to avoid wasting time renaming. vpx_scale/win32/scaleopt.c contains functions which have not been called in a long time but are potentially optimized. The 'vp8' functions have not been renamed yet. That is for after the cleanup. Change-Id: I2c325a101d60fa9d27e7dfcd5b52a864b4a1e09c	2012-12-05 08:59:40 -08:00
Johann	a905672906	Remove ARM optimizations from VP9 Change-Id: I9f0ae635fb9a95c4aa1529c177ccb07e2b76970b	2012-12-05 08:59:25 -08:00
John Koleszar	5d91a1e0ae	Merge remote-tracking branch 'origin/vp9-preview' into experimental	2012-12-05 08:41:35 -08:00
John Koleszar	4a4d2aa55c	vp9_bilinear_filters_mmx: add missing extern specifiers Change-Id: Ibabf18947f90cb4f45052763ebf44cfb8209bd8b	2012-12-05 08:27:48 -08:00
Paul Wilkins	4cc657ec6e	Change to MV reference search. This patch reduces the cpu cost of the MV ref search by only allowing insert for candidates that would be in the current top 4. This could alter the outcome and slightly favors near candidates which are tested first but also limits the worst case loop count to 4 and means in many cases it will drop out and not happen. Change-Id: Idd795a825f9fd681f30f4fcd550c34c38939e113	2012-12-05 14:03:45 +00:00
Johann	d138262ac0	Merge "Begin to refactor vpx_scale usage in VP9" into experimental	2012-12-04 15:23:42 -08:00
Yaowu Xu	6a5e6e0549	Fix the build with MSVC 1. remove the dependency on non existing "vp9_temporal_filter_x86.h" 2. prefix filenames with vp9_ in obj_int_extract.bat to reflect the change of the actual filenames. Change-Id: Ib1b4d96ac41788f76917764a6722d8461c857302	2012-12-04 09:12:49 -08:00
Frank Galligan	48556db7b2	Merge "vp9: Fix assert check." into vp9-preview	2012-12-03 17:29:46 -08:00
Yaowu Xu	806d05e1a8	merged optimiz_b_16x16() into optmize_b() The commit changed the trellis quantization function optimize_b() to work for MBs using all transform sizes, and eliminated the function for MB using 16x16 transform only, optimize_b_16x16. Change-Id: I3fa650587ab5198ed16315b38754783a72b33ba2	2012-12-03 14:53:45 -08:00
Johann	57e72208b3	Merge "Remove ARM optimizations from VP9" into experimental	2012-12-03 13:54:38 -08:00
Johann	c6bd29e2f5	Begin to refactor vpx_scale usage in VP9 Only declare the functions in vpx_scale RTCD and include the relevant header. Remove unused files and functions in vpx_scale to avoid wasting time renaming. vpx_scale/win32/scaleopt.c contains functions which have not been called in a long time but are potentially optimized. The 'vp8' functions have not been renamed yet. That is for after the cleanup. Change-Id: I2c325a101d60fa9d27e7dfcd5b52a864b4a1e09c	2012-12-03 12:51:56 -08:00
Johann	34591b54dd	Remove ARM optimizations from VP9 Change-Id: I9f0ae635fb9a95c4aa1529c177ccb07e2b76970b	2012-12-03 12:50:15 -08:00
Jim Bankoski	b95338c7ab	Merge "fixes --disable-vp9-encoder" into vp9-preview	2012-12-03 12:41:31 -08:00
Jim Bankoski	d9038b3c60	fixes --disable-vp9-encoder Change-Id: I467bf0fdf3b35326bcce58d5459e6d2dbfd6c5e5	2012-12-03 12:21:16 -08:00
Frank Galligan	0d687ed22b	vp9: Fix assert check. Change-Id: If0cc1ab60dff6abd67dae7c7b3dc83a1afd7fe65	2012-12-03 12:18:59 -08:00
Frank Galligan	3e0ea7f6e1	vp9: Remove superfluous command. - vpx_calloc is called on arf_not_zz above. - Note The removed vpx_memset call had an issue with sizeof. Change-Id: I86fd7a167d0a042e581e613e2a6c0b5e63073fc6	2012-12-03 10:26:15 -08:00
Deb Mukherjee	8b92f1e023	Supports inter-intra prediction with superblocks Adds support for compound inter-intra prediction with superblocks. Also, fixes a bug that disabled intra modes for superblocks. Change-Id: I4d711317e1bc19df8c2f32dc645429f7fff31036	2012-12-01 15:19:55 -08:00
Deb Mukherjee	6632330702	Adds switchable filters with superblocks Allows switchbale filters to be used without mismatch when the superblock experiment is on. Also removes a spurious clamping code in decodemv.c which causes rare encode/decode mismatches. Change-Id: I809d9ee0b2859552b613500b539a615515b863ae	2012-11-30 09:37:08 -08:00
Jim Bankoski	9f9370425b	warnings in various experiments Change-Id: Ib5106d4772450f8026f823dd743f162ab833b1d6	2012-11-30 07:31:37 -08:00
Jim Bankoski	2b8dc065d1	google style guide include guards Change-Id: I2c252f3ddcc99e96c1f5d3dab8bcb25a2a3637ea	2012-11-30 07:30:59 -08:00
Yunqing Wang	eebc0b49f1	Merge "Further improve macroblock loop filters" into experimental	2012-11-29 16:07:14 -08:00
Deb Mukherjee	d7489ea45e	Merge "Minor refactoring of superblock decoding" into experimental	2012-11-29 15:33:42 -08:00
Deb Mukherjee	be08b5af1a	Minor refactoring of superblock decoding Refactoring for improved readability - no bitstream or performance change. Change-Id: I4488ed4715f8dbe38c66431106478669041b8b33	2012-11-29 15:26:56 -08:00
Jim Bankoski	e3bdae1fc7	intrinsic warnings begone Change-Id: I6a224c590b6a2c5b91f9084ffb8083d18223a206	2012-11-29 14:14:26 -08:00
Jim Bankoski	d0a20fd22c	last remaining warning Change-Id: I1f49d96cdb5e342041c9a72ef31df361a1b609eb	2012-11-29 14:07:21 -08:00
Jim Bankoski	51e770deb1	fix implicit warnings idct etc Change-Id: I54a122cc8c0b6ed2dbc3c6ecfcd44736cd40b687	2012-11-29 11:23:02 -08:00
Jim Bankoski	ef3c01ed67	Additional warning message cleanup. Change-Id: I429a97ac57db3de0bf67ce3f3fe0c6b409f77a9e	2012-11-29 10:10:51 -08:00
Yaowu Xu	ff2f9de828	Merge changes Iaa67bcf1,Ibea3bc80 into experimental * changes: more warning cleanup unused variables & warnings	2012-11-29 09:34:10 -08:00
Yaowu Xu	b3055ec020	Merge "more unused variables." into experimental	2012-11-29 09:33:59 -08:00
Yaowu Xu	8422ef772d	Merge "unused variable" into experimental	2012-11-29 09:33:52 -08:00
Yaowu Xu	e007eb89cf	Merge "unused var removed" into experimental	2012-11-29 09:33:41 -08:00
Yaowu Xu	6431007df3	Merge "minor fix to eob check for setting CONTEXT" into experimental	2012-11-29 09:27:00 -08:00
Yaowu Xu	7ab1d3e49f	minor fix to eob check for setting CONTEXT Previously, the "!=" check is logically incorrect when eob is at 0 and effective coefficient starting position is 1. This commit should have no effect on bitstream. Change-Id: I6ce3a847c7e72bfbe4f7c74f88e3310c6b9b6d30	2012-11-29 09:10:15 -08:00
Jim Bankoski	00b27a3647	more warning cleanup Change-Id: Iaa67bcf1e866dfe255c4e458d4e51e9c708ffcf4	2012-11-29 09:07:12 -08:00
Jim Bankoski	a802f5e783	unused variables & warnings Change-Id: Ibea3bc80eb26a975faaa60268bbc93237f82bc57	2012-11-29 09:02:47 -08:00
Jim Bankoski	cf671e2756	more unused variables. Change-Id: Ibe11e9275949b26a77fa9c8ac2e7c356ae533d5d	2012-11-29 08:54:59 -08:00
Jim Bankoski	6e02947e29	unused variable Change-Id: I1302a6eaa840d419e8bb9ad0673e42ef139d3fee	2012-11-29 08:51:19 -08:00
Jim Bankoski	705220ee71	unused var removed Change-Id: I9d0efdff0c79ea4bdd660098106b64776bdd4483	2012-11-29 08:50:20 -08:00
Jim Bankoski	245fba74b7	signed mismatch mvrefcount Change-Id: Ie34820c1b6eaba9cf9316415a46f48af79c41646	2012-11-29 08:13:18 -08:00
Jim Bankoski	abd74ed594	warning error missing void Change-Id: I914bcc669297d3414261486bf1bfb716c2ecc804	2012-11-29 07:47:50 -08:00
Jim Bankoski	030e268a90	ihtllm moves to rtcd clears up some warnings Change-Id: I9899637497c6ad7519f098e055ab98580ae6d688	2012-11-29 07:19:38 -08:00
Jim Bankoski	e69b5258fd	fix vp9_vp8 files renamed Change-Id: I20c426e91ee49666db42e20eb074095ab6b8ec5d	2012-11-29 06:53:08 -08:00
Jim Bankoski	13dbf1fb17	more rtcd cleanup Change-Id: Ieefd76e164ca4aa87597da0412977614ddfbacb7	2012-11-28 17:27:15 -08:00
Deb Mukherjee	0de214260b	Merge "Fixing 8x8/4x4 ADST for intra modes with tx select" into experimental	2012-11-28 16:59:17 -08:00
Deb Mukherjee	0742b1e4ae	Fixing 8x8/4x4 ADST for intra modes with tx select This patch allows use of 8x8 and 4x4 ADST correctly for Intra 16x16 modes and Intra 8x8 modes when the block size selected is smaller than the prediction mode. Also includes some cleanups and refactoring. Rebase. Change-Id: Ie3257bdf07bdb9c6e9476915e3a80183c8fa005a	2012-11-28 16:21:12 -08:00
Yaowu Xu	b2f27d909a	Merge "remove the vp9_default_mode_contexts_a" into experimental	2012-11-28 13:56:42 -08:00
Yaowu Xu	1cc5739669	remove the vp9_default_mode_contexts_a Given the way mode_context is updated, the benefit of an additional default is not signficant. Change-Id: I67489453e8781340b18e26a1cc2f04e9221004a2	2012-11-28 11:14:30 -08:00
Jim Bankoski	c67873989f	fixed includes to be fully specified Change-Id: Ia1cce221f8511561b9cbd8edb7726fbc286ff243	2012-11-28 10:53:17 -08:00
Jim Bankoski	926d95cd84	Merge "remove postproc invokes" into experimental	2012-11-28 10:30:42 -08:00
John Koleszar	00e2c6bf7a	Merge "Clamp decoded feature data" into experimental	2012-11-28 10:08:37 -08:00
John Koleszar	b07fcf5f6f	Merge "Revert "make: flatten object file directories"" into experimental	2012-11-28 10:08:22 -08:00
Jim Bankoski	85cba19e16	remove postproc invokes and some miscellaneous invoke left overs Change-Id: I63191b1bfd3bea4ce30cceaeb686ec850570fc43	2012-11-28 10:00:25 -08:00
Yaowu Xu	3e976bba21	Localize Y2 entropy coding context This commit makes sure Y2 entropy coding context is always updated on every macroblock even there is no Y2 block. Change-Id: Ie307cfc46526efe55613be39f9f178d2531b56ba	2012-11-28 09:27:36 -08:00
Yunqing Wang	d202138621	Further improve macroblock loop filters This change included: 1. Aligned reads in vp9_mbloop_filter_vertical_edge function. Since we actually read 16 bytes, we can align the reads to read starting at (s - 8) instead of (s - 5). 2. Combined u, v loop filters. 3. Added 8x16 transpose. This gave 2% decoder performance gain (tulip clip). Change-Id: Ib14c2f1645c4a3436df17fe2f24789506bf0bb58	2012-11-28 09:27:07 -08:00
Yaowu Xu	12da793d00	removed redundant mode_context data structures This commit removed a couple of redundant data structures in frame coding contextsm, mode_context and mode_context_a, and changed to use vp9_mode_contexts only. The switch of the context for different frame type now relies on the switch of frame coding context between lfc and lfc_a. This commit also removed a number of memcpy among these redundant data structure. Change-Id: I42e8174bd60f466b0860afc44c1263896471b0f3	2012-11-28 09:24:30 -08:00
John Koleszar	a1f15814be	Clamp decoded feature data Not all segment feature data elements are full-range powers of two, so there are values that can be encoded that are invalid. Add a new function to clamp values to the maximum allowed. Change-Id: Ie47cb80ef2d54292e6b8db9f699c57214a915bc4	2012-11-27 16:38:31 -08:00
John Koleszar	1760c39bce	Revert "make: flatten object file directories" This reverts commit `b72373de79`. Change-Id: Ic1601160e11df1a018ef12da25967cfb5eebd5ba	2012-11-27 16:36:39 -08:00
John Koleszar	fcccbcbb39	Add vp9_ prefix to all vp9 files Support for gyp which doesn't support multiple objects in the same static library having the same basename. Change-Id: Ib947eefbaf68f8b177a796d23f875ccdfa6bc9dc	2012-11-27 14:12:30 -08:00
Yunqing Wang	3bf7b131c8	Merge "Improve sad3x16 SSE2 function" into experimental	2012-11-26 10:15:35 -08:00
Paul Wilkins	fbc8e8f9ae	Merge "Modified mv prediction." into experimental	2012-11-26 09:59:24 -08:00
Paul Wilkins	d22f3d9f42	Modified mv prediction. Modified the mv_pred() fuunction that chooses a centre point from which to start step searches to use the top candidate vectors chosen previously. Some gains (mainly on HD and tested with SB off). Std_hd 0.874%, YT-hd 0.174%, YT 0.05%, Derf 0.036% Change-Id: Ie232284f561838b8ecee0e28dcbb07a9cd46cf56	2012-11-26 17:55:19 +00:00
Yunqing Wang	e7cd80718b	Improve sad3x16 SSE2 function Vp9_sad3x16_sse2() is heavily called in decoder, in which the unaligned reads consume lots of cpu cycles. When CONFIG_SUBPELREFMV is off, the unaligned offset is 1. In this situation, we can adjust the src_ptr to be 4-byte aligned, and then do the aligned reads. This reduced the reading time significantly. Tests on 1080p clip showed over 2% decoder performance gain with CONFIG_SUBPELREFM off. Change-Id: I953afe3ac5406107933ef49d0b695eafba9a6507	2012-11-26 09:53:50 -08:00
Yaowu Xu	89d62e3b04	remove the dependency on idct.h Change-Id: Idcf827d8ae6429ee5b673c3398f838dbeacb4e74	2012-11-26 09:12:04 -08:00
Jim Bankoski	f42e41f2ef	Merge "removed the idct rtcd idct calls" into experimental	2012-11-24 21:38:36 -08:00
Ronald S. Bultje	25b609b62b	Move switch(tx_size) around txsize to detokenize.c. Add a new function vp9_decode_mb_tokens() that handles the switch between different per-tx-size detokenize functions. Make actual implementations (vp9_decode_mb_tokens_NxN()) static. Change-Id: I9e0c4ef410bfa90128a02b472c079a955776816d	2012-11-24 21:22:42 -08:00
Ronald S. Bultje	9dc7d4fb97	Fix crash in pick_inter_mode_sb(). It didn't handle rd_thresh == INT_MAX, which means the reference is unavailable. Change-Id: Ie6fa8b2577437411db81a8c24e8dcdfd856a0e8d	2012-11-24 21:20:32 -08:00
Jim Bankoski	510557e2eb	removed the idct rtcd idct calls More cleanup to do after this, but this is a good chunk of removing rtcd. Change-Id: I551db75e341a0a85c3ad650df1e9a60dc305681a	2012-11-24 19:33:58 -08:00
Ronald S. Bultje	9970d8b662	Restructure vp9_decode_mb_tokens_8x8() a bit. Don't declare variables if they only ever have a single value and are used only as argument to another function call; instead, just hardcode the value in the function call directly. Split out UV and Y coefficient loops for clarity. Use xd->block[].qcoeff instead of xd->qcoeff + magic to remove use of magic offset variables. Change-Id: I5b17eda1bb666c69c2b7ea957d5525cd78192e33	2012-11-23 09:43:13 -08:00
Ronald S. Bultje	f090b6b47b	Restructure vp9_decode_mb_tokens_16x16() a bit. Don't declare variables if they only ever have a single value and are used only as argument to another function call; instead, just hardcode the value in the function call directly. Also remove unneeded brackets around a code block, and remove the magic offsets 64 and 256 for chroma values in the coefficient memory block. Change-Id: I14fc14120a81ea1d6fb862674e8bf8cf6ba3d114	2012-11-23 09:11:12 -08:00
Ronald S. Bultje	0312c3d6d9	Make get_eob() function static. Change-Id: Idde3ab97960eda7022367c1f91a873a479bc9d7b	2012-11-23 08:17:06 -08:00
Ronald S. Bultje	4422847143	Rename "block_type" function argument to "txfm_size". Also fix the type (TX_SIZE instead of int). Change-Id: Ib9b3f33835e58a6e758ed5f37bb64543e62b6a86	2012-11-23 08:15:00 -08:00
Jim Bankoski	91d703b2b2	Merge "remove subpixel invoke functions" into experimental	2012-11-21 19:55:16 -08:00
Ronald S. Bultje	a5e542e74b	Fix enc/dec mismatch with b_context_pred experiment enabled. Change-Id: I1272ae3f0fdfb7ed8eb364ef0c6dd1818d3179d7	2012-11-21 12:39:55 -08:00
Jim Bankoski	3338af4109	remove subpixel invoke functions Removed the rtcd subpixel invoke functions. Change-Id: I8b7618bd5813333fac66b2817bdf807616e0fb33	2012-11-21 09:16:30 -08:00
Jim Bankoski	e25bd474ad	fixed const problem NEEDED FOR BUILD Change-Id: I56a3e68f15dff480b34de048e30231ba821b1ee2	2012-11-21 06:46:25 -08:00
Jim Bankoski	4ad2f08c72	Merge "clean out some of the rtcd code." into experimental	2012-11-21 06:41:37 -08:00
John Koleszar	414f68d266	Merge "Pack invisible frames without lengths" into experimental	2012-11-20 17:22:50 -08:00
Yunqing Wang	bbe5e032a4	Fix ref_stride in sad function Used ref_stride. Change-Id: I31f0a3bb935520f54d11a1d87315627f162ae845	2012-11-20 10:01:20 -08:00
Jim Bankoski	f4871b6a3f	clean out some of the rtcd code. This removes functions that are no longer needed and cleans up some warnings. Change-Id: I292a4c3694e9c1d68ce99cea390905b198434719	2012-11-18 12:33:18 -08:00
Ronald S. Bultje	4db08237e0	Merge "Assign above/left context in decode_coefs() instead of in caller." into experimental	2012-11-17 14:41:15 -08:00
Ronald S. Bultje	18e42dddf2	Merge "Remove unused argument from decode_coefs() function prototype." into experimental	2012-11-17 14:41:07 -08:00
Ronald S. Bultje	d0b525656b	Merge "Remove coef_bands_x[] array and related machinery in decode_coefs()." into experimental	2012-11-17 14:40:56 -08:00
Ronald S. Bultje	825b20b0ae	Merge "Inline count_tokens() in decode_coefs()." into experimental	2012-11-17 14:40:48 -08:00
Ronald S. Bultje	4db4f98b52	Merge "Merge various count_token() functions into a single one." into experimental	2012-11-17 14:40:41 -08:00
Ronald S. Bultje	5d7cb59035	Assign above/left context in decode_coefs() instead of in caller. this prevents duplicating the same line of code in each caller of decode_coefs(). Change-Id: Id7996ad394828bf77ef3d5e03002f577c9f79609	2012-11-17 11:22:38 -08:00
Ronald S. Bultje	3bdf302ce7	Remove unused argument from decode_coefs() function prototype. Change-Id: I8d2539ba1046012c948520ac23a1f1978be921c5	2012-11-17 11:11:06 -08:00
Ronald S. Bultje	a253b3791b	Remove coef_bands_x[] array and related machinery in decode_coefs(). Change-Id: I0a36d1efb3bb81a54005b10316550ec67100559e	2012-11-17 11:07:23 -08:00
Ronald S. Bultje	511ef2072c	Inline count_tokens() in decode_coefs(). This prevents the relatively expensive token-from-coefficient lookup function get_token(), plus a duplicate loop.. Change-Id: Ibecd407b2a91d3593d439ec4646e43fa26d2ff91	2012-11-17 10:35:47 -08:00
Ronald S. Bultje	56352f189d	Merge various count_token() functions into a single one. Change-Id: I1970f43e2cb5f7d9744c7249099eed226f16f162	2012-11-17 10:18:41 -08:00
Jim Bankoski	b38b6abccc	Merge "removal of temporal invoke" into experimental	2012-11-17 09:53:02 -08:00
Ronald S. Bultje	166d24d07e	Remove unused function count_tokens() in detokenize.c. Change-Id: I178f250b1a4d41d5a9c1619091f5ae51cebffb10	2012-11-17 07:45:46 -08:00
Jim Bankoski	cb98b83239	removal of temporal invoke Change-Id: I18ca713b02a5241bdb20dddcde0216467b55b596	2012-11-17 06:11:01 -08:00
Ronald S. Bultje	f19a1cafed	Remove special-case inline detokenization in b_pred reconstruction. Just like for all other block modes, b_pred tokens can be read together before starting macroblock reconstruction. This removes special cases for b_pred in decode_macroblock() and allows to make decode_coefs_4x4() static in detokenize.c. While at it, remove the redundant handling and checking of plane_type and block_index (i) in decode_coefs_4x4(). Since the function is static, and is called only from decode_mb_tokens_4x4(), we don't need to worry that the arguments ever go out of sync. Change-Id: I2d415da0b51b89d0490a6b9e24cc86363c2090f7	2012-11-16 22:26:12 -08:00
Yunqing Wang	0eb5590425	Merge "Add const before the dequant(dq)" into experimental	2012-11-16 12:35:17 -08:00
Yunqing Wang	4c7c15ee69	Merge "Optimize 8x8 dequant and idct" into experimental	2012-11-16 12:23:06 -08:00
Yunqing Wang	47d9d48fa4	Add const before the dequant(dq) Modified code to use const before dq. Change-Id: I6fa59c2ed9743ded33ad08df70e15c2fe1ae7b99	2012-11-16 12:13:13 -08:00
Ronald S. Bultje	5b11052ac1	Support 32x32 intra modes in non-keyframe superblocks. Change-Id: Icf8ad313c543462e523bff89690e5daa8d49bcc0	2012-11-16 09:54:43 -08:00
Paul Wilkins	a57dbd957b	Further experimentation with the mode context Experiments with a larger set of contexts and some clean up to replace magic numbers regarding the number of contexts. The starting values and rate of backwards adaption are still suspect and based on a small set of tests. Added forwards adjustment of probabilities. The net result of adding the new context and forward update is small compared to the old context from the legacy find_near function. (down a little on derf but up by a similar amount for HD) HOWEVER.... with the new context and forward update the impact of disabling the reverse update (which may be necessary in some use cases to facilitate parallel decoding) is hugely reduced. For the old context without forward update, the impact of turning off reverse update (Experiment was with SB off) was Derf - 0.9, Yt -1.89, ythd -2.75 and sthd -8.35. The impact was mainly at low data rates. With the new context and forward update enabled the impact for all the test sets was no more than 0.5-1% (again most at the low end). Change-Id: Ic751b414c8ce7f7f3ebc6f19a741d774d2b4b556	2012-11-16 16:58:00 +00:00
John Koleszar	6bca6decbf	Merge "Don't write recon.yuv by default" into experimental	2012-11-16 08:41:40 -08:00
Deb Mukherjee	cb2d06ceac	Merge "Compound inter-intra experiment" into experimental	2012-11-16 08:30:34 -08:00
Yaowu Xu	170305dcd3	Merge "changed mv candidate search for superblocks" into experimental	2012-11-16 07:21:55 -08:00
Yaowu Xu	415e6bff4d	changed mv candidate search for superblocks added additional motion vectors at close neighborhood of a superblock to the list of candiate motion vectors, and removed a couple that are further away. The change helped std-hd set about .8% (all metrics) and smaller gain for derf set. Change-Id: Iaa69b98614db43420ed3fd4738d0ca5587b90045	2012-11-16 07:01:13 -08:00
Deb Mukherjee	0c917fc975	Compound inter-intra experiment A patch on compound inter-intra prediction. In compound inter-intra prediction, a new predictor for 16x16 inter coded MBs are obtained by combining a single inter predictor with a 16x16 intra predictor, in a manner that the weight varies with distance from the top/left boundary. The current search strategy is to combine the best inter mode with the best intra mode obtained independently. Results so far: derf +0.31% yt +0.32% std-hd +0.35% hd +0.42% It is conceivable that the results would improve somewhat with a more thorough search strategy where all intra modes are searched given the best mv, or even a joint search for the best mv and the best intra mode. Change-Id: I7951f1ed0d6eb31ca32ac24d120f1585bcd8d79b	2012-11-16 06:56:29 -08:00
Yaowu Xu	1c56946ec1	Merge "subpelrefmv for superblocks" into experimental	2012-11-16 05:49:32 -08:00
John Koleszar	64bcffc1ec	Pack invisible frames without lengths Modify the decoder to return the ending position of the bool decoder and use that as the starting position for the next frame. The constant-space algorithm for parsing the appended frame lengths is O(n^2), which is a potential DoS concern if n is unbounded. Revisit the appended lengths for use as partition lengths when multipartition support is added. In addition, this allows decoding of raw streams outside of a container without additional framing information, though it's insufficient to be able to remux said stream into a container. Change-Id: I71e801a9c3e37abe559a56a597635b0cbae1934b	2012-11-15 15:48:07 -08:00
Yaowu Xu	61416aedc2	subpelrefmv for superblocks duplicate code clean-up and variable name corrections Change-Id: Ibc4703228e652ec425125de5e7bc038fa46595c5	2012-11-15 13:46:52 -08:00
John Koleszar	a9c7597adc	support building vp8 and vp9 into a single lib Change-Id: Ib8f8a66c9fd31e508cdc9caa662192f38433aa3d	2012-11-15 10:46:17 -08:00
John Koleszar	b72373de79	make: flatten object file directories Rather than building an object file directory heirarchy matching the source tree's layout, rename the object files so that the object file name contains the path in the source file tree. The intent here is to allow two files in different parts of the source tree to have the same name and still not collide when put into an ar archive. Change-Id: Id627737dc95ffc65b738501215f34a995148c5a2	2012-11-15 10:44:58 -08:00
John Koleszar	6becad426c	detokenize: use SEG_LVL_EOB feature consistently Update decode_coefs() to break when c >= eob, since it's possible that c starts the loop from 1 and eob is 0. The loop won't terminate in that case. Add new get_eob() function to consistently clamp the eob based on the segment level EOB and the block size. It's possible to code a segment level EOB that's greater than the block size, and that leads to an out of bounds access. Change-Id: I859563b30414615cf1b30dcc2aef8a1de358c42d	2012-11-15 11:44:29 +00:00
pascal massimino	5a955973d9	Merge changes I63348ae3,I658ea409 into experimental * changes: Segment mode coding bug. Silenced a few warnings.	2012-11-15 00:24:57 -08:00
Ronald S. Bultje	120690989b	Merge "fix costing bug in pick_uv_sb_mode." into experimental	2012-11-14 17:05:46 -08:00
Ronald S. Bultje	d7290d4974	Merge "Merge a few mostly-duplicate code fragments in SB/MB encoding." into experimental	2012-11-14 17:05:40 -08:00
Ronald S. Bultje	a77df0c473	Merge "Prevent overflow in variance32x32." into experimental	2012-11-14 15:43:19 -08:00
Ronald S. Bultje	a653c9d286	fix costing bug in pick_uv_sb_mode. Change-Id: Ia24e0fddcca9125f8e41e95dbb22444dc51767c7	2012-11-14 15:19:45 -08:00
Ronald S. Bultje	fa1b356e4e	Merge a few mostly-duplicate code fragments in SB/MB encoding. Change-Id: I8e12fbab7ec4732b6400ae3a6964749d818c90c9	2012-11-14 15:19:45 -08:00
Ronald S. Bultje	a099370370	Prevent overflow in variance32x32. Change-Id: I478878c78ef8a770186622d987d318176827ef5f	2012-11-14 15:18:21 -08:00
John Koleszar	16e2686682	Merge "SEG_LVL_MODE: don't code ref_frame if it's implicit" into experimental	2012-11-14 09:39:25 -08:00
Ronald S. Bultje	127836d11f	Merge "Don't use hybrid transform (ADST) for superblocks." into experimental	2012-11-14 09:18:34 -08:00
Ronald S. Bultje	1e3dd49fe3	Don't use hybrid transform (ADST) for superblocks. This is in line with other cases where we disable ADST if prediction size and transform size don't match. Before this patch, the RD loop will use ADST for superblocks, but frame encoding/decoding won't. Change-Id: I700368c632eb72b5e089c22ef25649d99d7697d0	2012-11-14 08:58:24 -08:00
Paul Wilkins	b527c4dbb7	Segment mode coding bug. There are now more than 16 possible modes so 5 bits required for segment mode feature. Note that it is likely that the mode feature and how it is coded will change but for now the 4 bits was a bug. Change-Id: I63348ae3a9cc31566a656c2dc78f09f5e1a9dcc9	2012-11-14 14:38:03 +00:00
Paul Wilkins	19a1ba1e91	Silenced a few warnings. Silenced a few VS compiler warnings. Change-Id: I658ea409c36c05cd11042675e2e42ccde0ef2420	2012-11-14 14:27:37 +00:00
John Koleszar	854e41f057	Don't write recon.yuv by default CONFIG_DEBUG was turning on some code to dump the reconstructed frame to a buffer from within the decoder. Move this code to a more specific debugging define. Change-Id: I3ca9ea634bdbd186f2470bd644d3695ee0ab3037	2012-11-13 15:22:35 -08:00
John Koleszar	6d482706ef	SEG_LVL_MODE: don't code ref_frame if it's implicit If the SEG_LVL_MODE is an intra mode, then the reference frame must be INTRA_FRAME. Change-Id: I2cdeeac3780c077c74b39ce89a528bc280674231	2012-11-13 15:22:09 -08:00
Yaowu Xu	3fa1348d5f	fix a few typos Change-Id: I7b6f27826052eb706fc6080d4e3a940dff7d3a58	2012-11-13 14:45:53 -08:00
Ronald S. Bultje	1761a6b55a	Merge "Use full 32-pixel edge for superblock bestrefmv motion vector ordering." into experimental	2012-11-13 14:12:58 -08:00
Ronald S. Bultje	b147c64c16	Merge "Fix edge MV handling in SBs." into experimental	2012-11-13 14:12:48 -08:00
Deb Mukherjee	7de64f35d3	A fix in MV_REF experiment This fix ensures that the forward prob update is not turned off for motion vectors. Change-Id: I0b63c9401155926763c6294df6cca68b32bac340	2012-11-13 08:27:04 -08:00
Yunqing Wang	e60478d46d	Optimize 8x8 dequant and idct Similar to 16x16 dequant and idct, based on the value of eobs, the 8x8 dequant and idct calculation was simplified to improve decorder performance. Combined vp9_dequant_idct_add_8x8 and vp9_dequant_dc_idct_add_8x8 to eliminate duplicate code. Change-Id: Ia58e50ab27f7012b7379c495837c9c0b5ba9cf7f	2012-11-12 17:41:53 -08:00
Ronald S. Bultje	c79ae1713c	Use full 32-pixel edge for superblock bestrefmv motion vector ordering. Change-Id: I417e39867c020a17d85370972446a8ce2bbe9a6d	2012-11-12 17:06:56 -08:00
Ronald S. Bultje	722972454c	Fix edge MV handling in SBs. Change-Id: Ia1eddb108ec463835e9de8769572d698e21bca49	2012-11-12 17:06:52 -08:00
Paul Wilkins	5d65614fdd	Merge "New inter mode context" into experimental	2012-11-12 09:24:14 -08:00
Paul Wilkins	2669f42b0d	New inter mode context This change is a fix / extension of the newbestrefmv experiment. As such it is presented without IFDEF. The change creates a new context for coding inter modes in vp9_find_mv_refs(). This replaces the context that was previously calculated in vp9_find_near_mvs(). The new context is unoptimized and not necessarily any better at this stage (results pending), but eliminates the need for a legacy call to vp9_find_near_mvs(). Based on numbers from Scott, this could help decode speed by several %. In a later patch I will add support for forward update of context (assuming this helps) and refine the context as necessary. Change-Id: I1cd991b82c8df86cc02237a34185e6d67510698a	2012-11-12 15:50:02 +00:00
Ronald S. Bultje	3a08b033b0	Merge "Fix data type for eobs[] array in SB 4x4 IDCT code." into experimental	2012-11-12 07:40:54 -08:00
Ronald S. Bultje	11fec1863d	Merge "Remove 'thismb' data pointer when superblock experiment is on." into experimental	2012-11-12 07:22:22 -08:00
Paul Wilkins	6fb8953c19	Restrict ref mv search range. Experiment to test speed trade off of reducing the extent of the ref mv search. Reducing the maximum number of tested candidates to 9 had minimal net effect on quality in any of the tests sets. Reduction to 7 has a small negative impact (worst was STD-HD at about -0.2%). This change is in response to the apparently high number of decode cycles reported in regard to mv-ref selection. Change-Id: I0e92e92e324337689358495a1ec9ccdeb23dc774	2012-11-12 11:31:12 +00:00
Ronald S. Bultje	dd9d4f9e1a	Fix data type for eobs[] array in SB 4x4 IDCT code. This fixes encoder/decoder mismatches with the superblock experiment turned on whenever a superblock is encoded using the 4x4 transform. Change-Id: Iefec7055e8d25f8efdbba66c4261bbd322d335a3	2012-11-10 12:08:27 -08:00
Ronald S. Bultje	73987d140a	Remove 'thismb' data pointer when superblock experiment is on. This should prevent inconsistent results between identical encodes with the superblock experiment turned on. Change-Id: I41a005fae53f2eb59736cc70041185fb7d63cfca	2012-11-10 08:39:51 -08:00
Deb Mukherjee	d01357bbad	New b-intra mode where direction is contextual Preliminary patch on a new 4x4 intra mode B_CONTEXT_PRED where the dominant direction from the context is used to encode. Various decoder changes are needed to support decoding of B_CONTEXT_PRED in conjunction with hybrid transforms since the scan order and tokenization depends on the actual direction of prediction obtained from the context. Currently the traditional directional modes are used in conjunction with the B_CONTEXT_PRED, which also seems to provide the best results. The gains are small - in the 0.1% range. Change-Id: I5a7ea80b5218f42a9c0dfb42d3f79a68c7f0cdc2	2012-11-10 07:12:30 -08:00
Deb Mukherjee	3f7182cb0d	Build fix in decoder/decodframe.c Missing eobs agrument in vp9_dequant_idct_add_16x16_c Change-Id: I826b1afa0a4ee6398f7373325aa0c75e6a866937	2012-11-09 12:48:35 -08:00
John Koleszar	3a0cfb3617	Merge "Packing Altref along with succeeding frame and length encoding frames" into experimental	2012-11-09 12:31:37 -08:00
Vignesh Venkatasubramanian	bc9670eee0	Packing Altref along with succeeding frame and length encoding frames The altref frame is packed along with the next P frame. So that outside of the codec there are now only two types of frames P and I. Also, now it is one frame in and one frame out with respect to the codec. Apart from that, all the frames are length encoded with the length of each frame appended to the frame itself. There are two categories of frames and each of them will look as follows: - Packed frames (an altref along with the succeeding p frame) - altref_frame_data \| altref_lenngth \| frame_data \| length - Unpacked frames (all frames other than the above) - frame_data \| length Change-Id: If1eabf5c473f7d46b3f2d026bd30c803588c5330	2012-11-09 12:04:53 -08:00
Yunqing Wang	71b1885403	Merge "Optimize 16x16 dequant and idct" into experimental	2012-11-09 08:30:53 -08:00
Jim Bankoski	a186eb7f1b	Merge "remove macros obfuscating mv costing" into experimental	2012-11-08 15:51:21 -08:00
Jim Bankoski	c72be96b0a	remove macros obfuscating mv costing cleanup Change-Id: I565eee40d900e0441ad211b65ac829fc5b93d94a	2012-11-08 15:44:39 -08:00
Ronald S. Bultje	1d4fbeb32a	Implement tx_select for superblock encoding. Also split superblock handling code out of decode_macroblock() into a new function decode_superblock(), for easier readability. Derf +0.05%, HD +0.2%, STDHD +0.1%. We can likely get further gains by allowing to select mb_skip_coeff for a subset of the complete SB or something along those lines, because although this change allows coding smaller transforms for bigger predictors, it increases the overhead of coding EOBs to skip the parts where the residual is near-zero, and thus the overall gain is not as high as we'd expect. Change-Id: I552ce1286487267f504e3090b683e15515791efa	2012-11-08 11:03:00 -08:00
Yunqing Wang	6c17c9fae0	Optimize 16x16 dequant and idct As suggested by Yaowu, simplified 16x16 dequant and idct. In decoder, after detoken step, we know the number of non-zero dct coefficients (eobs) in a macroblock. Idct calculation can be skipped or simplified based on eobs, which improves the decoder performance. Change-Id: I9ffa1cb134bcb5a7d64fcf90c81871a96d1b4018	2012-11-07 20:04:09 -08:00
John Koleszar	8959c8b11d	Merge with upstream experimental changes (2) Include upstream changes (variance fixes) into the merged code base. Change-Id: I4182654c1411c1b15cd23235d3822702613abce1	2012-11-07 14:32:26 -08:00
James Zern	5338d983d6	Merge "Fix variance (signed integer) overflow" into experimental	2012-11-07 12:49:36 -08:00
John Koleszar	2c08c28191	Merge with upstream experimental changes Include upstream changes (unit test fixes, in particular) into the merged code base. Change-Id: I096f8a9d09e2532fbec0c95d7a995ab22fa54b29	2012-11-07 11:46:23 -08:00
John Koleszar	7b8dfcb5a2	Rough merge of master into experimental Creates a merge between the master and experimental branches. Fixes a number of conflicts in the build system to allow either VP8 or VP9 to be built. Specifically either: $ configure --disable-vp9 $ configure --disable-vp8 --disable-unit-tests VP9 still exports its symbols and files as VP8, so that will be resolved in the next commit. Unit tests are broken in VP9, but this isn't a new issue. They are fixed upstream on origin/experimental as of this writing, but rebasing this merge proved difficult, so will tackle that in a second merge commit. Change-Id: I2b7d852c18efd58d1ebc621b8041fe0260442c21	2012-11-07 11:30:16 -08:00
Yaowu Xu	0cedaa3631	merge full pixel refmv experiment Change-Id: Ib39ad47a7d188f3b45416937b7eeb28c3e79b74c	2012-11-07 10:52:45 -08:00
James Zern	984734436d	Fix variance (signed integer) overflow In the variance calculations the difference is summed and later squared. When the sum exceeds sqrt(2^31) the value is treated as a negative when it is shifted which gives incorrect results. To fix this we force the multiplication to be unsigned. The alternative fix is to shift sum down by 4 before multiplying. However that will reduce precision. For 16x16 blocks the maximum sum is 65280 and sqrt(2^31) is 46340 (and change). This change is based on: `1698234` Missed some variance casts `fea3556` Fix variance overflow Change-Id: I2c61856cca9db54b9b81de83b4505ea81a050a0f	2012-11-06 23:06:44 -08:00
Yaowu Xu	a879b4e6d4	fixed function prototype so they are consistent with actual definitions of the functions Change-Id: Ie4b4e81b3da3e288fc2edbbd2b393a5c54d2556b	2012-11-06 15:55:11 -08:00
Yaowu Xu	acadcec5c5	group refmv experiment related functions Change-Id: Iedaa108ddb65f54d768424f9c47ad4d069b656fd	2012-11-06 15:54:47 -08:00
James Zern	182f99f0c6	Merge "fix test builds" into experimental	2012-11-06 12:18:01 -08:00
James Zern	2e3e685799	fix test builds s/([vV][pP])8/$19/ additionally dct.h was removed; declare the _c functions that are used in the tests. the TODO for conversion to parameterized tests still remains. Change-Id: I73db9425a57075bbb78a92693ba6b320578981cd	2012-11-06 12:12:58 -08:00
John Koleszar	83b1d907da	vpx: merge with master Change-Id: I44b3ad780cef6f448fa17ff8e28fea87ef9cd518	2012-11-06 12:04:53 -08:00
Yunqing Wang	4626faf1e7	Convert 16x16 dct/idct to integer forms Converted vp9_short_fdct16x16_c and vp9_short_idct16x16_c to integer versions. Change-Id: Ie3ec985a890ac0f4f4f5818e6f0122e00c8af69f	2012-11-06 11:25:25 -08:00
James Zern	0078d2f3dc	vp9/encoder/bitstream.c: fix unused variable warnings Change-Id: Ibfac7e000509d2017eac9a108060e534a19fec33	2012-11-06 11:08:34 -08:00
Yaowu Xu	55f2f14f10	Merge "silent a lot of MSVC compiler warnings" into experimental	2012-11-06 09:39:47 -08:00
Yaowu Xu	8a336b0d0d	silent a lot of MSVC compiler warnings there are still a couple type of warning left, which are related to double constants assigned to float type. As those would be addressed by the conversion of transforms into integer version. This commit has left those un-dealt with. Change-Id: I48fd9b489c0c27ad6b543f4177423419f929f2bb	2012-11-06 09:09:25 -08:00
Jim Bankoski	8ce914f5fd	Merge "remove invoke_search macro" into experimental	2012-11-06 06:31:52 -08:00
James Zern	e47d9f1d07	rd_pick_inter_mode: prevent signed integer overflow calculate the txfm_cache difference first as both values may be INT64_MAX with the intent that they cancel each other out. Change-Id: I214d072458e1b24f60289974e6302af1aff7b66c	2012-11-05 17:14:32 -08:00
Jim Bankoski	7849aa20ed	remove invoke_search macro Removed invoke search from encoder Change-Id: I3d809b795abe6df0e71366edfe94026aaede14fb	2012-11-05 16:58:03 -08:00
James Zern	f2541f8a4a	rdopt: fix use of uninitialized value in addition rd_pick_intra4x4mby_modes / rd_pick_intra8x8mby_modes would both use the input value of 'rate_y' in the return calculation. In many places this value is uninitialized. Remove the unneeded sum. Change-Id: Icbd3df685303000301e69291c0ebc06f74bd548d	2012-11-05 12:50:16 -08:00
Ronald S. Bultje	849c9540d5	Merge "Don't generate residual 3x when doing a macroblock luma RD estimate." into experimental	2012-11-05 06:21:03 -08:00
James Zern	ee38c4184b	loopfilter: prevent signed integer overflow use unsigned ints to extended filter values in vp9_mbloop_filter_horizontal_edge_c_sse2 Change-Id: I55ec3ac2bcb9baf55626b0384d151b07fc8e087d	2012-11-03 09:45:21 -07:00
Yunqing Wang	28826a913c	Merge "Fix eobs data type" into experimental	2012-11-02 16:00:56 -07:00
Yunqing Wang	d41b0e6498	Fix eobs data type The block sizes for decoding tokens are up to 16x16, which means eobs is within [0, 256]. Using (signed) char is not enough. Changed eobs data type to unsigned short to fix the problem. Change-Id: I88a7d3098e1f1604c336d6adb88ffec971fb03a6	2012-11-02 13:22:29 -07:00
Ronald S. Bultje	6cd2541379	Don't generate residual 3x when doing a macroblock luma RD estimate. Change-Id: Ia601e96fcb4fc547884b6ab894f9f2ad22a98078	2012-11-02 11:46:57 -07:00
Ronald S. Bultje	3c4f47e843	Place non-static function prototypes in a header file. Change-Id: I7cd21b9f1e69f4e0b3338bfe27b3c67e4b47de58	2012-11-02 11:22:57 -07:00
John Koleszar	06f3e51da6	vpx_scale: sync from master Update vpx_scale from current code in master, run style transform, fix lint warnings. Change-Id: I47eadeb5b6881d448ea3728537f9b8a5b5aac78e	2012-11-02 08:44:54 -07:00
Ronald S. Bultje	4b2c2b9aa4	Rename vp8/ codec directory to vp9/. Change-Id: Ic084c475844b24092a433ab88138cf58af3abbe4	2012-11-01 16:31:22 -07:00

... 44 45 46 47 48 ...

2723 Commits