generic-library/vpx

Author	SHA1	Message	Date
John Koleszar	d48ea5a2ab	Merge "Remove legacy integer types"	2011-12-22 13:00:23 -08:00
John Koleszar	f56918ba9c	Remove legacy integer types Remove BOOL, INTn, UINTn, etc, in favor of C99-style fixed width types. Change-Id: I396636212fb5edd6b347d43cc940186d8cd1e7b5	2011-12-22 09:58:40 -08:00
John Koleszar	a2407935d2	Merge "Remove opaque pointer VP8D_PTR"	2011-12-22 09:36:11 -08:00
John Koleszar	0c2f8e77cc	Remove useless g_common.h This file declared a bunch of nonexistent, unreferenced global function pointers. Change-Id: Ic26bb8c7712deba754c49fc01f383b53afc9e728	2011-12-21 15:02:23 -08:00
John Koleszar	bf1a8073c3	Remove opaque pointer VP8D_PTR Use an opaque struct rather than typecasting through VP8D_PTR, an int*. Change-Id: Ia260b7d53d7e0950cfa1e00f4ecead1099bd3b87	2011-12-21 14:48:10 -08:00
James Zern	b651875e24	squash some signed/unsigned comparison warnings Change-Id: Ifc64cf990ae04d77934da3324d0afb3993f061e7	2011-12-21 13:49:19 -08:00
Scott LaVarnway	0ccefd2c8f	Fixed mb_skip_coeff bug When mb_skip_coeff is set, the idct is not necessary. Prior to this patch, the code would call idcts based on leftover eob information. This patch will now skip the idct for SPLIT_MV and clear out the eobs for B_PRED, forcing the idct to be skipped. Change-Id: If5b0d2ed3ebd07789d30ec5160df927485fcaa17	2011-12-16 13:48:01 -05:00
Scott LaVarnway	a53d5a4c44	Moved dequant idct into common These functions are now used by the encoder. This is WIP with the goal of creating a common idct/add for the encoder and decoder. A boost of 1.8% was seen for the HD rt test clip used. [Tero] Added needed changes to ARM side. Change-Id: Ibbb8000be09034203d7adffc457d3c3f8b06a5bf	2011-12-15 14:23:41 -05:00
Scott LaVarnway	34d7c8b3d4	Added vp8_dequant_idct_add_y_block_sse2 setup In Change I83202ffd, I deleted one too many lines. Change-Id: If05d7c8988eb5c00898dc7c833ad7d99b5eb23e7	2011-11-28 13:06:13 -05:00
Scott LaVarnway	f46e17fd6f	Merge "Modified the inverse walsh to output directly"	2011-11-28 07:26:07 -08:00
Scott LaVarnway	4a91541c94	Modified the inverse walsh to output directly to the dqcoeff or qcoeff buffer. The encoder would populate the dc coeffs of the y blocks as a separate stage (recon_dcblock) and the decoder would use a special version of the idct. This change eliminates the extra copy and reduces the code footprint. [Tero] Added needed changes to armv6 and NEON assembly. Change-Id: I83202ffdbaf83f6e5dd69f4ba2519fcf0b13b3ba	2011-11-25 09:24:04 +02:00
Stefan Holmer	b5ee7b12d2	Decoder fixes to better support reference picture selection. Change-Id: Id3388985d754706b9fd1f079c47121e79a63efdf	2011-11-21 10:25:21 +01:00
Tero Rintaluoma	5a2fd63a2a	ARMv6 optimized Intra4x4 prediction Added ARM optimized intra 4x4 prediction - 2x faster on Profiler compared to C-code compiled with -O3 - Function interface changed a little to improve BLOCKD structure access Change-Id: I9bc2b723155943fe0cf03dd9ca5f1760f7a81f54	2011-11-09 09:13:51 +02:00
Stefan Holmer	1427205215	Changing decoder input partition API to input fragments. Adding support for several partitions within one input fragment. This is necessary to fully support all possible packetization combinations in the VP8 RTP profile. Several partitions can be transmitted in the same packet, and they can only be split by reading the partition lengths from the bitstream. Change-Id: If7d7ea331cc78cb7efd74c4a976b720c9a655463	2011-11-01 14:44:37 -07:00
John Koleszar	9bf3bc9a72	Correct SPLITMV clamping Prior to this fix, the clamping state of the last subblock partition dominated, whereas the correct behavior is to clamp if any partition needs clamping. This bug was introduced by v0.9.6-232-g6b25501 See also: [1]: http://code.google.com/p/webm/issues/detail?id=371 [2]: https://bugzilla.mozilla.org/show_bug.cgi?id=696390 Change-Id: I444db492b4c4f05f039c7da6f4216da8207dc138	2011-10-31 14:42:51 -07:00
Scott LaVarnway	6064384d59	Improved decode_split_mv() Tests showed ~1.2% performance boost on the HD clip used. Performance will vary based on material. Change-Id: Icbcf1a828750d5b4ae5252bf596b3ef594042e8a	2011-10-27 11:26:30 -04:00
Scott LaVarnway	efa69d26a1	Merge "Improved read_mb_modes_mv()"	2011-10-26 08:26:30 -07:00
Scott LaVarnway	ff1d170e69	Improved read_mb_modes_mv() Interleaved vp8_find_near_mvs and vp8_mv_ref_probs. 2.5% to 4% performance improvement for the HD clips used. Change-Id: Id888b667cf5ae2f0e19da18743140f055ff7de8d	2011-10-26 10:46:36 -04:00
Johann	f9dba66877	Merge "remove uninitialized variable warning"	2011-10-25 14:42:21 -07:00
Scott LaVarnway	e03330bd80	Merge "Improved token decoder"	2011-10-25 10:04:11 -07:00
Johann	9409af2083	remove uninitialized variable warning Restructure if statement to clarify the error condition. Trigger the error before clobbering pc-> variables. Change-Id: Id01cab798a341ce9899078fdcec265a0e942a0b7	2011-10-25 09:24:40 -07:00
Scott LaVarnway	49ea2bc3f4	Removed read_mv_ref Decode the mv mode with if-then-elses instead of traversing the vp8_mv_ref_tree data structure. This will make it easier to interleave vp8_find_near_mvs and vp8_mv_ref_probs. Change-Id: I1e798d6ec40fcaeeff06ccc82f81201978d12f74	2011-10-24 16:16:08 -04:00
Scott LaVarnway	f182376dd6	Moved the split motion vector decode into a function. Change-Id: Ia023a0587100a52cb084f5d9d5512efa6198dad3	2011-10-24 13:52:15 -04:00
Scott LaVarnway	231339932b	Merge "Removed redundant mv clamps for nearmv and nearestmv"	2011-10-24 10:27:53 -07:00
Scott LaVarnway	a99c20c0f4	Removed redundant mv clamps for nearmv and nearestmv Did some cleanup as well. Patchset 2: Fixed bug. Will revisit the segmentation logic. Change-Id: Idf9fbcff9aaf467bdace9fbd58ef2cea6c602049	2011-10-24 11:37:52 -04:00
Tero Rintaluoma	bdb4fb8991	Remove unused DETOK structure DETOK structure is not used anymore. Change-Id: Id22e1af78fb85d4bb151237a60290d9364faf217	2011-10-21 09:33:49 +03:00
Scott LaVarnway	5e54085703	Improved token decoder Tests showed over 2% improvement on various HD clips. Change-Id: I94a30d209c92cbd5fef285122f9fc570688635fe	2011-10-19 13:38:35 -04:00
Scott LaVarnway	ed9c66f584	Remove usage of predict buffer for decode Instead of using the predict buffer, the decoder now writes the predictor into the recon buffer. For blocks with eob=0, unnecessary idcts can be eliminated. This gave a performance boost of ~1.8% for the HD clips used. Tero: Added needed changes to ARM side and scheduled some assembly code to prevent interlocks. Patch Set 6: Merged (I1bcdca7a95aacc3a181b9faa6b10e3a71ee24df3) into this commit because of similarities in the idct functions. Patch Set 7: EC bug fix. Change-Id: Ie31d90b5d3522e1108163f2ac491e455e3f955e6	2011-10-18 12:06:50 -04:00
John Koleszar	6f9457ec12	Merge "clamp_mvs() using the wrong motion vector information"	2011-09-22 11:54:15 -07:00
Attila Nagy	1a7d25a484	Replace vpx_ports/config.h with vpx_config.h Just a clean-up. Change-Id: Iea5b6dc925dcfa7db548bc1ab1a13d26ed5a2c9a	2011-09-22 13:33:54 +03:00
Stefan Holmer	e529a825f7	Fix necessary for input partitions iface to match the RTP profile These changes fixes a glitch between the RTP profile and the input partitions interface. Since there's no way for the user to know the actual number of partitions, the decoder have to read the multi_token_paritition bits also when input partitions mode is enabled. Included are also a couple of fixes for issues with independent partitions and uninitialized memory reads. Change-Id: I6f93b15287d291169ed681898ed3fbcc5dc81837	2011-09-19 15:00:21 +02:00
John Koleszar	35ce4eb01d	Merge "Fixes the boundary checks for extrapolated and interpolated MVs."	2011-09-16 08:09:44 -07:00
Scott LaVarnway	c0ee870b0a	clamp_mvs() using the wrong motion vector information In the "Removed bmi copy to/from BLOCKD" commit, the copy to the bmi in BLOCKD was eliminated. The clamp_mvs() used the bmi in BLOCKD, which now contains incorrect values. This patch fixes this problem. Change-Id: I8eca1eaf4015052b0b63e90876f7ad321aba7cff	2011-09-16 11:03:53 -04:00
Stefan Holmer	b854bbd844	Fixes the boundary checks for extrapolated and interpolated MVs. Change-Id: I5b47d39d1604f2650d2f2d1ca2a3f40843c8e1ea	2011-09-16 11:58:57 +02:00
Scott LaVarnway	222c72e50f	Merge "Removed bmi copy to/from BLOCKD"	2011-08-31 06:57:20 -07:00
Scott LaVarnway	b870947d42	Removed bmi copy to/from BLOCKD for SPLITMV and B_PRED modes. Modified code to use the bmi found in mode_info_context instead of BLOCKD. On the decode side, the uvmvs are calculated only when required, instead of every macroblock. This is WIP. (bmi should eventually be removed from BLOCKD) Small performance gains noticed for RT encodes and decodes.(VGA) Change-Id: I2ed7f0fd5ca733655df684aa82da575c77a973e7	2011-08-24 14:42:26 -04:00
Fritz Koenig	112bd4e2b4	Fix naming of sse2 idct functions. Prepend idct function names with vp8_ so that under profiling they show up associated with libvpx. Change-Id: I4fe357b50236cb7730a4cc00164c0a3487a1d8b4	2011-08-24 10:25:32 -07:00
Stefan Holmer	99d870a472	Don't set the bmi mode when doing error concealment Since the block will be interpreted as an inter block, the mode will be interpreted as a motion vector, resulting in bad concealment. Change-Id: Ifcc685ae1cc883492bce6dbd61e418d91a89b053	2011-08-15 11:46:04 -04:00
John Koleszar	a4c2211ea3	Propagate macroblock MV to subblocks for error concealment EC expects the subblock MVs to be populated, but `f1d6cc79e4` removed this code. This commit restores it, protected by CONFIG_ERROR_CONCEALMENT. May move this to the EC code more directly in the future. Change-Id: I44f8f985720cb9a1bf222e59143f9e69abf56ad2	2011-08-12 14:49:35 -04:00
Stefan Holmer	a609be5633	Disable error concealment until first key frame is decoded When error concealment is enabled the first key frame must be successfully received before error concealment is activated. Error concealment will be activated when the delta following delta frame is received. Also fixed a couple of bugs related to error tracking in multi-threading. And avoiding decoding corrupt residual when we have multiple non-resilient partitions. Change-Id: I45c4bb296e2f05f57624aef500a874faf431a60d	2011-08-12 14:49:34 -04:00
John Koleszar	cdae03a4eb	Fix potential OOB read with Error Concealment This patch fixes an OOB read when error concealment is enabled and the partition sizes are corrupt. The partition size read from the bitstream was not being validated in EC mode. Change-Id: Ia81dfd4bce1ab29ee78e42320abe52cee8318974	2011-08-12 14:49:34 -04:00
John Koleszar	06c3d5bb9a	Fix building with --disable-postproc Change-Id: I7e6bc28e7974a376da747300744e0dd5dc1d21e9	2011-08-01 17:50:23 -04:00
John Koleszar	db8f0d2ca9	Merge "cosmetics: consistently use [u]int64_t"	2011-07-26 12:57:43 -07:00
James Zern	b45065d38b	cosmetics: consistently use [u]int64_t Removes mixed usage of (unsigned) long long and INT64. Fixes Issue #208. Change-Id: I220d3ed5ce4bb1280cd38bb3715f208ce23cf83a	2011-07-26 11:34:36 -07:00
Scott LaVarnway	a11624497c	"Eliminated TOKENEXTRABITS" broke the windows build. Fixed. Change-Id: I3348e8dbcaee6ace263af413701101d77636e5df	2011-07-26 09:33:16 -04:00
Scott LaVarnway	76eb402668	Eliminated TOKENEXTRABITS Noticed small performance gains, depending on material. Change-Id: I334369f6312bc19aa73481fc3f790ab181e11867	2011-07-25 17:11:24 -04:00
Johann	a04ed0e8f3	fix sharpness bug and clean up sharpness was not recalculated in vp8cx_pick_filter_level_fast remove last_filter_type. all values are calculated, don't need to update the lfi data when it changes. always use cm->sharpness_level. the extra indirection was annoying. don't track last frame_type or sharpness_level manually. frame type only matters for motion search and sharpness_level is taken care of in frame_init move function declarations to their proper header Change-Id: I7ef037bd4bf8cf5e37d2d36bd03b5e22a2ad91db	2011-07-22 12:33:57 -04:00
Scott LaVarnway	b2d9700f53	Merge "Moved vp8_encode_bool into boolhuff.h"	2011-07-19 08:15:14 -07:00
Johann	6afafc313c	remove old armv5 code armv5 dequantizer is not referenced Change-Id: Id1cc617dcee35ebd6a406816ec6aaa26e8bbc8ad	2011-07-19 09:20:38 -04:00
Scott LaVarnway	a25f6a9c88	Moved vp8_encode_bool into boolhuff.h allowing the compiler to inline this function. For real-time encodes, this gave a boost of 1% to 2.5%, depending on the speed setting. Change-Id: I3929d176cca086b4261267b848419d5bcff21c02	2011-07-19 09:17:25 -04:00
Yunqing Wang	f1f28535c3	Merge "Fix unnecessary casting of B_PREDICTION_MODE (issue 349)"	2011-07-13 13:32:57 -07:00
Yunqing Wang	139577f937	Fix unnecessary casting of B_PREDICTION_MODE (issue 349) Minor fix. Change-Id: Iaf93f6e47e882a33c479e57c7a0d0bf321e291c0	2011-07-13 15:52:07 -04:00
Johann	d9b825cff2	Merge "New loop filter interface"	2011-07-13 04:09:26 -07:00
Attila Nagy	622958449b	New loop filter interface Separate simple filter with reduced no. of parameters. MB filter level picking based on precalculated table. Level table updated for each frame. Inside and edge limits precalculated and updated just when sharpness changes. HEV threshhold is constant. ARM targets use scalars and others vectors. Change works only with --target=generic-gnu All other targets have to be updated! Change-Id: I6b73aca6b525075b20129a371699b2561bd4d51c	2011-07-08 09:31:41 +03:00
Johann	6611f66978	clean up warnings when building arm with rtcd Change-Id: I3683cb87e9cb7c36fc22c1d70f0799c7c46a21df	2011-06-29 10:51:41 -04:00
John Koleszar	f3a13cb236	Merge "Use MAX_ENTROPY_TOKENS and ENTROPY_NODES more consistently"	2011-06-29 07:29:59 -07:00
Johann	dc004e8c17	Merge "Avoid text relocations in ARM vp8 decoder"	2011-06-28 16:34:10 -07:00
John Koleszar	b32da7c3da	Use MAX_ENTROPY_TOKENS and ENTROPY_NODES more consistently There were many instances in the code of vp8_coef_tokens and vp8_coef_tokens-1, which was a preprocessor macro despite the naming convention. Replace these with MAX_ENTROPY_TOKENS and ENTROPY_NODES, respectively. Change-Id: I72c4f6c7634c94e1fa066cd511471e5592c748da	2011-06-28 17:03:55 -04:00
John Koleszar	9bcf07ae4a	Merge "Simplify decode_macroblock."	2011-06-28 12:54:25 -07:00
Gaute Strokkenes	81c0546407	Simplify decode_macroblock. Change-Id: Ieb2f3827ae7896ae594203b702b3e8fa8fb63d37	2011-06-28 17:01:14 +01:00
Stefan Holmer	7296b3f922	New ways of passing encoded data between encoder and decoder. With this commit frames can be received partition-by-partition from the encoder and passed partition-by-partition to the decoder. At the encoder-side this makes it easier to split encoded frames at partition boundaries, useful when packetizing frames. When VPX_CODEC_USE_OUTPUT_PARTITION is enabled, several VPX_CODEC_CX_FRAME_PKT packets will be returned from vpx_codec_get_cx_data(), containing one partition each. The partition_id (starting at 0) specifies the decoding order of the partitions. All partitions but the last has the VPX_FRAME_IS_FRAGMENT flag set. At the decoder this opens up the possibility of decoding partition N even though partition N-1 was lost (given that independent partitioning has been enabled in the encoder) if more info about the missing parts of the stream is available through external signaling. Each partition is passed to the decoder through the vpx_codec_decode() function, with the data pointer pointing to the start of the partition, and with data_sz equal to the size of the partition. Missing partitions can be signaled to the decoder by setting data != NULL and data_sz = 0. When all partitions have been given to the decoder "end of data" should be signaled by calling vpx_codec_decode() with data = NULL and data_sz = 0. The first partition is the first partition according to the VP8 bitstream + the uncompressed data chunk + DCT address offsets if multiple residual partitions are used. Change-Id: I5bc0682b9e4112e0db77904755c694c3c7ac6e74	2011-06-28 11:10:17 -04:00
Mike Hommey	e3f850ee05	Avoid text relocations in ARM vp8 decoder The current code stores pointers to coefficient tables and loads them to access the tables contents. As these pointers are stored in the code sections, it means we end up with text relocations. eu-findtextrel will thus complain about code not compiled with -fpic/-fPIC. Since the pointers are stored in the code sections, we can actually cheat and let the assembler generate relative addressing when accessing the coefficient tables, and just load their location with adr. Change-Id: Ib74ae2d3f2bab80b29991355f2dbe6955f38f6ae	2011-06-28 09:11:40 +02:00
Stefan Holmer	ba0822ba96	Adding support for error concealment in multi-threaded decoding Also includes a couple of error concealment bug fixes: - the segment_id wasn't properly initialized when missing - when interpolating and no neighbors are found, set to zero - clear the qcoef buffer when concealing an MB Change-Id: Id79c876b41d78b559a2241e9cd0fd2cae6198f49	2011-06-27 09:03:06 -04:00
James Berry	2bd90c13a0	get/set reference buffer dimension check added vp8_yv12_copy_frame_ptr() expects same size buffers which was not previously gaurenteed. Using an improperly allocated buffer would cause a crash before. Change-Id: I904982313ce9352474f80de842013dcd89f48685	2011-06-22 13:36:24 -04:00
Scott LaVarnway	67a1f98c2c	Improved vp8dx_decode_bool Relocated the vp8dx_bool_decoder_fill() call, allowing the compiler to produce better assembly code. Tests showed a 1 - 2 % performance boost (x86 using gcc) for the 720p clip used. Change-Id: Ic5a4eefed8777e6eefa007d4f12dfc7e64482732	2011-06-20 14:44:16 -04:00
John Koleszar	5223016337	Merge "Remove redundant check for KEY_FRAME in multithreaded decoder"	2011-06-15 10:18:06 -07:00
John Koleszar	1ade44b352	Merge "fix --disable-runtime-cpu-detect on x86"	2011-06-15 07:09:09 -07:00
Attila Nagy	c7e6aabbca	Remove redundant check for KEY_FRAME in multithreaded decoder For Intra blocks is enough to check ref_frame == INTRA_FRAME. Change-Id: I3e2d3064c7642658a9e14011a4627de58878e366	2011-06-15 09:01:27 +03:00
Scott LaVarnway	7be5b6dae4	Merge "Populate bmi for B_PRED only"	2011-06-14 12:04:50 -07:00
Johann	92b0e544f3	fix --disable-runtime-cpu-detect on x86 Change-Id: Ib8e429152c9a8b6032be22b5faac802aa8224caa	2011-06-14 11:31:50 -04:00
James Zern	532c30c83e	fix corrupt frame leak If setup_token_decoder reported an internal error the memory allocated there would not be freed in the resulting call to _remove_decompressor. Change-Id: Ib459de222d76b1910d6f449cdcd01663447dbdf6	2011-06-13 17:32:19 -07:00
Scott LaVarnway	223d1b54cf	Populate bmi for B_PRED only Small decode performance gain (~1%) on keyframes. No noticeable gains on encode. Also changed pick_intra4x4mby_modes() to read the above and left block modes for keyframes only. Change-Id: I1f4885252f5b3e9caf04d4e01e643960f910aba5	2011-06-13 17:14:11 -04:00
Johann	79327be6c7	use GCC inline magic Better fix for #326. ICC happens to support the inline magic Change-Id: Ic367eea608c88d89475cb7b05d73500d2a1bc42b	2011-06-08 16:19:37 -04:00
Scott LaVarnway	1374a4db3b	Removed unused function vp8_treed_read_num Change-Id: Id66e70540ee7345876f099139887c1843093907f	2011-06-07 09:32:51 -04:00
Scott LaVarnway	f1d6cc79e4	Removed unnecessary bmi motion vector stores. left_block_mv and above_block_mv will return the MB motion vector for non SPLITMV macro blocks. Change-Id: I58dbd7833b4fdcd44b6b72e98ec732c93c2ce4f4	2011-06-03 13:09:46 -04:00
Scott LaVarnway	8c5b73de2a	Merge "Removed B_MODE_INFO"	2011-06-03 08:32:30 -07:00
Scott LaVarnway	773768ae27	Removed B_MODE_INFO Declared the bmi in BLOCKD as a union instead of B_MODE_INFO. Then removed B_MODE_INFO completely. Change-Id: Ieb7469899e265892c66f7aeac87b7f2bf38e7a67	2011-06-02 13:46:41 -04:00
John Koleszar	4101b5c5ed	Merge "Bugfix in vp8dx_set_reference"	2011-06-01 13:57:23 -07:00
Henrik Lundin	69ba6bd142	Bugfix in vp8dx_set_reference The fb_idx_ref_cnt book-keeping was in error. Added an assert to prevent future errors in the reference count vector. Also fixed a pointer syntax error. Change-Id: I563081090c78702d82199e407df4ecc93da6f349	2011-06-01 21:41:12 +02:00
Scott LaVarnway	4f586f7bd0	Broken EC after MODE_INFO size reduction This patch fixes the compiler errors and the seg fault when running decode_with_partial_drops. Change-Id: I7c75369e2fef81d53b790d5dabc327218216838b	2011-05-26 15:13:00 -04:00
Scott LaVarnway	a39321f37e	Use int_mv instead of MV in vp8_mv_cont Less operations. Change-Id: Ibb9cd5ae66b8c7c681c9a654d551c8729c31c3ae	2011-05-24 16:01:12 -04:00
Scott LaVarnway	cfab2caee1	Removed unused variable warnings Change-Id: I6e5e921f03dc15a72da89a457848d519647677a3	2011-05-24 15:17:03 -04:00
Scott LaVarnway	b5278f38b0	Merge "MODE_INFO size reduction"	2011-05-24 12:08:24 -07:00
Scott LaVarnway	e11f21af9a	MODE_INFO size reduction Declared the bmi in MODE_INFO as a union instead of B_MODE_INFO. This reduced the memory footprint by 518,400 bytes for 1080 resolutions. The decoder performance improved by ~4% for the clip used and the encoder showed very small improvements. (0.5%) This reduction was first mentioned to me by John K. and in a later discussion by Yaowu. This is WIP. Change-Id: I8e175fdbc46d28c35277302a04bee4540efc8d29	2011-05-24 13:24:52 -04:00
Henrik Lundin	a126cd1760	Fixing bug in VP8_SET_REFERENCE decoder control command In vp8dx_set_reference, the new reference image is written to an unused reference frame buffer. Change-Id: I9e4f2cef5a011094bb7ce7b2719cbfe096a773e8	2011-05-24 09:03:43 +02:00
Stefan Holmer	d04f852368	Adding error-concealment to the decoder. The error-concealer is plugged in after any motion vectors have been decoded. It tries to estimate any missing motion vectors from the motion vectors of the previous frame. Intra blocks with missing residual are replaced with inter blocks with estimated motion vectors. This feature was developed in a separate sandbox (sandbox/holmer/error-concealment). Change-Id: I5c8917b031078d79dbafd90f6006680e84a23412	2011-05-19 13:46:33 -04:00
Scott LaVarnway	6b25501bf1	Using int_mv instead of MV The compiler produces better assembly when using int_mv for assignments. The compiler shifts and ors the two 16bit values when assigning MV. Change-Id: I52ce4bc2bfbfaf3f1151204b2f21e1e0654f960f	2011-05-12 11:08:16 -04:00
Johann	a7d4d3c550	clean up unused variable warnings Change-Id: I9467d7a50eac32d8e8f3a2f26db818e47c93c94b	2011-05-09 12:56:20 -04:00
Yunqing Wang	aeb86d615c	Merge "Runtime detection of available processor cores."	2011-05-05 04:59:54 -07:00
Scott LaVarnway	ccd6f7ed77	Consolidated build inter predictors Code cleanup. Change-Id: Ic8b0167851116c64ddf08e8a3d302fb09ab61146	2011-04-28 10:53:59 -04:00
John Koleszar	085fb4b737	Merge "SSE2/SSSE3 optimizations for build_predictors_mbuv{,_s}()."	2011-04-27 12:02:55 -07:00
Ronald S. Bultje	1083fe4999	SSE2/SSSE3 optimizations for build_predictors_mbuv{,_s}(). decoding before 10.425 10.432 10.423 =10.426 after: 10.405 10.416 10.398 =10.406, 0.2% faster encoding before 14.252 14.331 14.250 14.223 14.241 14.220 14.221 =14.248 after 14.095 14.090 14.085 14.095 14.064 14.081 14.089 =14.086, 1.1% faster Change-Id: I483d3d8f0deda8ad434cea76e16028380722aee2	2011-04-27 11:31:27 -07:00
John Koleszar	64355ecad3	Merge "Speed up VP8DX_BOOL_DECODER_FILL"	2011-04-27 09:03:45 -07:00
John Koleszar	f8ffecb176	Merge "Update VP8DX_BOOL_DECODER_FILL to better detect EOS"	2011-04-27 09:03:24 -07:00
John Koleszar	5e1fd41357	Speed up VP8DX_BOOL_DECODER_FILL The end-of-buffer check is hoisted out of the inner loop. Gives about 0.5% improvement on x86_64. Change-Id: I8e3ed08af7d33468c5c749af36c2dfa19677f971	2011-04-27 10:25:03 -04:00
John Koleszar	9594370e0c	Update VP8DX_BOOL_DECODER_FILL to better detect EOS Allow more reliable detection of truncated bitstreams by being more precise with the count of "virtual" bits in the value buffer. Specifically, the VP8_LOTS_OF_BITS value is accumulated into count, rather than being assigned, which was losing the prior value, increasing the required tolerance when testing for the error condition. Change-Id: Ib5172eaa57323b939c439fff8a8ab5fa38da9b69	2011-04-27 10:24:39 -04:00
Scott LaVarnway	0da77a840b	Merge "Test vector mismatch fix"	2011-04-26 10:12:37 -07:00
Scott LaVarnway	7a2b9c50a3	Test vector mismatch fix Fixed test vector mismatch that was introduced in the "Removed dc_diff from MB_MODE_INFO" (Ie2b9cdf9e0f4e8b932bbd36e0878c05bffd28931) Change-Id: I98fa509b418e757b5cdc4baa71202f4168dc14ec	2011-04-26 09:37:19 -04:00
Johann	01527e743f	remove simpler_lpf the decision to run the regular or simple loopfilter is made outside the function and managed with pointers stop tracking the option in two places. use filter_type exclusively Change-Id: I39d7b5d1352885efc632c0a94aaf56b72cc2fe15	2011-04-25 17:37:41 -04:00
Scott LaVarnway	6f6cd3abb9	Removed unnecessary frame type checks ref_frame is set to INTRA_FRAME for keyframes. The B_PRED mode is only used in intra frames. Change-Id: I9bac8bec7c736300d47994f3cb570329edf11ec0	2011-04-21 14:59:42 -04:00
Scott LaVarnway	3698c1f620	Removed dc_diff from MB_MODE_INFO The dc_diff flag is used to skip loopfiltering. Instead of setting this flag in the decoder/encoder, we now check for this condition in the loopfilter. Change-Id: Ie2b9cdf9e0f4e8b932bbd36e0878c05bffd28931	2011-04-21 14:38:36 -04:00
Scott LaVarnway	09c933ea80	Removed redundant checks of the mode_info_context flags Code cleanup. The build inter predictor functions are redundantly checking the mode_info_context for either INTRA_FRAME or SPLITMV. Change-Id: I4d58c3a5192a4c2cec5c24ab1caf608bf13aebfb	2011-04-20 14:06:40 -04:00
Scott LaVarnway	e1a8b6c8d5	Removed unused timers Change-Id: I209803b9dbed2b2f6d02258fd7a3963a6645f4ab	2011-04-18 09:09:57 -04:00
Johann	f64f425a50	remove executable bit source files are not executable Change-Id: Id2c7294695a22217468426423979f68f02d82340	2011-04-15 13:43:24 -04:00
Gaute Strokkenes	15f03c2f13	Slightly simplify vp8_decode_mb_tokens. Change-Id: I0058ba7dcfc50a3374b712197639ac337f8726be	2011-04-04 16:47:22 +01:00
Attila Nagy	297b27655e	Runtime detection of available processor cores. Detect the number of available cores and limit the thread allocation accordingly. On decoder side limit the number of threads to the max number of token partition. Core detetction works on Windows and Posix platforms, which define _SC_NPROCESSORS_ONLN or _SC_NPROC_ONLN. Change-Id: I76cbe37c18d3b8035e508b7a1795577674efc078	2011-03-31 10:23:01 +03:00
John Koleszar	769c74c0ac	Merge "Increase static linkage, remove unused functions"	2011-03-21 04:51:51 -07:00
John Koleszar	429dc676b1	Increase static linkage, remove unused functions A large number of functions were defined with external linkage, even though they were only used from within one file. This patch changes their linkage to static and removes the vp8_ prefix from their names, which should make it more obvious to the reader that the function is contained within the current translation unit. Functions that were not referenced were removed. These symbols were identified by: $ nm -A libvpx.a \| sort -k3 \| uniq -c -f2 \| grep ' [A-Z] ' \ \| sort \| grep '^ *1 ' Change-Id: I59609f58ab65312012c047036ae1e0634f795779	2011-03-17 20:53:47 -04:00
John Koleszar	de50520a8c	apple: include proper mach primatives Fixes implicit declaration warning for 'mach_task_self'. This change is an update to Change I9991dedd1ccfddc092eca86705ecbc3b764b799d, which fixed this issue for the decoder but not the encoder. Change-Id: I9df033e81f9520c4f975b7a7cf6c643d12e87c96	2011-03-16 13:59:32 -04:00
John Koleszar	8c48c943e7	Merge "Fix an unused variable warning."	2011-03-14 14:13:53 -07:00
John Koleszar	27972d2c1d	Move build_intra_predictors_mby to RTCD framework The vp8_build_intra_predictors_mby and vp8_build_intra_predictors_mby_s functions had global function pointers rather than using the RTCD framework. This can show up as a potential data race with tools such as helgrind. See https://bugzilla.mozilla.org/show_bug.cgi?id=640935 for an example. Change-Id: I29c407f828ac2bddfc039f852f138de5de888534	2011-03-11 13:04:50 -05:00
Ralph Giles	56efffdcd1	Fix an unused variable warning. Move the update of the loopfilter info to the same block where it is used. GCC 4.5 is not able trace the initialization of the local filter_info across the other calls between the two conditionals on pbi->common and issues an uninitialized variable warning. Change-Id: Ie4487b3714a096b3fb21608f6b0c74e745e3c6fc	2011-03-08 14:56:15 -08:00
John Koleszar	c764c2a20f	Merge "clean up unused files"	2011-02-18 06:33:05 -08:00
John Koleszar	3ed8fe8778	remove unused vp8_predict_dc function Change-Id: I64fa47889c54cfed094a674c49ef0996d49bdd42	2011-02-18 09:12:20 -05:00
John Koleszar	cbf923b12c	clean up unused files Removed a number of files that were unused or little-used. Change-Id: If9ae5e5b11390077581a9a879e8a0defe709f5da	2011-02-18 09:09:49 -05:00
John Koleszar	c351aa7f1b	Merge "Fix relative include paths"	2011-02-17 04:13:44 -08:00
James Zern	0030303b69	Remove redundant ptr checks in calls to vpx_free vpx_free if used contains this check. If replaced, well behaved free will behave similarly. Change-Id: I25483aaa8b39255b9a8cf388d6e5eaa20a908ae1	2011-02-15 12:43:35 -08:00
Johann	bb6bcbccda	remove assembly detokenizer hasn't been kept up to date. remove it to avoid confusion. Change-Id: I52ffde19b59fec5c7a381299ca2e85cb38330be7	2011-02-11 11:09:00 -05:00
John Koleszar	02321de0f2	Fix relative include paths Allow compiling without adding vp8/{common,encoder,decoder} to the include paths. Change-Id: Ifeb5dac351cdfadcd659736f5158b315a0030b6c	2011-02-10 15:09:44 -05:00
John Koleszar	a39b5af10b	Merge "Put more code under #if CONFIG_MULTITHREAD."	2011-02-09 08:31:36 -08:00
Gaute Strokkenes	315e3c2518	Put more code under #if CONFIG_MULTITHREAD. Change-Id: Icf4b692099d7d249fe3553852b1022b027b28e4b	2011-02-09 11:21:18 -05:00
Johann	40dcae9c2e	clarify _offsets.asm differences it's difficult to mux the _offsets.c files because of header conflicts. make three instead, name them consistently and partititon the contents to allow building them as required. Change-Id: I8f9768c09279f934f44b6c5b0ec363f7943bb796	2011-02-08 16:35:43 -05:00
Johann	3273c7b679	move one of the offset files common/arm/vpx_asm_offsets moves up a level. prepare for muxing with encoder/arm/vpx_vp8_enc_asm_offsets Change-Id: I89a04a5235447e66571995c9d9b4b6edcb038e24	2011-02-07 11:35:30 -05:00
Johann	bb9c95ea53	remove unused dboolhuff code we were holding on to this "just in case." purge it instead Change-Id: I77a367b36d0821d731019f2566ecfffdae1d4b8a	2011-02-04 16:00:00 -05:00
Gaute Strokkenes	bf5f585b0d	Make vp8_adjust_mb_lf_value return the updated value rather than manipulating it in situ via a pointer. Change-Id: If4a87a4eccd84f39577c0e91e171245f4954c5cf	2011-02-03 19:24:16 +00:00
Tero Rintaluoma	11a222f5d9	Adds "armvX-none-rvct" targets Adds following targets to configure script to support RVCT compilation without operating system support (for Profiler or bare metal images). - armv5te-none-rvct - armv6-none-rvct - armv7-none-rvct To strip OS specific parts from the code "os_support"-config was added to script and CONFIG_OS_SUPPORT flag is used in the code to exclude OS specific parts such as OS specific includes and function calls for timers and threads etc. This was done to enable RVCT compilation for profiling purposes or running the image on bare metal target with Lauterbach. Removed separate AREA directives for READONLY data in armv6 and neon assembly files to fix the RVCT compilation. Otherwise "ldr <reg>, =label" syntax would have been needed to prevent linker errors. This syntax is not supported by older gnu assemblers. Change-Id: I14f4c68529e8c27397502fbc3010a54e505ddb43	2011-01-28 12:47:39 +02:00
John Koleszar	2f0331c90c	Merge "Implement error tracking in the decoder"	2011-01-19 05:51:00 -08:00
Henrik Lundin	67fb3a5155	Implement error tracking in the decoder A new vpx_codec_control called VP8D_GET_FRAME_CORRUPTED. The output from the function is non-zero if the last decoded frame contains corruption due to packet losses. The decoder is also modified to accept encoded frames of zero length. A zero length frame indicates to the decoder that one or more frames have been completely lost. This will mark the last decoded reference buffer as corrupted. The data pointer can be NULL if the length is zero. Change-Id: Ic5902c785a281c6e05329deea958554b7a6c75ce	2011-01-19 09:53:21 +01:00
John Koleszar	f97f2b1bb6	Merge "fix last frame buffer copy logic regression"	2011-01-18 12:54:57 -08:00
Henrik Lundin	48c28fc42c	Remove unused local variables Removing unused local variables causing compiler warnings in Visual Studio. Change-Id: I0e2096303be1fdbc01428a6e57cca9796bb32c8a	2011-01-11 15:22:19 +01:00
John Koleszar	1942eeb886	fix last frame buffer copy logic regression Commit `0ce3901` introduced a change in the frame buffer copy logic where the NEW frame could be copied to the ARF or GF buffer through the copy_buffer_to_{arf,gf}==1 flags, if the LAST frame was not being refreshed. This is not correct. The intent of the copy_buffer_to_{arf,gf}==1 flag is to copy the LAST buffer. To copy the NEW buffer, the refresh_{alt_ref,golden}_frame flag should be used. The original buffer copy logic is fairly convoluted. For example: if (cm->refresh_last_frame) { vp8_swap_yv12_buffer(&cm->last_frame, &cm->new_frame); cm->frame_to_show = &cm->last_frame; } else { cm->frame_to_show = &cm->new_frame; } ... if (cm->copy_buffer_to_arf) { if (cm->copy_buffer_to_arf == 1) { if (cm->refresh_last_frame) vp8_yv12_copy_frame_ptr(&cm->new_frame, &cm->alt_ref_frame); else vp8_yv12_copy_frame_ptr(&cm->last_frame, &cm->alt_ref_frame); } else if (cm->copy_buffer_to_arf == 2) vp8_yv12_copy_frame_ptr(&cm->golden_frame, &cm->alt_ref_frame); } Effectively, if refresh_last_frame, then new and last are swapped, so when "new" is copied to ARF, it's equivalent to copying LAST to ARF. If not refresh_last_frame, then LAST is copied to ARF. So LAST is copied to ARF in both cases. Commit `0ce3901` removed the first buffer swap but kept the refresh_last_frame?new:last behavior, changing the sense since the first swap wasn't done to the more readable refresh_last_frame?last:new, but this logic is not correct when !refresh_last_frame. This commit restores the correct behavior from v0.9.1 and prior. This case is missing from the test vector set. Change-Id: I8369fc13a37ae882e31a8a104da808a08bc8428f	2011-01-06 13:07:42 -05:00
John Koleszar	8d94796cad	vp8mt_alloc_temp_buffers: make prototype return void This function was never called in a context expecting a return value, the return value was always a constant, and the !CONFIG_MULTITHREAD path didn't have a return statement, which caused a compiler warning. This patch changes the function to return void instead. Fixes issue #231 Change-Id: I9ef7f56e54418b7265026c54fc4ed5660c1418d1	2010-11-17 09:13:57 -05:00
Fritz Koenig	647df00f30	postproc : Re-work posproc calling to allow more flags. Debugging in postproc needs more flags to allow for specific block types to be turned on or off in the visualizations. Must be enabled with --enable-postproc-visualizer during configuration time. Change-Id: Ia74f357ddc3ad4fb8082afd3a64f62384e4fcb2d	2010-11-10 14:14:46 -08:00
John Koleszar	4d1b0d2a2d	Merge commit 'fix integer promotion bug in partition size check' Change-Id: I4081917b46013fa8f4218cade8bd12cb2d013aee	2010-11-05 16:49:32 -04:00
John Koleszar	9fb80f7170	fix integer promotion bug in partition size check The check '(user_data_end - partition < partition_size)' must be evaluated as a signed comparison, but because partition_size was unsigned, the LHS was promoted to unsigned, causing an incorrect result on 32-bit. Instead, check the upper and lower bounds of the segment separately. Change-Id: I6266aba7fd7de084268712a3d2a81424ead7aa06	2010-11-05 14:52:53 -04:00
Timothy B. Terriberry	c4d7e5e67e	Eliminate more warnings. This eliminates a large set of warnings exposed by the Mozilla build system (Use of C++ comments in ISO C90 source, commas at the end of enum lists, a couple incomplete initializers, and signed/unsigned comparisons). It also eliminates many (but not all) of the warnings expose by newer GCC versions and _FORTIFY_SOURCE (e.g., calling fread and fwrite without checking the return values). There are a few spurious warnings left on my system: ../vp8/encoder/encodemb.c:274:9: warning: 'sz' may be used uninitialized in this function gcc seems to be unable to figure out that the value shortcut doesn't change between the two if blocks that test it here. ../vp8/encoder/onyx_if.c:5314:5: warning: comparison of unsigned expression >= 0 is always true ../vp8/encoder/onyx_if.c:5319:5: warning: comparison of unsigned expression >= 0 is always true This is true, so far as it goes, but it's comparing against an enum, and the C standard does not mandate that enums be unsigned, so the checks can't be removed. Change-Id: Iaf689ae3e3d0ddc5ade00faa474debe73b8d3395	2010-10-27 18:08:04 -07:00
Johann	b90a072f10	fix implicit declarations ARM used to explicitly remove this file from the build. With the RTCD changes, that's no longer possible. These errors also exist for x86 w/o RTCD, but that's not the default configuration Change-Id: I3e10e5553ddf3278e8d3c9365ca6fb84f52f5066	2010-10-27 11:21:02 -04:00
Timothy B. Terriberry	b71962fdc9	Add runtime CPU detection support for ARM. The primary goal is to allow a binary to be built which supports NEON, but can fall back to non-NEON routines, since some Android devices do not have NEON, even if they are otherwise ARMv7 (e.g., Tegra). The configure-generated flags HAVE_ARMV7, etc., are used to decide which versions of each function to build, and when CONFIG_RUNTIME_CPU_DETECT is enabled, the correct version is chosen at run time. In order for this to work, the CFLAGS must be set to something appropriate (e.g., without -mfpu=neon for ARMv7, and with appropriate -march and -mcpu for even earlier configurations), or the native C code will not be able to run. The ASFLAGS must remain set for the most advanced instruction set required at build time, since the ARM assembler will refuse to emit them otherwise. I have not attempted to make any changes to configure to do this automatically. Doing so will probably require the addition of new configure options. Many of the hooks for RTCD on ARM were already there, but a lot of the code had bit-rotted, and a good deal of the ARM-specific code is not integrated into the RTCD structs at all. I did not try to resolve the latter, merely to add the minimal amount of protection around them to allow RTCD to work. Those functions that were called based on an ifdef at the calling site were expanded to check the RTCD flags at that site, but they should be added to an RTCD struct somewhere in the future. The functions invoked with global function pointers still are, but these should be moved into an RTCD struct for thread safety (I believe every platform currently supported has atomic pointer stores, but this is not guaranteed). The encoder's boolhuff functions did not even have _c and armv7 suffixes, and the correct version was resolved at link time. The token packing functions did have appropriate suffixes, but the version was selected with a define, with no associated RTCD struct. However, for both of these, the only armv7 instruction they actually used was rbit, and this was completely superfluous, so I reworked them to avoid it. The only non-ARMv4 instruction remaining in them is clz, which is ARMv5 (not even ARMv5TE is required). Considering that there are no ARM-specific configs which are not at least ARMv5TE, I did not try to detect these at runtime, and simply enable them for ARMv5 and above. Finally, the NEON register saving code was completely non-reentrant, since it saved the registers to a global, static variable. I moved the storage for this onto the stack. A single binary built with this code was tested on an ARM11 (ARMv6) and a Cortex A8 (ARMv7 w/NEON), for both the encoder and decoder, and produced identical output, while using the correct accelerated functions on each. I did not test on any earlier processors. Change-Id: I45cbd63a614f4554c3b325c45d46c0806f009eaa	2010-10-25 09:23:29 -04:00
John Koleszar	3b9e72b210	Merge "Improve handling of invalid frames." Change-Id: Icef5226a70260607c190126c1c0cc28b796e759c	2010-10-22 11:54:49 -04:00
Timothy B. Terriberry	09bcc1f710	Improve handling of invalid frames. The code was not checking for frame sizes smaller than 3 bytes, and the partition size checks might have failed if the input buffer was within 16MB of the top of the heap. In addition, the reference count on the current frame buffer was not being decremented on error, so after a small number of errors, no new frame buffer could be found and it would run off the list of them. Change-Id: I0c60dba6adb1e2a29df39754f72a56ab6c776b46	2010-10-22 11:50:56 -04:00
Timothy B. Terriberry	8f75ea6b5c	Convert [4][4] matrices to [16] arrays. Most of the code that actually uses these matrices indexes them as if they were a single contiguous array, and coverity produces reports about the resulting accesses that overflow the static bounds of the first row. This is perfectly legal in C, but converting them to actual [16] arrays should eliminate the report, and removes a good deal of extraneous indexing and address operators from the code. Change-Id: Ibda479e2232b3e51f9edf3b355b8640520fdbf23	2010-10-21 17:04:30 -07:00
Jan Kratochvil	5cdc3a4c29	nasm: address labels 'rel label' vice 'wrt rip' nasm does not support `label wrt rip', it requires `rel label'. It is still fully compatible with yasm. Provide nasm compatibility. No binary change by this patch with yasm on {x86_64,i686}-fedora13-linux-gnu. Few longer opcodes with nasm on {x86_64,i686}-fedora13-linux-gnu have been checked as safe. Change-Id: I488773a4e930a56e43b0cc72d867ee5291215f50	2010-10-04 19:47:54 -04:00
Jan Kratochvil	e114f699f6	nasm: match instruction length (movd/movq) to parameters nasm requires the instruction length (movd/movq) to match to its parameters. I find it more clear to really use 64bit instructions when we use 64bit registers in the assembly. Provide nasm compatibility. No binary change by this patch with yasm on {x86_64,i686}-fedora13-linux-gnu. Few longer opcodes with nasm on {x86_64,i686}-fedora13-linux-gnu have been checked as safe. Change-Id: Id9b1a5cdfb1bc05697e523c317a296df43d42a91	2010-10-04 23:36:29 +02:00
John Koleszar	2b521ab551	move reconintra_mt to decoder (fixup) Missed the .h file in the move. Change-Id: Ib408183fbb4d019fd46394b362f89ca6ea9d10bc	2010-09-27 12:48:31 -04:00
John Koleszar	dbd57c2663	Merge "move reconintra_mt to decoder (for now)"	2010-09-24 08:46:35 -07:00
John Koleszar	48e76ff4fd	move reconintra_mt to decoder (for now) reconintra_mt.c is only required for building the decoder right now. It could definitely be used for the encoder in the future, but it currently depends on decoder only data structures. (onyxd_int.h, VP8D_COMP, etc). Move it from common/ to decoder/ until the necessary changes to the common multithread code are complete. This patch is needed to build with --disable-vp8-decoder. Change-Id: I568c52221a2b309234d269675cba97131ce35c86	2010-09-24 11:23:06 -04:00
Yunqing Wang	8db5da2906	Adjust multi-thread sync ranges according to image sizes In multi-threaded decoder, set different sync ranges for different video resolutions. Change-Id: Iea48fd36f51919e0152c8ed3b1f10e1b723c0ca7	2010-09-23 13:53:09 -04:00
John Koleszar	cdd2066687	unset execute bit on c source Change-Id: I6625ee41f8872908cb015ce0729e1c7a105b5217	2010-09-21 19:48:06 -04:00
John Koleszar	6f4c0435d1	Merge "Don't reset mb clamping state during splitmv decoding"	2010-09-21 09:06:59 -07:00
John Koleszar	4d391e8ed2	Don't reset mb clamping state during splitmv decoding The MV decoding changes in `c5fb0eb` introduced a bug where the macroblock clamping state was reset for each partition, so if an earlier partition needed clamping but a subsequent one didn't, the MB wouldn't receive clamping. Instead, the state is only set during splitmv decoding, never cleared. Change-Id: I224fe258493405ee0f6a04596acdb622c475e845	2010-09-21 11:58:48 -04:00
Yunqing Wang	a23ccf8f8c	Merge "Restructure multi-threaded decoder"	2010-09-21 05:00:30 -07:00
Johann	6cf2b4aa0e	Merge "reorder data to use wider instructions"	2010-09-20 10:47:33 -07:00
Johann	9c9afbab85	Merge "Update NEON wide idcts"	2010-09-20 10:47:22 -07:00
Johann	022323bf85	reorder data to use wider instructions the previous commit laid the groundwork by doing two sets of idcts together. this moved that further by grouping the interesting data (q[0], q+16[0]) together to allow using wider instructions. also managed to drop a few instructions by recognizing that the constant for sinpi8sqrt2 could be downshifted all the time which avoided a dowshift as well as workarounds for a function which only accepted signed data looks like a modest gain for performance: at qcif, went from ~180 fps to ~183 Change-Id: I842673f3080b8239e026cc9b50346dbccbab4adf	2010-09-17 16:47:39 -04:00
Yunqing Wang	f857a85088	Restructure multi-threaded decoder On each MB, loopfiltering is done right after MB decoding. This combines two loops in multi-threaded code into one, which reduces number of synchronizations to half. The above-row/left-col data are saved in temp buffers for next-row/next MB decoding. Tests on 4-core gLucid machine showed 10% decoder performance gain with threads=4 (tulip clip). Testing on other platforms isn't done yet. Change-Id: Id18ea7c1e84965dabea65d4c01ca5bc056ddeac9	2010-09-17 09:56:05 -04:00
John Koleszar	9100073e8d	cleanup: remove unused xprintf These files aren't currently used, and we can get them back if we need them. Change-Id: I62aa3bff828e491a80c80eeb84a7c44903df29b5	2010-09-16 13:14:12 -04:00
Scott LaVarnway	c5fb0eb8d9	Improved subset block search Improved the subset block search and fill. (about 3% improvement for 32 bit) Modified/merged the code in order to create vp8_read_mb_modes_mv which can decode the modes/mvs on a macroblock level. This will allow the decode loop (in the future) to decode modes/mvs on a frame, row, or mb level. Change-Id: If637d994b508792f846d39b5d44a7bf9aa5cddf3	2010-09-09 14:42:48 -04:00
Johann	14ba764219	Update NEON wide idcts Expand `93c32a55` which used SSE2 instructions to do two idct/dequant/recons at a time to NEON. Initial working commit. More work needs to be put into rearranging and interlacing the data to take advantage of quadword operations, which is when we'll hopefully see a much better boost Change-Id: I86d59d96f15e0d0f9710253e2c098ac2ff2865d1	2010-09-09 14:08:12 -04:00
John Koleszar	c2140b8af1	Use WebM in copyright notice for consistency Changes 'The VP8 project' to 'The WebM project', for consistency with other webmproject.org repositories. Fixes issue #97. Change-Id: I37c13ed5fbdb9d334ceef71c6350e9febed9bbba	2010-09-09 10:01:21 -04:00
Scott LaVarnway	0de458f6b9	Reduced the size of MB_MODE_INFO Moved partition_bmi and partition_count out of MB_MODE_INFO and placed into MACROBLOCK. Also reduced the size of other members of the MB_MODE_INFO struct. For 1080p, the memory was reduced by 1,209,516 bytes. The decoder performance appeared to improve by 3% for the clip used. Note: The main goal for this change is to improve the decoder performance. The encoder will be revisited at a later date for further structure cleanup. Change-Id: I4733621292ee9cc3fffa4046cb3fd4d99bd14613	2010-09-03 16:43:23 -04:00
Frank Galligan	d45e55015e	Fix rare deadlock before loop filter There was an extremely rare deadlock that happened when one thread was waiting to start the loop filter on frame n while the other threads were starting to work on frame n+1. Change-Id: Icc94f728b3b6663405435640d9a2996735ba19ef	2010-09-01 22:01:21 -04:00
Yunqing Wang	0e78efad0b	Replace sleep(0) calls in multi-threaded decoder This is a workaround for gLucid problem. Change-Id: I188a016a07e4c2ea212444c5a6284ff3c48a5caa	2010-08-31 20:37:11 -04:00
Johann	0b94f5d6e8	followup arm patch make the arm asm detokenizer work with the new structures Change-Id: I7cd92c2a018ec24032bb1cfd1bb9739bc84b444a	2010-08-31 11:41:10 -04:00
Scott LaVarnway	e85e631504	Changed above and left context data layout The main reason for the change was to reduce cycles in the token decoder. (~1.5% gain for 32 bit) This layout should be more cache friendly. As a result of this change, the encoder had to be updated. Change-Id: Id5e804169d8889da0378b3a519ac04dabd28c837 Note: dixie uses a similar layout	2010-08-31 11:24:30 -04:00
Johann	5c244398e1	clean up compiler warnings did a test compile with clang and got rid of some warnings that have been annoying me for a while: vp8/decoder/detokenize.c: In function 'vp8_init_detokenizer': vp8/decoder/detokenize.c:121: warning: assignment discards qualifiers from pointer target type vp8/decoder/detokenize.c:122: warning: assignment discards qualifiers from pointer target type vp8/decoder/detokenize.c:123: warning: assignment from incompatible pointer type vp8/decoder/detokenize.c:124: warning: assignment discards qualifiers from pointer target type vp8/decoder/detokenize.c:125: warning: assignment discards qualifiers from pointer target type vp8/decoder/detokenize.c:128: warning: assignment discards qualifiers from pointer target type vp8/decoder/detokenize.c:129: warning: assignment discards qualifiers from pointer target type vp8/decoder/detokenize.c:130: warning: assignment discards qualifiers from pointer target type vp8/decoder/detokenize.c:131: warning: assignment discards qualifiers from pointer target type Change-Id: I78ddab176fe47cbeed30379709dc7bab01c0c2e4	2010-08-24 18:23:16 -04:00
Johann	d73217ab17	update structures mbmi and eob moved in previous commits Change-Id: I30a2eba36addf89ee50b406ad4afdd059a832711	2010-08-23 13:44:56 -04:00
Fritz Koenig	93c32a55c2	Rework idct calling structure. Moving the eob structure allows for a non-struct based function to handle decoding an entire mb of idct/dequant/recon data. This allows for SIMD functions to idct/dequant/recon multiple blocks at once. SSE2 implementation gives 3% gain on Atom. Change-Id: I8a8f3efd546ea4e0535f517d94f347cfb737c9c2	2010-08-23 08:58:54 -07:00
Johann	9602799cd9	framework for assembly version of the detokenizer adds a compile time option: --enable-arm-asm-detok which pulls in vp8/decoder/arm/detokenize.asm currently about break even speed wise, but changes are pending to the fill code (branch and load 3 bytes versus conditionally always load one) and the error handling. Currently it doesn't handle zero runs or overrunning the buffer. this is really just so i don't have to rebase my changes all the time to run benchmarks - now just need to replace one file! Change-Id: I56d0e2354dc0ca3811bffd0e88fe1f952fa6c797	2010-08-12 16:39:56 -04:00
Scott LaVarnway	9c7a0090e0	Removed unnecessary MB_MODE_INFO copies These copies occurred for each macroblock in the encoder and decoder. Thetemp MB_MODE_INFO mbmi was removed from MACROBLOCKD. As a result, a large number compile errors had to be fixed. Change-Id: I4cf0ffae3ce244f6db04a4c217d52dd256382cf3	2010-08-12 16:25:43 -04:00
Scott LaVarnway	99f46d62d9	Moved gf_active code to encoder only The gf_active code is only used by the encoder, so it was moved from common and decoder. Change-Id: Iada15acd5b2b33ff70c34668ca87d4cfd0d05025	2010-08-11 11:54:25 -04:00
Yunqing Wang	ba2e107d28	First modification of multi-thread decoder This is the first modification of VP8 multi-thread decoder, which uses same threads to decode macroblocks and then do loopfiltering for each frame. Inspired by Rob Clark, synchronization was done on every 8 macroblocks instead of every macroblock to reduce lock contention. Comparing with the original code, this implementation gave about 15%- 20% performance gain while decoding my test clips on a Core2 Quad platform (Linux). The work is not done yet. Test on other platforms are needed. Change-Id: Ice9ddb0b511af1359b9f71e65066143c04fef3b5	2010-08-10 14:09:57 -04:00
John Koleszar	675298216d	Merge "Replace pinsrw (SSE) with MMX instructions"	2010-08-02 06:16:26 -07:00
Philip Jägenstedt	7d243701d9	Replace pinsrw (SSE) with MMX instructions Fixes http://code.google.com/p/webm/issues/detail?id=136 Change-Id: I5a3e294061644a1a9718e8ba4a39548ede25cc42	2010-08-02 09:15:45 -04:00
John Koleszar	38a20e030f	apple: include proper mach primatives Fixes implicit declaration warning for 'mach_task_self'. Patch courtesy of timeless at gmail.com Change-Id: I9991dedd1ccfddc092eca86705ecbc3b764b799d	2010-07-29 17:04:44 -04:00
Johann	b9a038a5ed	Fix build w/o RTCD So many places to update ... Change-Id: Ide957b40cc833f99c2d1849acade6850fbf7585d	2010-07-27 11:56:19 -04:00
Johann	56f5a9a060	update arm idct functions Jeff Muizelaar posted some changes to the idct/reconstruction c code. This is the equivalent update for the arm assembly. This shows a good boost on v6, and a minor boost on neon. Here are some numbers for highway in qcif, 2641 frames: HEAD neon: ~161 fps new neon: ~162 fps HEAD v6: ~102 fps new v6: ~106 fps The following functions have been updated for armv6 and neon: vp8_dc_only_idct_add vp8_dequant_idct_add vp8_dequant_dc_idct_add Conflicts: vp8/decoder/arm/armv6/dequantdcidct_v6.asm vp8/decoder/arm/armv6/dequantidct_v6.asm Resolved by removing these files. When I rewrote the functions, I also moved the files to dequant_dc_idct_v6.asm/dequant_idct_v6.asm Change-Id: Ie3300df824d52474eca1a5134cf22d8b7809a5d4	2010-07-26 08:55:19 -04:00
Jeff Muizelaar	98fcccfe97	Change the x86 idct functions to do reconstruction at the same time Change-Id: I896fe6f9664e6849c7cee2cc6bb4e045eb42540f	2010-07-23 15:21:36 -04:00
Jeff Muizelaar	b2fa74ac18	Combine idct and reconstruction steps This moves the prediction step before the idct and combines the idct and reconstruction steps into a single step. Combining them seems to give an overall decoder performance improvement of about 1%. Change-Id: I90d8b167ec70d79c7ba2ee484106a78b3d16e318	2010-07-23 15:21:36 -04:00
Fritz Koenig	0ce3901282	Swap alt/gold/new/last frame buffer ptrs instead of copying. At the end of the decode, frame buffers were being copied. The frames are not updated after the copy, they are just for reference on later frames. This change allows multiple references to the same frame buffer instead of copying it. Changes needed to be made to the encoder to handle this. The encoder is still doing frame buffer copies in similar places where pointer reference could be done. Change-Id: I7c38be4d23979cc49b5f17241ca3a78703803e66	2010-07-23 14:53:59 -04:00
Fritz Koenig	08eed049d4	Remove CONFIG_NEW_TOKENS files. These files were out of date and no longer maintained. Token decoding has implemented the no-crash code which is incompatible with this arm assembly code. Change-Id: Ibf729886c56fca48181af60b44bda896c30023fc	2010-07-22 19:00:21 -04:00
Michael Kohler	80f0e7a7d0	limit range checking code for L[k] to CONFIG_DEBUG. patch by timeless@gmail.com	2010-07-12 18:41:45 +02:00
John Koleszar	308e867f91	Update loopfilter frame/filter/sharp info for multithread Change I9fd1a5a4 updated the multithreaded loopfilter to avoid reinitializing several parameteres if they haven't changed from the last frame, but the code to update the last frame's parameters wasn't invoked in the multithreaded case. Change-Id: Ia23d937af625c01dd739608e02d110f742b7e1f2	2010-06-30 10:23:53 -04:00
Yunqing Wang	29d586b462	Add loopfilter initialization fix in multithreading code Modified loopfilter initialization to avoid unnecessary operations. Change-Id: I9fd1a5a49edc1cb8116c2a72a6908b1e437459ec	2010-06-30 09:42:39 -04:00
John Koleszar	94c52e4da8	cosmetics: trim trailing whitespace When the license headers were updated, they accidentally contained trailing whitespace, so unfortunately we have to touch all the files again. Change-Id: I236c05fade06589e417179c0444cb39b09e4200d	2010-06-18 13:06:11 -04:00
Timothy B. Terriberry	c17b62e1bd	Change bitreader to use a larger window. Change bitreading functions to use a larger window which is refilled less often. This makes it cheap enough to do bounds checking each time the window is refilled, which avoids the need to copy the input into a large circular buffer. This uses less memory and speeds up the total decode time by 1.6% on an ARM11, 2.8% on a Cortex A8, and 2.2% on x86-32, but less than 1% on x86-64. Inlining vp8dx_bool_decoder_fill() has a big penalty on x86-32, as does moving the refill loop to the front of vp8dx_decode_bool(). However, having the refill loop between computation of the split values and the branch in vp8_decode_mb_tokens() is a big win on ARM (presumably due to memory latency and code size: refilling after normalization duplicates the code in the DECODE_AND_BRANCH_IF_ZERO and DECODE_AND_LOOP_IF_ZERO cases. Unfortunately, refilling at the end of vp8dx_bool_decoder_fill() and at the beginning of each decode step in vp8_decode_mb_tokens() means the latter requires an extra refill at the end. Platform-specific versions could avoid the problem, but would require most of detokenize.c to be duplicated. Change-Id: I16c782a63376f2a15b78f8086d899b987204c1c7	2010-06-15 19:55:14 -07:00
Paul Wilkins	7a81b29d38	Use local pointer to pbi->common.	2010-06-11 15:17:57 +01:00
John Koleszar	fb220d257b	replace while(0) construct with if/else No good reason to be tricky here. I don't know why 'break' occurred to me as the natrual replacement for the 'return', but an if/else block is definitely clearer. Change-Id: I08a336307afeb0dc7efa494b37398f239f66c2cf	2010-06-10 20:15:21 -04:00
Timothy B. Terriberry	05c6eca4db	Fix new MV clamping scheme for chroma MVs. The new scheme introduced in I68d35a2f did not clamp chroma MVs in the SPLITMV case, and clamped them incorrectly (to the luma plane bounds) in every other case. Because chroma MVs are computed from the luma MVs before clamping occurs, they could still point outside of the frame buffer and cause crashes. This clamping happens outside of the MV prediction loop, and so should not affect bitstream decoding.	2010-06-10 18:42:24 -04:00
John Koleszar	3085025fa1	Remove secondary mv clamping from decode stage This patch removes the secondary MV clamping from the MV decoder. This behavior was consistent with limits placed on non-split MVs by the reference encoder, but was inconsistent with the MVs generated in the split case. The purpose of this secondary clamping was only to prevent crashes on invalid data. It was not intended to be a behaviour an encoder could or should rely on. Instead of doing additional clamping in a way that changes the entropy context, the secondary clamp is removed and the border handling is made implmentation specific. With respect to the spec, the border is treated as essentially infinite, limited only by the clamping performed on the near/nearest reference and the maximum encodable magnitude of the residual MV. This does not affect any currently produced streams. Change-Id: I68d35a2fbb51570d6569eab4ad233961405230a3	2010-06-09 11:47:24 -04:00
Philip Jägenstedt	0dd78af3e9	remove unreferenced variable i	2010-06-07 11:35:33 -04:00
John Koleszar	09202d8071	LICENSE: update with latest text Change-Id: Ieebea089095d9073b3a94932791099f614ce120c	2010-06-04 16:19:40 -04:00
Yunqing Wang	d33bf3d664	Remove costly memory reads/writes in vp8_reset_mb_tokens_context() Tests on x86 showed this function costed 2.7% of total decoding time because of all the memory reads/writes. After modification, it only costs about 0.7% of decoding time, which gives a 2% gain. Change-Id: I5003ee30b6dc6dea0bfa42a6ad7e7c22fcc7b215	2010-06-01 07:59:50 -04:00
John Koleszar	b7492341ac	install includes in DIST_DIR/include/vpx, move vpx_codec/ to vpx/ This renames the vpx_codec/ directory to vpx/, to allow applications to more consistently reference these includes with the vpx/ prefix. This allows the includes to be installed in /usr/local/include/vpx rather than polluting the system includes directory with an excessive number of includes. Change-Id: I7b0652a20543d93f38f421c60b0bbccde4d61b4f	2010-05-24 20:27:42 -04:00
John Koleszar	0ea50ce9cb	Initial WebM release	2010-05-18 11:58:33 -04:00

... 2 3 4 5 6 ...

344 Commits