generic-library/vpx

Author	SHA1	Message	Date
Scott LaVarnway	a25f6a9c88	Moved vp8_encode_bool into boolhuff.h allowing the compiler to inline this function. For real-time encodes, this gave a boost of 1% to 2.5%, depending on the speed setting. Change-Id: I3929d176cca086b4261267b848419d5bcff21c02	2011-07-19 09:17:25 -04:00
John Koleszar	b5ea2fbc2c	Improved 1-pass CBR rate control This patch attempts to improve the handling of CBR streams with respect to the short term buffering requirements. The "buffer level" is changed to be an average over the rc buffer, rather than a long running average. Overshoot is also tracked over the same interval and the golden frame targets suppressed accordingly to correct for overly aggressive boosting. Testing shows that this is fairly consistently positive in one metric or another -- some clips that show significant decreases in quality have better buffering characteristics, others show improvenents in both. Change-Id: I924c89aa9bdb210271f2e03311e63de3f1f8f920	2011-07-18 11:48:05 -04:00
Scott LaVarnway	e68894fa03	Merge "Tokenize MB optimized"	2011-07-15 07:54:14 -07:00
Tero Rintaluoma	4e82f01547	Tokenize MB optimized Optimized C-code of the following functions: - vp8_tokenize_mb - tokenize1st_order_b - tokenize2nd_order_b Gives ~1-5% speed-up for RT encoding on Cortex-A8/A9 depending on encoding parameters. Change-Id: I6be86104a589a06dcbc9ed3318e8bf264ef4176c	2011-07-15 11:26:54 +03:00
James Berry	6b6f367c3d	bug fix vpx_copy_and_extend_frame size issue vpx_copy_and_extend_frame could incorrectly resize uv frames which could result in a crash. Change-Id: Ie96f7078b1e328b3907a06eebeee44ca39a2e898	2011-07-14 15:58:15 -04:00
John Koleszar	04dce631a2	Remove unused speed features min_fs_radius, max_fs_radius, full_freq were set but never read. Change-Id: I82657f4e7f2ba2acc3cbc3faa5ec0de5b9c6ec74	2011-07-14 14:20:25 -04:00
Yunqing Wang	f1f28535c3	Merge "Fix unnecessary casting of B_PREDICTION_MODE (issue 349)"	2011-07-13 13:32:57 -07:00
Yunqing Wang	139577f937	Fix unnecessary casting of B_PREDICTION_MODE (issue 349) Minor fix. Change-Id: Iaf93f6e47e882a33c479e57c7a0d0bf321e291c0	2011-07-13 15:52:07 -04:00
Yunqing Wang	0e9a6ed72a	Add improvements made in good-quality mode to real-time mode Several improvements we made in good-quality mode can be added into real-time mode to speed up encoding in speed 1, 2, and 3 with small quality loss. Tests using tulip clip showed: --rt --cpu-used=-1 (before change) PSNR: 38.028 time: 1m33.195s (after change) PSNR: 38.014 time: 1m20.851s --rt --cpu-used=-2 (before change) PSNR: 37.773 time: 0m57.650s (after change) PSNR: 37.759 time: 0m54.594s --rt --cpu-used=-3 (before change) PSNR: 37.392 time: 0m42.865s (after change) PSNR: 37.375 time: 0m41.949s Change-Id: I76ab2a38d72bc5efc91f6fe20d332c472f6510c9	2011-07-13 14:51:02 -04:00
Fritz Koenig	84c3cd79d1	Merge "Reduce motion vector search on alt-ref frame."	2011-07-13 10:07:30 -07:00
Johann	211694f67e	Merge "update x86 asm for loopfilter"	2011-07-13 04:10:03 -07:00
Johann	8f910594bd	Merge "Update armv6 loopfilter to new interface"	2011-07-13 04:09:55 -07:00
Johann	1a219c22b1	Merge "Update armv7 loopfilter to new interface"	2011-07-13 04:09:42 -07:00
Johann	d9b825cff2	Merge "New loop filter interface"	2011-07-13 04:09:26 -07:00
Attila Nagy	c231b0175d	Update armv6 loopfilter to new interface Change-Id: I5fe581d797571a7a9432fbd17fc557591d0c1afa	2011-07-12 12:14:51 +03:00
Attila Nagy	283b0e25ac	Update armv7 loopfilter to new interface Change-Id: I65105a9c63832669237e6a6a7fcb4ea3ea683346	2011-07-12 12:12:25 +03:00
Fritz Koenig	ede0b15c9d	Reduce motion vector search on alt-ref frame. Clamp mv search to accomodate subpixel filtering of UV mv. Change-Id: Iab3ed405993ef6bf779ad7cf60863153068fb7d1	2011-07-11 09:05:43 -07:00
Yunqing Wang	587ca06da9	Minor change in pick_inter_mode() Scott suggested to move vp8_mv_pred() under "case NEWMV" to save extra checks. Change-Id: I09e69892f34a08dd425a4d81cfcc83674e344a20	2011-07-08 14:08:45 -04:00
Yunqing Wang	e83d36c053	Merge "Adjust full-pixel clamping and motion vector limit calculation"	2011-07-08 08:39:32 -07:00
Yunqing Wang	40991faeae	Adjust full-pixel clamping and motion vector limit calculation Do mvp clamping in full-pixel precision instead of 1/8-pixel precision to avoid error caused by right shifting operation. Also, further fixed the motion vector limit calculation in change: `b748045470` Change-Id: Ied88a4f7ddfb0476eb9f7afc6ceeddbf209fffd7	2011-07-08 11:34:28 -04:00
Johann	01433c5043	update x86 asm for loopfilter Change-Id: I1ed739522db7c00c189851c7095c1b64ef6412ce	2011-07-08 09:23:38 -04:00
Johann	6ae12c415e	Merge "clean up warnings when building arm with rtcd"	2011-07-08 05:16:09 -07:00
Attila Nagy	622958449b	New loop filter interface Separate simple filter with reduced no. of parameters. MB filter level picking based on precalculated table. Level table updated for each frame. Inside and edge limits precalculated and updated just when sharpness changes. HEV threshhold is constant. ARM targets use scalars and others vectors. Change works only with --target=generic-gnu All other targets have to be updated! Change-Id: I6b73aca6b525075b20129a371699b2561bd4d51c	2011-07-08 09:31:41 +03:00
John Koleszar	973a9c075d	Merge "Set VPX_FRAME_IS_DROPPABLE"	2011-07-07 08:11:05 -07:00
John Koleszar	37de0b8bdf	Set VPX_FRAME_IS_DROPPABLE Allow the encoder to inform the application that the encoded frame will not be used as a reference. Change-Id: I90e41962325ef73d44da03327deb340d6f7f4860	2011-07-07 10:38:45 -04:00
John Koleszar	b4f70084cc	Merge "Properly use GET_GOT/RESTORE_GOT when using GLOBAL()."	2011-07-01 07:14:34 -07:00
Ronald S. Bultje	c8a23ad3f4	Properly use GET_GOT/RESTORE_GOT when using GLOBAL(). This should fix binaries using PIC on x86-32. Also should fix issue 343. Change-Id: I591de3ad68c8a8bb16054bd8f987a75b4e2bad02	2011-06-30 14:04:27 -07:00
Yunqing Wang	ae8aa836d5	Merge "Copy macroblock data to a buffer before encoding it"	2011-06-30 11:14:24 -07:00
Yunqing Wang	80c3bbf657	Merge "Bug fix in motion vector limit calculation"	2011-06-30 09:52:03 -07:00
Yunqing Wang	b748045470	Bug fix in motion vector limit calculation Motion vector limits are calculated using right shifts, which could give wrong results for negative numbers. James Berry's test on one clip showed encoder produced some artifacts. This change fixed that. Change-Id: I035fc02280b10455b7f6eb388f7c2e33b796b018	2011-06-30 11:20:13 -04:00
Johann	3e4a80cc35	Merge "remove incorrect initialization"	2011-06-30 07:59:08 -07:00
Paul Wilkins	eacaabc592	Merge "Change to arf boost calculation."	2011-06-29 10:03:57 -07:00
Paul Wilkins	11694aab66	Change to arf boost calculation. In this commit I have added an experimental function that tests prediction quality either side of a central position to calculate a suggested boost number for an ARF frame. The function is passed an offset from the current position and a number of frames to search forwards and backwards. It returns a forward, backward and compound boost number. The new code can be deactivated using #define NEW_BOOST 0 In its current default state the code searches forwards and backwards from the proposed position of the next alt ref. The the old code used a boost number calculated by scanning forward from the previous GF up to the proposed alt ref frame position. I have also added some code to try and prevent placement of a gf/arf where there is a brief flash. Change-Id: I98af789a5181148659f10dd5dd2ff2d4250cd51c	2011-06-29 18:01:25 +01:00
Johann	fe53107fda	remove incorrect initialization Values were set, then reset. Only set them once. Change-Id: Iaf43c8467129f2f261f04fa9188b603aa46216b5	2011-06-29 11:54:27 -04:00
Johann	6611f66978	clean up warnings when building arm with rtcd Change-Id: I3683cb87e9cb7c36fc22c1d70f0799c7c46a21df	2011-06-29 10:51:41 -04:00
John Koleszar	f3a13cb236	Merge "Use MAX_ENTROPY_TOKENS and ENTROPY_NODES more consistently"	2011-06-29 07:29:59 -07:00
Johann	dc004e8c17	Merge "Avoid text relocations in ARM vp8 decoder"	2011-06-28 16:34:10 -07:00
Johann	02c30cdeef	Merge "utilize preload in ARMv6 MC/LPF/Copy routines"	2011-06-28 16:33:45 -07:00
John Koleszar	b32da7c3da	Use MAX_ENTROPY_TOKENS and ENTROPY_NODES more consistently There were many instances in the code of vp8_coef_tokens and vp8_coef_tokens-1, which was a preprocessor macro despite the naming convention. Replace these with MAX_ENTROPY_TOKENS and ENTROPY_NODES, respectively. Change-Id: I72c4f6c7634c94e1fa066cd511471e5592c748da	2011-06-28 17:03:55 -04:00
John Koleszar	9bcf07ae4a	Merge "Simplify decode_macroblock."	2011-06-28 12:54:25 -07:00
Gaute Strokkenes	81c0546407	Simplify decode_macroblock. Change-Id: Ieb2f3827ae7896ae594203b702b3e8fa8fb63d37	2011-06-28 17:01:14 +01:00
Stefan Holmer	7296b3f922	New ways of passing encoded data between encoder and decoder. With this commit frames can be received partition-by-partition from the encoder and passed partition-by-partition to the decoder. At the encoder-side this makes it easier to split encoded frames at partition boundaries, useful when packetizing frames. When VPX_CODEC_USE_OUTPUT_PARTITION is enabled, several VPX_CODEC_CX_FRAME_PKT packets will be returned from vpx_codec_get_cx_data(), containing one partition each. The partition_id (starting at 0) specifies the decoding order of the partitions. All partitions but the last has the VPX_FRAME_IS_FRAGMENT flag set. At the decoder this opens up the possibility of decoding partition N even though partition N-1 was lost (given that independent partitioning has been enabled in the encoder) if more info about the missing parts of the stream is available through external signaling. Each partition is passed to the decoder through the vpx_codec_decode() function, with the data pointer pointing to the start of the partition, and with data_sz equal to the size of the partition. Missing partitions can be signaled to the decoder by setting data != NULL and data_sz = 0. When all partitions have been given to the decoder "end of data" should be signaled by calling vpx_codec_decode() with data = NULL and data_sz = 0. The first partition is the first partition according to the VP8 bitstream + the uncompressed data chunk + DCT address offsets if multiple residual partitions are used. Change-Id: I5bc0682b9e4112e0db77904755c694c3c7ac6e74	2011-06-28 11:10:17 -04:00
Stefan Holmer	4cb0ebe5b2	Adding support for independent partitions Adding support in the encoder for generating independent residual partitions by forcing equal probabilities over the prev coef entropy contexts. Change-Id: I402f5c353255f3ca20eae2620af739f6a498cd21	2011-06-28 11:10:17 -04:00
Mike Hommey	e3f850ee05	Avoid text relocations in ARM vp8 decoder The current code stores pointers to coefficient tables and loads them to access the tables contents. As these pointers are stored in the code sections, it means we end up with text relocations. eu-findtextrel will thus complain about code not compiled with -fpic/-fPIC. Since the pointers are stored in the code sections, we can actually cheat and let the assembler generate relative addressing when accessing the coefficient tables, and just load their location with adr. Change-Id: Ib74ae2d3f2bab80b29991355f2dbe6955f38f6ae	2011-06-28 09:11:40 +02:00
Fritz Koenig	be99868bd1	Fix after removal of B_MODE_INFO Change Ieb746989: Removed B_MODE_INFO missed this. Change-Id: I32202555581cc2a5d45e729c6650ada4d2df55d3	2011-06-27 09:43:21 -07:00
Johann	8a9a11e8dc	Merge "configuration, support disabling any subset of ARM arch"	2011-06-27 08:55:18 -07:00
Stefan Holmer	ba0822ba96	Adding support for error concealment in multi-threaded decoding Also includes a couple of error concealment bug fixes: - the segment_id wasn't properly initialized when missing - when interpolating and no neighbors are found, set to zero - clear the qcoef buffer when concealing an MB Change-Id: Id79c876b41d78b559a2241e9cd0fd2cae6198f49	2011-06-27 09:03:06 -04:00
Adrian Grange	deca8cfc44	Fixed initialization of frame buffer ref counters Only the first frame buffer ref counter was being initialized because the index was fixed at 0 rather than using i. Change-Id: Ib842298be4a5e3607f9e21c2cd4bfbee4054ffc4	2011-06-24 08:43:40 -07:00
Yunqing Wang	0d87098e08	Copy macroblock data to a buffer before encoding it I got this idea from Pascal (Thanks). Before encoding a macroblock, copy it to a 16x16 buffer, and then read source data from there instead. This will help keep the source data in cache, and help with the performance. Change-Id: Id05f4cb601299150511d59dcba0ae62c49b5b757	2011-06-23 13:54:02 -04:00
John Koleszar	db67dcba6a	Revert "Reduce overshoot in 1 pass rate control" This reverts commit `212f618373`. Further testing shows that the overshoot accumulation/damping is too aggressive on some clips. Allowing the accumulated overshoot to decay and limiting to damping to golden frames shows some promise. But some clips show significant overshoot in the buffer window, so I think this still needs work. Change-Id: Ic02a9ca34f55229f9cc04786f4fab54cdc1a3ef5	2011-06-23 11:52:12 -04:00
James Berry	2bd90c13a0	get/set reference buffer dimension check added vp8_yv12_copy_frame_ptr() expects same size buffers which was not previously gaurenteed. Using an improperly allocated buffer would cause a crash before. Change-Id: I904982313ce9352474f80de842013dcd89f48685	2011-06-22 13:36:24 -04:00
Yaowu Xu	76495617e0	Merge "adjusting the calculation of errorperbit"	2011-06-21 09:47:42 -07:00
Scott LaVarnway	55c3963c88	Merge "Improved vp8dx_decode_bool"	2011-06-21 07:45:51 -07:00
Yunqing Wang	109c20299c	Merge "Remove unnecessary bounds checking in motion search"	2011-06-21 07:23:24 -07:00
Attila Nagy	6f23f24afe	configuration, support disabling any subset of ARM arch Useful for leaving out any version specific asm files. Change-Id: I233514410eb9d7ca88d2d2c839673122c507fa99	2011-06-21 10:39:01 +03:00
Yaowu Xu	10ed60dc71	adjusting the calculation of errorperbit RDMULT/RDDIV defines a bit worth of distortion in term of sum squared difference. This has also been used as errorperbit in subpixel motion search, where the distortions computed as variance of the difference. The variance of differences is different from sum squared differences by amount of DC squared. Typically, for inter predicted MBs, this difference averages around 10% between the two distortion, so this patch introduces a 110% constant in deriving errorperbit from RDMULT/RDDIV. Test on CIF set shows small but positive gain on overall PSNR (.03%) and SSIM (.07%), overall impact on average PSNR is 0. Change-Id: I95425f922d037b4d96083064a10c7cdd4948ee62	2011-06-20 16:32:30 -07:00
Scott LaVarnway	67a1f98c2c	Improved vp8dx_decode_bool Relocated the vp8dx_bool_decoder_fill() call, allowing the compiler to produce better assembly code. Tests showed a 1 - 2 % performance boost (x86 using gcc) for the 720p clip used. Change-Id: Ic5a4eefed8777e6eefa007d4f12dfc7e64482732	2011-06-20 14:44:16 -04:00
Taekhyun Kim	458fb8f491	utilize preload in ARMv6 MC/LPF/Copy routines About 9~10% decoding perf improvement on non-Neon ARM cpus Change-Id: I7dc2a026764e84e9c2faf282b4ae113090326837	2011-06-17 14:04:53 -07:00
Yunqing Wang	2cd1c2855e	Remove unnecessary bounds checking in motion search The starting points are always within the limits, and bounds checking on these points is not needed. For speed < 5, the encoded result changes a little because different treatment is taken while starting point equals the bounds. Change-Id: I09a402d310f51e305a3519f1601b1d17b05c6152	2011-06-17 14:19:51 -04:00
John Koleszar	a60fc419f5	Merge "Use SSE as BPRED distortion metric consistently"	2011-06-17 09:48:32 -07:00
Ronald S. Bultje	87fd66bb0e	Assign boost to GF bit allocation if past frame had no ARF. Modify the second-pass code to provide a full golden-frame (GF) bit allocation boost if the past GF group (GFG) had no alt-ref frame (ARF), even if the current GFG does contain and ARF. This mostly has no effect on clips, since switching ARFs on/off between GFGs is not very common. Has a positive effect on e.g. cheer (+0.45 SSIM at 600kbps) and football (+0.25 SSIM at 600kbps), particularly at high bitrates. Has a negative effect (-0.04 SSIM at 300kbps) at pamphlet, which appears only marginally related to this patch, and crew (-0.1 SSIM at 700kbps). Change-Id: I2e32899638b59f857e26efeac18a82e0c0b77089	2011-06-16 13:01:27 -04:00
John Koleszar	eb645abeac	Merge "Disable specialcase for last frames if the sequence contains ARFs."	2011-06-16 09:56:05 -07:00
John Koleszar	5223016337	Merge "Remove redundant check for KEY_FRAME in multithreaded decoder"	2011-06-15 10:18:06 -07:00
John Koleszar	61599fb59f	Use SSE as BPRED distortion metric consistently The BPRED mode selection uses SSE as a distortion metric, but the early breakout threshold being used was a variance value. Change-Id: I42d4602fb9b548bf681a36445701fada5e73aff1	2011-06-15 10:53:37 -04:00
John Koleszar	1ade44b352	Merge "fix --disable-runtime-cpu-detect on x86"	2011-06-15 07:09:09 -07:00
Ronald S. Bultje	299193dd1c	Disable specialcase for last frames if the sequence contains ARFs. firstpass.c contains some rate adjustment code that assures that the last few frames in a sequence abide by rate limits. If the second-to- last group of frames contains an alt-ref frame (ARF), the last golden frame (GF) is zero bytes, and we will thus spend a ridiculously high number of bits on regular P-frames trying to hit the target rate. This does slightly enhance the quality of these last few frames, but has no perceptual value (other than hitting the target rate). Disabling this code means we consistently (slightly) undershoot the target rate and consequently do worse on the last few frames of a clip, which is particularly noticeable for small clips. The quality- per-bitrate is generally better, ~0.2% better overall on derf-set, especially on clips such as garden, tennis, foreman at low bitrates. Has a negative effect on hallmonitor at high bitrates. Change-Id: I1d63452fef5fee4a0ad2fb2e9af4c9f2e0d86d23	2011-06-15 09:47:00 -04:00
Attila Nagy	c7e6aabbca	Remove redundant check for KEY_FRAME in multithreaded decoder For Intra blocks is enough to check ref_frame == INTRA_FRAME. Change-Id: I3e2d3064c7642658a9e14011a4627de58878e366	2011-06-15 09:01:27 +03:00
Scott LaVarnway	7be5b6dae4	Merge "Populate bmi for B_PRED only"	2011-06-14 12:04:50 -07:00
Johann	92b0e544f3	fix --disable-runtime-cpu-detect on x86 Change-Id: Ib8e429152c9a8b6032be22b5faac802aa8224caa	2011-06-14 11:31:50 -04:00
Tero Rintaluoma	9909047461	Fix RT only build Moved encode_intra function from firstpass.c to encodeintra.c to prevent linking problem in real-time only build. Also changed name of the function to vp8_encode_intra because it is not a static. Change-Id: Ibf3c6c1de3152567347e5fbef47d1d39564620a5	2011-06-14 13:39:06 +03:00
James Zern	532c30c83e	fix corrupt frame leak If setup_token_decoder reported an internal error the memory allocated there would not be freed in the resulting call to _remove_decompressor. Change-Id: Ib459de222d76b1910d6f449cdcd01663447dbdf6	2011-06-13 17:32:19 -07:00
Scott LaVarnway	223d1b54cf	Populate bmi for B_PRED only Small decode performance gain (~1%) on keyframes. No noticeable gains on encode. Also changed pick_intra4x4mby_modes() to read the above and left block modes for keyframes only. Change-Id: I1f4885252f5b3e9caf04d4e01e643960f910aba5	2011-06-13 17:14:11 -04:00
Scott LaVarnway	e71a010646	Calc ref_frame_cost once per frame instead of every macro block. Change-Id: I2604e94c6b89e3a8457777e21c8c38406d55b165	2011-06-13 09:58:03 -04:00
John Koleszar	f3ba4c6b82	Merge "bug fix mode_info_context not initialized for error-resilient"	2011-06-09 13:39:47 -07:00
Yaowu Xu	361717d2be	remove one set of 16x16 variance funcations call to this set of functions are replaced by var16x16. Change-Id: I5ff1effc6c1358ea06cda1517b88ec28ef551b0d	2011-06-09 11:23:05 -07:00
James Berry	45feea4cf0	bug fix mode_info_context not initialized for error-resilient uninitialized xd->mode_info_context would crash vpxenc for --error-resilient=1. Change-Id: I31849e40281e3d65ab63257cfec5e93398997f0b	2011-06-09 12:46:31 -04:00
John Koleszar	af49c11250	Update keyframe activity in non-RD mode Activity update is no longer dependent on being in RD mode, so update it unconditionally. Change-Id: Ib617a6fc210dfc045455e3e4467d7ee5e3d1fa0e	2011-06-09 12:05:31 -04:00
Johann	79327be6c7	use GCC inline magic Better fix for #326. ICC happens to support the inline magic Change-Id: Ic367eea608c88d89475cb7b05d73500d2a1bc42b	2011-06-08 16:19:37 -04:00
John Koleszar	8767ac3bc7	Merge "vp8_pick_inter_mode: remove best_bmodes"	2011-06-08 10:59:30 -07:00
John Koleszar	9e4df2bcf5	Merge "vp8_pick_intra_mode: correct returned rate"	2011-06-08 10:58:36 -07:00
John Koleszar	254a7483e5	Merge "Move RD intra block mode selection to rdopt.c"	2011-06-08 10:51:50 -07:00
John Koleszar	001bd51ceb	vp8_pick_inter_mode: remove best_bmodes Since BPRED will be tested at most once, and SPLITMV is not enabled, there's nothing to clobber the subblock modes, so there's no need to save and restore them. Change-Id: I7c3615b69190c10bd068a44df5488d6e8b85a364	2011-06-08 13:50:50 -04:00
Scott LaVarnway	dce64343d6	Merge "Removed unused function parameters"	2011-06-08 10:20:28 -07:00
John Koleszar	91907e0bf4	vp8_pick_intra_mode: correct returned rate The returned rate was always the 4x4 rate, instead of the rate matching the selected mode. Change-Id: I51da31f80884f5e37f3bcc77d1047d31e612ded4	2011-06-08 13:19:12 -04:00
Scott LaVarnway	69d8d386ed	Removed unused function parameters Change-Id: Ib641c624faec28ad9eb99e2b5de51ae74bbcb2a2	2011-06-08 13:01:09 -04:00
Yaowu Xu	1fba1e38ea	Adjust errorperbit according to RDMULT in activity masking In activity masking, RDO constant RDMULT is adjusted on a per MB basis adaptive to activity with the MB. errorperbit, which is defined as RDMULT/RDDIV, is a constant used in motion estimation. Previously, in activity masking, errorperbit is not changed even when RDMULT is changed. This commit changed to adjust errorperbit according to the change in RDMULT. Test in cif set showed a very small but consistent gain by all quality metrics (average, overall psnr and ssim) when activity masking is on. Change-Id: I07ded3e852919ab76757691939fe435328273823	2011-06-08 09:45:47 -07:00
Yaowu Xu	5fafa2d524	Merge "Further activity masking changes:"	2011-06-08 09:30:31 -07:00
John Koleszar	96a42aaa2d	Move RD intra block mode selection to rdopt.c This change is analogous to I0b67dae1f8a74902378da7bdf565e39ab832dda7, which made the move for the non-RD path. Change-Id: If63fc1b0cd1eb7f932e710f83ff24d91454f8ed1	2011-06-08 12:05:05 -04:00
John Koleszar	e90d17d240	Move intra block mode selection to pickinter.c This commit moves the intra block mode selection from encodeframe.c to pickinter.c (in the non-RD case). This allowed pick_intra_mbuv_mode and pick_intra4x4mby_modes to be made static, and is a step towards refactoring intra mode selection in the main pickinter loop. Gave a small perf increase (~0.5%). Change-Id: I0b67dae1f8a74902378da7bdf565e39ab832dda7	2011-06-08 11:44:57 -04:00
Paul Wilkins	4e81a68af7	Further activity masking changes: Some further re-structuring of activity masking code. Still has various experimental switches. Supports a metric based on intra encode. Experimental comparison against a fixed activity target rather than a frame average, for altering rd and zbin. Overall the SSIM performance is similar to TT's original code but there is a much smaller PSNR hit of circa 0.5% instead of 3.2% Change-Id: I0fd53b2dfb60620b3f74d7415e0b81c1ac58c39a	2011-06-08 16:03:37 +01:00
Yaowu Xu	7368dd4f8f	Merge "remove redundant functions"	2011-06-07 16:36:37 -07:00
Yaowu Xu	59129afc05	Merge "adjust sad per bit constants"	2011-06-07 12:37:04 -07:00
Yaowu Xu	221e00eaa9	adjust sad per bit constants While investigating the effect of DC values on SAD and SSE in motion estimation, a side finding indicates the two table of constants need be adjusted. The adjustment was done by multiplying old constants by 90% with rounding. Also absorb the 1/2 scaling constant into the two tables. Refer to change Ifa285c3e for background of the 1/2 factor. Cif set test showed a very small gain on all metric. Change-Id: I04333527a823371175dd46cb04a817e5b9a8b752	2011-06-07 12:35:03 -07:00
John Koleszar	5c166470a5	Merge "Reduce overshoot in 1 pass rate control"	2011-06-07 12:30:37 -07:00
Scott LaVarnway	346358a5b7	Merge "Wrapped asserts in critical code with CONFIG_DEBUG"	2011-06-07 06:53:51 -07:00
Scott LaVarnway	afb84bb1cc	Merge "Removed unused function vp8_treed_read_num"	2011-06-07 06:51:24 -07:00
Scott LaVarnway	0e3bcc6f32	Wrapped asserts in critical code with CONFIG_DEBUG Change-Id: I5b0aaca06f2e0f40588cb24fb0642b6865da8970	2011-06-07 09:34:47 -04:00
Scott LaVarnway	1374a4db3b	Removed unused function vp8_treed_read_num Change-Id: Id66e70540ee7345876f099139887c1843093907f	2011-06-07 09:32:51 -04:00
Yaowu Xu	d4700731ca	remove redundant functions The encoder defined about 4 set of similar functions to calculate sum, variance or sse or a combination of them. This commit removed one set of these functions, get8x8var and get16x16var, where calls to the later function are replaced with var16x16 by using the fact on a 16x16 MB: variance == sse - sum*sum/256 Change-Id: I803eabd1fb3ab177780a40338cbd596dffaed267	2011-06-06 16:44:05 -07:00
Yunqing Wang	03973017a7	Remove hex search's variance calculation while in real-time mode In real-time mode motion search, there is no need to calculate variance. This change improved encoding speed by 1% ~ 2%(speed=-5). Change-Id: I65b874901eb599ac38fe8cf9cad898c14138d431	2011-06-06 19:11:05 -04:00
Johann	04edde2b11	Merge "neon fast quantize block pair"	2011-06-06 13:42:58 -07:00
Johann	da8eb716e8	Merge "adds preload for armv6 encoder asm"	2011-06-06 13:32:13 -07:00
Scott LaVarnway	d1c0ba8f7a	Merge "Removed unnecessary bmi motion vector stores."	2011-06-06 07:57:39 -07:00
John Koleszar	824e9410c6	Merge "Don't allow very short GF groups even when the GF is predicted from an ARF."	2011-06-06 07:02:29 -07:00
John Koleszar	212f618373	Reduce overshoot in 1 pass rate control This patch attempts to reduce the peak bitrate hit by the encoder when using small buffer windows. Tested on the CIF set over 200-500kbps using these settings: --buf-sz=500 --buf-initial-sz=250 --buf-optimal-sz=250 \ --undershoot-pct=100 Two pass encodes were tested at best quality. One pass encodes were tested only at realtime speed 4: --rt --cpu-used=-4 The peak datarate (over the specified 500ms window) was measured for each encode, and averaged together to get metric for "average peak," computed as SUM(peak)/SUM(target). This patch reduces the average peak datarate as follows: One pass: baseline: 1.29715 this patch: 1.23664 Two pass: baseline: 1.32702 this patch: 1.37824 This change had a positive effect on our quality metrics as well: One pass CBR: Min / Mean / Max (pct) Average PSNR -0.42 / 2.86 / 27.32 Overall PSNR -0.90 / 2.00 / 17.27 SSIM -0.05 / 3.95 / 37.46 Two pass CBR: Min / Mean / Max (pct) Average PSNR -4.47 / 4.35 / 35.99 Overall PSNR -3.40 / 4.18 / 36.46 SSIM -4.56 / 6.98 / 53.67 One pass VBR: Min / Mean / Max (pct) Average PSNR -5.21 / 0.01 / 3.30 Overall PSNR -8.10 / -0.38 / 1.21 SSIM -7.38 / -0.11 / 3.17 (note: most values here were close to the mean, there were a few outliers on files that were very sensitive to golden frame size) Two pass VBR: Min / Mean / Max (pct) Average PSNR 0.00 / 0.00 / 0.00 Overall PSNR 0.00 / 0.00 / 0.00 SSIM 0.00 / 0.00 / 0.00 Neither one pass or two pass CBR mode adheres particularly strictly to the short term buffer constraints, and two pass is less consistent, even in the baseline commit. This should be addressed in a later commit. This likely will hurt the quality numbers, as it will have to reduce the burstiness of golden frames. Aside: My work on this commit makes it clear that we need to make rate control modes "pluggable", where you can easily write a new one or work on one in isolation. Change-Id: I1ea9a48f2beedd59891f1288aabf7064956b4716	2011-06-03 16:38:11 -04:00
Scott LaVarnway	f1d6cc79e4	Removed unnecessary bmi motion vector stores. left_block_mv and above_block_mv will return the MB motion vector for non SPLITMV macro blocks. Change-Id: I58dbd7833b4fdcd44b6b72e98ec732c93c2ce4f4	2011-06-03 13:09:46 -04:00
Scott LaVarnway	8c5b73de2a	Merge "Removed B_MODE_INFO"	2011-06-03 08:32:30 -07:00
Yunqing Wang	e5c236c210	Adjust bounds checking for hex search in real-time mode Currently, hex search couldn't guarantee the motion vector(MV) found is within the limit of maximum MV. Therefore, very large motion vectors resulted from big motion in the video could cause encoding artifacts. This change adjusted hex search bounds checking to make sure the resulted motion vector won't go out of the range. James Berry, thank you for finding the bug. Change-Id: If2c55edd9019e72444ad9b4b8688969eef610c55	2011-06-03 08:53:42 -04:00
Scott LaVarnway	773768ae27	Removed B_MODE_INFO Declared the bmi in BLOCKD as a union instead of B_MODE_INFO. Then removed B_MODE_INFO completely. Change-Id: Ieb7469899e265892c66f7aeac87b7f2bf38e7a67	2011-06-02 13:46:41 -04:00
Ronald S. Bultje	9f002bee53	Don't allow very short GF groups even when the GF is predicted from an ARF. This is basically a slightly modified version of the previous patch, and it has a moderately positive effect (SSIM/PSNR both +0.08% avg on derf-set). Most clips show no change, except waterfall/coastguard, each ~ +0.8% SSIM/PSNR. You can see similar effects in other clips by shortening their length to terminate at a very short last group of frames. Change-Id: I7a70de99ca1f9fe6a8b6ca7a6e30e8a4b64383e4	2011-06-02 09:14:51 -07:00
Yaowu Xu	4ce6928d5b	Merge "further clean up of errorperbit and sadperbit"	2011-06-02 08:58:03 -07:00
Yaowu Xu	5b2fb32961	further clean up of errorperbit and sadperbit this commit makes the usage errorperbit and sadperbit consistent for encoding modes and passes. Removed all different magic weight factors associated with errorperbit. Now 1/2 is used for both sadperbit16 and sadperbit4, the /2 operation is merged into initializations of the 2 variables. Tests on cif set show .23%, 0.18% and 0.19% gain by avg psnr, overall psnr and ssim respectively. Change-Id: Ifa285c3e065ce0a5a77addfc9f95aabf54ee270d	2011-06-01 14:44:06 -07:00
John Koleszar	4101b5c5ed	Merge "Bugfix in vp8dx_set_reference"	2011-06-01 13:57:23 -07:00
Henrik Lundin	69ba6bd142	Bugfix in vp8dx_set_reference The fb_idx_ref_cnt book-keeping was in error. Added an assert to prevent future errors in the reference count vector. Also fixed a pointer syntax error. Change-Id: I563081090c78702d82199e407df4ecc93da6f349	2011-06-01 21:41:12 +02:00
John Koleszar	5610970fe9	Merge "Fix code under #if CONFIG_INTERNAL_STATS."	2011-06-01 11:14:17 -07:00
Ronald S. Bultje	34ba18760f	Fix code under #if CONFIG_INTERNAL_STATS. Change-Id: Iccbd78d91c3071b16fb3b2911523a22092652ecd	2011-06-01 11:10:13 -07:00
Yaowu Xu	50916c6a7d	remove some magic weights associated with sad_per_bit sad_per_bit has been used for a number of motion vector search routines with different magic weights: 1, 1/2 and 1/4. This commit remove these magic numbers and use 1/2 for all motion search routines, also reformat a number of source code lines to within 80 column limit. Test on cif set shows overall effect is neutral on all metrics. <=0.01% Change-Id: I8a382821fa4cffc9c0acf8e8431435a03df74885	2011-06-01 10:10:44 -07:00
Tero Rintaluoma	61f0c090df	neon fast quantize block pair vp8_fast_quantize_b_pair_neon function added to quantize two adjacent blocks at the same time to improve performance. - Additional 3-6% speedup compared to neon optimized fast quantizer (Tanya VGA@30fps, 1Mbps stream, cpu-used=-5..-16) Change-Id: I3fcbf141e5d05e9118c38ca37310458afbabaa4e	2011-06-01 10:48:05 +03:00
Scott LaVarnway	9e4f76c154	Merge "vp8_pick_inter_mode code cleanup"	2011-05-31 12:31:46 -07:00
Scott LaVarnway	1a5a1903ea	vp8_pick_inter_mode code cleanup Small code cleanups before attempting to reduce the size of bmi found in BLOCKD. Change-Id: Ie9c14adb53afd847716a75bcce067d0e6c04f225	2011-05-31 14:24:42 -04:00
John Koleszar	0a72f568ec	Initialize first_time_stamp_ever Misplaced #endif caused first_time_stamp_ever to only be initialized if CONFIG_INTERNAL_STATS was set. Change-Id: I2296a4ab00f7dfb767583edcc5d59b94f48c0621	2011-05-31 12:37:45 -04:00
Tero Rintaluoma	5305e79eae	adds preload for armv6 encoder asm Added preload instructions to armv6 encoder optimizations. About 5% average speed-up on Tegra2 for VGA@30fps sequence. Change-Id: I41d74737720fb71ce7a316f07555357822f3347e	2011-05-30 11:10:03 +03:00
John Koleszar	4a4ade6dc8	Merge "bug fix check frame buffer index before copy"	2011-05-27 12:35:06 -07:00
James Berry	8795b52512	bug fix check frame buffer index before copy in onyx_if.c update_reference_frames() make sure that frame buffer indexes are not equal before preforming a buffer copy. If two frames share the same buffer the flags will already be set correctly. Change-Id: Ida9b5516d08e3435c90f131d2dc19d842cfb536e	2011-05-27 14:59:29 -04:00
Yunqing Wang	4fb5ce6a92	Merge "Use hex search for realtime mode speed>4"	2011-05-27 11:12:50 -07:00
Yunqing Wang	4d052bdd91	Use hex search for realtime mode speed>4 Test showed using hex search in realtime mode largely speed up encoding process, and still achieves similar quality like the diamond search we have. Therefore, removed the diamond search option. Change-Id: I975767d0ec0539f9f6ed7fdfc09506e39761b66c	2011-05-27 14:05:02 -04:00
Scott LaVarnway	ba420f1097	Merge "Broken EC after MODE_INFO size reduction"	2011-05-27 07:52:04 -07:00
Yunqing Wang	5a8cbb8955	Merge "Remove unused code"	2011-05-27 07:25:25 -07:00
Yunqing Wang	2dc24635ec	Remove unused code Hex search is not called in rdopt.c Change-Id: I67347f03e13684147a7c77fb9e9147e440bb5e8e	2011-05-27 10:20:49 -04:00
Scott LaVarnway	4f586f7bd0	Broken EC after MODE_INFO size reduction This patch fixes the compiler errors and the seg fault when running decode_with_partial_drops. Change-Id: I7c75369e2fef81d53b790d5dabc327218216838b	2011-05-26 15:13:00 -04:00
John Koleszar	1fe5070b76	Merge "Do not copy data between encoder reference buffers."	2011-05-26 09:58:26 -07:00
Yaowu Xu	9a248f1593	Merge "fix the mix use of errorperbit and sadperbit"	2011-05-26 09:39:41 -07:00
Scott LaVarnway	40b850b458	Merge "Use int_mv instead of MV in vp8_mv_cont"	2011-05-26 07:01:38 -07:00
Yaowu Xu	d8c525b8b1	fix the mix use of errorperbit and sadperbit error_per_bit and sad_per_bit were designed as estimates of a bit worth of sum squared error and sum absolute difference respectively. Under this assumption, error_per_bit should be used in combination with 2nd order errors (variance or sum squared error) while sad_per_bit should be used in combination with 1st order SADs in motion estimation. There were a few places where sad_per_bit has been misused with variances, this commit changes to use error_per_bit for those places, also changes parameter names to properly indicate which constant is being used. On cif set, the change has a universal gain by all metrics: 0.13% by average/overall psnr and 0.1% by ssim. Change-Id: I4850fdcc3fd6886b30f784bd843f13dd401215fb	2011-05-25 16:48:10 -07:00
Yunqing Wang	13b56eeb7a	Merge " Use var8x8 instead of get8x8var in VP8_UVSSE"	2011-05-25 11:35:42 -07:00
Yunqing Wang	f299d628f3	Merge "Return sse value in vp8_variance SSE2 functions"	2011-05-25 11:31:07 -07:00
Yaowu Xu	22c05c0575	remove code not in use Change-Id: I6e5e86235d341cce3b02abda26dbeb71940ed955	2011-05-25 09:46:37 -07:00
Yunqing Wang	b6679879b8	Return sse value in vp8_variance SSE2 functions Minor modification. Change-Id: I09511d38fd1451d5c4106a48acdb3f766ce59cb7	2011-05-25 11:55:41 -04:00
Attila Nagy	a615c40499	Use var8x8 instead of get8x8var in VP8_UVSSE 'sum' returned by get8x8var is not used and var8x8 has optimizations for more platforms. Change-Id: I4a907fb1a05f285669fb0b95dc71d42182c980f6	2011-05-25 12:54:34 +03:00
Yunqing Wang	d75eb73653	Fix a bug happening while encoding at profile=3 While profile=3, there is no sub-pixel search. Distortion and SSE have to calculated using get_inter_mbpred_error(). Change-Id: Ifb36e17eef7750af93efa7d0e2870142ef540184	2011-05-24 16:28:23 -04:00
Scott LaVarnway	a39321f37e	Use int_mv instead of MV in vp8_mv_cont Less operations. Change-Id: Ibb9cd5ae66b8c7c681c9a654d551c8729c31c3ae	2011-05-24 16:01:12 -04:00
Scott LaVarnway	cfab2caee1	Removed unused variable warnings Change-Id: I6e5e921f03dc15a72da89a457848d519647677a3	2011-05-24 15:17:03 -04:00
Scott LaVarnway	b5278f38b0	Merge "MODE_INFO size reduction"	2011-05-24 12:08:24 -07:00
Scott LaVarnway	e11f21af9a	MODE_INFO size reduction Declared the bmi in MODE_INFO as a union instead of B_MODE_INFO. This reduced the memory footprint by 518,400 bytes for 1080 resolutions. The decoder performance improved by ~4% for the clip used and the encoder showed very small improvements. (0.5%) This reduction was first mentioned to me by John K. and in a later discussion by Yaowu. This is WIP. Change-Id: I8e175fdbc46d28c35277302a04bee4540efc8d29	2011-05-24 13:24:52 -04:00
John Koleszar	fbea372817	Merge "Fixing bug in VP8_SET_REFERENCE decoder control command"	2011-05-24 05:57:44 -07:00
Yunqing Wang	69aad3a720	Merge "Rewrite hex search function"	2011-05-24 05:26:16 -07:00
Henrik Lundin	a126cd1760	Fixing bug in VP8_SET_REFERENCE decoder control command In vp8dx_set_reference, the new reference image is written to an unused reference frame buffer. Change-Id: I9e4f2cef5a011094bb7ce7b2719cbfe096a773e8	2011-05-24 09:03:43 +02:00
Yaowu Xu	99fb568e67	Merge "use get8x8var directly for non-subpixel motion case in VP8_UVSSE"	2011-05-23 14:49:56 -07:00
Yunqing Wang	7838f4cfff	Rewrite hex search function Reduced some bound checks in hex search function. Change-Id: Ie5f73a6c227590341c960a74dc508cff80f8aa06	2011-05-23 16:18:52 -04:00
Yaowu Xu	ab2dfd22f3	use get8x8var directly for non-subpixel motion case in VP8_UVSSE VP8_UVSSE mistakenly used subpixvar8x8 to calculate SSE for non-subpixl motion cases. Change-Id: I4a5398bb9ef39c211039f6af4540546d4972e6a9	2011-05-23 09:11:28 -07:00
John Koleszar	ad6fe4a88c	Merge "bug fix active_worst_quality set below active_best_quality"	2011-05-20 11:23:10 -07:00
John Koleszar	8196cc85f8	Merge "cleanup: collect twopass variables"	2011-05-20 11:20:44 -07:00
Johann	6d82d2d22e	Merge "Fixed iwalsh_neon build problems with RVDS4.1"	2011-05-20 07:51:11 -07:00
Yaowu Xu	1fbc81a970	Merge "revise two function definitions with less parameters"	2011-05-20 07:45:42 -07:00
John Koleszar	a0c11928db	Merge "Remove unused members of VP8_COMP"	2011-05-20 07:39:03 -07:00
Yaowu Xu	a4c69e9a0f	revise two function definitions with less parameters Change-Id: Ia96e5bf915e4d3c0ac9c1795114bd9e5dd07327a	2011-05-19 19:06:03 -07:00
Yaowu Xu	1f3f18443d	Merge "disable trellis optimization for first pass"	2011-05-19 17:25:31 -07:00
Yaowu Xu	d5b8f7860f	disable trellis optimization for first pass also remove 2 #defines and 1 function declaration that are not in use. Change-Id: I8f743d0e3dd9ebf1de24a8b0c30ff09f29b00c53	2011-05-19 17:22:14 -07:00
James Berry	caa1b28be3	bug fix active_worst_quality set below active_best_quality fixed a bug where active_worst_quality could be set below active_best_quality which could result in an infinite loop. Change-Id: I93c229c3bc5bff2a82b4c33f41f8acf4dd194039	2011-05-19 18:10:31 -04:00
John Koleszar	63cb1a7ce0	cleanup: collect twopass variables This patch collects the twopass specific memebers of VP8_COMP into a dedicated struct. This is a first step towards isolating the two pass rate control and aids readability by decorating these variables with the 'twopass.' namespace. This makes it clear to the reader in what contexts the variable will be valid, and is a hint that a section of code might be a good candidate to move to firstpass.c in later refactoring. There likely will be other rate control modes that need their own specific data as well. This notation is probably overly verbose in firstpass.c, so an alternative would be to access this struct through a pointer like 'rc->' instead of 'cpi->firstpass.' in that file. Feel free to make a review comment to that effect if you prefer. Change-Id: I0ab8254647cb4b493a77c16b5d236d0d4a94ca4d	2011-05-19 17:26:09 -04:00
Scott LaVarnway	dba79821f0	Merge "Using partition_info instead of blockd info for splitmv"	2011-05-19 13:22:59 -07:00
John Koleszar	048497720c	Remove unused members of VP8_COMP Various members that were either completely unreferenced or written and not read. Change-Id: Ie41ebac0ff0364a76f287586e4fe09a68907806e	2011-05-19 15:49:09 -04:00
Scott LaVarnway	99b9757685	Using partition_info instead of blockd info for splitmv The partition_info struct contains info just for SPLITMV, so it should be used instead of BLOCKD. Eventually, I want to reduce the size of B_MODE_INFO struct found in BLOCKD, so this is the first step toward that goal. Also, since SPLITMV is not supported in vp8_pick_inter_mode(), the unnecessary mem copies and checks were removed. For rt encodes, this gave a slight performance improvement. Change-Id: I5585c98fa9d5acbde1c7e0f452a01d9ecc080574	2011-05-19 15:03:36 -04:00
Scott LaVarnway	914f7c36d7	Merge "Make hor UV predict ~2x faster (73 vs 132 cycles) using SSSE3."	2011-05-19 11:22:01 -07:00
John Koleszar	c684d5e5f2	Merge "changed configure option name to reduce confusion"	2011-05-19 11:17:08 -07:00
John Koleszar	ff39958cee	Merge "Make activity masking functions static"	2011-05-19 11:12:18 -07:00
John Koleszar	21ca4c4d5d	Merge "Fix segv without --enable-error-concealment"	2011-05-19 10:58:24 -07:00
John Koleszar	7def902261	Fix segv without --enable-error-concealment Missed wrapping one function call in #if CONFIG_ERROR_CONCEALMENT. Change-Id: I5746b1e6e4531670dbed1130467331fe309bdcae	2011-05-19 13:57:45 -04:00
John Koleszar	e3081b2502	Merge "Adding error-concealment to the decoder."	2011-05-19 10:48:58 -07:00
Stefan Holmer	d04f852368	Adding error-concealment to the decoder. The error-concealer is plugged in after any motion vectors have been decoded. It tries to estimate any missing motion vectors from the motion vectors of the previous frame. Intra blocks with missing residual are replaced with inter blocks with estimated motion vectors. This feature was developed in a separate sandbox (sandbox/holmer/error-concealment). Change-Id: I5c8917b031078d79dbafd90f6006680e84a23412	2011-05-19 13:46:33 -04:00
John Koleszar	a84177b432	Make activity masking functions static These don't need extern linkage. Change-Id: I21220ada926380a75ff654f24df84376ccc49323	2011-05-19 11:14:13 -04:00
John Koleszar	87254e0b7b	Move quantizer init functions to quantize.c Group related functions together. Change-Id: I92fd779225b75a7204650f1decb713142c655d71	2011-05-19 11:07:41 -04:00
Attila Nagy	f96d56c4aa	Fixed iwalsh_neon build problems with RVDS4.1 rvct 4.1 was complaining about vstmia.16, store multiple expects 64 data type. optimized the implementation. Change-Id: I0701052cabd685c375637bbc3796ff6d88f5972c	2011-05-19 10:27:26 +03:00
Yunqing Wang	00a1e2f8e4	Merge "Modify MVcount in pick_inter_mode to eliminate calling of vp8_find_near_mvs"	2011-05-18 12:53:27 -07:00
Yunqing Wang	9c62f94129	Fix a bug in vp8_clamp_mv function Scott fixed the bug in MV clamping function in encoder, which could cause artifacts. Change-Id: Id05f2794c43c31cdd45e66179c8811f3ee452cb9	2011-05-18 09:52:56 -04:00
Yunqing Wang	f62b33f140	Modify MVcount in pick_inter_mode to eliminate calling of vp8_find_near_mvs Moved MVcount modification in pick_inter_mode, and eliminated calling of vp8_find_near_mvs. Change-Id: Icd47448a1dfc8fdf526f86757d0e5a7f218cb5e8	2011-05-17 10:59:42 -04:00
John Koleszar	eafdc5e10a	Merge "Improve framerate adaptation"	2011-05-13 11:18:42 -07:00
Yaowu Xu	5608c14020	Merge "adjusting rd constant slightly by ~10%"	2011-05-13 09:28:26 -07:00
Paul Wilkins	0e86235265	Merge "Restructure of activity masking code."	2011-05-13 09:23:50 -07:00
Paul Wilkins	ff52bf3691	Restructure of activity masking code. This commit restructures the mb activity masking code to better facilitate experimentation using different metrics etc. and also allows for adjustment of the zero bin either for encode only or both the encode and mode selection stages It also uses information from the current frame rather than the previous frame and the default strength has been reduced. Change-Id: Id39b19eace37574dc429f25aae810c203709629b	2011-05-13 10:37:50 +01:00
John Koleszar	5ed116e220	Improve framerate adaptation This patch improves the accuracy of frame rate estimation by using a larger, 1 second window. It also more quickly adapts to step changes in the input frame rate (ie 30fps to 15fps) Change-Id: I39e48a8f5ac880b4c4b2ebd81049259b81a0218e	2011-05-12 15:07:50 -04:00
Scott LaVarnway	71a7501bcf	Removed mv_bits_sadcost This sad cost is being generated but never used. Change-Id: I562eebdcb792b743770954feca365b5b37491ecd	2011-05-12 11:20:41 -04:00
Scott LaVarnway	6b25501bf1	Using int_mv instead of MV The compiler produces better assembly when using int_mv for assignments. The compiler shifts and ors the two 16bit values when assigning MV. Change-Id: I52ce4bc2bfbfaf3f1151204b2f21e1e0654f960f	2011-05-12 11:08:16 -04:00
Yunqing Wang	6ed81fa5b3	Merge "Modification and issue fix in full-pixel refining search"	2011-05-12 07:20:44 -07:00
Yunqing Wang	b4da1f83e6	Modification and issue fix in full-pixel refining search Further modification and wrong implementation fix which caused refining_search and refining_searchx4 result mismatching. Change-Id: I80cb3a44bf5824413fd50c972e383eebb75f9b6f	2011-05-12 10:18:40 -04:00
Yaowu Xu	bd9d890605	adjusting rd constant slightly by ~10% This is to reflect the RD improvement in the encoder. The change has a small positive impact on quality (0.25% by VPXSSIM and 0.05% by PSNR) Change-Id: Ic66ffc19b10870645088c0624c85556f009fd210	2011-05-11 23:32:06 -07:00
Yaowu Xu	ba6f60dba7	Merge "remove a variable no longer in use"	2011-05-10 20:20:59 -07:00
Yaowu Xu	1bcf4e66bb	Merge "fix a bug related to gf_active_flags in multi-threaded encoder"	2011-05-10 19:59:52 -07:00
Yaowu Xu	f7cf439b34	remove a variable no longer in use The variable is introduced in commit `2e53e9e53` to make more use of trellis quantization, but this is no longer necessary after RDMULT was made adaptive in a number of later commits. Change-Id: I7420522ec7723f38cf77033466c25afb405d52ae	2011-05-10 19:57:51 -07:00
Johann	df2023a6cb	set up Global Offset Table in recon global values were being referenced, but the GOT was not being set up. as the GOT is only required for PIC, this issue wasn't caught in the default configuration. Change-Id: I8006e53776139362a76f2c80cf9d0f8458602b2f http://code.google.com/p/webm/issues/detail?id=328	2011-05-10 15:58:56 -04:00
Yunqing Wang	c7a56f677d	Merge "Use diamond search to replace full search in full-pixel refining search"	2011-05-10 06:59:38 -07:00
Yunqing Wang	cb7b1fb144	Use diamond search to replace full search in full-pixel refining search In NEWMV mode, currently, full search is used as the refining search after n-step search. By replacing it with an iterative diamond search of radius 1 largely reduced the computation complexity, but still maintained the same encoding quality since the refining search is done for every macroblock instead of only a small precentage of macroblocks while using full search. Tests on the test set showed a 3.4% encoding speed increase with none psnr & ssim loss. Change-Id: Ife907d7eb9544d15c34f17dc6e4cfd97cb743d41	2011-05-09 14:07:06 -04:00
Johann	a7d4d3c550	clean up unused variable warnings Change-Id: I9467d7a50eac32d8e8f3a2f26db818e47c93c94b	2011-05-09 12:56:20 -04:00
Yaowu Xu	89c6017cc0	fix a bug related to gf_active_flags in multi-threaded encoder Paul pointed out that the pointer to the gf_active_flags is not being properly incremented in multithreaded encoder. This commit fixes the issue by making sure the gf_active_ptr points to the starting of next group of mb rows. Change-Id: I3246e657d23beabb614dfb880733a68a5fd7e34c	2011-05-06 09:00:44 -07:00
John Koleszar	5c756005aa	Merge "Don't override active_worst_quality in 2 pass"	2011-05-06 08:59:05 -07:00
Johann	52490354f3	Merge "neon fast quantizer updated"	2011-05-06 08:54:14 -07:00
John Koleszar	abc9958c52	Don't override active_worst_quality in 2 pass Commit `db5057c` introduced a bug in that the active_worst_quality selected by the 2 pass rate controller was being overridden for key frames, causing a severe quality loss. Change-Id: I4865a6fbe3e94e9b4fb9271c7dd68b455d7b371d	2011-05-06 11:48:53 -04:00
Tero Rintaluoma	33fa7c4ebe	neon fast quantizer updated vp8_fast_quantize_b_neon function updated and further optimized. - match current C implementation of fast quantizer - updated to use asm_enc_offsets for structure members - updated ads2gas scripts to handle alignment issues Change-Id: I5cbad9c460ad8ddb35d2970a8684cc620711c56d	2011-05-06 08:59:52 +03:00
Aron Rosenberg	eeb8117303	Fix semaphore emulation on Windows The existing emulation of posix semaphores on Windows uses SetEvent() and WaitForSingleObject(), which implements a binary semaphore, not a counting semaphore as implemented by posix. This causes deadlock when used with the expected posix semantics. Instead, this patch uses the CreateSemaphore() and ReleaseSemaphore() calls (introduced in Windows 2000) which have the expected behavior. This patch also reverts commit `eb16f00`, which split a semaphore that was being used with counting semantics into two binary semaphores. That commit is unnecessary with corrected emulation. Change-Id: If400771536a27af4b0c3a31aa4c4e9ced89ce6a0	2011-05-06 00:13:59 -04:00
Yunqing Wang	eb16f00cf2	Fix rare hang in multi-thread encoder on Windows This patch is to fix a rare hang in multi-thread encoder that was only seen on Windows. Thanks for John's help in debugging the problem. More test is needed. Change-Id: Idb11c6d344c2082362a032b34c5a602a1eea62fc	2011-05-05 10:42:29 -04:00
Johann	ca5c1b17a2	Merge "Loopfilter NEON: Use VMOV for constant vectors instead of VLD."	2011-05-05 06:16:21 -07:00
Yunqing Wang	aeb86d615c	Merge "Runtime detection of available processor cores."	2011-05-05 04:59:54 -07:00
Attila Nagy	a6aa389d2f	Loopfilter NEON: Use VMOV for constant vectors instead of VLD. Change-Id: I562b6e01c32bb51d00f3b95faf757fc7dc29a3a3	2011-05-04 11:29:23 +03:00
Yunqing Wang	3fbade23a2	Merge "Modify HEX search"	2011-05-03 11:59:32 -07:00
Yunqing Wang	04ec930abc	Modify HEX search Changed 8-neighbor searching to 4-neighour searching, and continued searching until the center point is the best match. Test on test set showed 1.3% encoding speed improvement as well as 0.1% PSNR and SSIM improvement at speed=-5 (rt mode). Will continue to improve it. Change-Id: If4993b1907dd742b906fd3f86fee77cc5932ee9a	2011-05-03 14:26:33 -04:00
Yaowu Xu	e9465daee3	Merge "change to use fast ssim code for internal ssim calculations"	2011-05-03 11:20:52 -07:00
Yaowu Xu	6c565fada0	change to use fast ssim code for internal ssim calculations The commit also removed the slow ssim calculation that uses a 7x7 kernel, and revised the comments to better describe how sample ssim values are computed and averaged Change-Id: I1d874073cddca00f3c997f4b9a9a3db0aa212276	2011-05-03 08:36:17 -07:00
John Koleszar	c09d8c1419	Merge "Fix documentation typos"	2011-05-02 06:50:22 -07:00
John Koleszar	a66d8d33dd	Fix compile error with --enable-postproc-visualizer Typo. Change-Id: I9cc6a4587c3d93c9f0da5e101d376741fc9622a4	2011-05-02 09:28:37 -04:00
Thijs Vermeir	8942f70cdf	Fix documentation typos Change-Id: I97124670926433bf1593c91660d8b8f8482ea9ce	2011-04-30 09:34:59 +02:00
Ronald S. Bultje	5a23352c03	Make hor UV predict ~2x faster (73 vs 132 cycles) using SSSE3. Change-Id: I658a1df7d825f820573cb2d11ad402f9d2791035	2011-04-29 11:52:09 -07:00
Yaowu Xu	57ad189129	changed configure option name to reduce confusion Renamed configure option "enable-psnr" to "enable-internal-stats" to better reflect the purpose of the option and eliminate the confusion reported in http://code.google.com/p/webm/issues/detail?id=35 Change-Id: If72df6fdb9f1e33dab1329240ba4d8911d2f1f7a	2011-04-29 09:39:05 -07:00
Yunqing Wang	dfa9e2c5ea	Merge "Use insertion sort instead of quick sort"	2011-04-29 08:27:58 -07:00
Scott LaVarnway	1b2abc5f49	Merge "Consolidated build inter predictors"	2011-04-29 07:13:49 -07:00
James Berry	f10732554b	bug fix removed inline from recon_wrapper_sse2.c removed inline from recon_wrapper_sse2.c to build for visual stuido Change-Id: I74a3482950448e2cdb30e9cd7087145b440d8a22	2011-04-28 15:12:00 -04:00
Scott LaVarnway	219ba87a93	Merge "Use psadbw to get the sum of bytes in a line."	2011-04-28 07:58:20 -07:00
Scott LaVarnway	ccd6f7ed77	Consolidated build inter predictors Code cleanup. Change-Id: Ic8b0167851116c64ddf08e8a3d302fb09ab61146	2011-04-28 10:53:59 -04:00
Ronald S. Bultje	1e7ded69cf	Use psadbw to get the sum of bytes in a line. Thanks Jason for pointing that out on #vp8. ;-). Change-Id: I5330a753e752a8704b78a409597472628e0b26a5	2011-04-27 13:49:21 -07:00
Scott LaVarnway	2e102855f4	Removed unused code in reconinter The skip flag is never set by the encoder for SPLITMV. Change-Id: I5ae6457edb3a1193cb5b05a6d61772c13b1dc506	2011-04-27 15:25:32 -04:00
John Koleszar	085fb4b737	Merge "SSE2/SSSE3 optimizations for build_predictors_mbuv{,_s}()."	2011-04-27 12:02:55 -07:00
Ronald S. Bultje	1083fe4999	SSE2/SSSE3 optimizations for build_predictors_mbuv{,_s}(). decoding before 10.425 10.432 10.423 =10.426 after: 10.405 10.416 10.398 =10.406, 0.2% faster encoding before 14.252 14.331 14.250 14.223 14.241 14.220 14.221 =14.248 after 14.095 14.090 14.085 14.095 14.064 14.081 14.089 =14.086, 1.1% faster Change-Id: I483d3d8f0deda8ad434cea76e16028380722aee2	2011-04-27 11:31:27 -07:00
Yunqing Wang	5abafcc381	Use insertion sort instead of quick sort Insertion sort performs better for sorting small arrays. In real- time encoding (speed=-5), test on test set showed 1.7% performance gain with 0% PSNR change in average. Change-Id: Ie02eaa6fed662866a937299194c590d41b25bc3d	2011-04-27 13:53:28 -04:00
John Koleszar	64355ecad3	Merge "Speed up VP8DX_BOOL_DECODER_FILL"	2011-04-27 09:03:45 -07:00
John Koleszar	f8ffecb176	Merge "Update VP8DX_BOOL_DECODER_FILL to better detect EOS"	2011-04-27 09:03:24 -07:00
John Koleszar	5e1fd41357	Speed up VP8DX_BOOL_DECODER_FILL The end-of-buffer check is hoisted out of the inner loop. Gives about 0.5% improvement on x86_64. Change-Id: I8e3ed08af7d33468c5c749af36c2dfa19677f971	2011-04-27 10:25:03 -04:00
John Koleszar	9594370e0c	Update VP8DX_BOOL_DECODER_FILL to better detect EOS Allow more reliable detection of truncated bitstreams by being more precise with the count of "virtual" bits in the value buffer. Specifically, the VP8_LOTS_OF_BITS value is accumulated into count, rather than being assigned, which was losing the prior value, increasing the required tolerance when testing for the error condition. Change-Id: Ib5172eaa57323b939c439fff8a8ab5fa38da9b69	2011-04-27 10:24:39 -04:00
John Koleszar	db5057c742	Refactor calc_iframe_target_size Combine calc_iframe_target_size, previously only used for forced keyframes, with calc_auto_iframe_target_size, which handled most keyframes. Change-Id: I227051361cf46727caa5cd2b155752d2c9789364	2011-04-26 16:55:35 -04:00
John Koleszar	81d2206ff8	Move pick_frame_size() to ratectrl.c This is a first step in cleaning up the redundancies between vp8_calc_{auto_,}iframe_target_size. The pick_frame_size() function is moved to ratectrl.c, and made to be the primary interface. This means that the various calc_*_target_size functions can be made private. Change-Id: I66a9a62a5f9c23c818015e03f92f3757bf3bb5c8	2011-04-26 16:49:54 -04:00
Scott LaVarnway	0da77a840b	Merge "Test vector mismatch fix"	2011-04-26 10:12:37 -07:00
Scott LaVarnway	7a2b9c50a3	Test vector mismatch fix Fixed test vector mismatch that was introduced in the "Removed dc_diff from MB_MODE_INFO" (Ie2b9cdf9e0f4e8b932bbd36e0878c05bffd28931) Change-Id: I98fa509b418e757b5cdc4baa71202f4168dc14ec	2011-04-26 09:37:19 -04:00
Johann	d5c46bdfc0	Merge "remove simpler_lpf"	2011-04-25 14:51:07 -07:00
Johann	01527e743f	remove simpler_lpf the decision to run the regular or simple loopfilter is made outside the function and managed with pointers stop tracking the option in two places. use filter_type exclusively Change-Id: I39d7b5d1352885efc632c0a94aaf56b72cc2fe15	2011-04-25 17:37:41 -04:00
John Koleszar	fd6da3b2e7	Fix duplicate vp8_compute_frame_size_bounds Likely introduced by a bad automatic merge from gerrit. Change-Id: I0c6dd6ec18809cf9492f524d283fa4a3a8f4088b	2011-04-25 14:30:57 -04:00
John Koleszar	1f32b1489c	Merge "Remove unused functions"	2011-04-25 11:05:00 -07:00
John Koleszar	47bc1c7013	Remove unused functions Remove estimate_min_frame_size() and calc_low_ss_err(), as they are never referenced. Change-Id: I3293363c14ef70b79c4678ca27aa65b345077726	2011-04-25 13:54:23 -04:00
John Koleszar	cfbfd39de8	Merge "Change rc undershoot/overshoot semantics"	2011-04-25 10:49:32 -07:00
John Koleszar	76557e34d2	Merge "Limit size of initial keyframe in one-pass."	2011-04-25 10:48:13 -07:00
John Koleszar	d9f898ab6d	Merge "Add rc_max_intra_bitrate_pct control"	2011-04-25 10:47:57 -07:00
John Koleszar	454cbc96b7	Limit size of initial keyframe in one-pass. Rather than using a default size of 1/2 or 3/2 seconds for the first frame, use a fraction of the initial buffer level to give the application some control. This will likely undergo further refinement as size limits on key frames are currently under discussion on codec-devel@, but this gives much better behavior for small buffer sizes as a starting point. Change-Id: Ieba55b86517b81e51e6f0a9fe27aabba295acab0	2011-04-25 13:47:20 -04:00
John Koleszar	aa926fbd27	Add rc_max_intra_bitrate_pct control Adds a control to limit the maximum size of a keyframe, as a function of the per-frame bitrate. See this thread[1] for more detailed discussion: [1]: http://groups.google.com/a/webmproject.org/group/codec-devel/browse_thread/thread/271b944a5e47ca38 Change-Id: I7337707642eb8041d1e593efc2edfdf66db02a94	2011-04-25 13:47:14 -04:00
John Koleszar	2089b2cee5	Merge "bug fix possible keyframe context divide by zero"	2011-04-25 09:35:12 -07:00
James Berry	8d5ce819dd	bug fix possible keyframe context divide by zero vp8_adjust_key_frame_context() divides by estimate_keyframe_frequency() which can return 0 in the case where --kf-max-dist=0. Change-Id: Idfc59653478a0073187cd2aa420e98a321103daa	2011-04-25 12:16:36 -04:00
Johann	aeca599087	Merge "keep values in registers during quantization"	2011-04-25 06:52:38 -07:00
Scott LaVarnway	c36b6d4d01	Merge "Removed unnecessary frame type checks"	2011-04-25 06:45:43 -07:00
Scott LaVarnway	5b67329747	Merge "Removed dc_diff from MB_MODE_INFO"	2011-04-25 06:45:32 -07:00
Ronald S. Bultje	496bcbb0de	Fix overflow in temporal_filter_apply_sse2(). The accumulator array is an integer array, so use paddd instead of paddw to add values to it. Fixes overflows when using large --arnr-maxframes (>8) values. Change-Id: Iad83794caa02400a65f3ab5760f2517e082d66ae	2011-04-22 10:00:38 -04:00
John Koleszar	73c3d32705	Merge "Remove unused kf rate variables"	2011-04-21 16:54:14 -07:00
Adrian Grange	d2a6eb4b1e	Corrected format specifiers in debug print statements The arguments to these fprintfs are int not long int so the format specifier should be "%d" and not "%ld". This was writing garbage in the linux build. Change-Id: I3d2aa8a448d52e6dc08858d825bf394929b47cf3	2011-04-21 15:45:57 -07:00
Johann	508ae1b3d5	keep values in registers during quantization add an sse4 quantizer so we can use pinsrw/pextrw and keep values in xmm registers instead of proxying through the stack. and as long as we're bumping up, use some ssse3 instructions in the EOB detection (see ssse3 fast quantizer) pick up about a percent on 32bit and about two on 64bit. Change-Id: If15abba0e8b037a1d231c0edf33501545c9d9363	2011-04-21 15:47:55 -04:00
Scott LaVarnway	6f6cd3abb9	Removed unnecessary frame type checks ref_frame is set to INTRA_FRAME for keyframes. The B_PRED mode is only used in intra frames. Change-Id: I9bac8bec7c736300d47994f3cb570329edf11ec0	2011-04-21 14:59:42 -04:00
Scott LaVarnway	3698c1f620	Removed dc_diff from MB_MODE_INFO The dc_diff flag is used to skip loopfiltering. Instead of setting this flag in the decoder/encoder, we now check for this condition in the loopfilter. Change-Id: Ie2b9cdf9e0f4e8b932bbd36e0878c05bffd28931	2011-04-21 14:38:36 -04:00
Scott LaVarnway	7a49accd0b	Removed force_no_skip force_no_skip is always set to zero. Change-Id: I89b61c5e0bee34627a9c07c05f3517e1db76af77	2011-04-20 15:45:12 -04:00
Scott LaVarnway	09c933ea80	Removed redundant checks of the mode_info_context flags Code cleanup. The build inter predictor functions are redundantly checking the mode_info_context for either INTRA_FRAME or SPLITMV. Change-Id: I4d58c3a5192a4c2cec5c24ab1caf608bf13aebfb	2011-04-20 14:06:40 -04:00
Attila Nagy	43464e94ed	Do not copy data between encoder reference buffers. Golden and ALT reference buffers were refreshed by copying from the new buffer. Replaced this by index manipulation. Also moved all the reference frame updates to one function for easier tracking. Change-Id: Icd3e534e7e2c8c5567168d222e6a64a96aae24a1	2011-04-20 15:26:55 +03:00
John Koleszar	ad6a8ca58b	Remove unused kf rate variables Remove tot_key_frame_bits and prior_key_frame_size[] as they were tracked but never used. Remove intra_frame_target, as it was only used to initialize prior_key_frame_size. Refactor vp8_adjust_key_frame_context() some to remove unnecessary calculations. Change-Id: Icbc2c83d2b90e184be03e6f9679e678f3a4bce8f	2011-04-19 16:14:57 -04:00
Johann	4a2b684ef4	modify SAVE_XMM for potential 64bit use the win64 abi requires saving and restoring xmm6:xmm15. currently SAVE_XMM and RESTORE XMM only allow for saving xmm6:xmm7. allow specifying the highest register used and if the stack is unaligned. Change-Id: Ica5699622ffe3346d3a486f48eef0206c51cf867	2011-04-19 10:42:45 -04:00
Johann	a9b465c5c9	Merge "Add save/restore xmm registers in x86 assembly code"	2011-04-19 06:32:10 -07:00
Johann	c7cfde42a9	Add save/restore xmm registers in x86 assembly code Went through the code and fixed it. Verified on Windows. Where possible, remove dependencies on xmm[67] Current code relies on pushing rbp to the stack to get 16 byte alignment. This broke when rbp wasn't pushed (vp8/encoder/x86/sad_sse3.asm). Work around this by using unaligned memory accesses. Revisit this and the offsets in vp8/encoder/x86/sad_sse3.asm in another change to SAVE_XMM. Change-Id: I5f940994d3ebfd977c3d68446cef20fd78b07877	2011-04-18 16:30:38 -04:00
Yunqing Wang	48438d6016	Merge "Use sub-pixel search's SSE in mode selection"	2011-04-18 13:20:04 -07:00
Yunqing Wang	b8f0b59985	Use sub-pixel search's SSE in mode selection Passed SSE from sub-pixel search back to pick_inter_mode function, which is compared with the encode_breakout to see if we could skip evaluating the remaining modes. Change-Id: I4a86442834f0d1b880a19e21ea52d17d505f941d	2011-04-18 16:12:28 -04:00
Yunqing Wang	d5069b5af0	Merge "Handle long delay between video frames in multi-thread decoder(issue 312)"	2011-04-18 10:11:41 -07:00
Johann	cd103a5721	Merge "store quant_shift as an unsigned char"	2011-04-18 10:03:40 -07:00
Yaowu Xu	c619f6cb0f	Merge "fixed an overflow in ssim calculation"	2011-04-18 07:44:34 -07:00
Scott LaVarnway	e1a8b6c8d5	Removed unused timers Change-Id: I209803b9dbed2b2f6d02258fd7a3963a6645f4ab	2011-04-18 09:09:57 -04:00
Yunqing Wang	8ba58951e9	Handle long delay between video frames in multi-thread decoder(issue 312) This is reported by m...@hesotech.de (see issue 312): "The decoder causes an access violation when you decode the first frame, then make a pause of about 60 seconds and then decode further frames. But only if vpx_codec_dec_cfg_t.threads> 1. This is caused by a timeout of WaitForSingleObject. When I change the definition of VPXINFINITE to INFINITE(0xFFFFFFFF), the problem is solved." Reproduced the crash and verified the changes on Windows platform. This brings the behavior inline with the other platforms using sem_wait(). Change-Id: I27b32f90bce05846ef2684b50f7a88f292299da1	2011-04-15 17:27:26 -04:00
Johann	d889035fe6	Merge "remove dead code, add missing RESTORE_XMM"	2011-04-15 13:32:54 -07:00
Johann	f64f425a50	remove executable bit source files are not executable Change-Id: Id2c7294695a22217468426423979f68f02d82340	2011-04-15 13:43:24 -04:00
Adrian Grange	0d2abe3084	Merge "Fix usage of value returned by vp8_pick_intra4x4mby_modes"	2011-04-15 08:37:19 -07:00
Yunqing Wang	1312a7a2e2	Merge "Reduce unnecessary distortion computation"	2011-04-15 08:17:03 -07:00
Johann	487c0299c9	remove dead code, add missing RESTORE_XMM vp8_filter_block1d16_h4_ssse3 was never called because UNSHADOW_ARGS moves the stack by 'mov rsp, rbp', the issue was masked. however, if/when win64 used those registers for persistant data, issues could/will arise. Change-Id: I56d6effca0aeba1f86082689771cb10145d39651	2011-04-15 10:11:53 -04:00
John Koleszar	a3399291ad	Fix off-by-one in copy_and_extend_plane Should only copy h lines, not h+1. Change-Id: I802a85686635900459c6dc79596189033e5298d8	2011-04-15 08:44:39 -04:00
Yunqing Wang	918fb5487e	Reduce unnecessary distortion computation In vp8_pick_inter_mode(), for NEWMV mode, use the error result got from motion search as distortion. This helps performance in real- time mode. Change-Id: I398c4e46cc5381f7d874e748cf78827ef0e0860c	2011-04-14 15:53:33 -04:00
John Koleszar	63f15987a5	Merge "Refactor lookahead ring buffer"	2011-04-14 12:35:01 -07:00
Fritz Koenig	e749ae510f	Merge "Use consistent delimiters."	2011-04-14 11:56:18 -07:00
Adrian Grange	8608de1c6f	Fix usage of value returned by vp8_pick_intra4x4mby_modes The value of distortion2 returned by vp8_pick_intra4x4mby_modes was being overwritten by the value returned by get16x16prederror before it was tested. Change-Id: If00e80332b272c5545c3a7e381c8041e8319b41a	2011-04-14 10:50:00 -07:00
Fritz Koenig	33cefd6f6e	Use consistent delimiters. opsnr.stt file was using \t for delimiters on everything except between VPXSSIM and Time. Change-Id: I6284c4e40c05ff642bf4b0170dca062c279a42df	2011-04-13 15:06:17 -07:00
Adrian Grange	8861174624	Fixed use of early breakout in vp8_pick_intra4x4mby_modes Index i is used to detect early breakout from the first loop, but its value is lost due to reuse in the second for loop. I moved the position of the second loop and did some format cleanup. Change-Id: I02780eae1bd89df4b6c000fb8a018b0837aac2e5	2011-04-13 12:56:46 -07:00
John Koleszar	88841f1059	Refactor lookahead ring buffer This patch cleans up the source buffer storage and copy mechanism to allow access through a standard push/pop/peek interface. This approach also avoids an extra copy in the case where the source is not a multiple of 16, fixing issue #102. Change-Id: I05808c39f5743625cb4c7af54cc841b9b10fdbd9	2011-04-13 14:26:45 -04:00
Johann	70f30aa95d	store quant_shift as an unsigned char in encodframe.c, quant_shift is set to 0 or 1 in vp8cx_invert_quant only use 8 bits to store this, instead of 16. will allow saving an xmm register in an updated version of the regular quantize Change-Id: Ie88c47fe2aff5af0283dab1147fb2791e4b12f90	2011-04-13 13:50:12 -04:00
John Koleszar	c99f9d7abf	Change rc undershoot/overshoot semantics This patch changes the rc_undershoot_pct and rc_overshoot_pct controls to set the "aggressiveness" of rate adaptation, by limiting the amount of difference between the target buffer level and the actual buffer level which is applied to the target frame rate for this frame. This patch was initially provided by arosenberg at logitech.com as an attachment to issue #270. It was modified to separate these controls from the other unrelated modifications in that patch, as well as to use the pre-existing variables rather than introducing new ones. Change-Id: Id542e3f5667dd92d857d5eabf29878f2fd730a62	2011-04-12 20:49:33 -04:00
John Koleszar	538f110407	Merge "Bugfix for error accumulator stats"	2011-04-12 06:59:00 -07:00
John Koleszar	e689a27d62	Bugfix for error accumulator stats Previous to commit `de4e9e3`, there was an early return in the alt-ref case that was inadvertantly removed when the function was refactored to return void. This patch restores the prior behavior. Change-Id: I783ffd594a4690297e2742f99526fd7ad67698b2	2011-04-12 08:47:33 -04:00
John Koleszar	fd09009227	Merge "Fix encoder range check for frame width and height"	2011-04-12 05:34:12 -07:00
Attila Nagy	1aadcedcfb	Fix encoder range check for frame width and height 14 bits available in the bistream => valid range [1..16383] Removed unused local vars. Change-Id: Icf3385e47a9fa13af70053129c2248671f285583	2011-04-12 15:07:37 +03:00
Yunqing Wang	4fd81a99f8	Set cpu_used range to [-16, 16] in real-time mode Remove encoding speed limitation in real-time mode. Change-Id: Ib5e35d8bb522b2a25f3e4ad5cfe2788ebebb3617	2011-04-11 15:55:04 -04:00
Yunqing Wang	d1abe62d1c	Define RDCOST only once Clean up the code. Change-Id: I7db048efa4d972b528d553a7921bc45979621129	2011-04-11 11:53:56 -04:00
John Koleszar	a9ce3e3834	Remove unused files Change-Id: I36ca3f2f4620358033da34daf764f0b388dacd08	2011-04-11 10:34:40 -04:00
Yunqing Wang	4b43167ad1	Fix input MV for full search Input MV needs to be modified to full-pixel precision. Change-Id: Ic5d78e41bf27077e325024332b9fe89f76c44f0c	2011-04-08 16:29:41 -04:00
Johann Koenig	6e156a4cd7	Merge "use asm_offsets with vp8_fast_quantize_b_sse3"	2011-04-08 10:05:47 -07:00
John Koleszar	921a32a306	Merge "Error accumulator stats bug."	2011-04-08 08:20:32 -07:00
Paul Wilkins	de4e9e3b44	Error accumulator stats bug. The error accumulator stats values cpi->prediction_error and cpi->intra_error were being populated with rd values not distortion values. These are only "currently" used in a limited way for RT compress key frame detection. Change-Id: I2702ba1cab6e49ab8dc096ba75b6b34ab3573021	2011-04-08 14:21:36 +01:00
Jim Bankoski	d4cdb683a4	fixed an overflow in ssim calculation This commit fixed an overflow in ssim calculation, added register save and restore to make sure assembly code working for x64 platform. It also changed the sampling points to every 4x4 instead of 8x8 and adjusted the constants in SSIM calculation to match the scale of previous VPXSSIM. Change-Id: Ia4dbb8c69eac55812f4662c88ab4653b6720537b	2011-04-07 14:25:25 -07:00
Johann Koenig	08702002e8	use asm_offsets with vp8_fast_quantize_b_sse3 on the same order as the sse2 fast quantize change: ~2% except for 32bit. only a slight improvment there. Change-Id: Iff80e5f1ce7e646eebfdc8871405458ff911986b	2011-04-07 16:40:05 -04:00
James Berry	aec5487cdd	Use correct 32 bit comparisons for SAD breakout. Rax updated to eax to avoid uninitialized memory usage. Change-Id: Iedb953f104329ede2a786fc648a47f1be2f3798a	2011-04-07 15:08:03 -04:00
Johann	2de858b9fc	Merge "use asm_offsets with vp8_fast_quantize_b_sse2"	2011-04-06 10:53:55 -07:00
Yunqing Wang	9e9f61a317	Merge "Minor modification"	2011-04-06 06:12:13 -07:00
Yunqing Wang	02423b2e92	Minor modification A small change. Change-Id: I2e7726e58370a95d0319361f4f6ad231138d1328	2011-04-06 09:08:47 -04:00
Johann	c32e0ecc59	use asm_offsets with vp8_fast_quantize_b_sse2 on the same order as the regular quantize change: ~2% Change-Id: I5c9eec18e89ae7345dd96945cb740e6f349cee86	2011-04-04 16:23:29 -04:00
Scott LaVarnway	f212a98ee7	Fixed unused variable warnings for firstpass.c Change-Id: I8378a9a541ade2f098359a7b20fa08e6c1596d80	2011-04-04 14:18:31 -04:00
John Koleszar	91036996ac	Merge "Slightly simplify vp8_decode_mb_tokens."	2011-04-04 08:58:25 -07:00
Johann	610dd90288	Merge "tweak vp8_regular_quantize_b_sse2"	2011-04-04 08:56:25 -07:00
Gaute Strokkenes	15f03c2f13	Slightly simplify vp8_decode_mb_tokens. Change-Id: I0058ba7dcfc50a3374b712197639ac337f8726be	2011-04-04 16:47:22 +01:00
Yunqing Wang	f5c0d95e8c	Merge "Use full-pixel MV in mvsadcost calculation"	2011-04-04 08:40:51 -07:00
Yunqing Wang	3d6815817c	Use full-pixel MV in mvsadcost calculation MV sad cost error is only used in full-pixel motion search, which only need full-pixel resolution instead of quarter-pixel resolution. This change reduced mvsadcost table size, and removed unneccessary pamameter passing since this table is constant once it is generated. Change-Id: I9f931e55f6abc3c99011321f1dfb2f3562e6f6b0	2011-04-01 16:41:58 -04:00
Johann	8520b5c785	tweak vp8_regular_quantize_b_sse2 rather than look up rc in the zig zag table, embed it in the macro. this also allows us to shuffle some values in the macro and keep *d in rsi gains of about the same order as the obj_int_extract implementation: ~2% Change-Id: Ib7252dd10eee66e0af8b0e567426122781dc053d	2011-04-01 09:58:23 -04:00
Johann	ba11e24d47	Merge "Wrapper function removed from vp8_subtract_b_neon function call"	2011-04-01 05:47:21 -07:00
Tero Rintaluoma	cec76a36d6	Wrapper function removed from vp8_subtract_b_neon function call Address calculations moved from encodemb_arm.c file to neon optimized assembly function to save cycles in function calls. - vp8_subtract_b_neon_func replaced with vp8_subtract_b_neon that contains all needed address calculations - unnecessary file encodemb_arm.c removed - consistent with ARMv6 optimized version Change-Id: I6cbc1a2670b56c2077f59995fcf8f70786b4990b	2011-04-01 10:06:44 +03:00
Johann	9d138379a2	Merge "ARMv6 optimized subtract functions"	2011-03-31 08:40:10 -07:00
Attila Nagy	297b27655e	Runtime detection of available processor cores. Detect the number of available cores and limit the thread allocation accordingly. On decoder side limit the number of threads to the max number of token partition. Core detetction works on Windows and Posix platforms, which define _SC_NPROCESSORS_ONLN or _SC_NPROC_ONLN. Change-Id: I76cbe37c18d3b8035e508b7a1795577674efc078	2011-03-31 10:23:01 +03:00
Attila Nagy	7d335868df	Fix: lpf semaphore was signaled in single threaded run After picking filter level, post the loopfilter semaphore just when multiple threads are in use. Change-Id: If7bfb64601d906adef703f454dafc25e978b93c6	2011-03-30 15:55:29 +03:00
Johann	0e43668546	Merge "Half pixel variance further optimized for ARMv6"	2011-03-29 12:14:54 -07:00
Yunqing Wang	534ea700bd	Merge "Fix a crash while enabling shared (--enable-shared)"	2011-03-29 09:04:22 -07:00
Yunqing Wang	b843aa4eda	Fix a crash while enabling shared (--enable-shared) Fixed a bug in SSSE3 sub-pixel filter functions. Change-Id: I2e2126652970eb78307ffcefcace1efd5966fb0a	2011-03-29 11:31:06 -04:00
Johann	f0c22a3f33	use GLOBAL correctly on 32bit shared libraries http://code.google.com/p/webm/issues/detail?id=309 Change-Id: I6fce9e2f74bc09a9f258df7f91ab599812324e8c	2011-03-29 11:27:03 -04:00
Tero Rintaluoma	6fdc9aa79f	ARMv6 optimized subtract functions Adds following ARMv6 optimized functions to encoder: - vp8_subtract_b_armv6 - vp8_subtract_mby_armv6 - vp8_subtract_mbuv_armv6 Gives 1-5% speed-up depending on input sequence and encoding parameters. Functions have one stall cycle inside the loop body on Cortex pipeline. Change-Id: I19cca5408b9861b96f378e818eefeb3855238639	2011-03-29 16:52:00 +03:00
Johann	4be062bbc3	add asm_enc_offsets.c for all targets now that we need asm_enc_offsets.c for x86 and arm and it is harmless to build it for other targets, add it unconditionally Change-Id: I320c5220afd94fee2b98bda9ff4e5e34c67062f3	2011-03-28 10:43:47 -04:00
Tero Rintaluoma	f5e433464b	Half pixel variance further optimized for ARMv6 Half pixel interpolations optimized in variance calculations. Separate function calls to vp8_filter_block2d_bil_x_pass_armv6 are avoided.On average, performance improvement is 6-7% for VGA@30fps sequences. Change-Id: Idb5f118a9d51548e824719d2cfe5be0fa6996628	2011-03-28 09:51:51 +03:00
Johann	beaafefcf1	Merge "use asm_offsets with vp8_regular_quantize_b_sse2"	2011-03-24 11:06:36 -07:00
Johann	8edaf6e2f2	use asm_offsets with vp8_regular_quantize_b_sse2 remove helper function and avoid shadowing all the arguments to the stack on 64bit systems when running with --good --cpu-used=0: ~2% on linux x86 and x86_64 ~2% on win32 x86 msys and visual studio more on darwin10 x86_64 significantly more on x86_64-win64-vs9 Change-Id: Ib7be12edf511fbf2922f191afd5b33b19a0c4ae6	2011-03-24 13:34:48 -04:00
Johann	4cde2ab765	Merge "ARMv6 optimized fdct4x4"	2011-03-23 07:52:51 -07:00
Yunqing Wang	73065b67e4	Merge "Fix multithreaded encoding for 1 MB wide frame"	2011-03-21 07:41:31 -07:00
John Koleszar	2cbd962088	Remove unused vp8_get4x4sse_cs_mmx declaration This declaration did not match the prototype_sad() prototype, but was unused in this translation unit, so it is removed instead. Fixes issue 290. Change-Id: I168854f88a85f73ca9aaf61d1e5dc0f43fc3fdb3	2011-03-21 07:53:53 -04:00
John Koleszar	769c74c0ac	Merge "Increase static linkage, remove unused functions"	2011-03-21 04:51:51 -07:00
Tero Rintaluoma	a61785b6a1	ARMv6 optimized fdct4x4 Optimized fdct4x4 (8x4) for ARMv6 instruction set. - No interlocks in Cortex-A8 pipeline - One interlock cycle in ARM11 pipeline - About 2.16 times faster than current C-code compiled with -O3 Change-Id: I60484ecd144365da45bb68a960d30196b59952b8	2011-03-21 13:33:45 +02:00
Attila Nagy	bfe803bda3	Fix multithreaded encoding for 1 MB wide frame Thread synchronization was not correct when frame width was 1 MB. Number of allocated encoding threads is limited by the sync_range. There is no point having more because each thread lags sync_range MBs behind the thread processing the row above. http://code.google.com/p/webm/issues/detail?id=302 Change-Id: Icaf67a883beecc5ebf2f11e9be47b6997fdf6f26	2011-03-18 12:35:30 +02:00
John Koleszar	429dc676b1	Increase static linkage, remove unused functions A large number of functions were defined with external linkage, even though they were only used from within one file. This patch changes their linkage to static and removes the vp8_ prefix from their names, which should make it more obvious to the reader that the function is contained within the current translation unit. Functions that were not referenced were removed. These symbols were identified by: $ nm -A libvpx.a \| sort -k3 \| uniq -c -f2 \| grep ' [A-Z] ' \ \| sort \| grep '^ *1 ' Change-Id: I59609f58ab65312012c047036ae1e0634f795779	2011-03-17 20:53:47 -04:00
Ralph Giles	185557344a	Set bounds from the array when iterating mmaps. The mmap allocation code in vp8_dx_iface.c was inconsistent. The static array vp8_mem_req_segs defines two descriptors, but only the first is real. The second is a sentinel and isn't actually allocated, so vpx_codec_alg_priv is declared with mmaps[NELEMENTS(vp8_mem_req_segs)-1]. Some functions use this reduced upper bound when iterating though the mmap array, but these two functions did not. Instead, this commit calls NELEMENTS(...->mmaps) to directly query the bounds of the dereferenced array. This fixes an array-bounds warning from gcc 4.6 on vp8_xma_set_mmap. Change-Id: I918e2721b401d134c1a9764c978912bdb3188be1	2011-03-17 14:52:05 -07:00
Ralph Giles	de5182eef3	Remove commented-out VP6 code from vp8_finalize_mmaps Change-Id: I48642c380353043bed96026f56de5908fcee270a	2011-03-17 14:51:31 -07:00
John Koleszar	8431e768c9	Merge "Fix "used uninitialized" warning in vp8_pack_bitstream()"	2011-03-17 14:25:04 -07:00
John Koleszar	de50520a8c	apple: include proper mach primatives Fixes implicit declaration warning for 'mach_task_self'. This change is an update to Change I9991dedd1ccfddc092eca86705ecbc3b764b799d, which fixed this issue for the decoder but not the encoder. Change-Id: I9df033e81f9520c4f975b7a7cf6c643d12e87c96	2011-03-16 13:59:32 -04:00
Attila Nagy	71bcd9f1af	Add vp8_variance8x8_armv6 and vp8_sub_pixel_variance8x8_armv6 functions Change-Id: I08edaffc62514907fa5e90e1689269e467c857f5	2011-03-15 15:50:44 +02:00
John Koleszar	8c48c943e7	Merge "Fix an unused variable warning."	2011-03-14 14:13:53 -07:00
Johann	d0ec28b3d3	Merge "Add vp8_mse16x16_armv6 function"	2011-03-14 12:47:42 -07:00
Attila Nagy	e54dcfe88d	Add vp8_mse16x16_armv6 function Change-Id: I77e9f2f521a71089228f96e2db72524189364ffb	2011-03-14 14:38:31 +02:00
Johann	3788b3564c	Merge "Move build_intra_predictors_mby to RTCD framework"	2011-03-11 10:23:48 -08:00
John Koleszar	27972d2c1d	Move build_intra_predictors_mby to RTCD framework The vp8_build_intra_predictors_mby and vp8_build_intra_predictors_mby_s functions had global function pointers rather than using the RTCD framework. This can show up as a potential data race with tools such as helgrind. See https://bugzilla.mozilla.org/show_bug.cgi?id=640935 for an example. Change-Id: I29c407f828ac2bddfc039f852f138de5de888534	2011-03-11 13:04:50 -05:00
Johann	5c60a646f3	Merge "ARMv6 optimized quantization"	2011-03-11 08:29:00 -08:00
John Koleszar	75051c8b59	Merge "Only enable ssim_opt.asm on X86_64"	2011-03-11 08:28:05 -08:00
John Koleszar	5db0eeea21	Only enable ssim_opt.asm on X86_64 Fix compiling on 32 bit x86. Change-Id: I6210573e1d9287ac49acbe3d7e5181e309316107	2011-03-11 11:27:08 -05:00
Paul Wilkins	6e73748492	Clean up of vp8_init_config() Clean up vp8_init_config() a bit and remove null pointer case, as this code can't be called any more and is not an adequate trap anyway, as a null pointer would cause exceptions before hitting the test. Change-Id: I937c00167cc039b3aa3f645f29c319d58ae8d3ee	2011-03-11 11:06:51 -05:00
John Koleszar	170b87390e	Merge "1 Pass CQ and VBR bug fixes"	2011-03-11 08:06:09 -08:00
Paul Wilkins	2ae91fbef0	1 Pass CQ and VBR bug fixes Issue 291 highlighted the fact that CQ mode was not working as expected in 1 pass mode, This commit fixes that specific problem but in so doing I also uncovered an overflow issue in the VBR code for 1 pass and some data values not being correctly initialized. For some clips (particularly short clips), the resulting improvement is dramatic. Change-Id: Ieefd6c6e4776eb8f1b0550dbfdfb72f86b33c960	2011-03-11 10:59:34 -05:00
John Koleszar	e34e417d94	Merge "Fix incorrect macroblock counts in twopass rate control"	2011-03-11 06:06:04 -08:00
Yunqing Wang	3c9dd6c3ef	Merge "Align SAD output array to be 16-byte aligned"	2011-03-11 05:56:02 -08:00
John Koleszar	c5c5dcd0be	Merge "vp8cx - psnr converted to call assemblerized sse"	2011-03-11 05:54:00 -08:00
John Koleszar	29c46b64a2	Merge "vp8cx- alternate ssim function with optimizations"	2011-03-11 05:53:41 -08:00
Jim Bankoski	3dc382294b	vp8cx - psnr converted to call assemblerized sse Change-Id: Ie388d4618c44b131f96b9fe526618b457f020dfa	2011-03-11 08:51:22 -05:00
Jim Bankoski	3f6f7289aa	vp8cx- alternate ssim function with optimizations Change-Id: I91921b0a90dbaddc7010380b038955be347964b3	2011-03-11 08:51:21 -05:00
Yunqing Wang	b2aa401776	Align SAD output array to be 16-byte aligned Use aligned store. Change-Id: Icab4c0c53da811d0c52bb7e8134927f249ba2499	2011-03-11 08:24:23 -05:00
Yunqing Wang	76ec21928c	Merge "Encoder loopfilter running in its own thread"	2011-03-11 04:55:05 -08:00

... 5 6 7 8 9 ...

1247 Commits