generic-library/vpx

Author	SHA1	Message	Date
Yaowu Xu	3ac73173a4	Merge "fix a bug that "optimize" flag is not set for sub-threads"	2010-12-14 13:32:04 -08:00
Yunqing Wang	23aa13d92c	Merge "Fix a bug in motion search code"	2010-12-14 13:25:34 -08:00
Yunqing Wang	7fb0f86863	Fix a bug in motion search code The MV's range is 256. Since the new motion search uses a different starting MV than the center ref MV, a MV range checking needs to be done to avoid corruption. Change-Id: I8ae0721d1bd203639e13891e2e54a2e87276f306	2010-12-14 13:59:38 -05:00
Yaowu Xu	64f3d91579	fix a bug that "optimize" flag is not set for sub-threads The flag for quantization optimization was not properly propagated to mb row encoding threads. Change-Id: Ic561599c35acd94cd5698c9b314bccd596ac2deb	2010-12-14 10:12:21 -08:00
Johann	825adc464f	shrink TOKENEXTRA and vp8_extra_bit_struct Per John's previous change, shrink TOKENEXTRA from 20 to 8 bytes original: `b7b1e6fb` reverted: `41f4458a` Also drop unused field from vp8_extra_bit_struct Update ARM ASM to deal with this change. In particular, Extra is signed and needs to be sign-extended when loaded. Change-Id: Ibd0ddc058432bc7bb09222d6ce4ef77e93a30b41	2010-12-14 10:32:50 -05:00
John Koleszar	7211ac407b	Merge remote branch 'internal/upstream' into HEAD	2010-12-14 00:05:07 -05:00
John Koleszar	6a80032280	Merge remote branch 'origin/master' into experimental Change-Id: Ic88e9b2fcf1dcb2852a7205bcda3f181103f5612	2010-12-14 00:05:05 -05:00
John Koleszar	41f4458a03	Revert "Reduce size of TOKENEXTRA struct" This reverts commit `b7b1e6fb55`. Previous fix is incomplete, breaks ARM. Itchy submit finger. Change-Id: I939dc0d3bf4173cf951c1d152338ab6ea2184bb9	2010-12-13 17:12:51 -05:00
John Koleszar	3809d7bbd9	Merge "remove unused temporal preproc code"	2010-12-13 13:57:59 -08:00
John Koleszar	398aa81849	Merge "Reduce size of TOKENEXTRA struct"	2010-12-13 13:57:55 -08:00
John Koleszar	b1aa54ab26	remove unused temporal preproc code This code is unused, as the current preproc implementation uses the same spatial filter that postproc uses. Change-Id: Ia06d5664917d67283f279e2480016bebed602ea7	2010-12-13 16:47:59 -05:00
John Koleszar	b7b1e6fb55	Reduce size of TOKENEXTRA struct Change the size of structure elements to reduce memory utilization. Removed the 'section' member entirely, as it is set but never read. Change-Id: Iad043830392fb4168cb3cd6075fb0eb70c7f691c	2010-12-13 16:37:37 -05:00
John Koleszar	b6905e36d9	Merge remote branch 'origin/master' into experimental Change-Id: Ibbe41ff2356aa8583c728e9ab1b0814958a51752	2010-12-11 00:05:08 -05:00
John Koleszar	eb1c033731	Merge remote branch 'internal/upstream' into HEAD	2010-12-11 00:05:08 -05:00
Yaowu Xu	97a86c5b13	fix a bug in multithreaded encoding with active_map enabled Added the initialization of the pointer to active map. Also added the same logic for cyclic refresh in mbrow encoding threads. Change-Id: Ic48d0849dc706b27fba72d07dcc498075725663d	2010-12-10 10:48:30 -08:00
Fritz Koenig	0ced701487	Merge "vp8 fast quantizer sse2 optimizations for eob."	2010-12-10 09:25:04 -08:00
John Koleszar	3bf929f18e	Merge remote branch 'internal/upstream' into HEAD	2010-12-10 00:05:09 -05:00
John Koleszar	0f02b37992	Merge remote branch 'origin/master' into experimental Change-Id: Iada4d917df4af42b16404e1b54b30ba2ca74df39	2010-12-10 00:05:07 -05:00
Fritz Koenig	e0cf330cde	vp8 fast quantizer sse2 optimizations for eob. Changed the end of block computation to use pmaxw. Removed additional pushing and popping of registers that was not needed. Change-Id: I08cb9b424513cd8a2c7ad8cea53b4e2adc66ef98	2010-12-09 15:00:30 -08:00
John Koleszar	cb9698951c	fix uninitialized read in encode breakout Change I3430820 performed an uninitialized read when encode_breakout == 0, since the sum and sse wouldn't be set: if(x->encode_breakout) VARIANCE_INVOKE(..., get16x16var)(..., &sum, &sse); if (cpi->active_map_enabled && x->active_ptr[0] == 0) { ... } else if (sse < x->encode_breakout) Change-Id: I915eb76d1227b4b6d1137a0dedf2c143860098a2	2010-12-09 16:05:26 -05:00
Paul Wilkins	c63fc881e1	Correct q_low and q_high limits for the recode loop Corrected the initial Q range limits for the recode loop to reflect the current allowed range for the frame. In experimental work on constrained quality this bug was causing unnecessary recodes. Change-Id: I7e256fbfa681293b0223fe21ec329933d76c229f	2010-12-09 15:02:04 +00:00
John Koleszar	d9c50b8103	Merge remote branch 'internal/upstream' into HEAD	2010-12-09 00:05:09 -05:00
John Koleszar	808f3814fc	Merge remote branch 'origin/master' into experimental Change-Id: I2b70793a97f80039ad23feea164744b1c236ac74	2010-12-09 00:05:07 -05:00
Yaowu Xu	160f3c7e9e	Merge "vp8e - static threshold play"	2010-12-08 13:08:04 -08:00
Yaowu Xu	d88da98614	Merge "vp8e - remove unnecessary variance calc"	2010-12-08 09:19:22 -08:00
John Koleszar	2905656159	Merge remote branch 'internal/upstream' into HEAD	2010-12-08 00:05:09 -05:00
John Koleszar	c5795b673d	Merge remote branch 'origin/master' into experimental Change-Id: I76ed5f6c24f3f71bba47679ff09d28e046ec1db9	2010-12-08 00:05:06 -05:00
Jim Bankoski	718c19711a	vp8e - static threshold play Realized no need for new assembly code sum is already calculated. Change-Id: Ie2d94feb4b7c1f77c5359bca29b66228e41638c9	2010-12-07 16:07:23 -05:00
Scott LaVarnway	f661fa1f24	Merge "vp8_rd_pick_best_mbsegmentation code restructure"	2010-12-07 07:53:12 -08:00
Yaowu Xu	062980cc48	Merge "adjust RDMULT for UV plane in quantization RDO"	2010-12-06 22:04:45 -08:00
John Koleszar	f68820390b	Merge remote branch 'internal/upstream' into HEAD	2010-12-07 00:05:11 -05:00
John Koleszar	727abbb38a	Merge remote branch 'origin/master' into experimental Change-Id: I1baeedb24f321d3e200f00412cc657ab92c43143	2010-12-07 00:05:08 -05:00
Yaowu Xu	7c03a1c308	adjust RDMULT for UV plane in quantization RDO This patch adds a weighting factor on RDMULT for UV blocks. The change has an overall gain about 0.5% based on ssim, between 0.1 and 0.2% by psnr numbers. Change-Id: I97781b077ce3bb7e34241b03268491917e8d1d72	2010-12-06 20:53:59 -08:00
Yunqing Wang	9520f4b3cc	Fix a memory leak problem in encoder Deallocating the buffers before re-allocating them. The fix passed James Berry's test program for memory leak check. Change-Id: I18c3cf665412c0e313a523e3d435106c03ca438d	2010-12-06 17:21:37 -05:00
Scott LaVarnway	2fa5d5a26d	vp8_rd_pick_best_mbsegmentation code restructure Moved the code from the segmentation loop into a function which is now called for each segment. This will allow us to change the segment order checking more easily. Change-Id: I9510d26f0acae5a73043fcca8f1984b121d3e052	2010-12-06 16:42:52 -05:00
Scott LaVarnway	d283d9bb30	Merge "Improve MV prediction accuracy to achieve performance gain"	2010-12-06 09:41:09 -08:00
Patrik Westin	8534071de0	Fix for manual Golden frame frequency When auto_golden wasn't set it forced all frames to be a golden frame. Now the manual configured frequency is adhered to. Change-Id: I360acac9bc487db0d9c4d4da6ee41f70c227c539	2010-12-06 09:53:41 -05:00
John Koleszar	b170d20111	Merge remote branch 'internal/upstream' into HEAD	2010-12-05 00:05:10 -05:00
John Koleszar	7e9910c69b	Merge remote branch 'origin/master' into experimental Change-Id: I2a47e43cb3ad61620bfef9e8caf578f321487f2c	2010-12-05 00:05:06 -05:00
Paul Wilkins	ccb0348473	Merge "Change to inter_minq table."	2010-12-04 02:06:33 -08:00
Paul Wilkins	cec6a596b5	Change to inter_minq table. The inter_minq table controls the range of quantizers available for a particular frame in two pass relative to a max Q value. The changes reduces the range somewhat. The effect of this was a small increase (0.3% average) in psnr for the test set but it should also help encode speed somewhat for higher quality modes as it will reduce the number of iterations in the recode loop. The change damps the range of quantizers available locally within a section of a clip and should therefore help keep quality more uniform. If there is systematic overshoot or undershoot the range can shift gradually to accommodate. However, there is some increased risk of overshoot or undershoot against the target bit rate in VBR mode and this risk will be more pronounced for short clips. The change damps the range of quantizers available locally within a section of a clip and should therefore help keep quality more uniform. If there is systematic overshoot or undershoot the range can shift gradually to accommodate. However, there is some increased risk of overshoot or undershoot against the target bit rate in VBR mode and this risk will be more pronounced for short clips. Change-Id: I84465567d49ae767c6c73ff2a2aac30c895adb52	2010-12-04 10:04:12 +00:00
John Koleszar	46214a2c64	Merge remote branch 'internal/upstream' into HEAD	2010-12-04 00:05:10 -05:00
John Koleszar	16724b7c93	Merge remote branch 'origin/master' into experimental Change-Id: I11cd10dba54d0f3f96640dadc97199e5733f1888	2010-12-04 00:05:08 -05:00
Yunqing Wang	c3bbb29164	Improve MV prediction accuracy to achieve performance gain Add vp8_mv_pred() to better predict starting MV for NEWMV mode in vp8_rd_pick_inter_mode(). Set different search ranges according to MV prediction accuracy, which improves encoder performance without hurting the quality. Also, as Yaowu suggested, using diamond search result as full search starting point and therefore adjusting(reducing) full search range helps the performance. Change-Id: Ie4a3c8df87e697c1f4f6e2ddb693766bba1b77b6	2010-12-03 15:23:35 -05:00
John Koleszar	5e76dfcc70	Merge 'Add simple version of activity masking.' Merge commit 'refs/changes/79/779/2' of https://review.webmproject.org/p/libvpx Conflicts: vp8/encoder/encodeintra.c vp8/encoder/encodemb.c Change-Id: Id607063fabe92d99eeb3c380e8ca670b01bfb3ef	2010-12-03 13:30:50 -05:00
John Koleszar	ea2a5754b4	Merge remote branch 'origin/master' into experimental Change-Id: If95cb994d898d3f29b28db0d118a1f9c973e88d9	2010-12-02 08:20:43 -05:00
John Koleszar	cfc44aae03	Merge remote branch 'internal/upstream' into HEAD	2010-12-02 00:05:06 -05:00
Fritz Koenig	9c8ad79fdc	Set refresh_alt_ref_frame on keyframe encode. On a keyframe alt ref and golden are refreshed. The flag was not being set and so on the frame after a keyframe, motion search would occur on the alt ref frame. This is not necessary because the alt ref frame identical to the last frame in this scenario. Handle corner case where a forward alt-ref frame is put directly after a keyframe. Change-Id: I9be4cf290d694f8cf2f9a31852014b5ccf1504d3	2010-12-01 12:48:22 -08:00
John Koleszar	f99e91b65b	Merge remote branch 'internal/upstream' into HEAD	2010-11-30 00:05:07 -05:00
John Koleszar	1b70862916	Merge remote branch 'origin/master' into experimental	2010-11-30 00:05:05 -05:00
Jim Bankoski	3430820bbe	vp8e - remove unnecessary variance calc only do the variance calculation if necessary ( eg needed for breakout test)	2010-11-27 14:02:59 -05:00
Pascal Massimino	fd9f9dc054	allow dimensions as low as 1 pixel remove warning comment in vpxenc.c: in case of 1x1 picture, detect_bytes will be equal to '3' and we'll fall back to RAW_TYPE. fix read_frame() by tracking the pre-read buffer length in the struct detect Change-Id: If1ed86ee5260dcdbc8f9d10da6cbb84a4cc2f151	2010-11-24 16:44:33 -08:00
John Koleszar	c9baf67f6c	Merge remote branch 'internal/upstream' into HEAD	2010-11-24 00:05:05 -05:00
John Koleszar	394b68c8f8	Merge remote branch 'origin/master' into experimental	2010-11-24 00:05:04 -05:00
John Koleszar	78cbe51bc3	Merge changes I3aed713e,I9ef7f56e,Ic18c60df * changes: vp8_set_maps: remove hard-coded width/height vp8mt_alloc_temp_buffers: make prototype return void Disable compile warning for ERROR macro	2010-11-23 12:38:20 -08:00
John Koleszar	849763d05f	Merge remote branch 'internal/upstream' into HEAD	2010-11-23 00:05:05 -05:00
John Koleszar	8416312095	Merge remote branch 'origin/master' into experimental	2010-11-23 00:05:05 -05:00
Paul Wilkins	ad6150f769	Recalibration of bits per MB tables The baseline bits per MB prediction tables have been re calibrated based on the assumption that bits per mb is inversely proportional to the quantizer level. Change-Id: Ibd355c7acac4b8053dda1baf1032fe35f11da7f7	2010-11-22 13:17:35 +00:00
Paul Wilkins	1753f0d208	Merge "Added extra two pass stats gathering."	2010-11-22 04:11:20 -08:00
John Koleszar	006247ac33	Merge remote branch 'internal/upstream' into HEAD	2010-11-21 00:05:06 -05:00
John Koleszar	b13d1c307e	Merge remote branch 'origin/master' into experimental	2010-11-21 00:05:05 -05:00
Paul Wilkins	70b885a0e8	Added extra two pass stats gathering. Added code to record spend so far against planed budget. Change-Id: I5a3335346fa1771b2b1219df9f6127f9993d2594	2010-11-19 14:12:33 -05:00
John Koleszar	1a1a8ea4df	Merge remote branch 'internal/upstream-experimental' into HEAD	2010-11-19 00:05:03 -05:00
Yaowu Xu	0867b81678	remove low pass filtering from two 4x4 intra prediction In the process of developing new intra prediction modes, tests have shown removal of the low pass filtering from B_HE_PRED and B_VE_PRED has an overall minor positive impact in both PSNR and SSIM metric. Overall difference is about 0.1%. The change shall also have a small positive impact on speed. Intuitively, this change should also reduce some of the tendency of "flattening" Change-Id: I3c43b0daca833c6eff77d00f19c811f9ef9368a3	2010-11-18 10:42:08 -08:00
Yaowu Xu	39ceef38a7	changed MAX_PSNR to 100 Changing the MAX_PSNR to 100 to allow testing of further experiments on extending quantizer range to near lossless. With an effective quantizer of 1, encoder achieves ~68DB, which is consistent with fdct/idct round trip error. Change-Id: I7b6d0e94a8936968ef42e82e63ebb13999c36832	2010-11-18 09:12:02 -08:00
Yaowu Xu	06c70d304f	extends the range of tokens Extending the value range of tokens allows further experiments on extending quantizer range. Encoder and decoder were verified to produce matching reconstructed buffers by tests with forced quantized value of 1. Change-Id: I12faf92832867870b6f71ddeafbf643f1040086d	2010-11-18 09:07:16 -08:00
John Koleszar	a2ebd0f3e4	Merge remote branch 'origin/master' into experimental	2010-11-18 00:05:05 -05:00
John Koleszar	f3919f1879	Merge remote branch 'internal/upstream' into HEAD	2010-11-18 00:05:04 -05:00
Pascal Massimino	ed5ab7fa49	remove warning was having: "vp8/encoder/onyx_if.c:5365: warning: comparison of unsigned expression >= 0 is always true"	2010-11-17 16:50:02 -08:00
Scott LaVarnway	9a6740af80	Merge "Removed unnecessary checks."	2010-11-17 11:28:22 -08:00
Scott LaVarnway	f7670acc68	Removed unnecessary checks. macro_block_yrd and vp8_rdcost_mby are not called for SPLITMV. Change-Id: I2224d3c8725df526d48426447482768d543752f1	2010-11-17 14:25:48 -05:00
Suman Sunkara	15a1dca2fb	Merge "FIXED bug in when CONFIG_SEGMENTATION NOT DEFINED" into experimental	2010-11-17 12:07:24 -05:00
Jim Bankoski	c35057f0f6	FIXED bug in when CONFIG_SEGMENTATION NOT DEFINED	2010-11-17 11:30:24 -05:00
Paul Wilkins	f874391e02	Replaced recode loop test with a function call Replaced existing code to decide if a frame recode is required with a function call. This is to simplify addition of extra clauses that may be needed for the planned constrained quality mode. Also fixed a bug where by alt ref not considered in the test. Change-Id: I3d40bb21abe3e19f8456761e6849deb171738b60	2010-11-17 15:12:04 +00:00
John Koleszar	8d94796cad	vp8mt_alloc_temp_buffers: make prototype return void This function was never called in a context expecting a return value, the return value was always a constant, and the !CONFIG_MULTITHREAD path didn't have a return statement, which caused a compiler warning. This patch changes the function to return void instead. Fixes issue #231 Change-Id: I9ef7f56e54418b7265026c54fc4ed5660c1418d1	2010-11-17 09:13:57 -05:00
John Koleszar	79e2b1f39b	Disable compile warning for ERROR macro The ERROR macro collides wiith the MS SDK on Windows. Since we're not making any win32 calls in this function, just #undef it first to take ownership. Change-Id: Ic18c60dfa3a33c52e6c49d3f4f8d3e7e3ac3341d	2010-11-17 09:08:51 -05:00
John Koleszar	3a778de77a	Merge remote branch 'origin/master' into experimental	2010-11-17 00:05:05 -05:00
John Koleszar	8232a3a0e4	Merge remote branch 'internal/upstream' into HEAD	2010-11-17 00:05:04 -05:00
Fritz Koenig	99d02c0f9f	Merge "Comments for alt ref flags."	2010-11-16 16:11:39 -08:00
Fritz Koenig	69ee697fef	Comments for alt ref flags. Clarify what the alt ref flags do when encoding. Change-Id: I71f78e0f42edae633fb91840f29dfbe64362c44c	2010-11-16 15:16:24 -08:00
Suman Sunkara	388546bc93	Merge branch 'experimental' of ssh://on2-git.corp.google.com:29418/libvpx into test Conflicts: configure Change-Id: Id874dc46b13e8b5da4179fc3b48e354ec313a2cd	2010-11-16 16:31:59 -05:00
Suman Sunkara	4b3f72001d	Merge branch 'experimental' of ssh://on2-git.corp.google.com:29418/libvpx into test Conflicts: vp8/common/blockd.h vp8/decoder/decodemv.c vp8/decoder/decodframe.c vp8/decoder/demode.c vp8/decoder/onyxd_if.c vp8/decoder/onyxd_int.h vp8/encoder/encodeframe.c Change-Id: Ic379f4dffaded9796dc19d56be304d3f8527c61f	2010-11-16 16:30:59 -05:00
Jim Bankoski	b4a3602f66	changes to start experimenting with color segmentation prediction modes.	2010-11-16 14:38:40 -05:00
Yaowu Xu	d49da085c0	correct errors in token alphabet descriptions There were a few errors in the comment section that describe VP8 token alphabet table. Change-Id: Ie6728a0e08bc3798893221b60408d5b201064bdc	2010-11-16 10:51:43 -08:00
John Koleszar	791cae74da	Merge remote branch 'origin/master' into experimental	2010-11-16 00:05:04 -05:00
John Koleszar	00fe7441e9	Merge remote branch 'internal/upstream' into HEAD	2010-11-16 00:05:04 -05:00
Fritz Koenig	e180255375	Remove stack shadowing for x86-x64 for SAD functions. x86-64 passes arguments in registers. There is no need to push them to the stack before using them. This fixes `15acc84f10` where ebx was not getting preserved on x86. Change-Id: I1214b5f818a0201f75ab6ad7d5c6f448e09b16c2	2010-11-15 10:56:02 -08:00
Paul Wilkins	f4709d2895	Merge "Bad cost tables used in ARNR filtering."	2010-11-15 09:55:35 -08:00
Paul Wilkins	373f5c3144	Bad cost tables used in ARNR filtering. The use of incorrect mv costing tables in the ARNR sub-pel filtering code led to corruption of the altref buffer in some cases, particularly at low data rates. The average gain from this fix is about 0.3% but there are a few extreme cases where nasty and visible artifacts manifested and for these few data points the improvement is > 10%. PGW and AWG Change-Id: I95cc02b196a433e71d0d2bd2b933fe68ed31e796	2010-11-15 17:47:12 +00:00
Yaowu Xu	73189f21b3	Merge "make rdmult adaptive for intra in quantizer RDO"	2010-11-15 09:22:45 -08:00
John Koleszar	9b5cd1c3ff	Merge remote branch 'internal/upstream' into HEAD	2010-11-13 00:05:05 -05:00
John Koleszar	25fa447acb	Merge remote branch 'origin/master' into experimental	2010-11-13 00:05:04 -05:00
John Koleszar	08d45a99b5	Merge remote branch 'internal/upstream' into HEAD	2010-11-12 00:05:03 -05:00
John Koleszar	7d799d2ced	Merge remote branch 'origin/master' into experimental	2010-11-12 00:05:03 -05:00
Yaowu Xu	ef2f27f10e	make rdmult adaptive for intra in quantizer RDO This intends to correct the tendency that VP8 aggressively favors rate on intra coded frames. Experiments tested different numbers in [0, 1] and found 9/16 overall provided about 2-4% gains for all-intra coded clips based on vpx-ssim metric. The impact on regular encoded clips is much smaller but positive overall. Overall impact on psnr is also positive even though very small. Change-Id: If808553aaaa87fdd44691f9787820ac9856d9f8a	2010-11-11 11:33:35 -08:00
John Koleszar	0a49747b01	quantizer: fix assertion in fast quantizer path The fast quantizer assembly code has not been updated to match the new exact quantizer, which was made the default in commit `6adbe09`. Specifically, they are not aware of the potential for the coefficient to be scaled, which results in the quantized result exceeding the range of the DCT. This patch restores the previous behavior of using the non-shifted coefficients when in the fast quantizer code path, but unfortunately requires rebuilding the tables when switching between the two. Change-Id: I0a33f5b3850335011a06906f49fafed54dda9546	2010-11-11 13:05:20 -05:00
Suman Sunkara	b9a18344cf	Use of temporal context for encoding delta updates. - Used three probability approach for temporal context as follows: P0 - probability of no change if both above and left not changed P1 - probability of no change if one of above and left has changed P2 - probability of no change if both above and left have changed In addition, a 1 bit/frame has been used to decide whether to use temporal context or to encode directly. The cost of using both the schemes is calculated ahead and the temporal_update flag is set if the cost of using temporal context is lower than encoding the segment ids directly. This approach has given around 20% reduction in cost of bits needed to encode segmentation ids. Change-Id: I44a5509599eded215ae5be9554314280d3d35405	2010-11-11 11:31:36 -05:00
Fritz Koenig	58083cb34d	Revert "Remove stack shadowing for x86-64" This reverts commit `15acc84f10`. Change-Id: Ia640be8cbc134432914849c1750f62575ea084e6	2010-11-11 08:20:02 -08:00
John Koleszar	f225211256	Merge remote branch 'origin/master' into experimental Conflicts: configure Change-Id: Ifa63e4610657f75cb953aa7ca08f997267612cc0	2010-11-11 09:25:10 -05:00
John Koleszar	1ea4c2924c	Merge remote branch 'internal/upstream' into HEAD Conflicts: configure Change-Id: I1c7bae5241f999387cae3f2abf2dfc84fe3f6651	2010-11-11 09:22:46 -05:00
Paul Wilkins	213f7b0907	Merge "Relax rate control for last few frames"	2010-11-11 02:39:20 -08:00
Fritz Koenig	9b1ece2cca	Merge "Remove stack shadowing for x86-64"	2010-11-10 14:36:10 -08:00
Fritz Koenig	5f0e0617ba	FDCT optimizations. Fixed up the fdct for mmx and 8x4 sse2 to match them most recent changes. Change-Id: Ibee2d6c536fe14dcf75cd6eb1c73f4848a56d719	2010-11-10 14:34:02 -08:00
Fritz Koenig	647df00f30	postproc : Re-work posproc calling to allow more flags. Debugging in postproc needs more flags to allow for specific block types to be turned on or off in the visualizations. Must be enabled with --enable-postproc-visualizer during configuration time. Change-Id: Ia74f357ddc3ad4fb8082afd3a64f62384e4fcb2d	2010-11-10 14:14:46 -08:00
Paul Wilkins	513f8e6814	Relax rate control for last few frames VBR rate control can become very noisy for the last few frames. If there are a few bits to spare or a small overshoot then the target rate and hence quantizer may start to fluctuate wildly. This patch prevents further adjustment of the active Q limits for the last few frames. Patch also removes some redundant variables and makes one small bug fix. Change-Id: Ic167831bec79acc9f0d7e4698bcc4bb188840c45	2010-11-10 10:09:45 +00:00
Paul Wilkins	6adbe09058	Tuning for the more exact quantizer. Small changes to the default zero bin and rounding tables. Though the tables are currently the same for the Y1 and Y2 cases I have left them as separate tables in case we want to tune this later. There is now some adjustment of the zbin based on the prediction mode. Previously this was restricted to an adjustment for gf/arf 0,0 MV. The exact quantizer now marginal outperforms and is the default. The overall average gain is about 0.5% Change-Id: I5e4353f3d5326dde4e86823684b236a1e9ea7f47	2010-11-10 09:52:58 +00:00
John Koleszar	3a99784b5e	Merge remote branch 'origin/master' into experimental	2010-11-10 00:05:06 -05:00
John Koleszar	cac05c832e	Merge remote branch 'internal/upstream' into HEAD	2010-11-10 00:05:05 -05:00
John Koleszar	458f4fedd2	Merge "improve average framerate calculation"	2010-11-09 08:52:16 -08:00
John Koleszar	2fa664a4e2	Merge remote branch 'origin/master' into experimental	2010-11-06 00:05:08 -04:00
John Koleszar	4d1b0d2a2d	Merge commit 'fix integer promotion bug in partition size check' Change-Id: I4081917b46013fa8f4218cade8bd12cb2d013aee	2010-11-05 16:49:32 -04:00
John Koleszar	9fb80f7170	fix integer promotion bug in partition size check The check '(user_data_end - partition < partition_size)' must be evaluated as a signed comparison, but because partition_size was unsigned, the LHS was promoted to unsigned, causing an incorrect result on 32-bit. Instead, check the upper and lower bounds of the segment separately. Change-Id: I6266aba7fd7de084268712a3d2a81424ead7aa06	2010-11-05 14:52:53 -04:00
John Koleszar	7a590c902b	Merge remote branch 'origin/master' into experimental Conflicts: configure ivfenc.c vp8/common/alloccommon.c vp8/common/onyxc_int.h vp8/vp8_cx_iface.c	2010-11-05 12:30:33 -04:00
John Koleszar	f7e187d362	improve average framerate calculation Change Ice204e86 identified a problem with bitrate undershoot due to low precision in the timestamps passed to the library. This patch takes a different approach by calculating the duration of this frame and passing it to the library, rather than using a fixed duration and letting the library average it out with higher precision timestamps. This part of the fix only applies to vpxenc. This patch also attempts to fix the problem for generic applications that may have made the same mistake vpxenc did. Instead of calculating this frame's duration by the difference of this frame's and the last frame's start time, we use the end times instead. This allows the framerate calculation to scavenge "unclaimed" time from the last frame. For instance: start \| end \| calculated duration ======+=======+==================== 0ms 33ms 33ms 33ms 66ms 33ms 66ms 99ms 33ms 100ms 133ms 34ms Change-Id: I92be4b3518e0bd530e97f90e69e75330a4c413fc	2010-11-05 08:42:46 -04:00
Fritz Koenig	0e7b60617f	postproc : Update visualizations. Change color reference frame to blend the macro block edge. This helps with layering of visualizations. Add block coloring for intra prediction modes. Change-Id: Icefe0e189e26719cd6937cebd6727efac0b4d278	2010-11-04 10:35:02 -07:00
Fritz Koenig	0a29bd9793	postproc : Fix display of motion vectors. Split motion vectors were all being treated as 4x4 blocks. Now correctly handle 16x8, 8x16, 8x8, 4x4 blocks. Change-Id: Icf345c5e69b5e374e12456877ed7c41213ad88cc	2010-11-02 13:29:13 -07:00
Scott LaVarnway	b8f43aec66	Merge "SSSE3 version of fast quantizer"	2010-11-02 06:27:29 -07:00
Fritz Koenig	90c505f218	Merge "postproc : Added SPLITMV visualization, fix line constrain."	2010-11-01 14:41:41 -07:00
Fritz Koenig	9f61a83bf9	postproc : Added SPLITMV visualization, fix line constrain. Now draw 16 vectors for SPLITMV mode. Fixed constrain line to block divide by zero issues. Blend block was not centering the shaded area correctly. Change-Id: I1edabd8b4e553aac8d980f7b45c80159e9202434	2010-11-01 13:27:13 -07:00
Scott LaVarnway	ff4a71f4c2	SSSE3 version of fast quantizer (test clip: tulip) For good quality mode with speed=1, this gave the encoder a small (2 - 3%) performance boost. Change-Id: I8a1d4269465944ac0819986c2f0be4b0a2ee0b35	2010-11-01 16:24:15 -04:00
Scott LaVarnway	dcee88ea37	Finding first label Using tables for the label count and label offset. Change-Id: Iac3d5b292c37341a881be0af282f5cac3b3e01eb	2010-10-29 10:01:04 -04:00
Yunqing Wang	6614563b8f	Save XMM registers in asm functions XMM6/7 are used in these functions, and need to be saved. Change-Id: I3dfaddaf2a69cd4bf8e8735c7064b17bac5a14e5	2010-10-28 16:59:03 -04:00
Yunqing Wang	f57fc7bcc6	Merge "Fix full-search SAD function crash in Visual Studio"	2010-10-28 13:46:35 -07:00
Yunqing Wang	7e3a1e7361	Fix full-search SAD function crash in Visual Studio Unlike GCC, Visual Studio compiler doesn't allocate SAD output array 16-byte aligned, which causes crash in visual studio. Change-Id: Ia755cf5a807f12929bda8db94032bb3c9d0c2362	2010-10-28 15:26:58 -04:00
Timothy B. Terriberry	c4d7e5e67e	Eliminate more warnings. This eliminates a large set of warnings exposed by the Mozilla build system (Use of C++ comments in ISO C90 source, commas at the end of enum lists, a couple incomplete initializers, and signed/unsigned comparisons). It also eliminates many (but not all) of the warnings expose by newer GCC versions and _FORTIFY_SOURCE (e.g., calling fread and fwrite without checking the return values). There are a few spurious warnings left on my system: ../vp8/encoder/encodemb.c:274:9: warning: 'sz' may be used uninitialized in this function gcc seems to be unable to figure out that the value shortcut doesn't change between the two if blocks that test it here. ../vp8/encoder/onyx_if.c:5314:5: warning: comparison of unsigned expression >= 0 is always true ../vp8/encoder/onyx_if.c:5319:5: warning: comparison of unsigned expression >= 0 is always true This is true, so far as it goes, but it's comparing against an enum, and the C standard does not mandate that enums be unsigned, so the checks can't be removed. Change-Id: Iaf689ae3e3d0ddc5ade00faa474debe73b8d3395	2010-10-27 18:08:04 -07:00
Fritz Koenig	a097e18964	postproc: Tweaks to line drawing and blending. Turned down the blending level to make colored blocks obscure the video less. Not blending the entire block to give distinction to macro block edges. Added configuration so that macro block blending function can be optimized. Change to constrain line as to when dx and dy are computed. Now draw two lines to form an arrow. Change-Id: Id3ef0fdeeab2949a6664b2c63e2a3e1a89503f6c	2010-10-27 13:20:03 -07:00
Yunqing Wang	71ecb5d7d9	Full search SAD function optimization in SSE4.1 Use mpsadbw, and calculate 8 sad at once. Function list: vp8_sad16x16x8_sse4 vp8_sad16x8x8_sse4 vp8_sad8x16x8_sse4 vp8_sad8x8x8_sse4 vp8_sad4x4x8_sse4 (test clip: tulip) For best quality mode, this gave encoder a 5% performance boost. For good quality mode with speed=1, this gave encoder a 3% performance boost. Change-Id: I083b5a39d39144f88dcbccbef95da6498e490134	2010-10-27 13:36:31 -04:00
John Koleszar	a0ae3682aa	Fix half-pixel variance RTCD functions This patch fixes the system dependent entries for the half-pixel variance functions in both the RTCD and non-RTCD cases: - The generic C versions of these functions are now correct. Before all three cases called the hv code. - Wire up the ARM functions in RTCD mode - Created stubs for x86 to call the optimized subpixel functions with the correct parameters, rather than falling back to C code. Change-Id: I1d937d074d929e0eb93aacb1232cc5e0ad1c6184	2010-10-27 13:00:30 -04:00
Johann	927f29a644	Merge "fix implicit declarations"	2010-10-27 09:59:28 -07:00
Johann	787733d855	Merge "RTCD build is bringing old errors to light"	2010-10-27 09:59:01 -07:00
Fritz Koenig	cf127474d8	vpxdec : Change --pp-debug-info to be a bit field. This allows multiple post processor debug levels to be overlayed. i.e. can show colored reference blocks and visual motion vectors. Change-Id: Ic4a1df438445b9f5780fe73adb3126e803472e53	2010-10-27 09:53:37 -07:00
Fritz Koenig	36ff6a6743	Merge "postproc: Add mode and refrence frame visualizers."	2010-10-27 09:04:39 -07:00
Johann	b90a072f10	fix implicit declarations ARM used to explicitly remove this file from the build. With the RTCD changes, that's no longer possible. These errors also exist for x86 w/o RTCD, but that's not the default configuration Change-Id: I3e10e5553ddf3278e8d3c9365ca6fb84f52f5066	2010-10-27 11:21:02 -04:00
Johann	abcf36c758	RTCD build is bringing old errors to light needs to be _recon_ not _recon_recon_ Change-Id: I7a8b9ddcb4fb72c2b723c563932c9ea52ff15982	2010-10-27 10:47:48 -04:00
John Koleszar	1747207700	Merge "Add half-pixel variance RTCD functions"	2010-10-26 20:05:02 -07:00
John Koleszar	1320e54d95	Merge "make vp8_recon16x16mb{,y} RTCD functions"	2010-10-26 20:02:57 -07:00
John Koleszar	87e17737e9	Merge "make arm hex search the generic implementation"	2010-10-26 20:02:37 -07:00
John Koleszar	53f64a7736	Merge "arm: move unrolled loops back to generic code"	2010-10-26 20:02:18 -07:00
John Koleszar	9fdd90c9aa	Merge "arm: remove duplicate functions"	2010-10-26 20:01:54 -07:00
John Koleszar	209d82ad72	Add half-pixel variance RTCD functions NEON has optimized 16x16 half-pixel variance functions, but they were not part of the RTCD framework. Add these functions to RTCD, so that other platforms can make use of this optimization in the future and special-case ARM code can be removed. A number of functions were taking two variance functions as parameters. These functions were changed to take a single parameter, a pointer to a struct containing all the variance functions for that block size. This provides additional flexibility for calling additional variance functions (the half-pixel special case, for example) and by initializing the table for all block sizes, we don't have to construct this function pointer table for each macroblock. Change-Id: I78289ff36b2715f9a7aa04d5f6fbe3d23acdc29c	2010-10-26 20:00:56 -07:00
Fritz Koenig	a0ccc97d8a	postproc: Add mode and refrence frame visualizers. Post process option to color the block for either the mode of the macro block, or the frame that the macro block references. Change-Id: Ie498175497f2d20e3319924d352dc4ddc16f4134	2010-10-26 16:00:14 -07:00
John Koleszar	d6c67f02c9	make vp8_recon16x16mb{,y} RTCD functions ARM NEON has a platform specific version of vp8_recon16x16mb, though it's just a stub to extract the various parameters from the MACROBLOCKD struct and pass them to vp8_recon16x16mb_neon(). Using that function's prototype directly will be a better long term solution, but it's quite an invasive change. Change-Id: I04273149e2ade34749e2d09e7edb0c396e1dd620	2010-10-26 13:23:36 -04:00
John Koleszar	96cf6588de	make arm hex search the generic implementation The ARM version of vp8_hex_search() is a faster implementation of the same algorithm. Since it doesn't use any ARM specific code, it can be made the default implementation. This removes a linking error. Change-Id: I77d10f2c16b2515bff4522c350004e03b7659934	2010-10-26 10:46:31 -04:00
John Koleszar	1e7c05e0b4	Merge "add missing GET_GOT/RESTORE_GOT pairs"	2010-10-26 07:05:21 -07:00
John Koleszar	19638c2309	arm: move unrolled loops back to generic code Some of the ARM functions differed from their generic counterparts only by unrolling their loops. Since this change may be useful on other platforms, or might even supercede the looped version in the generic case, move it back to the generic file. This code is left under #if ARCH_ARM for now, but it may be worth considering a different (possibly new) conditional for these. If it turns out that this should be runtime selectable, these functions will have to move to the RTCD infrastructure. Don't want to take that step at this time without more profile data. Change-Id: I4612fdbc606fbebba4971a690fb743ad184ff15f	2010-10-26 09:51:35 -04:00
John Koleszar	d330a5876b	arm: remove duplicate functions These functions were true duplicates of functions present in the generic code. This fixes some of the link errors when building with --enable-shared --enable-pic. Change-Id: Idff26599d510d954e439207883607ad6b74df20c	2010-10-26 09:37:44 -04:00
Jim Bankoski	0a5a638c60	Merge commit 'refs/changes/09/809/1' of https://review.webmproject.org/p/libvpx	2010-10-26 07:34:57 -04:00
John Koleszar	b523dd51bd	add missing GET_GOT/RESTORE_GOT pairs These functions made global references but did not set up the GOT, causing compilation failures in PIC mode. Change-Id: Iac473bf46733f87eb2e001cd736af4acf73fa51d	2010-10-25 23:45:02 -04:00
Fritz Koenig	1d70aaf08b	Merge "Debug option for drawing motion vectors."	2010-10-25 15:40:22 -07:00
Fritz Koenig	d1a4cce809	Debug option for drawing motion vectors. Postproc level that uses Bresenham's line algorithm to draw motion vectors onto the postproc buffer. Change-Id: I34c7daa324f2bdfee71e84fcb1c50b90fa06f6fb	2010-10-25 15:39:04 -07:00
Johann	a3b002fc90	Merge "quiet compiler"	2010-10-25 13:26:55 -07:00
Martin Ettl	c3fd2c4ea7	Fix leaked file descriptor with ENTROPY_STATS cppcheck found a leaked file descriptor in the debugging code enabled by defining ENTROPY_STATS. Fixes issue #60. Change-Id: I0c1d0669cb94d44fed77860f97b82763be06b7cb	2010-10-25 13:16:39 -04:00
Johann	385865f820	quiet compiler clean up compiler warnings, man in the yellow hat warnings, and start to remove unused #includes Change-Id: I6267e98d9b3024b6fb1ef2732b29067a33cb96f6	2010-10-25 10:07:35 -04:00
Johann	1376f061da	reuse common loopfilter code there were four versions for the regular and macroblock loopfilters: horizontal [y\|uv] vertical [y\|uv] this moves all the common code into 2 functions: vp8_loop_filter_neon vp8_mbloop_filter_neon this provides no gain in performance. there's a bit of jitter, but it trends down ~0.25-0.5%. however, this is a huge gain maintenance. also, there is the potential to drop some stack usage in the macroblock loopfilter. Change-Id: I91506f07d2f449631ff67ad6f1b3f3be63b81a92	2010-10-25 09:48:50 -04:00
Timothy B. Terriberry	b71962fdc9	Add runtime CPU detection support for ARM. The primary goal is to allow a binary to be built which supports NEON, but can fall back to non-NEON routines, since some Android devices do not have NEON, even if they are otherwise ARMv7 (e.g., Tegra). The configure-generated flags HAVE_ARMV7, etc., are used to decide which versions of each function to build, and when CONFIG_RUNTIME_CPU_DETECT is enabled, the correct version is chosen at run time. In order for this to work, the CFLAGS must be set to something appropriate (e.g., without -mfpu=neon for ARMv7, and with appropriate -march and -mcpu for even earlier configurations), or the native C code will not be able to run. The ASFLAGS must remain set for the most advanced instruction set required at build time, since the ARM assembler will refuse to emit them otherwise. I have not attempted to make any changes to configure to do this automatically. Doing so will probably require the addition of new configure options. Many of the hooks for RTCD on ARM were already there, but a lot of the code had bit-rotted, and a good deal of the ARM-specific code is not integrated into the RTCD structs at all. I did not try to resolve the latter, merely to add the minimal amount of protection around them to allow RTCD to work. Those functions that were called based on an ifdef at the calling site were expanded to check the RTCD flags at that site, but they should be added to an RTCD struct somewhere in the future. The functions invoked with global function pointers still are, but these should be moved into an RTCD struct for thread safety (I believe every platform currently supported has atomic pointer stores, but this is not guaranteed). The encoder's boolhuff functions did not even have _c and armv7 suffixes, and the correct version was resolved at link time. The token packing functions did have appropriate suffixes, but the version was selected with a define, with no associated RTCD struct. However, for both of these, the only armv7 instruction they actually used was rbit, and this was completely superfluous, so I reworked them to avoid it. The only non-ARMv4 instruction remaining in them is clz, which is ARMv5 (not even ARMv5TE is required). Considering that there are no ARM-specific configs which are not at least ARMv5TE, I did not try to detect these at runtime, and simply enable them for ARMv5 and above. Finally, the NEON register saving code was completely non-reentrant, since it saved the registers to a global, static variable. I moved the storage for this onto the stack. A single binary built with this code was tested on an ARM11 (ARMv6) and a Cortex A8 (ARMv7 w/NEON), for both the encoder and decoder, and produced identical output, while using the correct accelerated functions on each. I did not test on any earlier processors. Change-Id: I45cbd63a614f4554c3b325c45d46c0806f009eaa	2010-10-25 09:23:29 -04:00
Johann	e81e30c25d	isolate new temporal filtering code onyx_if is getting pretty big. split out the temporal code to make it easier to look at. Change-Id: I207c3a94c90e91b32e3ea5e1836a53b7a990fabd	2010-10-25 09:11:03 -04:00
John Koleszar	3b9e72b210	Merge "Improve handling of invalid frames." Change-Id: Icef5226a70260607c190126c1c0cc28b796e759c	2010-10-22 11:54:49 -04:00
Timothy B. Terriberry	09bcc1f710	Improve handling of invalid frames. The code was not checking for frame sizes smaller than 3 bytes, and the partition size checks might have failed if the input buffer was within 16MB of the top of the heap. In addition, the reference count on the current frame buffer was not being decremented on error, so after a small number of errors, no new frame buffer could be found and it would run off the list of them. Change-Id: I0c60dba6adb1e2a29df39754f72a56ab6c776b46	2010-10-22 11:50:56 -04:00
Timothy B. Terriberry	8f75ea6b5c	Convert [4][4] matrices to [16] arrays. Most of the code that actually uses these matrices indexes them as if they were a single contiguous array, and coverity produces reports about the resulting accesses that overflow the static bounds of the first row. This is perfectly legal in C, but converting them to actual [16] arrays should eliminate the report, and removes a good deal of extraneous indexing and address operators from the code. Change-Id: Ibda479e2232b3e51f9edf3b355b8640520fdbf23	2010-10-21 17:04:30 -07:00
Frank Galligan	45e6494177	Change altref times to preceding pts+1. Change the pts of the altref frame to be as close as possible to the pts of the preceding frame and still be strictly increasing. Change-Id: Iae3033a4c89ae5a9d0e5c4198e9196e5f3ee57c7	2010-10-21 14:11:58 -04:00
John Koleszar	1ee3ebcd66	Merge "Move firstpass motion map to stats packet"	2010-10-21 11:09:02 -07:00
John Koleszar	bb7dd5b1ba	Move firstpass motion map to stats packet The first implementation of the firstpass motion map for motion compensated temporal filtering created a file, fpmotionmap.stt, in the current working directory. This was not safe for multiple encoder instances. This patch merges this data into the first pass stats packet interface, so that it is handled like the other (numerical) firstpass stats. The new stats packet is defined as follows: Numerical Stats (16 doubles) -- 128 bytes Motion Map -- 1 byte / Macroblock Padding -- to align packet to 8 bytes The fpmotionmap.stt file can still be generated for debugging purposes in the same way that the textual version of the stats are available (defining OUTPUT_FPF in firstpass.c) Change-Id: I083ffbfd95e7d6a42bb4039ba0e81f678c8183ca	2010-10-21 14:04:20 -04:00
Yunqing Wang	4cefb4434f	Add MMWORD PTR/XMMWORD PTR in subtract_sse2.asm Change-Id: Ia649b500ef020225d8bbf611799d0f47658dc2ac	2010-10-21 13:42:24 -04:00
Yunqing Wang	31752f2f41	Merge "Rewrite vp8_short_walsh4x4_sse2()"	2010-10-21 10:31:23 -07:00
Yunqing Wang	0918747520	Merge "Add SSE2 subtract functions"	2010-10-21 10:30:27 -07:00
Fritz Koenig	15acc84f10	Remove stack shadowing for x86-64 x86-64 passes most arguments in registers. There is no need to push them to the stack before using them. Change-Id: I13c683f1358782682ecafaf1df3fb0af23b978ea	2010-10-21 10:28:08 -07:00
Yunqing Wang	fc94ffcea4	Rewrite vp8_short_walsh4x4_sse2() This rewriting reflects changes made in commit "Improve the accuracy of forward walsh-hadamard transform". Since this function is not called much, only a small encoder performance gain (~0.5% ) is seen. Change-Id: Ie9df58a43028a11fd5b115c4bbe3141f7596578b	2010-10-21 13:02:55 -04:00
John Koleszar	bdf469c91e	Merge "Update arnr strength range form 1-6 to 0-6."	2010-10-19 20:20:31 -07:00
Frank Galligan	15542721ee	Update arnr strength range form 1-6 to 0-6. Change-Id: I8eb49c56f7509f0a8074d440e8345b9e3344b85b	2010-10-19 20:18:13 -07:00
Yaowu Xu	fc2f8dafaf	Merge "fixed a typo that mis-used Y plane stride for UV blocks."	2010-10-19 16:23:31 -07:00
Yaowu Xu	b9fe6d4da4	Merge "change to make use of more trellis quantization"	2010-10-19 08:11:52 -07:00
Yunqing Wang	4db2076594	Add SSE2 subtract functions Instead of doing 8-bit data unpack and 16-bit subtraction, use psubb to do 16 8-bit subtractions and pcmpgtb to preserve the sign information. This does not bring noticable gain since these functions are not called frequently. Change-Id: I90a0dfaa3db9d422e4ada324076596ffb178548e	2010-10-18 14:15:15 -04:00
Johann	ce1ce992ce	copy compiler warning fixes generic version got fixed, but not the arm version. fixes: vp8/encoder/arm/mcomp_arm.c: In function 'vp8_full_search_sadx3': vp8/encoder/arm/mcomp_arm.c:1208: warning: pointer targets in passing argument 5 of 'fn_ptr->sdx3f' differ in signedness vp8/encoder/arm/mcomp_arm.c:1208: note: expected 'unsigned int ' but argument is of type 'int ' and another unsigned change to keep the files similar Change-Id: I1b6255dc3a03b90394a791ee0d15d8167d9454db	2010-10-18 13:23:39 -04:00
Johann	963bcd6c87	remove dead code vp8_diamond_search_sadx4 isn't used in arm because there is no corrosponding sdx4df as in x86. rather than keep it in sync with ../mcomp.c, delete it vp8_hex_search had the original, more readable/understandable code if`d out. it's also available in ../mcomp.c, so remove the dead copy Change-Id: Ia42aa6e23b3a2e88040f467280befec091ec080e	2010-10-15 15:37:09 -04:00
Yaowu Xu	2e53e9e53f	change to make use of more trellis quantization when a subsequent frame is encoded as an alt reference frame, it is unlikely that any mb in current frame will be used as reference for future frames, so we can enable quantization optimization even when the RD constant is slightly rate-biased. The change has an overall benefit between 0.1% to 0.2% bit savings on the test sets based on vpxssim scores. Change-Id: I9aa7bc5cd573ea84e3ee655d2834c18c4460ceea	2010-10-15 10:14:34 -07:00
Jim Bankoski	39f41a4f36	safety check to avoid divide by 0s	2010-10-14 16:19:06 -04:00
Yunqing Wang	a2b598a2f9	Merge "Fix one gcc compiler warning"	2010-10-14 12:20:25 -07:00
Yunqing Wang	7804befb55	Fix one gcc compiler warning ../libvpx/vp8/encoder/bitstream.c: In function ‘pack_inter_mode_mvs’: ../libvpx/vp8/encoder/bitstream.c:1026: warning: array subscript has type ‘char’ Change-Id: Ic77491e0a172fa1821e5b3e914d0dc41fe87c00f	2010-10-14 15:15:35 -04:00
Yunqing Wang	7f31d987f0	Merge "Improve bounds checking in vp8_diamond_search_sadx4()"	2010-10-14 11:29:24 -07:00
Yunqing Wang	d6da7b8ea1	Improve bounds checking in vp8_diamond_search_sadx4() In order to know if all 4/8 neighbor points are within the bounds, 4 bounds checking are enough instead of checking 4 bounds for each points (16/32 checkings). This improvement reduces cost of vp8_diamond_search_sadx4() by 30%, and gives encoder a 1.5% performance gain (test options: 1 pass, good, speed=4). Change-Id: Ie8da29d18a6ecfc9829e74ac02f6fa70e042331a	2010-10-14 11:06:37 -04:00
Fritz Koenig	1dc0ca1340	Fix compiler warning about vp8_fast_quantize_b_impl_ssse2. Typo had function defined as _ssse2 and prototyped as _sse2. Change-Id: If9f19da1a83cff40774a90cf936d601c0bf1b7fe	2010-10-13 17:08:13 -07:00
Fritz Koenig	92df4a06d2	Correct QWORD usage in assembly files QWORD was being undefined because it was being used incorrectly. Change-Id: I3610cefa3d6f0da4054316760f78b9694cde3876	2010-10-13 16:57:57 -07:00
John Koleszar	136857475e	Centralize mb skip state calculation This patch moves the scattered updates to the mb skip state (mode_info_context->mbmi.mb_skip_coeff) to vp8_tokenize_mb. Recent changes to the quantizer exposed a bug where if a macroblock could be coded as a skip but isn't, the encoder would run the loopfilter but the decoder wouldn't, causing a reference buffer mismatch. The loopfilter is controlled by a flag called dc_diff. The decoder looks at the number of decoded coefficients when setting this flag. The encoder sets this flag based on the skip state, since any skippable macroblock should be transmitted as a skip. The coefficient optimization pass (vp8_optimize_b()) could change the coefficients such that a block that was not a skip becomes one. The encoder was not updating the skip state in this situation for intra coded blocks. The underlying issue predates it, but this bug was recently triggered by enabling trellis quantization on the Y2 block in commit `dcd29e3`, and by changing the quantizer range control in commit `305be4e`. Change-Id: I5cce5da0dbc2d22f7d79ee48149f01e868a64802	2010-10-12 09:03:19 -04:00
John Koleszar	acff1627b8	Merge "Add const qualifiers to variance/SAD functions."	2010-10-12 05:44:20 -07:00
Timothy B. Terriberry	8d0f7a01e6	Add simple version of activity masking. This uses MB variance to change the RDO weight for mode decision and quantization. Activity is normalized against the average for the frame, which is currently tracked using feed-forward statistics. This could also be used to adjust the quantizer for the entire frame, but that requires more extensive rate control changes. This does not yet attempt to adapt the quantizer within the frame, but the signaling cost means that will likely only be useful at very high rates. Change-Id: I26cd7c755cac3ff33cfe0688b1da50b2b87b9c93	2010-10-12 08:41:03 -04:00
Timothy B. Terriberry	f4a8594492	Add const qualifiers to variance/SAD functions. These functions should never change their input, and there's no reason not to declare that. This allows them to be passed static const data. Change-Id: Ia49fe4b01e80e9afcb24b4844817694d4da5995c	2010-10-12 08:40:54 -04:00
John Koleszar	037345eb69	Merge "Move vp8_strict_quantize_b inside EXACT_QUANT #define."	2010-10-12 05:34:30 -07:00
John Koleszar	fc018e0d92	Merge "Remove INTRARDOPT #define and intra_rd_opt option."	2010-10-12 05:33:22 -07:00
Timothy B. Terriberry	82c4339885	Move vp8_strict_quantize_b inside EXACT_QUANT #define. There is currently no inexact version of this function, so do not even compile it without EXACT_QUANT. This will prevent someone from inadvertently trying to use it without the proper EXACT_QUANT setup. Change-Id: Ia13491e0128afb281c05c9222ee5987101e4010d	2010-10-11 13:51:35 -07:00
Timothy B. Terriberry	dd08db9315	Remove INTRARDOPT #define and intra_rd_opt option. This is just eliminating some cruft. Although a number of variables are declared only when INTRARDOPT is defined, they are used elsewhere without that protection, and no longer just for intra RDO. The intra_rd_opt flag was hard-coded to 1 and never checked. Change-Id: I83a81554ecee8053e7b4ccd8aa04e18fa60f8e4f	2010-10-11 11:53:57 -07:00
Scott LaVarnway	6b1b28a83c	Merge "Added vp8_fast_quantize_b_sse2"	2010-10-11 09:34:48 -07:00
Yunqing Wang	7e6f7b579a	Remove unused file in encoder Remove vp8/encoder/x86/csystemdependent.c Change-Id: I7c590dcd07b68704d463a1452f62f29ffb1402f4	2010-10-07 12:08:08 -04:00
Scott LaVarnway	d860f685b8	Added vp8_fast_quantize_b_sse2 Moved vp8_fast_quantize_b_sse from quantize_mmx.asm into quantize_sse2.asm and renamed. Updated the assembly code to match the C version. Change-Id: I1766d9e1ca60e173f65badc0ca0c160c2b51b200	2010-10-07 11:43:19 -04:00
Yaowu Xu	d338d14c6b	optimize fast_quantizer c version As the zbin and rounding constants are normalized, rounding effectively does the zbinning, therefore the zbin operation can be removed. In addition, the memset on the two arrays are no longer necessary. Change-Id: If39c353c42d7e052296cb65322e5218810b5cc4c	2010-10-06 13:28:36 -07:00
Paul Wilkins	2931b05ac5	Merge "Tune effect of motion on KF/GF boost in two pass;"	2010-10-05 06:58:24 -07:00
Jan Kratochvil	1fc294116a	nasm: movhps compatibility QWORD->MMWORD Filed for nasm as: https://sourceforge.net/tracker/?func=detail&atid=106208&aid=3081103&group_id=6208 nasm just does not accept any size parameter for movhps: 1.asm:2: error: mismatch in operand sizes Some parts of libvpx already use MMWORD for movhps and MMWORD is defined-out so it is compatible both with yasm and nasm. Provide nasm compatibility. No binary change by this patch with yasm on {x86_64,i686}-fedora13-linux-gnu. Change-Id: I4008a317ca87ec07c9ada958fcdc10a0cb589bbc	2010-10-04 20:47:19 -04:00
Jan Kratochvil	5cdc3a4c29	nasm: address labels 'rel label' vice 'wrt rip' nasm does not support `label wrt rip', it requires `rel label'. It is still fully compatible with yasm. Provide nasm compatibility. No binary change by this patch with yasm on {x86_64,i686}-fedora13-linux-gnu. Few longer opcodes with nasm on {x86_64,i686}-fedora13-linux-gnu have been checked as safe. Change-Id: I488773a4e930a56e43b0cc72d867ee5291215f50	2010-10-04 19:47:54 -04:00
Jan Kratochvil	e114f699f6	nasm: match instruction length (movd/movq) to parameters nasm requires the instruction length (movd/movq) to match to its parameters. I find it more clear to really use 64bit instructions when we use 64bit registers in the assembly. Provide nasm compatibility. No binary change by this patch with yasm on {x86_64,i686}-fedora13-linux-gnu. Few longer opcodes with nasm on {x86_64,i686}-fedora13-linux-gnu have been checked as safe. Change-Id: Id9b1a5cdfb1bc05697e523c317a296df43d42a91	2010-10-04 23:36:29 +02:00
Yaowu Xu	49fdb7c41e	fixed a typo that mis-used Y plane stride for UV blocks. Raised by Lei Yang, the Y plane stride was used for UV blocks. This is clearly a typo. But as the comments in the code suggested that this port of code has not been used yet, so the typo should not have created any damage yet. Change-Id: Iea895edc17469a51c803a8cc6d0fce65a1a7fc2f	2010-10-04 11:31:14 -07:00
Yaowu Xu	2d4ef37507	Merge "enable trellis quantization for 2nd order blocks"	2010-10-04 10:41:20 -07:00
Paul Wilkins	788c0eb54e	Tune effect of motion on KF/GF boost in two pass; This code adjust the impact of the amount and speed of motion on GF and KF boost. Sections with lots of slow motion will tend to have a somewhat bigger boost and sections with fast motion may have less. There is a knock on effect to the selection of the active quantizer range. This will likely require further tuning but helps with a couple of particularly bad edge cases. Change-Id: Ic2449cda7305672b69acf42fc0a845b77ac98d40	2010-10-02 17:31:46 +01:00
Yaowu Xu	dcd29e369f	enable trellis quantization for 2nd order blocks Experimented with different value for Y2_RD_MULT ranging f[1, 32], without adapting the value to MB coding mode/frame type/Q value, 4 works out best among all values, providing overall 0.1% coding gain on the test set. Change-Id: I6b2583a8aa5db5e7e5c65c646301909c0c58f876	2010-10-02 06:20:33 -07:00
Johann	f143a81191	Merge "Fix valgrind errors in the NEON loop filters."	2010-10-01 06:18:53 -07:00
Adrian Grange	999bc00301	Made temporal filter default to use centered mode If temporal filtering is enabled but a filter type is not specified centered filter mode is used by default. Change-Id: I87306f267c1390074c806c506a69b4ba914d92a2	2010-10-01 10:14:01 +01:00
Timothy B. Terriberry	a465076e02	Fix valgrind errors in the NEON loop filters. Like the ARMv6 code, these functions were accessing values below the stack pointer, which can be corrupted by signal delivery at any time.	2010-09-30 20:40:45 -07:00
John Koleszar	0faa8a0861	Merge "Rename mode_ref_lf_test_function"	2010-09-30 10:26:31 -07:00
John Koleszar	a047fee606	Merge "Fix loopfilter delta zero transitions"	2010-09-30 10:26:10 -07:00
Adrian Grange	8ee7284d60	Changed defaults & range checking for AltRef params Modified the range checking of parameters used in the AltRef temporal filter (arnr-max-frames, arnr-strength, arnr-type) and default values for each of them. Change-Id: Ib261028d501b9523f6e44cb4790cc52167b6e92b	2010-09-30 10:06:09 +01:00
John Koleszar	7e5e31516c	Rename mode_ref_lf_test_function This function graduated from being a test func to something that's on by default. Rename it and remove some spurious comments that confuse its status. Change-Id: I689695a3ad29c35e9a72a43ec93766733ac6c20b	2010-09-29 13:53:14 -04:00
Fritz Koenig	439b2ecd74	Merge "Optimizations on the loopfilters."	2010-09-29 10:47:01 -07:00
John Koleszar	b9be7a464f	Fix loopfilter delta zero transitions Loopfilter deltas are initialized to zero on keyframes in the decoder. The values then persist from the previous frame unless an update bit is set in the bitstream. This data is not included in the entropy data saved by the 'refresh entropy' bit in the bitstream, so it is effectively an additional contextual element beyond the 3 ref-frames and the entropy data. The encoder was treating this delta update bit as update-if-nonzero, meaning that the value would be refreshed even if it hadn't changed, and more significantly, if the correct value for the delta changed to zero, the update wouldn't be sent, and the decoder would preserve the last (presumably non-zero) value. This patch updates the encoder to send an update only if the value has changed from the previously transmitted value. It also forces the value to be transmitted in error resilient mode, to account for lost context in the event of lost frames. Change-Id: I56671d5b42965d0166ac226765dbfce3e5301868	2010-09-29 13:04:04 -04:00
Paul Wilkins	7288cdf79d	Change to coefficient optimization rules. Allow coefficient optimization for good quality speed 0. Change-Id: Id0cb363df6823c6798671584fbba097916a7df2c	2010-09-29 13:22:05 +01:00
Adrian Grange	4f92b96bdb	Merge "Moved row-specific computation of MV bounds out of col loop"	2010-09-29 05:13:41 -07:00
Adrian Grange	0e7c45b391	Moved row-specific computation of MV bounds out of col loop Moved the bounds computation on vertical MV component out of the loop that processes MBs within a MB row.	2010-09-29 13:03:07 +01:00
Paul Wilkins	ff3068d6da	Control of active min quantizer for two pass. Create look up tables for controlling the active quantizer range. Some initial tuning to improve quality circa 0.5% on test set. Clean up of some stats output code Change-Id: Ia698a8525f8b8129a503cadace3ee73fe888f543	2010-09-29 12:03:19 +01:00
Fritz Koenig	0964ef0e71	Optimizations on the loopfilters. - Scheduling for Atom processors - Combining of macros to allow for better interleaving - Change from multiplies to adds for main filter - Use of movhps/movlps to fill xmm registers without shifting and orring Change-Id: I0b3500a5f58abf7085253ec92d64c8a96723040b	2010-09-28 12:01:34 -07:00
Adrian Grange	47fc8f2683	Enabled AltRef motion map creation Enabled the first-pass encode to output the map of macroblock coding modes required by the AltRef filter.	2010-09-28 16:52:19 +01:00
Adrian Grange	0090328164	Merge "Made AltRef filter adaptive & added motion compensation"	2010-09-28 08:34:44 -07:00
Adrian Grange	1b2f8308e4	Made AltRef filter adaptive & added motion compensation Modified AltRef temporal filter to adapt filter length based on macroblock coding modes selected during first-pass encode. Also added sub-pixel motion compensation to the AltRef filter.	2010-09-28 15:23:41 +01:00
Timothy B. Terriberry	18dc92fd66	Add 4-tap version of 2nd-pass ARMv6 MC filter. The existing code applied a 6-tap filter with 0's on either end. We're already paying the branch penalty to avoid computing the two extra columns needed as input to this filter. We might as well save time computing the filter as well. This reduces the inner loop from 21 instructions to 16, the number of loads per iteration from 4 to 1, and the number of multiplies from 7 to 4. The gain in overall decoding performance, however, is small (less than 1%). This change also means we now valgrind clean on ARMv6, which is its real purpose. The errors reported here were valgrind's fault (it does not detect that 0 times an uninitialized value is initialized), but Julian Seward says it would slow down valgrind considerably to make such checks. Speeding up libvpx rather, even by a small amount, seems a much better idea if only to enable proper valgrind checking of the rest of the codec. Change-Id: Ifb376ea195e086b60f61daf1097d8910c4d8ff16	2010-09-27 18:25:45 -07:00
Paul Wilkins	305be4e417	Badly placed initialization of rolling rate monitors. This affects control of the active quantizer range. Change-Id: I30511fc81ac9f75ff20d9f1372382423d56739da	2010-09-27 12:50:55 -04:00
John Koleszar	2b521ab551	move reconintra_mt to decoder (fixup) Missed the .h file in the move. Change-Id: Ib408183fbb4d019fd46394b362f89ca6ea9d10bc	2010-09-27 12:48:31 -04:00
John Koleszar	9fdcdc511d	Merge "disable compilation of debugging code"	2010-09-27 07:00:03 -07:00
Johann	063be9b82a	Merge "combine max values and compare once"	2010-09-27 06:39:20 -07:00
Timothy B. Terriberry	e2795e9978	Fix valgrind errors in vp8_sixtap_predict8x4_armv6(). This function was accessing values below the stack pointer, which can be corrupted by signal delivery at any time. Change-Id: I92945b30817562eb0340f289e74c108da72aeaca	2010-09-24 14:34:18 -07:00
Johann	f30e8dd7bd	combine max values and compare once previous implementation compared each set of values to limit and then &'d them together, requiring a compare and & for each value. this does the accumulation first, requiring only one compare Change-Id: Ia5e3a1a50e47699c88470b8c41964f92a0dc1323	2010-09-24 15:42:50 -04:00
John Koleszar	dbd57c2663	Merge "move reconintra_mt to decoder (for now)"	2010-09-24 08:46:35 -07:00
John Koleszar	8ca779aba8	disable compilation of debugging code This patch avoids compiling some debugging code in onyx_if.c. The most significant fix is to avoid generating code for vp8_write_yuv_frame, which is never called. Some other code was removed by the dead code elimination performed by the compiler, and this patch does it with the preprocessor instead. There are advantages both ways. Change-Id: I044fd43179d2e947553f0d6f2cad5b40907ac458	2010-09-24 11:42:22 -04:00
Yunqing Wang	aab0f5b121	Merge "Adjust multi-thread sync ranges according to image sizes"	2010-09-24 08:34:07 -07:00
John Koleszar	48e76ff4fd	move reconintra_mt to decoder (for now) reconintra_mt.c is only required for building the decoder right now. It could definitely be used for the encoder in the future, but it currently depends on decoder only data structures. (onyxd_int.h, VP8D_COMP, etc). Move it from common/ to decoder/ until the necessary changes to the common multithread code are complete. This patch is needed to build with --disable-vp8-decoder. Change-Id: I568c52221a2b309234d269675cba97131ce35c86	2010-09-24 11:23:06 -04:00
John Koleszar	329aaaf453	Merge "Add getter functions for the interface data symbols"	2010-09-24 05:39:48 -07:00
John Koleszar	fa7a55bb04	Add getter functions for the interface data symbols Having these symbols be available as functions rather than data is occasionally more convenient. Implemented this way rather than a get-codec-by-id style to avoid creating a link-time dependency between the encoder and the decoder. Fixes issue #169 Change-Id: I319f281277033a5e7e3ee3b092b9a87cce2f463d	2010-09-23 14:58:43 -04:00
Yunqing Wang	8db5da2906	Adjust multi-thread sync ranges according to image sizes In multi-threaded decoder, set different sync ranges for different video resolutions. Change-Id: Iea48fd36f51919e0152c8ed3b1f10e1b723c0ca7	2010-09-23 13:53:09 -04:00
Johann	7fed3832e7	Remove dead code The new loopfilter was originally introduced as an experimental change. It's permanent now. Change-Id: I25dbedb6ceff3e9f9c04e18bb29f84c3ecb7e546	2010-09-22 11:07:34 -04:00
John Koleszar	cdd2066687	unset execute bit on c source Change-Id: I6625ee41f8872908cb015ce0729e1c7a105b5217	2010-09-21 19:48:06 -04:00
John Koleszar	6f4c0435d1	Merge "Don't reset mb clamping state during splitmv decoding"	2010-09-21 09:06:59 -07:00
John Koleszar	4d391e8ed2	Don't reset mb clamping state during splitmv decoding The MV decoding changes in `c5fb0eb` introduced a bug where the macroblock clamping state was reset for each partition, so if an earlier partition needed clamping but a subsequent one didn't, the MB wouldn't receive clamping. Instead, the state is only set during splitmv decoding, never cleared. Change-Id: I224fe258493405ee0f6a04596acdb622c475e845	2010-09-21 11:58:48 -04:00
John Koleszar	015cfcafbd	Merge "Add high limit check for unsigned parameters"	2010-09-21 05:36:46 -07:00
Yunqing Wang	a23ccf8f8c	Merge "Restructure multi-threaded decoder"	2010-09-21 05:00:30 -07:00
Fritz Koenig	b7dc9398f2	Use movq instead of movdqu. Movdqu is more expensive (throughput, uops) than movq. Minimal impact for newer big cores, but ~2.25% gain on Atom. Change-Id: I62c80bb1cc01d8a91c350c4c7719462809a4ef7f	2010-09-20 11:34:26 -07:00
Fritz Koenig	1c906448cc	Merge "Better choice of instruction filter mask comparision."	2010-09-20 11:01:51 -07:00
Johann	6cf2b4aa0e	Merge "reorder data to use wider instructions"	2010-09-20 10:47:33 -07:00
Johann	9c9afbab85	Merge "Update NEON wide idcts"	2010-09-20 10:47:22 -07:00
Fritz Koenig	8eae7fe7e8	Better choice of instruction filter mask comparision. Use pmaxub instead of a combination of psubusb/por to determine if any comparisons go over the limit. Change-Id: I3f0bd7d2aabe5fee9ba6620508e2b60605abcb82	2010-09-20 10:20:38 -07:00
Guillermo Ballester Valor	236906863a	Add high limit check for unsigned parameters The patch related with issue #55 (`5a72620`) fixed some warnings, but the fix was not optimal. It actually was a trick to confuse compiler rather than a fix. This patch fixes it by creating a new macro used when needed just a high limit check for an unsigned. Change-Id: I94b322e0f7fb07604b3b1df1f9321185f48cfcb5	2010-09-20 10:03:05 -04:00
Johann	022323bf85	reorder data to use wider instructions the previous commit laid the groundwork by doing two sets of idcts together. this moved that further by grouping the interesting data (q[0], q+16[0]) together to allow using wider instructions. also managed to drop a few instructions by recognizing that the constant for sinpi8sqrt2 could be downshifted all the time which avoided a dowshift as well as workarounds for a function which only accepted signed data looks like a modest gain for performance: at qcif, went from ~180 fps to ~183 Change-Id: I842673f3080b8239e026cc9b50346dbccbab4adf	2010-09-17 16:47:39 -04:00
Yunqing Wang	f857a85088	Restructure multi-threaded decoder On each MB, loopfiltering is done right after MB decoding. This combines two loops in multi-threaded code into one, which reduces number of synchronizations to half. The above-row/left-col data are saved in temp buffers for next-row/next MB decoding. Tests on 4-core gLucid machine showed 10% decoder performance gain with threads=4 (tulip clip). Testing on other platforms isn't done yet. Change-Id: Id18ea7c1e84965dabea65d4c01ca5bc056ddeac9	2010-09-17 09:56:05 -04:00
John Koleszar	9100073e8d	cleanup: remove unused xprintf These files aren't currently used, and we can get them back if we need them. Change-Id: I62aa3bff828e491a80c80eeb84a7c44903df29b5	2010-09-16 13:14:12 -04:00
John Koleszar	147b125b15	Reduce size of tokenizer tables This patch reduces the size of the global tables maintained by the tokenizer to 16k from 80k-96k. See issue #177. Change-Id: If0275d5f28389af11ac83c5d929d1157cde90fbe	2010-09-16 10:00:04 -04:00
Suman Sunkara	00cec8f9e9	Changed code to remove extra read/write loops when not necessary Modified code so that: -When above and left contexts are same and not equal to current segment id, it needs to read a maximum of 2 segment_tree_probabilities. - When above and left contexts are different and not equal to current segment id, it needs to read only a single segment_tree_probability. Change-Id: Idc2cf2c4afcc6179b8162ac5a32c948ff5a9a2ba	2010-09-14 16:05:42 -04:00
Fritz Koenig	769f2424cc	Removed unnecessary pxor. There is no need to make sure that the lower byte of the register is 0 because the downshift by 11 overwrites that byte. Change-Id: I89cbf004b2ff532a2c68e0dc399c45a49cdad5a1	2010-09-13 18:34:34 -07:00
Fritz Koenig	71a1c19754	Merge "Make block access to frame buffer sequential"	2010-09-13 11:04:22 -07:00
Suman Sunkara	be7e4e854c	Delta updates to segmentation map using left and above contexts. -Updates by making use of spatial correlation. -Checks if the segment_id is same as above or left context and encodes only the update to the map instead of updating individual segment_ids. Change-Id: Ib861df97e8aa2b37516219eeddcdbaf552b6a249	2010-09-13 10:01:21 -04:00
Fritz Koenig	a65cd3def0	Make block access to frame buffer sequential Sequentially accessing memory from a low address to a high address should make it easier for the processor to predict the cache. Change-Id: I1921ce996bdd547144fe864fea6435f527f5842d	2010-09-10 16:27:28 -07:00
Scott LaVarnway	a32ded1d5f	Merge "Improved subset block search"	2010-09-09 11:51:29 -07:00
Scott LaVarnway	c5fb0eb8d9	Improved subset block search Improved the subset block search and fill. (about 3% improvement for 32 bit) Modified/merged the code in order to create vp8_read_mb_modes_mv which can decode the modes/mvs on a macroblock level. This will allow the decode loop (in the future) to decode modes/mvs on a frame, row, or mb level. Change-Id: If637d994b508792f846d39b5d44a7bf9aa5cddf3	2010-09-09 14:42:48 -04:00
Johann	14ba764219	Update NEON wide idcts Expand `93c32a55` which used SSE2 instructions to do two idct/dequant/recons at a time to NEON. Initial working commit. More work needs to be put into rearranging and interlacing the data to take advantage of quadword operations, which is when we'll hopefully see a much better boost Change-Id: I86d59d96f15e0d0f9710253e2c098ac2ff2865d1	2010-09-09 14:08:12 -04:00
John Koleszar	edcbb1c199	Fix GF interval for non-lagged ARFs When ARFs are enabled in non-lagged compress modes, the GF interval was being reset to zero. Non-lagged ARF updates were enabled in commit `63ccfbd`, but this incorrect GF interval caused a quality regression. Change-Id: I615c3b493f4ce2127044f4e68d0bcb07d6b730c3	2010-09-09 13:18:54 -04:00
Fritz Koenig	6d90f867e4	Merge branch 'master' of git://review.webmproject.org/libvpx	2010-09-09 08:54:21 -07:00
John Koleszar	c2140b8af1	Use WebM in copyright notice for consistency Changes 'The VP8 project' to 'The WebM project', for consistency with other webmproject.org repositories. Fixes issue #97. Change-Id: I37c13ed5fbdb9d334ceef71c6350e9febed9bbba	2010-09-09 10:01:21 -04:00
Jim Bankoski	69ae8f475d	Skip unnecessary search of identical frames vp8_get_compressed_data() was defeating logic in encode_frame_to_datarate() that determined the reference buffers to search and forcing all frames to be eligible to search. In cases where buffers have identical contents, this is unnecessary extra work. Change-Id: I9e667ac39128ae32dc455a3db4c62e3efce6f114	2010-09-08 11:31:34 -04:00
Jim Bankoski	63ccfbd545	Enable ARFs for non-lagged compress ARFs were explicitly disabled except in lagged compress mode. New ARF logic allows for the ARF buffer to hold an older golden frame, which does not require lagged compress. Change-Id: I1dff82b6f53e8311f1e0514b1794ae05919d5f79	2010-09-08 11:26:13 -04:00
Fritz Koenig	3fb37162a8	Bilinear subpixel optimizations for ssse3. Used pmaddubsw for multiply and add of two filter taps at once for 16x16 and 8x8 blocks. Change-Id: Idccf2d6e094561624407b109fa7e80ba799355ea	2010-09-07 17:19:40 -07:00
Scott LaVarnway	0de458f6b9	Reduced the size of MB_MODE_INFO Moved partition_bmi and partition_count out of MB_MODE_INFO and placed into MACROBLOCK. Also reduced the size of other members of the MB_MODE_INFO struct. For 1080p, the memory was reduced by 1,209,516 bytes. The decoder performance appeared to improve by 3% for the clip used. Note: The main goal for this change is to improve the decoder performance. The encoder will be revisited at a later date for further structure cleanup. Change-Id: I4733621292ee9cc3fffa4046cb3fd4d99bd14613	2010-09-03 16:43:23 -04:00
John Koleszar	4496db45e3	Whitespace: nuke CRLFs Change-Id: I8b9fdf9875a8fcff4cb49a3357ce44f18108c2e7	2010-09-02 13:33:01 -04:00
James Zern	76640f85da	encoder: remove postproc dependency Remove the dependency on postproc.c for the encoder in general, the only unchecked need for it is when CONFIG_PSNR is enabled. All other cases are already wrapped in CONFIG_POSTPROC. In the CONFIG_PSNR case the file will still be included. Additionally, when VP8_SET_POSTPROC is used with the encoder when post processing has been disabled an error will be returned. This addresses issue #153. Change-Id: Ia6dfe20167f7077734a6058cbd1d794550346089	2010-09-02 11:52:37 -04:00
John Koleszar	7a3e0a1d93	Merge "added separate rounding/zbin constants for 2nd order"	2010-09-02 08:42:29 -07:00
John Koleszar	9398be0f46	Merge "Disable frame dropping by default"	2010-09-02 08:41:46 -07:00
Yaowu Xu	fca129203a	added separate rounding/zbin constants for 2nd order This allows experiments of using different rounding and zerobin constants for 2nd order blocks. Change-Id: Idd829adba3edd1f713c66151a8d29bb245e33a71	2010-09-02 10:27:03 -04:00
John Koleszar	23216211bc	Disable frame dropping by default This is not the behavior that most users expect. Change-Id: I226126ea400c22cf1f7918e80ea7fe0771c569cb	2010-09-02 09:32:03 -04:00
Frank Galligan	d45e55015e	Fix rare deadlock before loop filter There was an extremely rare deadlock that happened when one thread was waiting to start the loop filter on frame n while the other threads were starting to work on frame n+1. Change-Id: Icc94f728b3b6663405435640d9a2996735ba19ef	2010-09-01 22:01:21 -04:00
Paul Wilkins	18c902f8a4	Merge "Improved Force Key Frame Behaviour"	2010-09-01 02:45:12 -07:00
Yunqing Wang	0e78efad0b	Replace sleep(0) calls in multi-threaded decoder This is a workaround for gLucid problem. Change-Id: I188a016a07e4c2ea212444c5a6284ff3c48a5caa	2010-08-31 20:37:11 -04:00
Paul Wilkins	c239a1b67c	Improved Force Key Frame Behaviour These changes improve the behaviour of the code with forced key frames sent in by a calling application. The sizing of the frames is still suboptimal for two pass in particular but the behaviour is much better than it was. Change-Id: I35fae610c67688ccc69d11f385e87dfc884e65a1	2010-08-31 14:32:40 -04:00
Johann	0b94f5d6e8	followup arm patch make the arm asm detokenizer work with the new structures Change-Id: I7cd92c2a018ec24032bb1cfd1bb9739bc84b444a	2010-08-31 11:41:10 -04:00
Scott LaVarnway	e85e631504	Changed above and left context data layout The main reason for the change was to reduce cycles in the token decoder. (~1.5% gain for 32 bit) This layout should be more cache friendly. As a result of this change, the encoder had to be updated. Change-Id: Id5e804169d8889da0378b3a519ac04dabd28c837 Note: dixie uses a similar layout	2010-08-31 11:24:30 -04:00
John Koleszar	aaad6d1b54	Merge "Fix harmless off-by-1 error."	2010-08-30 12:40:42 -07:00
John Koleszar	674e477b81	Merge "increase rate control buffer level precision"	2010-08-30 07:49:35 -07:00
Timothy B. Terriberry	7a8e0a2935	Fix harmless off-by-1 error. The memory being zeroed in vp8_update_mode_info_border() was just allocated with calloc, and so the entire function is actually redundant, but it should be made correct in case someone expects it to actually work in the future. Change-Id: If7a84e489157ab34ab77ec6e2fe034fb71cf8c79	2010-08-27 16:07:54 -07:00
Johann	5c244398e1	clean up compiler warnings did a test compile with clang and got rid of some warnings that have been annoying me for a while: vp8/decoder/detokenize.c: In function 'vp8_init_detokenizer': vp8/decoder/detokenize.c:121: warning: assignment discards qualifiers from pointer target type vp8/decoder/detokenize.c:122: warning: assignment discards qualifiers from pointer target type vp8/decoder/detokenize.c:123: warning: assignment from incompatible pointer type vp8/decoder/detokenize.c:124: warning: assignment discards qualifiers from pointer target type vp8/decoder/detokenize.c:125: warning: assignment discards qualifiers from pointer target type vp8/decoder/detokenize.c:128: warning: assignment discards qualifiers from pointer target type vp8/decoder/detokenize.c:129: warning: assignment discards qualifiers from pointer target type vp8/decoder/detokenize.c:130: warning: assignment discards qualifiers from pointer target type vp8/decoder/detokenize.c:131: warning: assignment discards qualifiers from pointer target type Change-Id: I78ddab176fe47cbeed30379709dc7bab01c0c2e4	2010-08-24 18:23:16 -04:00
Johann	d73217ab17	update structures mbmi and eob moved in previous commits Change-Id: I30a2eba36addf89ee50b406ad4afdd059a832711	2010-08-23 13:44:56 -04:00
Fritz Koenig	93c32a55c2	Rework idct calling structure. Moving the eob structure allows for a non-struct based function to handle decoding an entire mb of idct/dequant/recon data. This allows for SIMD functions to idct/dequant/recon multiple blocks at once. SSE2 implementation gives 3% gain on Atom. Change-Id: I8a8f3efd546ea4e0535f517d94f347cfb737c9c2	2010-08-23 08:58:54 -07:00
John Koleszar	8e7ebacb19	increase rate control buffer level precision The external API exposes the RC initial/optimal/full buffer level in milliseconds, but this value was truncated internally to seconds. This patch allows the use of the full precision during the conversion from time to bits. Change-Id: If8dd2a87614c05747f81432cbe75dd9e6ed2f04e	2010-08-20 11:04:48 -04:00
Jim Bankoski	b0660457fe	Revert "Removed ssse3 sixtap code" This reverts commit `6ea5bb85cd`.	2010-08-19 15:58:27 -04:00
Johann	52852da7c9	cleanup simple loop filter move some things around, reorder some instructions constant 0 is used several times. load it once per call in horiz, once per loop in vert. separate saturating instructions to avoid stalls. just use one usub8 call to set GE flags, rather than uqsub8 followed by usub8 w/ 0 document some stalls for further consideration Change-Id: Ic3877e0ddbe314bb8a17fd5db73501a7d64570ec	2010-08-19 13:37:40 -04:00
Johann	a522be2941	Merge "fix armv6 simpleloop filter"	2010-08-19 08:31:57 -07:00
Johann	467a0b99ab	fix armv6 simpleloop filter test cases were causing a crash because the count was being read incorrectly. after fixing that, noticed that the output was not matching. fixed that. Change-Id: Idb0edb887736bd566a3cf6d4aa1a03ea8d20eb27	2010-08-19 11:29:21 -04:00
Scott LaVarnway	6ea5bb85cd	Removed ssse3 sixtap code Change-Id: I0f20fbb898ee31eb94a143471aa6f1ca17a229a4	2010-08-18 15:34:09 -04:00
John Koleszar	496cf8cc48	Merge "store more vars than we removed"	2010-08-16 07:54:48 -07:00
Johann	c75f3993c0	store more vars than we removed only saved r4-11+lr, but were storing r4-r12+lr Change-Id: If77df1998af50e9badee7d99ef53543046434675	2010-08-16 10:32:15 -04:00
John Koleszar	9aa498b82a	arm: fix missing dependency with --enable-shared The C version of the dequant/idct/add function depends on the C version of the IDCT, but this isn't compiled in on ARM. Since this code has asm version, we can just remove this file to eliminate the link error. Change-Id: I21de74d89d3765a1db2da27292b20727c53178e9	2010-08-16 09:34:34 -04:00
John Koleszar	80d3923a78	move segmentation_common to encoder vp8_update_gf_useage_maps() is only used by the encoder. This patch fixes the ability to build in decode-only or encode-only configurations. Change-Id: I3a5211428e539886ba998e09e8abd747ac55c9aa	2010-08-13 14:54:24 -04:00
Johann	9602799cd9	framework for assembly version of the detokenizer adds a compile time option: --enable-arm-asm-detok which pulls in vp8/decoder/arm/detokenize.asm currently about break even speed wise, but changes are pending to the fill code (branch and load 3 bytes versus conditionally always load one) and the error handling. Currently it doesn't handle zero runs or overrunning the buffer. this is really just so i don't have to rebase my changes all the time to run benchmarks - now just need to replace one file! Change-Id: I56d0e2354dc0ca3811bffd0e88fe1f952fa6c797	2010-08-12 16:39:56 -04:00
Johann	633646b73b	update structure mode_info_context->mbmi no longer gets copied up a level Change-Id: Icd2d27d381909721326c34594a1ccdc26d48a995	2010-08-12 16:37:55 -04:00
Johann	1ec7981c34	remove unused definition asm_offsets contains some definitions which are no longer used. this was one of them. v6 build works now Change-Id: If370cfa8acd145de4fead2d9a11b048fccc090df	2010-08-12 16:37:55 -04:00
Scott LaVarnway	9c7a0090e0	Removed unnecessary MB_MODE_INFO copies These copies occurred for each macroblock in the encoder and decoder. Thetemp MB_MODE_INFO mbmi was removed from MACROBLOCKD. As a result, a large number compile errors had to be fixed. Change-Id: I4cf0ffae3ce244f6db04a4c217d52dd256382cf3	2010-08-12 16:25:43 -04:00
Scott LaVarnway	f5615b6149	Merge "Finished vp8_sixtap_predict4x4_ssse3 function"	2010-08-11 12:23:24 -07:00
John Koleszar	d22e2968a8	cosmetics: add missing 2D array braces Silences compile warning. Change-Id: I4b207d97f8570fe29aa2710e4ce4f02e7e43b57a	2010-08-11 13:55:38 -04:00
John Koleszar	392a958274	avoid negative array subscript warnings The mv_ref and sub_mv_ref token encodings are indexed from NEARESTMV and LEFT4X4, respectively, rather than being zero-based like the other token encodings. Change-Id: I3699c3f84111209ecfb91097c4b900773e9a3ad5	2010-08-11 13:49:12 -04:00
Scott LaVarnway	b07e5b6fa1	Finished vp8_sixtap_predict4x4_ssse3 function Added vp8_filter_block1d4_h6_ssse3 and vp8_filter_block1d4_v6_ssse3 assembly routines. Also removed unused assembly. Change-Id: I01c1021835f2edda9da706822345f217087ca0d0	2010-08-11 13:49:00 -04:00
Johann	c0ba42d3c0	rename DETOK_[AL] everything else uses lowercase detok Change-Id: I9671e2e90eb2961208dfa81c00b3accb5749ec04	2010-08-11 13:36:35 -04:00
Scott LaVarnway	99f46d62d9	Moved gf_active code to encoder only The gf_active code is only used by the encoder, so it was moved from common and decoder. Change-Id: Iada15acd5b2b33ff70c34668ca87d4cfd0d05025	2010-08-11 11:54:25 -04:00
Yaowu Xu	c404fa42ac	Removed duplicate functions Change-Id: Ie587972ccefd3c762b8cdf8ef39345cd22924b9b	2010-08-10 21:45:34 -07:00
Yaowu Xu	3b95a46c55	Normalize quantizer's zero bin and rounding factors This patch changes a few numbers in the two constant arrays for quantizer's zerobin and rounding factors, in general to make the sum of the two factors for any Q to be 128. While it might be beneficial to calibrate the two arrays for best quantizer performance, it is not the purpose of this patch. Normalizing the two arrays will enable quick optimization of the current faster quantizer, i.e .zerobin check can be removed. Change-Id: If9abfd7929bf4b8e9ecd64a79d817c6728c820bd	2010-08-10 21:12:04 -07:00
Timothy B. Terriberry	8fa38096a3	Add trellis quantization. Replace the exponential search for optimal rounding during quantization with a linear Viterbi trellis and enable it by default when using --best. Right now this operates on top of the output of the adaptive zero-bin quantizer in vp8_regular_quantize_b() and gives a small gain. It can be tested as a replacement for that quantizer by enabling the call to vp8_strict_quantize_b(), which uses normal rounding and no zero bin offset. Ultimately, the quantizer will have to become a function of lambda in order to take advantage of activity masking, since there is limited ability to change the quantization factor itself. However, currently vp8_strict_quantize_b() plus the trellis quantizer (which is lambda-dependent) loses to vp8_regular_quantize_b() alone (which is not) on my test clip. Patch Set 3: Fix an issue related to the cost evaluation of successor states when a coefficient is reduced to zero. With this issue fixed, now the trellis search almost exactly matches the exponential search. Patch Set 2: Overall, the goal of this patch set is to make "trellis" search to produce encodings that match the exponential search version. There are three main differences between Patch Set 2 and 1: a. Patch set 1 did not properly account for the scale of 2nd order error, so patch set 2 disable it all together for 2nd blocks. b. Patch set 1 was not consistent on when to enable the the quantization optimization. Patch set 2 restore the condition to be consistent. c. Patch set 1 checks quantized level L-1, and L for any input coefficient was quantized to L. Patch set 2 limits the candidate coefficient to those that were rounded up to L. It is worth noting here that a strategy to check L and L+1 for coefficients that were truncated down to L might work. (a and b get trellis quant to basically match the exponential search on all mid/low rate encodings on cif set, without a, b, trellis quant can hurt the psnr by 0.2 to .3db at 200kbps for some cif clips) (c gets trellis quant to match the exponential search to match at Q0 encoding, without c, trellis quant can be 1.5 to 2db lower for encodings with fixed Q at 0 on most derf cif clips) Change-Id: Ib1a043b665d75fbf00cb0257b7c18e90eebab95e	2010-08-10 20:58:24 -07:00
Scott LaVarnway	e4fe866949	Added ssse3 version of sixtap filters Improved decoder performance by 9% for the clip used. Change-Id: I8fc5609213b7bef10248372595dc85b29f9895b9	2010-08-10 17:33:49 -04:00
Yunqing Wang	ba2e107d28	First modification of multi-thread decoder This is the first modification of VP8 multi-thread decoder, which uses same threads to decode macroblocks and then do loopfiltering for each frame. Inspired by Rob Clark, synchronization was done on every 8 macroblocks instead of every macroblock to reduce lock contention. Comparing with the original code, this implementation gave about 15%- 20% performance gain while decoding my test clips on a Core2 Quad platform (Linux). The work is not done yet. Test on other platforms are needed. Change-Id: Ice9ddb0b511af1359b9f71e65066143c04fef3b5	2010-08-10 14:09:57 -04:00
John Koleszar	618c7d27a0	Mark loopfilter C functions as static Clang defaults to C99 mode, and inline works differently in C99. (gcc, on the other hand, defaults to a special gnu-style inlining, which uses different syntax.) Making the functions static makes sure clang doesn't decide to discard a function because it's too large to inline. Thanks to eli.friedman for the patch. Fixes http://code.google.com/p/webm/issues/detail?id=114 Change-Id: If3c1c3c176eb855a584a60007237283b0cc631a4	2010-08-09 09:36:44 -04:00
John Koleszar	cfb204eaf7	Merge "Issue 150: Fixing linker warning in extend.c."	2010-08-02 09:35:05 -07:00
Jan Kratochvil	0e8f108fb0	nasm: avoid space before the :data symbol type. global label:data ^^ Provide nasm compatibility. No binary change by this patch with yasm on {x86_64,i686}-fedora13-linux-gnu. Few longer opcodes with nasm on {x86_64,i686}-fedora13-linux-gnu have been checked as safe. Change-Id: I10f17eb1e4d4a718d4ebd1d0ccddc807c365e021	2010-08-02 09:20:42 -04:00
Jan Kratochvil	0327d3df90	nasm: end labels with colon (':') Labels should end by colon (':'), nasm requires it. Provide nasm compatibility. No binary change by this patch with yasm on {x86_64,i686}-fedora13-linux-gnu. Few longer opcodes with nasm on {x86_64,i686}-fedora13-linux-gnu have been checked as safe. Change-Id: I0b2ec6f01afb061d92841887affb5ca0084f936f	2010-08-02 09:20:03 -04:00
Jan Kratochvil	c8134bc54a	nasm: use OWORD vs DQWORD nasm knows only OWORD. yasm knows both OWORD and DQWORD. Provide nasm compatibility. No binary change by this patch with yasm on {x86_64,i686}-fedora13-linux-gnu. Few longer opcodes with nasm on {x86_64,i686}-fedora13-linux-gnu have been checked as safe. Change-Id: I62151390089e90df9a7667822fa594ac20b00e78	2010-08-02 09:17:14 -04:00
John Koleszar	675298216d	Merge "Replace pinsrw (SSE) with MMX instructions"	2010-08-02 06:16:26 -07:00
Philip Jägenstedt	7d243701d9	Replace pinsrw (SSE) with MMX instructions Fixes http://code.google.com/p/webm/issues/detail?id=136 Change-Id: I5a3e294061644a1a9718e8ba4a39548ede25cc42	2010-08-02 09:15:45 -04:00
John Koleszar	38a20e030f	apple: include proper mach primatives Fixes implicit declaration warning for 'mach_task_self'. Patch courtesy of timeless at gmail.com Change-Id: I9991dedd1ccfddc092eca86705ecbc3b764b799d	2010-07-29 17:04:44 -04:00
Yaowu Xu	c2a8d8b54c	Merge "Enable the switch between two versions of quantizer"	2010-07-29 07:17:40 -07:00
Frank Galligan	062e6c1886	Removed two unused global variables. Removed the global variables vp8_an and vp8_cd. vp8_an was causing problems because it was increasing the .bss by 1572864 bytes. Change-Id: I6c12e294133c7fb6e770c0e4536d8287a5720a87	2010-07-28 17:25:09 -04:00
Yaowu Xu	f95c80b60f	Enable the switch between two versions of quantizer To facilitate more testing related to quantizer and rate control, the old version quantizer is added back. old and new quantizer can be switched back and forth by define or un-define the macro "EXACT_QUANT". Change-Id: Ia77e687622421550f10e9d65a9884128a79a65ff	2010-07-28 10:51:34 -07:00
John Koleszar	aa82363c46	Merge "msvs: fix install of codec sources"	2010-07-27 11:21:42 -07:00
Johann	a570bbd418	x86/sse2: disable asm quantizer follow up to Change I0e51492d: neon: disable asm quantizer Now x86 doesn't segfault with --disable-runtime-cpu-detect and -p=2 Change-Id: I8ca127bb299198efebbcbd5a661e81788361933f	2010-07-27 12:54:43 -04:00
Johann	b9a038a5ed	Fix build w/o RTCD So many places to update ... Change-Id: Ide957b40cc833f99c2d1849acade6850fbf7585d	2010-07-27 11:56:19 -04:00
John Koleszar	d8009c077a	neon: disable asm quantizer The assembly version of the quantizer has not been updated to match the new exact quantizer introduced in commit `e04e2935`. That commit tried to disable this code but missed the non-RTCD case. Thanks to David Baker <david.baker at openmarket.com> for isolating the issue and testing this fix. Change-Id: I0e51492dc6f8e44d2c10b587427448bf94135c65	2010-07-27 11:16:19 -04:00
Fritz Koenig	1743f9486b	Merge "update arm idct functions"	2010-07-26 06:05:39 -07:00
Fritz Koenig	3de8a95831	Merge changes I896fe6f9,I90d8b167 * changes: Change the x86 idct functions to do reconstruction at the same time Combine idct and reconstruction steps	2010-07-26 06:05:30 -07:00
Johann	56f5a9a060	update arm idct functions Jeff Muizelaar posted some changes to the idct/reconstruction c code. This is the equivalent update for the arm assembly. This shows a good boost on v6, and a minor boost on neon. Here are some numbers for highway in qcif, 2641 frames: HEAD neon: ~161 fps new neon: ~162 fps HEAD v6: ~102 fps new v6: ~106 fps The following functions have been updated for armv6 and neon: vp8_dc_only_idct_add vp8_dequant_idct_add vp8_dequant_dc_idct_add Conflicts: vp8/decoder/arm/armv6/dequantdcidct_v6.asm vp8/decoder/arm/armv6/dequantidct_v6.asm Resolved by removing these files. When I rewrote the functions, I also moved the files to dequant_dc_idct_v6.asm/dequant_idct_v6.asm Change-Id: Ie3300df824d52474eca1a5134cf22d8b7809a5d4	2010-07-26 08:55:19 -04:00
Justin Lebar	1d8277f8e8	Issue 150: Fixing linker warning in extend.c.	2010-07-23 16:42:25 -07:00
Fredrik Söderquist	2add72d9bc	Don't dereference ctx->priv if it hasn't been setup correctly.	2010-07-23 19:13:50 -04:00
Fredrik Söderquist	eafcf918a0	Only touch ctx->priv if vp8_mmap_alloc succeeded.	2010-07-23 19:13:34 -04:00
Jeff Muizelaar	98fcccfe97	Change the x86 idct functions to do reconstruction at the same time Change-Id: I896fe6f9664e6849c7cee2cc6bb4e045eb42540f	2010-07-23 15:21:36 -04:00
Jeff Muizelaar	b2fa74ac18	Combine idct and reconstruction steps This moves the prediction step before the idct and combines the idct and reconstruction steps into a single step. Combining them seems to give an overall decoder performance improvement of about 1%. Change-Id: I90d8b167ec70d79c7ba2ee484106a78b3d16e318	2010-07-23 15:21:36 -04:00
Fritz Koenig	0ce3901282	Swap alt/gold/new/last frame buffer ptrs instead of copying. At the end of the decode, frame buffers were being copied. The frames are not updated after the copy, they are just for reference on later frames. This change allows multiple references to the same frame buffer instead of copying it. Changes needed to be made to the encoder to handle this. The encoder is still doing frame buffer copies in similar places where pointer reference could be done. Change-Id: I7c38be4d23979cc49b5f17241ca3a78703803e66	2010-07-23 14:53:59 -04:00
Paul Wilkins	68cf24310b	Merge commit 'refs/changes/51/351/1' of ssh://review.webmproject.org:29418/libvpx into KfRateBugMerged	2010-07-23 17:45:26 +01:00
Yaowu Xu	f5cf8553a2	Merge "Make the quantizer exact."	2010-07-23 09:26:26 -07:00
Paul Wilkins	9404c7db6d	Rate control bug with long key frame interval. In two pass encodes, the calculation of the number of bits allocated to a KF group had the potential to overflow for high data rates if the interval is very long. We observed the problem in one test clip where there was one section where there was an 8000 frame gap between key frames. Change-Id: Ic48eb86271775d7573b4afd166b567b64f25b787	2010-07-23 17:01:12 +01:00
Timothy B. Terriberry	e04e293522	Make the quantizer exact. This replaces the approximate division-by-multiplication in the quantizer with an exact one that costs just one add and one shift extra. The asm versions have not been updated in this patch, and thus have been disabled, since the new method requires different multipliers which are not compatible with the old method. Change-Id: I53ac887af0f969d906e464c88b1f4be69c6b1206	2010-07-23 08:48:01 -07:00
Paul Wilkins	d576690ba1	80 character line length on Arnr LUT Tweaked table to fit to 80 characters. Change-Id: Ie6ba80e0b31b33e23d2bf78599abe223369fcefb	2010-07-23 16:47:54 +01:00
Fritz Koenig	08eed049d4	Remove CONFIG_NEW_TOKENS files. These files were out of date and no longer maintained. Token decoding has implemented the no-crash code which is incompatible with this arm assembly code. Change-Id: Ibf729886c56fca48181af60b44bda896c30023fc	2010-07-22 19:00:21 -04:00
John Koleszar	4d86ef3534	msvs: fix install of codec sources The libs.mk file must be installed for the vpx.vcproj file to be generated. It was being installed, but not in the src/ directory as expected. Also missed include files yasm.rules, quantize_x86.h Change-Id: Ic1a6f836e953bfc954d6e42a18c102a0114821eb	2010-07-22 18:33:25 -04:00
Johann	160d671e34	Merge "limit range checking code for L[k] to CONFIG_DEBUG. patch by timeless@gmail.com"	2010-07-21 12:59:39 -07:00
Yaowu Xu	7a89d4c3d4	Merge "Improve the accuracy of forward walsh-hadamard transform"	2010-07-19 07:50:26 -07:00
Paul Wilkins	0ba32632cd	ARNR Lookup Table. Change submitted for Adrian Grange. Convert threshold calculation in ARNR filter to a lookup table. Change-Id: I12a4bbb96b9ce6231ce2a6ecc2d295610d49e7ec	2010-07-19 14:46:42 +01:00
Paul Wilkins	02277b8aa3	Parameter limit change. Change maximum ARNR filter width to 15. Change-Id: I3b72450ea08e96287445ec18810630ee2292954c	2010-07-19 14:39:43 +01:00
Paul Wilkins	bf18069ceb	Rate control fix for ARNR filtered frames. Previously we had assumed that it was necessary to give a full frame's bit allocation to the alt ref frame if it has been created through temporal filtering. This is not the case. The active max quantizer control insures that sufficient bits are allocated if needed and allocating a full frame's worth of bits creates an excessive overhead for the ARF. Change-Id: I83c95ed7bc7ce0e53ccae6ff32db5a97f145937a	2010-07-19 14:10:07 +01:00
Paul Wilkins	7c938f4d3c	Fix: Incorrect 'cols' calculation in temporal filter. Change-Id: I37f10fbe4fbb505c1d34980a59af3e817c287e22	2010-07-16 15:57:17 +01:00
Michael Kohler	80f0e7a7d0	limit range checking code for L[k] to CONFIG_DEBUG. patch by timeless@gmail.com	2010-07-12 18:41:45 +02:00
John Koleszar	16249382cb	Merge "Fix misspelled "skiped" in onyxc_int.h to "skipped"."	2010-07-07 16:57:08 -07:00
Yaowu Xu	3d0a1edadd	Fix a compiling error on armv6 The issue was caused by a bad merge in Change I5559d1e8 Change-Id: I6563f652bc1500202de361f8f51d11cc6ddf3331	2010-07-07 14:45:13 -04:00
Michael Kohler	1e23f45119	Fix misspelled "skiped" in onyxc_int.h to "skipped". Signed-off-by: Michael Kohler <michaelkohler@live.com>	2010-07-07 20:06:04 +02:00
Adrian Grange	0618ff14d6	Fix bug in 1st pass motion compensation In the case where the best reference mv is not (0,0) a secondary search is carried out centered on (0,0). However, rather than sending tmp_err into the search function, motion_error was inadvertently passed. As a result tmp_err remains set at INT_MAX and the (0,0)-based search result will never be selected, even if it is better. Change-Id: I3c82b246c8c82ba887b9d3fb4c9e0a0f2fe5a76c	2010-07-01 14:19:43 +01:00
Paul Wilkins	2e3d8d3263	Merge "Further adjustment of RD behaviour with Q and Zbin."	2010-07-01 01:53:40 -07:00
John Koleszar	b3eb3d2163	Merge "Remove INLINE/FORCEINLINE"	2010-06-30 07:59:39 -07:00
John Koleszar	308e867f91	Update loopfilter frame/filter/sharp info for multithread Change I9fd1a5a4 updated the multithreaded loopfilter to avoid reinitializing several parameteres if they haven't changed from the last frame, but the code to update the last frame's parameters wasn't invoked in the multithreaded case. Change-Id: Ia23d937af625c01dd739608e02d110f742b7e1f2	2010-06-30 10:23:53 -04:00
Yunqing Wang	b2f77866aa	Merge "Add loopfilter initialization fix in multithreading code"	2010-06-30 06:56:36 -07:00
Yunqing Wang	29d586b462	Add loopfilter initialization fix in multithreading code Modified loopfilter initialization to avoid unnecessary operations. Change-Id: I9fd1a5a49edc1cb8116c2a72a6908b1e437459ec	2010-06-30 09:42:39 -04:00
Adrian Grange	cf49034b14	Merge "Fixed buffer selection for UV in AltRef filtering"	2010-06-30 02:43:47 -07:00
Yunqing Wang	bead039d4d	Improve SSE2 loopfilter functions Restructured and rewrote SSE2 loopfilter functions. Combined u and v into one function to take advantage of SSE2 128-bit registers. Tests on test clips showed a 4% decoder performance improvement on Linux desktop. Change-Id: Iccc6669f09e17f2224da715f7547d6f93b0a4987	2010-06-29 15:23:14 -04:00
Paul Wilkins	1ca39bf26d	Further adjustment of RD behaviour with Q and Zbin. Following conversations with Tim T (Derf) I ran a large number of tests comparing the existing polynomial expression with a simpler ^2 variant. Though the polynomial was sometimes a little better at the extremes of Q it was possible to get close for most clips and even a little better on some. This code also changes the way the RD multiplier is calculated when the ZBIN is extended to use a variant of the same ^2 expression. I hope that this simpler expression will be easier to tune further as we expand our test set and consider adjustments based on content. Change-Id: I73b2564346e74d1332c33e2c1964ae093437456c	2010-06-29 12:15:54 +01:00
Yaowu Xu	b62d093efa	Improve the accuracy of forward walsh-hadamard transform Besides the slight improvement in round trip error. This also fixes a sign bias in the forward transform, so the round trip errors are evenly distributed between +1s and -1s. The old bias seemed to work well with the dc sign bias in old fdct, which no longer exist in the improved fdct. Change-Id: I8635e7be16c69e69a8669eca5438550d23089cef	2010-06-28 22:10:48 -07:00
Adrian Grange	aa8fe0d269	Fixed buffer selection for UV in AltRef filtering Corrected setting of "which_buffer" for U & V cases to match that used for Y, i.e. to refer to the temporally most recent frame of those to be filtered. Change-Id: Idf94b287ef47a05f060da3e61134a0b616adcb6b	2010-06-28 16:45:06 +01:00
Scott LaVarnway	f1a3b1e0d9	Added first-pass sse2 version of Yaowu's new fdct. Change-Id: Ib479210067510162879c368428b92690591120b2	2010-06-24 16:40:56 -04:00
Yaowu Xu	d0dd01b8ce	Redo the forward 4x4 dct The new fdct lowers the round trip sum squared error for a 4x4 block ~0.12. or ~0.008/pixel. For reference, the old matrix multiply version has average round trip error 1.46 for a 4x4 block. Thanks to "derf" for his suggestions and references. Change-Id: I5559d1e81d333b319404ab16b336b739f87afc79	2010-06-24 13:17:58 -07:00
Fritz Koenig	a5906668a3	vp8cx : bestsad declared and initialized incorrectly. bestsad needs to be a int and set to INT_MAX because at the end of the function it is compared to INT_MAX to determine if there was a match in the function. Change-Id: Ie80e88e4c4bb4a1ff9446079b794d14d5a219788	2010-06-24 14:30:48 -04:00
Fritz Koenig	cecdd73db7	vp8cx : bestsad declared and initialized incorrectly. bestsad should be an int initialized to INT_MAX. The optimized SAD function expects a signed value for bestsad to use for comparison and early loop termination. When no match is made, which is determined by a comparison of bestsad to INT_MAX, INT_MAX is returned.	2010-06-24 12:18:23 -04:00
John Koleszar	5e34461448	Remove INLINE/FORCEINLINE These are mostly vestigial, it's up to the compiler to decide what should be inlined, and this collided with certain Windows platform SDKs. Change-Id: I80dd35de25eda7773156e355b5aef8f7e44e179b	2010-06-24 09:24:33 -04:00
agrange	a08df4552a	Fix breakout thresh computation for golden & AltRef frames 1. Unavailability of each reference frame type should be tested independently, 2. Also, only the VP8_GOLD_FLAG needs to be tested before setting golden frame specific thresholds, and only VP8_ALT_FLAG needs testing before setting thresholds relevant to the AltRef frame. (Raised by gbvalor, in response to Issue 47) Change-Id: I6a06fc2a6592841d85422bc1661e33349bb6c3b8	2010-06-21 16:50:59 +01:00
agrange	daa5d0eb3d	Changed unary operator from ! to ~ Since the intent is to reset the appropriate bit in ref_frame_flags not to test a logic condition. Prior result would always have been ref_frame_flags being set to 0. (Issue reported by dgohman, issue 47) Change-Id: I2c12502ed74c73cf38e98c9680e0249c29e16433	2010-06-21 15:23:51 +01:00
agrange	d4b99b8e3a	Moved DOUBLE_DIVIDE_CHECK to denominator (was on numerator) The DOUBLE_DIVIDE_CHECK macro prevents from divide by 0, so must be on the denominator to work as intended. Change-Id: Ie109242d52dbb9a2c4bc1e11890fa51b5f87ffc7	2010-06-21 15:20:52 +01:00
Timothy B. Terriberry	9f81463454	Fix a linker error on x86-64 Linux when not using a version script. If the version script produced by the libvpx build system is not used when linking a shared library on x86-64 Linux, the constant data in the subpel filters produces R_X86_64_32 relocation errors due to the use of wrt rip addressing instead of wrt rip wrt ..gotpcrel. Instead of adding a new macro for this addressing mode, this patch sets the ELF visibility of these symbols to "hidden", which allows wrt rip addressing to work without a text relocation. This allows building a shared library without using the provided build system or a separate version script. Fixes http://code.google.com/p/webm/issues/detail?id=46 Change-Id: Ie108f9d9a4352e5af46938bf4750d2302c1b2dc2	2010-06-21 08:19:12 -04:00
Jim Bankoski	220daa00e0	vp8_block_error_xmm: remove unnecessary instructions Remove a couple instructions from this function which weren't necessary for correct execution. Change-Id: Ib649674f140689f7e5c1530c35686241688a3151	2010-06-18 13:34:43 -04:00
John Koleszar	94c52e4da8	cosmetics: trim trailing whitespace When the license headers were updated, they accidentally contained trailing whitespace, so unfortunately we have to touch all the files again. Change-Id: I236c05fade06589e417179c0444cb39b09e4200d	2010-06-18 13:06:11 -04:00
John Koleszar	c65e8e8e46	Merge "Change bitreader to use a larger window."	2010-06-17 18:08:36 -07:00
Timothy B. Terriberry	c17b62e1bd	Change bitreader to use a larger window. Change bitreading functions to use a larger window which is refilled less often. This makes it cheap enough to do bounds checking each time the window is refilled, which avoids the need to copy the input into a large circular buffer. This uses less memory and speeds up the total decode time by 1.6% on an ARM11, 2.8% on a Cortex A8, and 2.2% on x86-32, but less than 1% on x86-64. Inlining vp8dx_bool_decoder_fill() has a big penalty on x86-32, as does moving the refill loop to the front of vp8dx_decode_bool(). However, having the refill loop between computation of the split values and the branch in vp8_decode_mb_tokens() is a big win on ARM (presumably due to memory latency and code size: refilling after normalization duplicates the code in the DECODE_AND_BRANCH_IF_ZERO and DECODE_AND_LOOP_IF_ZERO cases. Unfortunately, refilling at the end of vp8dx_bool_decoder_fill() and at the beginning of each decode step in vp8_decode_mb_tokens() means the latter requires an extra refill at the end. Platform-specific versions could avoid the problem, but would require most of detokenize.c to be duplicated. Change-Id: I16c782a63376f2a15b78f8086d899b987204c1c7	2010-06-15 19:55:14 -07:00
Yunqing Wang	9fdfb8e928	Merge "More on "some XMM registers are non-volatile on windows x64 ABI""	2010-06-15 06:41:54 -07:00
Yunqing Wang	397aad3ec2	More on "some XMM registers are non-volatile on windows x64 ABI" Add same fix in subpixel_sse2.asm. Change-Id: Icfda6103cbf74ec43308e96961dd738aa823c14d	2010-06-15 09:11:26 -04:00
John Koleszar	89c8b3dbc6	vp8_cx_iface: set default cpu used to 0 Change-Id: I7b35f4717cdd204224112f72471b551617262417	2010-06-14 17:28:15 -04:00
Guillermo Ballester Valor	5a72620de9	Fix compiler warnings Change-Id: I2a97f08cc3c7808ce5be39e910cc5147ecf03a1d	2010-06-14 17:23:49 -04:00
Scott LaVarnway	48c84d138f	sse2 version of vp8_regular_quantize_b Added sse2 version of vp8_regular_quantize_b which improved encode performance(for the clip used) by ~10% for 32 bit builds and ~3% for 64 bit builds. Also updated SHADOW_ARGS_TO_STACK to allow for more than 9 arguments. Change-Id: I62f78eabc8040b39f3ffdf21be175811e96b39af	2010-06-14 14:07:56 -04:00
Paul Wilkins	99c5745760	Merge "Use local pointer to pbi->common."	2010-06-14 09:55:02 -07:00
John Koleszar	900d0548db	Merge "Make this/next iiratio unsigned."	2010-06-13 14:35:21 -07:00
Paul Wilkins	5ef25a9728	Merge "Tuning of baseline Rd equation to improve behavior at the"	2010-06-13 04:01:46 -07:00
Paul Wilkins	b99d89d0bf	Merge "Incorrect comment."	2010-06-13 04:01:01 -07:00
John Koleszar	cd475da8ed	Make this/next iiratio unsigned. This patch addresses issue #79, which is a regression since commit `28de670` "Fix RD bug." If the coded error value is zero, the iiratio calculation effectively multiplies by 1000000 by the DOUBLE_DIVIDE_CHECK macro. This can result in a value larger than INT_MAX, giving a negative ratio. Since the error values are conceptually unsigned (though they're stored in a double) this patch makes the iiratio values unsigned, which allows the clamping to work as expected.	2010-06-12 14:11:51 -04:00
John Koleszar	00d566eae1	Merge "require --enable-psnr to build ssim"	2010-06-12 07:10:39 -07:00
John Koleszar	59c50966ac	Enable vp8_sad16x16x4d_sse3 in non-RTCD case Typo caused C version of 16x16x4 SAD to be called when built with --disable-runtime-cpu-detect. Change-Id: I0fe6fa67280b3a5f13acb3c8ed914f039aaaf316	2010-06-11 13:15:30 -04:00
John Koleszar	9099fc0d69	require --enable-psnr to build ssim ssim.c comiles in a huge (512M) amount of global scratch space. Allocating this data on the heap would be a better solution, but this file doesn't need to be built at all in most cases, so as a first pass, disable it except when doing opsnr.stt output (--enable-psnr). Change-Id: I320d812f6d652a12516a16b52295ebff20b5bd42	2010-06-11 13:05:08 -04:00
Makoto Kato	63ea8705eb	some XMM registers are non-volatile on windows x64 ABI XMM6 to XMM15 are non-volatile on Windows x64 ABI. We have to save these registers. Change-Id: I4676309f1350af25c8a35f0c81b1f0499ab99076	2010-06-11 12:11:15 -04:00
Paul Wilkins	20f7332b34	Incorrect comment. (Thanks to Ronald S. Bultje)	2010-06-11 16:12:45 +01:00
Paul Wilkins	7a81b29d38	Use local pointer to pbi->common.	2010-06-11 15:17:57 +01:00
Paul Wilkins	f6a58d620d	Tuning of baseline Rd equation to improve behavior at the low and high Q ends.	2010-06-11 15:10:51 +01:00
Yunqing Wang	8389f1967c	Merge "Improve vp8_sixtap_predict functions"	2010-06-11 06:48:52 -07:00
John Koleszar	fb220d257b	replace while(0) construct with if/else No good reason to be tricky here. I don't know why 'break' occurred to me as the natrual replacement for the 'return', but an if/else block is definitely clearer. Change-Id: I08a336307afeb0dc7efa494b37398f239f66c2cf	2010-06-10 20:15:21 -04:00
Timothy B. Terriberry	05c6eca4db	Fix new MV clamping scheme for chroma MVs. The new scheme introduced in I68d35a2f did not clamp chroma MVs in the SPLITMV case, and clamped them incorrectly (to the luma plane bounds) in every other case. Because chroma MVs are computed from the luma MVs before clamping occurs, they could still point outside of the frame buffer and cause crashes. This clamping happens outside of the MV prediction loop, and so should not affect bitstream decoding.	2010-06-10 18:42:24 -04:00
John Koleszar	317a66693b	Remove reference to 'vpx Technologies' Vestigial. Change-Id: Iffa9e6d5ba5199b136d7549890101da17c11e3c3	2010-06-10 12:08:01 -04:00
Yunqing Wang	8873a93811	Improve vp8_sixtap_predict functions Restructure vp8_sixtap_predict functions to eliminate extra 5-line calculation while doing first-pass only. Also, combline functions to eliminate usage of intermediate buffer. This gives decoder a 3% performance gain on my test clips. Change-Id: I13de49638884d1a57d0855c63aea719316d08c1b	2010-06-10 11:48:48 -04:00
Paul Wilkins	10ae99c67b	Merge "Adjust to avoid long line"	2010-06-10 03:24:54 -07:00
Paul Wilkins	a04ed23ff5	Adjust to avoid long line	2010-06-10 11:15:05 +01:00
Paul Wilkins	cd715faa50	Merge "Correct comment"	2010-06-10 03:05:32 -07:00
Paul Wilkins	ae244efb85	Merge "Fix RD bug."	2010-06-10 03:04:45 -07:00
John Koleszar	f6f0ffe96a	Merge "Remove secondary mv clamping from decode stage"	2010-06-09 17:55:57 -07:00
John Koleszar	3085025fa1	Remove secondary mv clamping from decode stage This patch removes the secondary MV clamping from the MV decoder. This behavior was consistent with limits placed on non-split MVs by the reference encoder, but was inconsistent with the MVs generated in the split case. The purpose of this secondary clamping was only to prevent crashes on invalid data. It was not intended to be a behaviour an encoder could or should rely on. Instead of doing additional clamping in a way that changes the entropy context, the secondary clamp is removed and the border handling is made implmentation specific. With respect to the spec, the border is treated as essentially infinite, limited only by the clamping performed on the near/nearest reference and the maximum encodable magnitude of the residual MV. This does not affect any currently produced streams. Change-Id: I68d35a2fbb51570d6569eab4ad233961405230a3	2010-06-09 11:47:24 -04:00

... 6 7 8 9 10 ...

781 Commits