generic-library/vpx

Author	SHA1	Message	Date
Scott LaVarnway	f212a98ee7	Fixed unused variable warnings for firstpass.c Change-Id: I8378a9a541ade2f098359a7b20fa08e6c1596d80	2011-04-04 14:18:31 -04:00
Johann	610dd90288	Merge "tweak vp8_regular_quantize_b_sse2"	2011-04-04 08:56:25 -07:00
Yunqing Wang	f5c0d95e8c	Merge "Use full-pixel MV in mvsadcost calculation"	2011-04-04 08:40:51 -07:00
Yunqing Wang	3d6815817c	Use full-pixel MV in mvsadcost calculation MV sad cost error is only used in full-pixel motion search, which only need full-pixel resolution instead of quarter-pixel resolution. This change reduced mvsadcost table size, and removed unneccessary pamameter passing since this table is constant once it is generated. Change-Id: I9f931e55f6abc3c99011321f1dfb2f3562e6f6b0	2011-04-01 16:41:58 -04:00
Johann	8520b5c785	tweak vp8_regular_quantize_b_sse2 rather than look up rc in the zig zag table, embed it in the macro. this also allows us to shuffle some values in the macro and keep *d in rsi gains of about the same order as the obj_int_extract implementation: ~2% Change-Id: Ib7252dd10eee66e0af8b0e567426122781dc053d	2011-04-01 09:58:23 -04:00
Johann	ba11e24d47	Merge "Wrapper function removed from vp8_subtract_b_neon function call"	2011-04-01 05:47:21 -07:00
Tero Rintaluoma	cec76a36d6	Wrapper function removed from vp8_subtract_b_neon function call Address calculations moved from encodemb_arm.c file to neon optimized assembly function to save cycles in function calls. - vp8_subtract_b_neon_func replaced with vp8_subtract_b_neon that contains all needed address calculations - unnecessary file encodemb_arm.c removed - consistent with ARMv6 optimized version Change-Id: I6cbc1a2670b56c2077f59995fcf8f70786b4990b	2011-04-01 10:06:44 +03:00
Johann	9d138379a2	Merge "ARMv6 optimized subtract functions"	2011-03-31 08:40:10 -07:00
Attila Nagy	297b27655e	Runtime detection of available processor cores. Detect the number of available cores and limit the thread allocation accordingly. On decoder side limit the number of threads to the max number of token partition. Core detetction works on Windows and Posix platforms, which define _SC_NPROCESSORS_ONLN or _SC_NPROC_ONLN. Change-Id: I76cbe37c18d3b8035e508b7a1795577674efc078	2011-03-31 10:23:01 +03:00
Attila Nagy	7d335868df	Fix: lpf semaphore was signaled in single threaded run After picking filter level, post the loopfilter semaphore just when multiple threads are in use. Change-Id: If7bfb64601d906adef703f454dafc25e978b93c6	2011-03-30 15:55:29 +03:00
Johann	0e43668546	Merge "Half pixel variance further optimized for ARMv6"	2011-03-29 12:14:54 -07:00
Yunqing Wang	534ea700bd	Merge "Fix a crash while enabling shared (--enable-shared)"	2011-03-29 09:04:22 -07:00
Yunqing Wang	b843aa4eda	Fix a crash while enabling shared (--enable-shared) Fixed a bug in SSSE3 sub-pixel filter functions. Change-Id: I2e2126652970eb78307ffcefcace1efd5966fb0a	2011-03-29 11:31:06 -04:00
Johann	f0c22a3f33	use GLOBAL correctly on 32bit shared libraries http://code.google.com/p/webm/issues/detail?id=309 Change-Id: I6fce9e2f74bc09a9f258df7f91ab599812324e8c	2011-03-29 11:27:03 -04:00
Tero Rintaluoma	6fdc9aa79f	ARMv6 optimized subtract functions Adds following ARMv6 optimized functions to encoder: - vp8_subtract_b_armv6 - vp8_subtract_mby_armv6 - vp8_subtract_mbuv_armv6 Gives 1-5% speed-up depending on input sequence and encoding parameters. Functions have one stall cycle inside the loop body on Cortex pipeline. Change-Id: I19cca5408b9861b96f378e818eefeb3855238639	2011-03-29 16:52:00 +03:00
Tero Rintaluoma	f5e433464b	Half pixel variance further optimized for ARMv6 Half pixel interpolations optimized in variance calculations. Separate function calls to vp8_filter_block2d_bil_x_pass_armv6 are avoided.On average, performance improvement is 6-7% for VGA@30fps sequences. Change-Id: Idb5f118a9d51548e824719d2cfe5be0fa6996628	2011-03-28 09:51:51 +03:00
Johann	beaafefcf1	Merge "use asm_offsets with vp8_regular_quantize_b_sse2"	2011-03-24 11:06:36 -07:00
Johann	8edaf6e2f2	use asm_offsets with vp8_regular_quantize_b_sse2 remove helper function and avoid shadowing all the arguments to the stack on 64bit systems when running with --good --cpu-used=0: ~2% on linux x86 and x86_64 ~2% on win32 x86 msys and visual studio more on darwin10 x86_64 significantly more on x86_64-win64-vs9 Change-Id: Ib7be12edf511fbf2922f191afd5b33b19a0c4ae6	2011-03-24 13:34:48 -04:00
Johann	4cde2ab765	Merge "ARMv6 optimized fdct4x4"	2011-03-23 07:52:51 -07:00
Yunqing Wang	73065b67e4	Merge "Fix multithreaded encoding for 1 MB wide frame"	2011-03-21 07:41:31 -07:00
John Koleszar	2cbd962088	Remove unused vp8_get4x4sse_cs_mmx declaration This declaration did not match the prototype_sad() prototype, but was unused in this translation unit, so it is removed instead. Fixes issue 290. Change-Id: I168854f88a85f73ca9aaf61d1e5dc0f43fc3fdb3	2011-03-21 07:53:53 -04:00
John Koleszar	769c74c0ac	Merge "Increase static linkage, remove unused functions"	2011-03-21 04:51:51 -07:00
Tero Rintaluoma	a61785b6a1	ARMv6 optimized fdct4x4 Optimized fdct4x4 (8x4) for ARMv6 instruction set. - No interlocks in Cortex-A8 pipeline - One interlock cycle in ARM11 pipeline - About 2.16 times faster than current C-code compiled with -O3 Change-Id: I60484ecd144365da45bb68a960d30196b59952b8	2011-03-21 13:33:45 +02:00
Attila Nagy	bfe803bda3	Fix multithreaded encoding for 1 MB wide frame Thread synchronization was not correct when frame width was 1 MB. Number of allocated encoding threads is limited by the sync_range. There is no point having more because each thread lags sync_range MBs behind the thread processing the row above. http://code.google.com/p/webm/issues/detail?id=302 Change-Id: Icaf67a883beecc5ebf2f11e9be47b6997fdf6f26	2011-03-18 12:35:30 +02:00
John Koleszar	429dc676b1	Increase static linkage, remove unused functions A large number of functions were defined with external linkage, even though they were only used from within one file. This patch changes their linkage to static and removes the vp8_ prefix from their names, which should make it more obvious to the reader that the function is contained within the current translation unit. Functions that were not referenced were removed. These symbols were identified by: $ nm -A libvpx.a \| sort -k3 \| uniq -c -f2 \| grep ' [A-Z] ' \ \| sort \| grep '^ *1 ' Change-Id: I59609f58ab65312012c047036ae1e0634f795779	2011-03-17 20:53:47 -04:00
John Koleszar	8431e768c9	Merge "Fix "used uninitialized" warning in vp8_pack_bitstream()"	2011-03-17 14:25:04 -07:00
Attila Nagy	71bcd9f1af	Add vp8_variance8x8_armv6 and vp8_sub_pixel_variance8x8_armv6 functions Change-Id: I08edaffc62514907fa5e90e1689269e467c857f5	2011-03-15 15:50:44 +02:00
Johann	d0ec28b3d3	Merge "Add vp8_mse16x16_armv6 function"	2011-03-14 12:47:42 -07:00
Attila Nagy	e54dcfe88d	Add vp8_mse16x16_armv6 function Change-Id: I77e9f2f521a71089228f96e2db72524189364ffb	2011-03-14 14:38:31 +02:00
Johann	3788b3564c	Merge "Move build_intra_predictors_mby to RTCD framework"	2011-03-11 10:23:48 -08:00
John Koleszar	27972d2c1d	Move build_intra_predictors_mby to RTCD framework The vp8_build_intra_predictors_mby and vp8_build_intra_predictors_mby_s functions had global function pointers rather than using the RTCD framework. This can show up as a potential data race with tools such as helgrind. See https://bugzilla.mozilla.org/show_bug.cgi?id=640935 for an example. Change-Id: I29c407f828ac2bddfc039f852f138de5de888534	2011-03-11 13:04:50 -05:00
Johann	5c60a646f3	Merge "ARMv6 optimized quantization"	2011-03-11 08:29:00 -08:00
Paul Wilkins	6e73748492	Clean up of vp8_init_config() Clean up vp8_init_config() a bit and remove null pointer case, as this code can't be called any more and is not an adequate trap anyway, as a null pointer would cause exceptions before hitting the test. Change-Id: I937c00167cc039b3aa3f645f29c319d58ae8d3ee	2011-03-11 11:06:51 -05:00
John Koleszar	170b87390e	Merge "1 Pass CQ and VBR bug fixes"	2011-03-11 08:06:09 -08:00
Paul Wilkins	2ae91fbef0	1 Pass CQ and VBR bug fixes Issue 291 highlighted the fact that CQ mode was not working as expected in 1 pass mode, This commit fixes that specific problem but in so doing I also uncovered an overflow issue in the VBR code for 1 pass and some data values not being correctly initialized. For some clips (particularly short clips), the resulting improvement is dramatic. Change-Id: Ieefd6c6e4776eb8f1b0550dbfdfb72f86b33c960	2011-03-11 10:59:34 -05:00
John Koleszar	e34e417d94	Merge "Fix incorrect macroblock counts in twopass rate control"	2011-03-11 06:06:04 -08:00
Yunqing Wang	3c9dd6c3ef	Merge "Align SAD output array to be 16-byte aligned"	2011-03-11 05:56:02 -08:00
John Koleszar	c5c5dcd0be	Merge "vp8cx - psnr converted to call assemblerized sse"	2011-03-11 05:54:00 -08:00
John Koleszar	29c46b64a2	Merge "vp8cx- alternate ssim function with optimizations"	2011-03-11 05:53:41 -08:00
Jim Bankoski	3dc382294b	vp8cx - psnr converted to call assemblerized sse Change-Id: Ie388d4618c44b131f96b9fe526618b457f020dfa	2011-03-11 08:51:22 -05:00
Jim Bankoski	3f6f7289aa	vp8cx- alternate ssim function with optimizations Change-Id: I91921b0a90dbaddc7010380b038955be347964b3	2011-03-11 08:51:21 -05:00
Yunqing Wang	b2aa401776	Align SAD output array to be 16-byte aligned Use aligned store. Change-Id: Icab4c0c53da811d0c52bb7e8134927f249ba2499	2011-03-11 08:24:23 -05:00
Yunqing Wang	76ec21928c	Merge "Encoder loopfilter running in its own thread"	2011-03-11 04:55:05 -08:00
Attila Nagy	9c836daf65	Fix "used uninitialized" warning in vp8_pack_bitstream() Change-Id: Iadcbdba717439f47a2c24e65fd69a3a1464174b5	2011-03-11 12:36:28 +02:00
Attila Nagy	3ae2465788	Encoder loopfilter running in its own thread In multithreaded mode the loopfilter is running in its own thread (filter level calculation and frame filtering). Filtering is mostly done in parallel with the bitstream packing. Before starting the packing the loopfilter level has to be calculated. Also any needed reference frame copying is done in the filter thread. Currently the encoder will create n+1 threads, where n > 1 is the number of threads specified by application and 1 is the extra filter thread. With n = 1 the encoder runs in single thread mode. There will never be more than n threads running concurrently. Change-Id: I4fb29b559a40275d6d3babb8727245c40fba931b	2011-03-11 10:52:51 +02:00
Tero Rintaluoma	7ab08e1fee	ARMv6 optimized quantization Adds new ARMv6 optimized function vp8_fast_quantize_b_armv6 to the encoder. Change-Id: I40277ec8f82e8a6cbc453cf295a0cc9b2504b21e	2011-03-11 10:48:42 +02:00
Adrian Grange	6daacdb785	Added missing format specifier in print statement Printout of firstpass stats for frame had one fewer format specifiers than arguments. Change-Id: I5a42c85aa79c471e1a70afd75e24a91546b7a1cd	2011-03-10 12:43:49 -08:00
Adrian Grange	ed40ff9e2d	Removed firstpass motion map The firstpass motion map consists of an 8-bit flag for each MB indicating how strongly the firstpass code believes it should be filtered during the second pass ARNR filtering. For long or large format material the motion map can become extremely large and hamper the operation of the encoding process. This change removes the motion map altogether, leaving the second pass to rely on the magnitude of the motion compensated error to determine the filter weight to use for the MB during ARNR filtering. Tests on the derf set indicate that the effect of this change is neutral, with some small wins and losses. The motion map has therefore been removed based on a cost/benefit evaluation. Change-Id: I53e07d236f5ce09a6f0c54e7c4ffbb490fb870f6	2011-03-10 11:32:48 -08:00
James Berry	f3e9e2a0f8	Fix incorrect macroblock counts in twopass rate control The previous calculation of macroblock count (w*h)/256 is not correct when the width/height are not multiples of 16. Use the precalculated macroblock count from cpi->common instead. This manifested itself as a divide by zero when the number of pixels was less than 256. num_mbs updated in estimate_max_q, estimate_q, estimate_kf_group_q, and estimate_cq Change-Id: I92ff98587864c801b1ee5485cfead964673a9973	2011-03-10 13:33:06 -05:00
Yunqing Wang	7b8e7f0f3a	Add vp8_sub_pixel_variance16x8_ssse3 function Added SSSE3 function Change-Id: I8c304c92458618d93fda3a2f62bd09ccb63e75ad	2011-03-09 12:33:21 -05:00
Yunqing Wang	4561109a69	Remove unused functions Removed some unused functions Change-Id: Ifdfc27453e53cfc75997b38492901d193a16b245	2011-03-09 10:45:03 -05:00
Yunqing Wang	7966dd5287	Merge "Improve SSE2 half-pixel filter funtions"	2011-03-09 07:23:06 -08:00
John Koleszar	fa836faede	Merge "Configuration updates:Making a clear distinction between Init and Change"	2011-03-09 05:07:11 -08:00
Yunqing Wang	419f638910	Improve SSE2 half-pixel filter funtions Rewrote these functions to process 16 pixels once instead of 8. Change-Id: Ic67e80124467a446a3df4cfecfb76a4248602adb	2011-03-08 16:25:06 -05:00
Yunqing Wang	859abd6b5d	Merge "Add zero offset checking in SSE2 sub-pixel filter function"	2011-03-08 12:26:58 -08:00
Yunqing Wang	8432a1729f	Add zero offset checking in SSE2 sub-pixel filter function Skip filter at zero offset. Change-Id: I95fc7e211869bc0ab5bcfb7ab2e3259d1c0ccf38	2011-03-08 15:22:07 -05:00
Yunqing Wang	e8f7b0f7f5	Merge "Write SSSE3 sub-pixel filter function"	2011-03-08 10:58:30 -08:00
Yunqing Wang	244e2e1451	Write SSSE3 sub-pixel filter function 1. Process 16 pixels at one time instead of 8. 2. Add check for both xoffset =0 and yoffset=0, which happens during motion search. This change gave encoder 1%~3% performance gain. Change-Id: Idaa39506b48f4f8b2fbbeb45aae8226fa32afb3e	2011-03-08 13:29:01 -05:00
Ralph Giles	e6948bf0f9	Fix a multi-line format-string warning. GCC 4.5 and 4.6 both issue a warning about the multi-line format string introduced in `bc9c30a0`, which also changed the whitespace in the associated stt file by line-wrapping the long format string. Instead, use multiple string constants, which the compiler will concatenate. This maintains the original formatting, but remains legible within the standard line length. Change-Id: I27c9f92d46be82d408105a3a4091f145f677e00e	2011-03-08 07:14:12 -08:00
Paul Wilkins	de87c420ef	Corrected minor typos. Change-Id: Icc9f12bd1e1bdaf51256dc8a90d08aa9be89ef34	2011-03-08 14:46:22 +00:00
Paul Wilkins	0eccee4378	Merge changes I00c3e823,If8bca004 * changes: Improved key frame detection. Improved KF insertion after fades to still.	2011-03-08 06:40:11 -08:00
John Koleszar	5d1d9911cb	correct zbin boost for splitmv mode Disable zbin boost in SPLITMV mode as intended. Was incorrectly looking at vp8_ref_frame_order instead of vp8_mode_order when comparing against SPLITMV. This condition should have always been false, as SPLITMV is not in the range of valid reference frames. Change-Id: I0408cc7595eff68f00efef6d008e79f5b60d14bf	2011-03-07 20:58:37 -05:00
Paul Wilkins	bc9c30a003	Improved key frame detection. In some cases where clips have been encoded with borders (eg. some wide-screen content where there is a border top and bottom and slide shows containing portrait format photographs (border left and right)) key frames were not being correctly detected. The new code looks to measure cases where a portion of the image can be coded equally easily using intra or inter modes and where the resulting error score is also very low. These "neutral" areas are then discounted in the key frame detection code. Change-Id: I00c3e8230772b8213cdc08020e1990cf83b780d8	2011-03-07 15:58:07 +00:00
Paul Wilkins	9fc8cb39aa	Improved KF insertion after fades to still. This code extends what was previously done for GFs, to pick cases where insertion of a key frame after a fade (or other transition or complex motion) followed by a still section, will be beneficial and will reduce the number of forced key frames. Change-Id: If8bca00457f0d5f83dc3318a587f61c17d90f135	2011-03-07 15:11:09 +00:00
John Koleszar	0bc31f1887	Merge "Fixing divide by zero"	2011-03-04 05:40:33 -08:00
Mikhal Shemer	84f7f20985	Configuration updates:Making a clear distinction between Init and Change Change-Id: I7b2fb326e1aabc08b032177a7b914a5b8bb7376f	2011-03-03 10:35:09 -08:00
Mikhal Shemer	1de99a2a81	Fixing divide by zero Change-Id: I9d8a98a2f7ed1e3116d0bae35164618c41998bac	2011-03-03 10:33:36 -08:00
John Koleszar	36be4f7f06	Fix drastic undershoot in long form content When the modified_error_left accumulator exceeds INT_MAX, an incorrect cast to int resulted in a negative value, causing the rate control to allocate no bits to that keyframe group, leading to severe undershoot and subsequent poor quality. This error was exposed by the recent change to the rolling target and actual spend accumulators in commit `305be4e4` which fixed them to actually calculate the average value rather than be re-initialized on every frame to the average per-frame bitrate. When this bug was triggered, the target bitrate could be 0, so the rolling target becomes small, which causes the undershoot. The code prior to `305be4e4` did not exhibit this behavior because the rolling target was always set to a reasonable value and was independent of the actual target bitrate. With this patch, the actual target bitrate is calculated correctly, and the rate control tracks as expected. This cast was likely added to silence a compiler warning on a comparison between a double (modified_error_left) and an int (0). Instead, this patch removes the cast and changes the comparison to be against 0.0, which should prevent the warning from reoccuring. This fixes issue #289. Special thanks to gnafu for his efforts in reporting and debugging this fix. Change-Id: Ie5cc1a7b516c578a76c3a50c892a6f04a11621fe	2011-03-02 22:52:27 -05:00
Johann	6f5189c044	Merge "ARMv6 optimized half pixel variance calculations"	2011-03-02 05:48:46 -08:00
Yunqing Wang	cfaee9f7c6	Merge "Add prefetch before variance calculation"	2011-02-28 11:42:28 -08:00
Scott LaVarnway	3e6d476ac3	Merge "Avoid double copying of key frames into alt and golden buffer"	2011-02-28 10:16:33 -08:00
Yunqing Wang	d96ba65a23	Add prefetch before variance calculation This improved encoding performance by 0.5% (good, speed 1) to 1.5% (good, speed 5). Change-Id: I843d72a0d68a90b5f694adf770943e4a4618f50e	2011-02-28 11:25:55 -05:00
Johann	31dab574cc	Merge "Remove a second check for invalid ptr in vp8_get_compressed_data"	2011-02-25 11:44:18 -08:00
Johann	e4fa638653	Merge "Remove temporal alt ref from realtime only build"	2011-02-25 06:55:17 -08:00
Attila Nagy	d8fc974ac0	Avoid double copying of key frames into alt and golden buffer Change-Id: I726976a297a593a35ed6cba3c660e372562f7b27	2011-02-25 09:03:16 +02:00
Attila Nagy	6da2018789	Remove a second check for invalid ptr in vp8_get_compressed_data Check is done first when function si entered. Change-Id: Ief0d0cbd4860aaf492b78728f8d22f24029b1174	2011-02-25 08:41:13 +02:00
Scott LaVarnway	861175ef00	Removed vp8_block2type and used defines instead. Change-Id: Idb56e0295d004793f406dfd2d8d8c546aad62e03	2011-02-24 14:35:18 -05:00
Scott LaVarnway	d53492bba4	Merge "Revisited rd_pick_intra4x4block"	2011-02-24 11:25:21 -08:00
Scott LaVarnway	658454a04c	Revisited rd_pick_intra4x4block Removed unnecessary copies. No noticeable speed gains. Change-Id: I996c50c23fedd06d54ee7a3e762cbf559cc4a9d1	2011-02-24 13:31:47 -05:00
Paul Wilkins	b862c108dd	Overflow of frame error accumulators. This fixes an overflow problem in the frame error accumulators. The overflow condition is extreme but did trigger when Frank B. coded some high motion interlaced HD content. The observed effect was a catastrophic breakdown of the rate control leading to massive undershoot and poor bit allocation. All the error values should really be unsigned but I will look at this separately. Change-Id: I9745f5c5ca2783620426b66b568b2088b579151f	2011-02-24 15:49:41 +00:00
Tero Rintaluoma	8ae92aef66	ARMv6 optimized half pixel variance calculations Adds following ARMv6 optimized functions to the encoder: - vp8_variance_halfpixvar16x16_h_armv6 - vp8_variance_halfpixvar16x16_v_armv6 - vp8_variance_halfpixvar16x16_hv_armv6 Change-Id: I1e9c2af7acd2a51b72b3845beecd990db4bebd29	2011-02-23 13:27:27 +02:00
Attila Nagy	7af0d906e3	Remove temporal alt ref from realtime only build It is not used in realtime mode. Reduces memory footprint. Change-Id: I7f163225762368df5457cfd413050161d3704a3f	2011-02-22 12:53:32 +02:00
Johann	945dad277d	Revert "use unaligned load" This reverts commit `f50f2fd2a7`. Change Ib7506e3e aligns the buffer Change-Id: Ie0f8bd3e57cfdfef81d39638a1451458ebbae2e0	2011-02-18 10:23:02 -05:00
John Koleszar	c764c2a20f	Merge "clean up unused files"	2011-02-18 06:33:05 -08:00
John Koleszar	3ed8fe8778	remove unused vp8_predict_dc function Change-Id: I64fa47889c54cfed094a674c49ef0996d49bdd42	2011-02-18 09:12:20 -05:00
John Koleszar	cbf923b12c	clean up unused files Removed a number of files that were unused or little-used. Change-Id: If9ae5e5b11390077581a9a879e8a0defe709f5da	2011-02-18 09:09:49 -05:00
John Koleszar	d371ca93e5	cosmetic: remove unnecessary scope Clean up some unnecessary scoping around pick_filter_level. Change-Id: Ic57fa33e3fcae37fe6beae977e5743783399d5af	2011-02-18 08:46:07 -05:00
John Koleszar	597d02b508	Merge "Dont pick encoder filter level when loopfilter is disabled."	2011-02-18 05:26:23 -08:00
Attila Nagy	fb5a692d27	Reinitialize quantizer only when any delta is changing No need to reinitialize for base Q changes. Change-Id: Ie76ec21dd3c5582d5183dbed75ed73a1eed3e291	2011-02-18 14:23:37 +02:00
Attila Nagy	c6ef75690f	Dont pick encoder filter level when loopfilter is disabled. Change-Id: I58154faf4f3ece24f9927a5c3ab7e830e0887fb6	2011-02-18 08:53:00 +02:00
John Koleszar	562f1470ce	Use endian-neutral bitstream packing/unpacking Eliminate unnecessary checks on target endianness and associated macros. Change-Id: I1d4e6a9dcee9bfc8940c8196838d31ed31b0e4aa	2011-02-17 15:20:53 -05:00
John Koleszar	c351aa7f1b	Merge "Fix relative include paths"	2011-02-17 04:13:44 -08:00
Yunqing Wang	da9402fbf6	Merge "Allocate source buffers to be multiples of 16"	2011-02-16 11:35:06 -08:00
Yunqing Wang	da227b901d	Allocate source buffers to be multiples of 16 Currently, when the video frame width is not multiples of 16, the source buffer has a stride of non-multiples of 16, which forces an unaligned load in SAD function and hurts the performance. To avoid that, this change allocates source buffers to be multiples of 16. Change-Id: Ib7506e3eb2cea06657d56be5a899f38dfe3eeb39	2011-02-16 12:57:17 -05:00
Johann	0c2cfff9b0	Merge "ARMv6 optimized sad16x16"	2011-02-16 05:22:38 -08:00
James Zern	0030303b69	Remove redundant ptr checks in calls to vpx_free vpx_free if used contains this check. If replaced, well behaved free will behave similarly. Change-Id: I25483aaa8b39255b9a8cf388d6e5eaa20a908ae1	2011-02-15 12:43:35 -08:00
Yunqing Wang	7725a7eb56	Merge "Improve vp8_sad16x16_sse3 function"	2011-02-14 14:09:25 -08:00
Yaowu Xu	27dad21548	Merge "Improved vp8_rd_pick_intra_mbuv_mode"	2011-02-14 13:58:12 -08:00
Scott LaVarnway	94d4fee08f	Improved vp8_rd_pick_intra_mbuv_mode Eliminated unnecessary calculations. Very small change to performance. Change-Id: Ib7213d43c64e36955177c4d47950ff472266f822	2011-02-14 16:34:33 -05:00
Yunqing Wang	2debd5b5f7	Improve vp8_sad16x16_sse3 function In real-time mode, vp8_sad16x16 function is called heavily in motion search part. Improvement of this function gives 1.2% encoding performance gain (real-time mode, tulip clip). Change-Id: I23c401fc40c061f732a9767e8d383737a179bd58	2011-02-14 16:23:49 -05:00
Yaowu Xu	404e998eb7	Merge "mem leak fix for cpi->tplist"	2011-02-14 11:29:22 -08:00
James Berry	d3dfcde0f7	mem leak fix for cpi->tplist checks added to make sure that cpi->tplist is freed correctly in vp8_dealloc_compressor_data and vp8_alloc_compressor_data. Change-Id: I66149dbbd25c958800ad94f4379d723191d9680d	2011-02-14 14:02:52 -05:00
Scott LaVarnway	d419b93e3e	Improved rd_pick_intra4x4block Eliminated unnecessary calculations. Improved performance by 10% on keyframes and 1.6% overall for the test clip used. Change-Id: I87671b26af5e2cc439e81d0fee3b15c7cd2a3309	2011-02-14 13:32:58 -05:00
Yunqing Wang	353246bd60	Merge "Add improved_mv_pred flag in real-time mode"	2011-02-11 07:20:17 -08:00
Yunqing Wang	9d0b2cbbce	Add improved_mv_pred flag in real-time mode As mentioned in check-in "Improve motion search in real-time mode", MV prediction calculation causes speed loss for speed 7 and above. This change added a flag to turn off this calculation for speed>6 in real-time mode. Change-Id: I9f4ae5a8bf449222d1784b54e7d315fc8347b2d1	2011-02-11 09:59:41 -05:00
Tero Rintaluoma	1ef86980b9	ARMv6 optimized sad16x16 Adds a new ARMv6 optimized function vp8_sad16x16_armv6 to encoder. Change-Id: Ibbd7edb8b25cb7a5b522d391b1e9a690fe150e57	2011-02-11 11:14:07 +02:00
Yaowu Xu	4f8a166058	Merge "Redefining good quality speed settings"	2011-02-10 21:38:19 -08:00
Yunqing Wang	6f53e59641	Merge "Improve motion search in real-time mode"	2011-02-10 12:42:44 -08:00
John Koleszar	02321de0f2	Fix relative include paths Allow compiling without adding vp8/{common,encoder,decoder} to the include paths. Change-Id: Ifeb5dac351cdfadcd659736f5158b315a0030b6c	2011-02-10 15:09:44 -05:00
Yunqing Wang	41e6eceb28	Improve motion search in real-time mode Applied better MV prediction in real-time mode, which improves the encoding quality. Used quarter-pixel search instead of iterative sub-pixel search for speed >=5 to improve encoding performance. Tests on the test set showed: 1. For speed=-5, quality improvement: 1.7% on AvgPSNR and 2.1% on SSIM, performance improvement: 3.6% (This counts in the performance lose caused by MV prediction calculation in "Improve MV prediction in vp8_pick_inter_mode() for speed>3"). 2. For speed=-8, quality improvement: 2.1% on AvgPSNR and 2.5% on SSIM. but, 6.9% performance decrease because of MV prediction calculation. This should be improved later. Change-Id: I349a96c452bd691081d8c8e3e54419e7f477bebd	2011-02-10 13:40:24 -05:00
Johann	7d8199f0c3	Merge "Adds armv6 optimized variance calculation"	2011-02-10 06:06:46 -08:00
Scott LaVarnway	19054ab6da	Redefining good quality speed settings Created a new speed 1 which is in the middle of the old speed 0 and speed 1. (for both quality and performance) Change-Id: I4802133cdb43f359ca787646c090899679dd5d84	2011-02-09 17:18:28 -05:00
James Berry	fffa2a61d7	fixed stride in vp8_temporal_filter_predictors_mb_c stride would not be calculated correctly for material with odd sized frame widths. Change-Id: I1710f6aef9ebb93d36249c9239c68c5baa9791f8	2011-02-09 16:55:39 -05:00
John Koleszar	c2b43164bd	Merge "correct cost for implicit bit in mvs"	2011-02-09 11:20:12 -08:00
John Koleszar	9954d05ca6	correct cost for implicit bit in mvs Use 0xFFF0 vice 240 (0xF0) for determining whether the sometimes implicit bit 3 will be transmitted. This is consistent with the decoder and encode_mvcomponent(). Change-Id: Ic1304d0ab56844bed8236edd1c5243a6767fc6b1	2011-02-09 12:50:17 -05:00
John Koleszar	a39b5af10b	Merge "Put more code under #if CONFIG_MULTITHREAD."	2011-02-09 08:31:36 -08:00
Gaute Strokkenes	315e3c2518	Put more code under #if CONFIG_MULTITHREAD. Change-Id: Icf4b692099d7d249fe3553852b1022b027b28e4b	2011-02-09 11:21:18 -05:00
Scott LaVarnway	85e79ce288	Merge "Added early breakout for vp8_rd_pick_intra4x4mby_modes"	2011-02-09 07:55:04 -08:00
Tero Rintaluoma	cb14764fab	Adds armv6 optimized variance calculation Adds vp8_sub_pixel_variance16x16_armv6 function to encoder. Integrates ARMv6 optimized bilinear interpolations from vp8/common/arm/armv6 and adds new assembly file for variance16x16 calculation. - vp8_filter_block2d_bil_first_pass_armv6 (integrated) - vp8_filter_block2d_bil_second_pass_armv6 (integrated) - vp8_variance16x16_armv6 (new) - bilinearfilter_arm.h (new) Change-Id: I18a8331ce7d031ceedd6cd415ecacb0c8f3392db	2011-02-09 10:23:43 -05:00
Scott LaVarnway	13db80c282	Added early breakout for vp8_rd_pick_intra4x4mby_modes Improved performance of good quality, speed 0 (3% average) with no average quality loss. Change-Id: Ica34473f99bd74260eaebde6b132185e09e3c09d	2011-02-08 16:50:43 -05:00
Johann	40dcae9c2e	clarify _offsets.asm differences it's difficult to mux the _offsets.c files because of header conflicts. make three instead, name them consistently and partititon the contents to allow building them as required. Change-Id: I8f9768c09279f934f44b6c5b0ec363f7943bb796	2011-02-08 16:35:43 -05:00
Yunqing Wang	58d2e70fc5	Fix link error in real-time mode make vp8_mv_pred() and vp8_cal_sad() available in real-time mode. Change-Id: I71dbae241b486ba943458dcbae552ec4a51689d3	2011-02-07 08:21:14 -05:00
Yunqing Wang	350ffe8dae	Merge "Improve MV prediction in vp8_pick_inter_mode() for speed>3"	2011-02-04 10:10:15 -08:00
John Koleszar	63fc44dfa5	correct quantizer initialization The encoder was not correctly catching transitions in the quantizer deltas. If a delta_q was set, then the quantizer would be reinitialized on every frame, but if they transitioned to 0, the quantizer would not be reinitialized, leading to a encode-decode mismatch. This bug was triggered by commit `999e155`, which sets a Y2 delta Q for very low base Q levels. Change-Id: Ia6733464a55ee4ff2edbb82c0873980d345446f5	2011-02-04 11:37:47 -05:00
John Koleszar	c0a9cbebe1	Merge "Delay auto key frame insertion in realtime configuration"	2011-02-04 05:16:15 -08:00
Scott LaVarnway	4aa12b6c5f	Merge "Zero out block mv when an intra mode is selected"	2011-02-03 07:16:52 -08:00
Yunqing Wang	a870315629	Merge "Improved encoder threading"	2011-02-03 05:44:57 -08:00
Attila Nagy	e5904f2d5e	Delay auto key frame insertion in realtime configuration Whe auto keyframe insertion is enabled and conditions are right (scene change) the encoder can decide to insert a key frame and does a re-encoding. This can introduce extra latency. In RT mode we do not do the re-encoding of the current frame but force the next frame to key frame. Change-Id: I15c175fa845ac4c1a1f18bea3676e154669522a7	2011-02-02 13:54:40 +02:00
Scott LaVarnway	07a7c08aef	Zero out block mv when an intra mode is selected instead of each time mode is tested. Change-Id: Ief0f5586dafde54cc14d348dcecdacb182e7c1d5	2011-02-01 12:55:51 -05:00
Scott LaVarnway	a5ecaca6a7	Removed unnecessary B_MODE_INFO memset. Change-Id: I2bcef6a8e47f88542861fd1356631ca934e2a0e7	2011-02-01 11:35:08 -05:00
Scott LaVarnway	b18df82e1d	Moved rd calculation into vp8_pick_intra4x4mby_modes Then removed unnecessary code. Change-Id: I142658815d843c9396b07881dbdd8d387c43c90e	2011-02-01 11:26:04 -05:00
Scott LaVarnway	4e7e79f770	Removed intra_modes from vp8cx_encode_intra_macro_block Restructured function in order to eliminate the prediction modes save/restore. Code cleanup also. Change-Id: I816e3b910de64d0f0f0ddc2398805c63263191e8	2011-02-01 10:05:35 -05:00
Attila Nagy	385c2a76d1	Improved encoder threading Reduce the number of sync points by letting each thread continue imediatly with a new MB row. Better multicore scaling, improves performance by 5-20% on ARM multicore. Change-Id: Ic97e4d1c4886a842c85dd3539a93cb217188ed1b	2011-02-01 12:17:58 +02:00
Scott LaVarnway	9e7fec216e	Removed prediction_error accumulation from vp8cx_encode_intra_macro_block. prediction_error is used when deciding if a frame should be a keyframe. After reviewing this with Yaowu, it was pointed out that vp8cx_encode_intra_macro_block is only called for keyframes, so the accumulation is unnecessary. Change-Id: Id79dc81b80d4f5d124f3a0dba1b923887e2e1ec8	2011-01-31 19:53:02 -05:00
Scott LaVarnway	317f0da91e	Removed last_auto_filter_prediction_error last_auto_filter_prediction_error is not really used. Change-Id: Ic6e56c4076bbd250ef783ee1be46964c85f62864	2011-01-31 19:41:09 -05:00
Scott LaVarnway	4a15e55793	Possible bug in vp8cx_encode_intra_macro_block vp8_pick_intra4x4mby_modes uses the passed in distortion for an early breakout. The best distortion was never saved and the distortion for TM_PRED was always used. Change-Id: Idbaf73027408a4bba26601713725191a5d7b325e	2011-01-31 17:43:18 -05:00
Scott LaVarnway	60fde4d342	Merge "Performance improvement of first pass"	2011-01-31 13:02:23 -08:00
Yaowu Xu	6d19d40718	Merge "change the threshold of DC check for encode breakout"	2011-01-31 11:00:46 -08:00
Adrian Grange	408a8adc15	Merge "Changed condition for using RD in Intra Mode"	2011-01-31 02:18:40 -08:00
Yaowu Xu	8f279596cb	change the threshold of DC check for encode breakout Previously, the DC check is to make sure there is no code-able DC shift for quantizer Q0, which has been verified rather conservative. This commit changes the criteria to have two components, DC and AC, to address the conservativeness. First, it checks if all AC energy is enough to contribute a single non-zero quantized AC coefficient. Second, for DC, the decision to skip further considers two possible scenarios: 1. There is no code-able 2nd order DC coefficient at all; 2 The residue is relatively flat, but the uniform DC change is very small, i.e. less than 1/2 gray level per pixel. Comparing to previous criteria, the new criteria is about 10% to 15% faster in encoding time with a very small quality loss. (threshold ~1000 and quality range 33db-45db) It should be noted that this commit enables "automatic" static threshold for encodebreakout if a non-zero small value is passed in to encoder. Change-Id: I0f77719a1ac2c2dfddbd950d84920df374515ce3	2011-01-28 09:43:23 -08:00
Johann	f3cb9ae459	Merge "Adds "armvX-none-rvct" targets"	2011-01-28 09:03:58 -08:00
Yunqing Wang	7cbe684ef5	Improve MV prediction in vp8_pick_inter_mode() for speed>3 Applied same method used in vp8_rd_pick_inter_mode() to improve the accuracy of MV prediction. Change-Id: Ia50ae26208b18482695601f32febd99fe89fbc17	2011-01-28 10:00:20 -05:00
Adrian Grange	e9f513d74a	Changed condition for using RD in Intra Mode The condition for using RD when selecting the intra coding mode for a MB is that the RD flag is set AND we're not in real-time mode. Previously the code used RD if either the RD flag was set OR we were not using real-time mode. Change-Id: Ic711151298468a3f99babad39ba8375f66d55a08	2011-01-28 14:47:36 +00:00
Paul Wilkins	dcb23e2aaa	Inconsistent distortion metric in vp8_rd_pick_intra_mbuv_mode This function was using a variance metric compared to and SSE metric in other places (eg. vp8_rd_inter_uv) Change-Id: I9109fcc5a13bca9db1d7ead500fe14999ab233eb	2011-01-28 13:13:30 +00:00
Tero Rintaluoma	11a222f5d9	Adds "armvX-none-rvct" targets Adds following targets to configure script to support RVCT compilation without operating system support (for Profiler or bare metal images). - armv5te-none-rvct - armv6-none-rvct - armv7-none-rvct To strip OS specific parts from the code "os_support"-config was added to script and CONFIG_OS_SUPPORT flag is used in the code to exclude OS specific parts such as OS specific includes and function calls for timers and threads etc. This was done to enable RVCT compilation for profiling purposes or running the image on bare metal target with Lauterbach. Removed separate AREA directives for READONLY data in armv6 and neon assembly files to fix the RVCT compilation. Otherwise "ldr <reg>, =label" syntax would have been needed to prevent linker errors. This syntax is not supported by older gnu assemblers. Change-Id: I14f4c68529e8c27397502fbc3010a54e505ddb43	2011-01-28 12:47:39 +02:00
Johann	73207a1d8b	warning: pointer targets differ in signedness vp8/encoder/rdopt.c:728: warning: pointer targets in passing argument 3 of 'macro_block_yrd' differ in signedness vp8/encoder/rdopt.c:541: note: expected 'int ' but argument is of type 'unsigned int ' distortion is signed when calling macro_block_yrd is both other cases, as well as for RDCOST Change-Id: I5e22358b7da76a116f498793253aac8099cb3461	2011-01-27 11:53:26 -05:00
Johann	27000ed6d9	clean up implicit declaration warnings for neon Change-Id: I6ca2d89f355839c4c770773c09fc69dcea7c1406 warning: implicit declaration of function 'vp8_variance_halfpixvar16x16_[h\|v\|hv]_neon' 'vp8_sub_pixel_variance16x16_neon_func'	2011-01-27 11:31:59 -05:00
Scott LaVarnway	8a5c255b3d	Merge "Removed unused members from VP8_COMP"	2011-01-27 08:12:22 -08:00
Yunqing Wang	bb30ffc4dc	Merge "Remove copies of same functions"	2011-01-27 08:11:26 -08:00
Yunqing Wang	3ee4e1e79f	Merge "Refine motion vector prediction for NEWMV mode"	2011-01-27 08:10:53 -08:00
Scott LaVarnway	3c18a2bb2e	Performance improvement of first pass Improved the performance of the first pass only (~6% on 720p test clip) by making use of LUT instead of the float calculations. Might try a SIMD version later. Also started to make use of int_mv instead of MV. Change-Id: If2a217c7d6b59cd2c25c5553e0ca7e0502403af8	2011-01-26 16:42:56 -05:00
Yunqing Wang	cac54404b9	Remove copies of same functions Reduce the code size. Change-Id: I2e1998557a3c8776e262c442fd758c25e17aff7a	2011-01-26 15:37:00 -05:00
Scott LaVarnway	c4887da39c	Removed unused members from VP8_COMP Change-Id: I8f3f2642b02975fbdb14982984a29821f80d30d3	2011-01-26 15:07:17 -05:00
Paul Wilkins	35bb74a6bd	Rationalize vp8_rd_pick_intra16x16mby_mode() Use the function macro_block_yrd() to calculate error and distortion in keeping with what is done for inter frames. The old code was using a variance metric for once case and an SSE function for measuring distortion in the other case. The function vp8_encode_intra16x16mbyrd() is no longer used. Change-Id: Ic228cb00a78ff637f4365b43f58fbe5a9273d36f	2011-01-26 18:46:34 +00:00
Paul Wilkins	e8e09d33df	Merge "Correction to buffer update for non-viewable frames."	2011-01-26 09:33:48 -08:00
Yaowu Xu	82266a1ac9	Merge "cap the best quantizer for 2nd order DC"	2011-01-26 09:27:11 -08:00
Paul Wilkins	a3f71ccff6	Correction to buffer update for non-viewable frames. The code previously tested cpi->common.refresh_alt_ref_frame but there are situations where this flag may be set for viewable frames. The correct test should be !cm->show_frame. Change-Id: Ia1a600622992a4a68fe1d38ac23bf6b34b133688	2011-01-26 12:52:31 +00:00
Paul Wilkins	2caa36aa4f	Merge "Fix for incorrect variable declaration."	2011-01-26 01:53:53 -08:00
Yaowu Xu	999e155f55	cap the best quantizer for 2nd order DC This commit also removes artificial RDMULT cap for low quantizers. The intention is to address some abnormal behavior of mode selections at the low quantizer end, where many macroblocks were coded with SPLITMV with all partitions using same motion vector including (0,0). This change improves the compression quality substantially for high quality encodings in both PSNR and SSIM terms. Overall effect on mid/low rate range is also positive for all metrics, but smaller in magnitude. Change-Id: I864b29c4bd9ff610d2545fa94a19cc7e80c02667	2011-01-25 22:26:18 -08:00
Fritz Koenig	53d8e9dc97	Fix for incorrect variable declaration. Commit `336aa0b7da` incorrectly declared current_pos as and int, when it should have been a FIRSTPASS_STATS pointer. Change-Id: I0a51c7a86ebba8546c95dd5d9d1c1143d4613e40	2011-01-25 15:41:41 -08:00
Johann	907e98fbb5	Merge "update sse2 regular quantizer"	2011-01-25 13:40:28 -08:00
Johann	58f19cc697	Merge "move new neon subpixel function"	2011-01-25 13:09:05 -08:00
Yunqing Wang	dcaaadd8ed	Refine motion vector prediction for NEWMV mode Adjust checking points in motion vector prediction to better cover possible movements, and get a better prediction. Tests on test clips showed a 0.1% improvement in SSIM, and no change in PSNR and performance. Change-Id: Ifdab05d35e10faea1445c61bb73debf888c9d2f8	2011-01-25 15:54:34 -05:00
Johann	af7d23c9b4	Merge "Fix issue 262, vp8cx_pack_tokens_into_partitions_armv5"	2011-01-25 12:49:52 -08:00
Johann	2168a94495	move new neon subpixel function previously wasn't guarded with ifdef ARMV7, causing a link error with ARMV6 Change-Id: I0526858be0b5f49b2bf11e9090180b2a6c48926d	2011-01-25 15:48:37 -05:00
Yunqing Wang	4e149bb447	Merge "Modify calling of NEON code in sub-pixel search"	2011-01-25 09:54:23 -08:00
Attila Nagy	3bf235a4c9	Fix issue 262, vp8cx_pack_tokens_into_partitions_armv5 http://code.google.com/p/webm/issues/detail?id=262 Function was asuming that partitions have equal amount of mb_rows, which is not always true. Change-Id: I59ed40117fd408392a85c633beeb5340ed2f4b25	2011-01-25 15:55:02 +02:00
Paul Wilkins	a69c18980f	Merge "Incorrect bit allocation in forced KF groups."	2011-01-25 05:32:26 -08:00
Paul Wilkins	336aa0b7da	Incorrect bit allocation in forced KF groups. The old 2 pass code estimated error distribution when coding a forced (by interval) key frame. The result of this was that in some cases, when allocating bits at the GF group level within a KF group there was either a glut of bits or starvation of bits at the end of the KF group. Added code to rescan and get the correct data once the position of a forced key frame has been determined. Change-Id: I0c811675ef3f9e4109d14bd049d7641682ffcf11	2011-01-25 12:29:06 +00:00
Scott LaVarnway	0ee525d6de	Added vp8_update_zbin_extra vp8cx_mb_init_quantizer was being called for every mode checked in vp8_rd_pick_inter_mode. zbin_extra is the only value that really needs to be recalculated. This calculation is disabled when using the fast quantizer for mode selection. This gave a small performance boost (~.5% to 1%). Note: This needs to be verified with segmentation_enabled. Change-Id: I62716a870b3c82b4a998bdf95130ff0b02106f1e	2011-01-24 11:00:56 -05:00
Yunqing Wang	d3e9409bb0	Merge "Modify sub-pixel filters to eliminate unnecessary calculations"	2011-01-21 11:07:17 -08:00
Yunqing Wang	0822a62f40	Modify sub-pixel filters to eliminate unnecessary calculations In sub-pixel calculation, xoffset and yoffset mostly take some specific values. Modified sub-pixel filter functions according to these possible values to improve performance. Change-Id: I83083570af8b00ff65093467914fbb97a4e9ea21	2011-01-21 13:59:27 -05:00
Paul Wilkins	0cdfef1e22	Modified static scene check. Added code to scan ahead a few frames when we see what we think is a static scene in the two pass GF loop to see if the conditions persist. Moved calculation of decay rate out into a fuunction. Change-Id: I6e9c67e01ec9f555144deafc8ae67ef25bffb449	2011-01-21 17:52:00 +00:00
Paul Wilkins	8064583d26	Further work to reduce pulsing. These changes are specifically targeted at fade transitions to static scenes. Here we want to place a GF/ARF immediately after the fade and prevent an ARF just before the fade. Also some code lines and comment lines shortened to 80 chars while I was there. Change-Id: Iefdc09a4fa7b265048fc017246b73e138693950f	2011-01-20 18:01:20 +00:00
Adrian Grange	815e1e9fe4	Fixed use of motion percentage in KF/GF group calc In both vp8_find_next_key_frame and define_gf_group, motion_pct was initialised at the top of the loop before next_frame stats had been read in. This fix sets motion_pct after next_frame stats have been read. Change-Id: I8c0bebf372ef8aa97b97fd35b42973d1d831ee73	2011-01-20 13:13:33 +00:00
Paul Wilkins	e867516843	First pass loop bug. Incorrect value loop_decay_rate used in GF loop. The intent was to test the cumulative value decay_accumulator. Change-Id: I62928c63eb09f4f6936a45ebd1c23784d1c9681b	2011-01-19 15:50:22 +00:00
Yunqing Wang	ce6c954d2e	Modify calling of NEON code in sub-pixel search In vp8_find_best_sub_pixel_step_iteratively(), many times xoffset and yoffset are specific values - (4,0) (0,4) and (4,4). Modified code to call simplified NEON version at these specific offsets to help with the performance. Change-Id: Iaf896a0f7aae4697bd36a49e182525dd1ef1ab4d	2011-01-18 14:19:52 -05:00
Jim Bankoski	edcf74c6ad	vp8e -removed undefined max call Change-Id: I42a86b0488f44115f09551fc5ad6d711fd470f0d	2011-01-18 11:21:32 -05:00
Paul Wilkins	d6d5d43708	Merge "Further CQ, Key frame and ARF changes"	2011-01-18 08:04:46 -08:00
Paul Wilkins	57136a268a	Further CQ, Key frame and ARF changes This code fixes a bug in the calculation of the minimum Q for alt ref frames. It also allows an extended gf/arf interval for sections of clips that completely static (or nearly so). Change-Id: I1a21aaa16d4f0578e5f99b13bebd78d59403c73b	2011-01-18 15:19:05 +00:00
Attila Nagy	cb791aaa2f	Fix encoder real-time only configuration. Remove allocation/deallocation of stats storage. Remove full search functions in machine specific encoder inits. Remove last pass validation in validate_config. Change-Id: I7f29be69273981a4fef6e80ecdb6217c68cbad4e	2011-01-18 08:19:21 -05:00
Paul Wilkins	339c512762	Fix CQ range and experimental KF sizing changes. The CQ level was not using the q_trans[] array to convert to a 0-127 range as per min and maxq Experimental change to try and match the reconstruction error for forced key frames approximately to that of the previous frame by means of the recode loop. Though this may cause extra recodes and the recode behavior has not been optimized, it can only happen on forced key frames. Change-Id: I1f7e42d526f1b1cb556dd461eff1a692bd1b5b2f	2011-01-17 17:24:45 +00:00
Johann	15f9bea73b	update sse2 regular quantizer about ~5% gain on 32bit. disabled for 64bit unset executable bit on ssse3 version (cosmetic) Change-Id: I1a5860839eb294ce4261f819caea2dcfa78e57ca	2011-01-14 14:26:10 -05:00
Paul Wilkins	a1a4d23797	Merge "KF/GF Pulsing"	2011-01-14 09:20:37 -08:00
Paul Wilkins	3aafb47729	Merge "Testing of modes with Alt Ref frame"	2011-01-14 07:26:37 -08:00
Paul Wilkins	8f711db4e8	Merge "Experimental change to help with ARNR problem."	2011-01-14 07:26:01 -08:00
Paul Wilkins	415371c9d9	Testing of modes with Alt Ref frame Previously when a frame was being overlaid on a previously coded alt ref frame we only checked the alt ref 0,0 mode. Where there is a possibility that the alt ref buffer is a filtered frame we should allow the other prediction modes as normal or at the least allow use of the last frame buffer. Change-Id: I4d6227223d125c96b4f3066ec6ec9484fee7768c	2011-01-14 15:20:45 +00:00
Adrian Grange	2c1b06e672	ARNR filter pointer update bug fix In cases where the frame width is not a multiple of 16 the ARNR filter would go wrong. In vp8_temporal_filter_iterate_c when updating pointers at the end of a row of MBs, the image size was incorrectly used rather than using Num_MBs_In_Row times 16 (Y) or 8 (U,V). This worked when width is multiple of 16 but failed otherwise. Change-Id: I008919062715bd3d17c7aa2562ab58d1cb37053a	2011-01-14 15:04:39 +00:00
Paul Wilkins	72e22b0bb8	Experimental change to help with ARNR problem. Allow use of other reference frames for the ARF overlay frame when ARNR filtering is enabled Change-Id: Icd6a9fb38977a88fbe7cc9b9c18198eb454c0273	2011-01-14 12:07:12 +00:00
Paul Wilkins	c8338ebf7a	KF/GF Pulsing This change is designed to try and reduce pulsing effects when moving with a complex transition like a fade, into an easy or static section in an otherwise difficult clip in CQ mode. The active CQ level is relaxed down to the user entered level for frames that are generating less than the passed in minimum bandwidth. Change-Id: Id6d8b551daad4f489c087bd742bc95418a95f3f0	2011-01-14 11:37:26 +00:00
Scott LaVarnway	b082790c7d	Merge "Moved ref frame calculations"	2011-01-13 06:59:28 -08:00
Paul Wilkins	eda7d538bf	One pass rate control correction. Fixed discrepancy cpi->ni_frames vs cm->current_video_frame > 150. Make one pass path explicit. There is still scope for some odd behaviour around the transition point at cpi->ni_frames > 150. Change-Id: Icdee130fe6e2a832206d30e45bf65963edd7a74d	2011-01-13 12:51:41 +00:00
Paul Wilkins	55acda98f7	Limit key frame quantizer for forced key frames. Where a key frame occurs because of a minimum interval selected by the user, then these forced key frames ideally need to be more closely matched in quality to the surrounding frame. Change-Id: Ia55b1f047e77dc7fbd78379c45869554f25b3df7	2011-01-12 17:43:59 +00:00
Scott LaVarnway	96fd758ea9	Moved ref frame calculations Moved ref frame calculations to outside of the mode_index loop. Change-Id: I06103fc7e8af88b54b84443acf6691d29b1272ac	2011-01-11 15:00:00 -05:00
Yunqing Wang	6ff2b0883a	Merge "Add no_skip_block4x4_search flag in SPLITMV mode"	2011-01-11 08:34:24 -08:00
Johann	e88d7ab245	Merge "use unaligned load"	2011-01-11 08:25:22 -08:00
Johann	f50f2fd2a7	use unaligned load source buffer is not guaranteed to be aligned for odd size buffers Change-Id: Id0b1fd40ba3bd6c994bcfada788feccd2b53c5a9	2011-01-11 11:22:29 -05:00
Yunqing Wang	1546e6a8c9	Add no_skip_block4x4_search flag in SPLITMV mode Add a flag to always enable block4x4 search for speed=0 (good quality) to guarantee no quality loss for speed0. Change-Id: Ie04bbc25f7e6a33a7bfa30e05775d33148731c81	2011-01-11 09:50:13 -05:00
Henrik Lundin	48c28fc42c	Remove unused local variables Removing unused local variables causing compiler warnings in Visual Studio. Change-Id: I0e2096303be1fdbc01428a6e57cca9796bb32c8a	2011-01-11 15:22:19 +01:00
Yunqing Wang	3675b2291c	Fix bug in motion search The maximum possible MV in 1/8 pel units is (1<<11), which could cause mvcost out of its range that is 1023. Change maximum possible MV in 1/8 pel units to (1<<11)-8 will fix this problem. Change-Id: I5788ed1de773f66658c14f225fb4ab5b1679b74b	2011-01-10 16:16:59 -05:00
Paul Wilkins	cf7c4732e5	Two Pass VBR change Further experiment with restriction of the Q range. This uses the average non KF/GF/ARF quantizer, instead of just relying on the initial value. It is not such a strong constraint but there may be a reduced risk of rate misses. Change-Id: I424fe782a37a2f4e18c70805e240db55bfaa25ec	2011-01-10 16:41:53 +00:00
Paul Wilkins	405499d835	Revert BASE_ERRPERMB Constant value reverted pending more tests on different video formats. Change-Id: I07d11a0e0185e60724698c835416caf2e0774e61	2011-01-10 16:02:51 +00:00
Paul Wilkins	c28b10adeb	Merge "CQ Mode"	2011-01-07 11:05:56 -08:00
Paul Wilkins	e0846c9c8c	CQ Mode The merge includes hooks to for CQ mode and other code changes merged from the test branch. CQ mode attempts to maintain a more stable quantizer within a clip whilst also trying to adhere to a guidline maximum bitrate. The existing target data rate parameter is used to specify the guideline maximum bitrate. A new parameter allows the user to specify a target CQ level. For normal (non kf/gf/arf) frames, the quantizer will not drop BELOW the user specified value (0-63). However, in some cases the encoder may choose to impose a target CQ that is above that specified by the user, if it estimates that consistent use of the target value is not compatible with guideline maximum bitrate. Change-Id: I2221f9eecae8cc3c431d36caf83503941b25e4c1	2011-01-07 18:46:29 +00:00
Paul Wilkins	ba976eaa9b	Merge "Limit Q variability in two pass."	2011-01-07 09:32:29 -08:00
Paul Wilkins	3af3593c8e	Limit Q variability in two pass. In two pass encoding each frame is given an active Q range to work with. This change limits how much this Q range can be altered over time from the initial estimate made for the clip as a whole. There is some danger this could lead to overshoot or undershoot in some corner cases but it helps considerably in regard to clips where either there is a glut or famine of bits in some sections, particularly near the end of a clip. Change-Id: I34fcd1af31d2ee3d5444f93e334645254043026e	2011-01-07 17:23:50 +00:00
Paul Wilkins	f7e2f1fedf	Merge "Disable some features for first pass."	2011-01-07 08:34:27 -08:00
Scott LaVarnway	dd314351e6	Merge "Removed cpi->target_bits_per_mb"	2011-01-07 06:46:45 -08:00
Scott LaVarnway	6dbdfe3422	Removed cpi->target_bits_per_mb cpi->target_bits_per_mb is currently not being used, so delete it. Also removed other unused code in rdopt.c. Change-Id: I98449f9030bcd2f15451d9b7a3b9b93dd1409923	2011-01-07 09:41:13 -05:00
Johann	8b0cf5f79d	x86 sse2 temporal_filter_apply count can be reduced to short because the max number of filtered frames is set to 15. the max value for any frame is 32 (modifier = 16, filter_weight = 2). 15*32 = 480 which requires 9 bits this function goes from about 7000 us / 1000 iterations for the C code to < 275 us / 1000 iterations for sse2 for block_size = 16 and from about 1800 us / 1000 iters to < 100 us / 1000 iters for block_size = 8 Change-Id: I64a32607f58a2d33c39286f468b04ccd457d9e6e	2011-01-06 14:00:30 -05:00
Paul Wilkins	431dac08d1	Disable some features for first pass. The following features don't make sense for the first pass in its current form and have a significant impact on its speed (up to 50%). Slow quantizer, slow dct and trellis optimization. Change-Id: Id9943f6765ffbd71fc0084ec7dfbc9d376fd6fcd	2011-01-06 17:10:07 +00:00
Paul Wilkins	b095d9df3c	Adjustment to boost calculation in two pass. Calculate a minimum intra value to be used in determining the IIratio scores used in two pass, second pass. This is to make sure sections that are low complexity" in the intra domain are still boosted appropriately for KF/GF/ARF. For now I have commented out the Q based adjustment of KF boost. Change-Id: I15deb09c5bd9b53180a2ddd3e5f575b2aba244b3	2011-01-04 18:11:28 +00:00
Scott LaVarnway	de4e8185e9	Fixed encoder crash when mult-threading is enabled. Happens in real-time mode. Will happen in good quality, speed 1. Change-Id: I3e5b68827b1a5798d0431b088a709256d1ce2c95	2010-12-29 16:41:22 -05:00
Yunqing Wang	a864678cdb	Always update last_frame_type Scott pointed out that last_frame_type only gets updated while loopfilter exists. Since last_frame_type is also needed in motion search now, it needs to be updated every frame. Change-Id: I9203532fd67361588d4024628d9ddb8e391ad912	2010-12-29 10:28:35 -05:00
Scott LaVarnway	3fb4abf3d1	Merge "Use the fast quantizer for inter mode selection"	2010-12-28 11:56:11 -08:00
Scott LaVarnway	516ea8460b	Use the fast quantizer for inter mode selection Use the fast quantizer for inter mode selection and the regular quantizer for the rest of the encode for good quality, speed 1. Both performance and quality were improved. The quality gains will make up for the quality loss mentioned in I9dc089007ca08129fb6c11fe7692777ebb8647b0. Change-Id: Ia90bc9cf326a7c65d60d31fa32f6465ab6984d21	2010-12-28 14:51:46 -05:00
Yunqing Wang	bf53ec492d	Adjust MV borders for SPLITMV mode Add limits to avoid MV going out of range. Change-Id: I8a5deb40bf393488d29f694b5a56804d578e68b5	2010-12-28 13:23:07 -05:00
Yunqing Wang	e463b95b4e	Merge "Modify motion estimation for SPLITMV mode"	2010-12-28 08:12:26 -08:00
Yunqing Wang	a5a8d92976	Modify motion estimation for SPLITMV mode 1. Search for block8x16/block16x8 uses block8x8's search results. 2. Check block4x4 only if block8x8 is chosen. (This hurts quality, which will be improved in another check-in.) 3. In block4x4 search, the previous block's result is used as MV predictor for next block. This change improves performance. Change-Id: I9dc089007ca08129fb6c11fe7692777ebb8647b0	2010-12-28 10:34:42 -05:00
Yaowu Xu	0f5264b584	adjusted sad_per_bit to correlate with quantizer Re-calibrated sad_per_bit16 and sad_per_bit4 tables to linearly correlated to quantizer values, these two variables are used in motion search for costing motion vectors. This change has an small positive effect on compression. Change-Id: Ic9b5ea6fb8d5078ef663ba4899db019cc51f4166	2010-12-23 22:59:38 -08:00
Johann	20b855c33e	improve integer version of filter the lookup table is based on floating point calculations (see source) by moving the *3 before the downshift and adding the rounding bit, the delta (LUT - integer) goes from: ______________________________________ __ 1__ 1______________________________ __ 1__ 1______________________________ ____ 1______ 1________________________ ____ 1 2__ 2 1________________________ ______ 1 1 2__ 2__ 2__ 2 1 1__________ ________ 1 1 2 2__ 1 2 3 1 2__ 2__ 2__ to: __-1__-1______________________________ ______________________________________ ____-1______-1________________________ ______________________________________ ________-1______________-1____________ ______________________________________ it's important to be able to use the integer version because the LUT more or less precludes SIMD optimizations Change-Id: I45a81127dc7b72a06fba951649135d9d918386c0	2010-12-22 11:33:59 -05:00
Johann	4b6219cb33	temporal filter naming changes be more consistant with the naming pattern, especially wrt rtcd Change-Id: I3df50686a09f1dab0a9620b5adbb8a1577b40f2f	2010-12-22 11:32:15 -05:00
Johann	092b5bef37	abstract apply_temporal_filter allow for optimized versions of apply_temporal_filter (now vp8_apply_temporal_filter_c) the function was previously declared as static and appears to have been inlined. with this change, that's no longer possible. performance takes a small hit. the declaration for vp8_cx_temp_filter_c was moved to onyx_if.c because of a circular dependency. for rtcd, temporal_filter.h holds the definition for the rtcd table, so it needs to be included by onyx_int.h. however, onyx_int.h holds the definition for VP8_COMP which is needed for the function prototype. blah. Change-Id: I499c055fdc652ac4659c21c5a55fe10ceb7e95e3	2010-12-22 11:31:54 -05:00
John Koleszar	b0da9b399d	Add psnr/ssim tuning option Add a new encoder control, VP8E_SET_TUNING, to allow the application to inform the encoder that the material will benefit from certain tuning. Expose this control as the --tune option to vpxenc. The args helper is expanded to support enumerated arguments by name or value. Two tunings are provided by this patch, PSNR (default) and SSIM. Activity masking is made dependent on setting --tune=ssim, as the current implementation hurts speed (10%) and PSNR (2.7% avg, 10% peak) too much for it to be a default yet. Change-Id: I110d969381c4805347ff5a0ffaf1a14ca1965257	2010-12-17 10:01:05 -05:00
Scott LaVarnway	64baa8df2e	Changed segmentation check order In SPLITMV, the 8x8 segment will be checked first. If the 8x8 rd is better than the best, we check the other segments. Otherwise bail. Adjustments to the thresh_mult were necessary to make up for the initial quality loss. The performance improved by 20% (average) for good quality, speed 0 and speed 1, while the overall quality remained the same. Change-Id: I717aef401323c8a254fba3e9777d2a316c774cc3	2010-12-16 17:01:27 -05:00
Scott LaVarnway	81cdeb7117	Adjusted breakout RD for SPLITMV vp8_rd_pick_best_mbsegmentation looks at y only. The new breakout does not include the frame cost, the prob_skip_false cost, or the uv rate. Performance improved by a few percent and the quality remained the same. Change-Id: I94ff013998ac51e8ecce7130870f7b6600758e15	2010-12-16 09:38:02 -05:00
Yunqing Wang	4fbd0227f5	Merge "Fix a bug in motion search code(2)"	2010-12-15 08:10:34 -08:00
Yunqing Wang	08706a3ea7	Fix a bug in motion search code(2) This fix added MV range checks for NEWMV mode as suggested by Jim. To reduce unnecessary MV range checks, I tried Yaowu's suggestion. Update UMV borders in NEWMV mode to also cover MV range check. Also, in this way, every MV that is valid gets checked in diamond search function. Change-Id: I95a89ce0daf6f178c454448f13d4249f19b30f3a	2010-12-14 17:39:25 -05:00
Yaowu Xu	3ac73173a4	Merge "fix a bug that "optimize" flag is not set for sub-threads"	2010-12-14 13:32:04 -08:00
Yunqing Wang	23aa13d92c	Merge "Fix a bug in motion search code"	2010-12-14 13:25:34 -08:00
Yunqing Wang	7fb0f86863	Fix a bug in motion search code The MV's range is 256. Since the new motion search uses a different starting MV than the center ref MV, a MV range checking needs to be done to avoid corruption. Change-Id: I8ae0721d1bd203639e13891e2e54a2e87276f306	2010-12-14 13:59:38 -05:00
Yaowu Xu	64f3d91579	fix a bug that "optimize" flag is not set for sub-threads The flag for quantization optimization was not properly propagated to mb row encoding threads. Change-Id: Ic561599c35acd94cd5698c9b314bccd596ac2deb	2010-12-14 10:12:21 -08:00
Johann	825adc464f	shrink TOKENEXTRA and vp8_extra_bit_struct Per John's previous change, shrink TOKENEXTRA from 20 to 8 bytes original: `b7b1e6fb` reverted: `41f4458a` Also drop unused field from vp8_extra_bit_struct Update ARM ASM to deal with this change. In particular, Extra is signed and needs to be sign-extended when loaded. Change-Id: Ibd0ddc058432bc7bb09222d6ce4ef77e93a30b41	2010-12-14 10:32:50 -05:00
John Koleszar	41f4458a03	Revert "Reduce size of TOKENEXTRA struct" This reverts commit `b7b1e6fb55`. Previous fix is incomplete, breaks ARM. Itchy submit finger. Change-Id: I939dc0d3bf4173cf951c1d152338ab6ea2184bb9	2010-12-13 17:12:51 -05:00
John Koleszar	3809d7bbd9	Merge "remove unused temporal preproc code"	2010-12-13 13:57:59 -08:00
John Koleszar	398aa81849	Merge "Reduce size of TOKENEXTRA struct"	2010-12-13 13:57:55 -08:00
John Koleszar	b1aa54ab26	remove unused temporal preproc code This code is unused, as the current preproc implementation uses the same spatial filter that postproc uses. Change-Id: Ia06d5664917d67283f279e2480016bebed602ea7	2010-12-13 16:47:59 -05:00
John Koleszar	b7b1e6fb55	Reduce size of TOKENEXTRA struct Change the size of structure elements to reduce memory utilization. Removed the 'section' member entirely, as it is set but never read. Change-Id: Iad043830392fb4168cb3cd6075fb0eb70c7f691c	2010-12-13 16:37:37 -05:00
Yaowu Xu	97a86c5b13	fix a bug in multithreaded encoding with active_map enabled Added the initialization of the pointer to active map. Also added the same logic for cyclic refresh in mbrow encoding threads. Change-Id: Ic48d0849dc706b27fba72d07dcc498075725663d	2010-12-10 10:48:30 -08:00
Fritz Koenig	0ced701487	Merge "vp8 fast quantizer sse2 optimizations for eob."	2010-12-10 09:25:04 -08:00
Fritz Koenig	e0cf330cde	vp8 fast quantizer sse2 optimizations for eob. Changed the end of block computation to use pmaxw. Removed additional pushing and popping of registers that was not needed. Change-Id: I08cb9b424513cd8a2c7ad8cea53b4e2adc66ef98	2010-12-09 15:00:30 -08:00
John Koleszar	cb9698951c	fix uninitialized read in encode breakout Change I3430820 performed an uninitialized read when encode_breakout == 0, since the sum and sse wouldn't be set: if(x->encode_breakout) VARIANCE_INVOKE(..., get16x16var)(..., &sum, &sse); if (cpi->active_map_enabled && x->active_ptr[0] == 0) { ... } else if (sse < x->encode_breakout) Change-Id: I915eb76d1227b4b6d1137a0dedf2c143860098a2	2010-12-09 16:05:26 -05:00
Paul Wilkins	c63fc881e1	Correct q_low and q_high limits for the recode loop Corrected the initial Q range limits for the recode loop to reflect the current allowed range for the frame. In experimental work on constrained quality this bug was causing unnecessary recodes. Change-Id: I7e256fbfa681293b0223fe21ec329933d76c229f	2010-12-09 15:02:04 +00:00
Yaowu Xu	160f3c7e9e	Merge "vp8e - static threshold play"	2010-12-08 13:08:04 -08:00
Yaowu Xu	d88da98614	Merge "vp8e - remove unnecessary variance calc"	2010-12-08 09:19:22 -08:00
Jim Bankoski	718c19711a	vp8e - static threshold play Realized no need for new assembly code sum is already calculated. Change-Id: Ie2d94feb4b7c1f77c5359bca29b66228e41638c9	2010-12-07 16:07:23 -05:00
Scott LaVarnway	f661fa1f24	Merge "vp8_rd_pick_best_mbsegmentation code restructure"	2010-12-07 07:53:12 -08:00
Yaowu Xu	062980cc48	Merge "adjust RDMULT for UV plane in quantization RDO"	2010-12-06 22:04:45 -08:00
Yaowu Xu	7c03a1c308	adjust RDMULT for UV plane in quantization RDO This patch adds a weighting factor on RDMULT for UV blocks. The change has an overall gain about 0.5% based on ssim, between 0.1 and 0.2% by psnr numbers. Change-Id: I97781b077ce3bb7e34241b03268491917e8d1d72	2010-12-06 20:53:59 -08:00
Yunqing Wang	9520f4b3cc	Fix a memory leak problem in encoder Deallocating the buffers before re-allocating them. The fix passed James Berry's test program for memory leak check. Change-Id: I18c3cf665412c0e313a523e3d435106c03ca438d	2010-12-06 17:21:37 -05:00
Scott LaVarnway	2fa5d5a26d	vp8_rd_pick_best_mbsegmentation code restructure Moved the code from the segmentation loop into a function which is now called for each segment. This will allow us to change the segment order checking more easily. Change-Id: I9510d26f0acae5a73043fcca8f1984b121d3e052	2010-12-06 16:42:52 -05:00
Scott LaVarnway	d283d9bb30	Merge "Improve MV prediction accuracy to achieve performance gain"	2010-12-06 09:41:09 -08:00
Patrik Westin	8534071de0	Fix for manual Golden frame frequency When auto_golden wasn't set it forced all frames to be a golden frame. Now the manual configured frequency is adhered to. Change-Id: I360acac9bc487db0d9c4d4da6ee41f70c227c539	2010-12-06 09:53:41 -05:00
Paul Wilkins	ccb0348473	Merge "Change to inter_minq table."	2010-12-04 02:06:33 -08:00
Paul Wilkins	cec6a596b5	Change to inter_minq table. The inter_minq table controls the range of quantizers available for a particular frame in two pass relative to a max Q value. The changes reduces the range somewhat. The effect of this was a small increase (0.3% average) in psnr for the test set but it should also help encode speed somewhat for higher quality modes as it will reduce the number of iterations in the recode loop. The change damps the range of quantizers available locally within a section of a clip and should therefore help keep quality more uniform. If there is systematic overshoot or undershoot the range can shift gradually to accommodate. However, there is some increased risk of overshoot or undershoot against the target bit rate in VBR mode and this risk will be more pronounced for short clips. The change damps the range of quantizers available locally within a section of a clip and should therefore help keep quality more uniform. If there is systematic overshoot or undershoot the range can shift gradually to accommodate. However, there is some increased risk of overshoot or undershoot against the target bit rate in VBR mode and this risk will be more pronounced for short clips. Change-Id: I84465567d49ae767c6c73ff2a2aac30c895adb52	2010-12-04 10:04:12 +00:00
Yunqing Wang	c3bbb29164	Improve MV prediction accuracy to achieve performance gain Add vp8_mv_pred() to better predict starting MV for NEWMV mode in vp8_rd_pick_inter_mode(). Set different search ranges according to MV prediction accuracy, which improves encoder performance without hurting the quality. Also, as Yaowu suggested, using diamond search result as full search starting point and therefore adjusting(reducing) full search range helps the performance. Change-Id: Ie4a3c8df87e697c1f4f6e2ddb693766bba1b77b6	2010-12-03 15:23:35 -05:00
John Koleszar	5e76dfcc70	Merge 'Add simple version of activity masking.' Merge commit 'refs/changes/79/779/2' of https://review.webmproject.org/p/libvpx Conflicts: vp8/encoder/encodeintra.c vp8/encoder/encodemb.c Change-Id: Id607063fabe92d99eeb3c380e8ca670b01bfb3ef	2010-12-03 13:30:50 -05:00
Fritz Koenig	9c8ad79fdc	Set refresh_alt_ref_frame on keyframe encode. On a keyframe alt ref and golden are refreshed. The flag was not being set and so on the frame after a keyframe, motion search would occur on the alt ref frame. This is not necessary because the alt ref frame identical to the last frame in this scenario. Handle corner case where a forward alt-ref frame is put directly after a keyframe. Change-Id: I9be4cf290d694f8cf2f9a31852014b5ccf1504d3	2010-12-01 12:48:22 -08:00
Jim Bankoski	3430820bbe	vp8e - remove unnecessary variance calc only do the variance calculation if necessary ( eg needed for breakout test)	2010-11-27 14:02:59 -05:00
Paul Wilkins	ad6150f769	Recalibration of bits per MB tables The baseline bits per MB prediction tables have been re calibrated based on the assumption that bits per mb is inversely proportional to the quantizer level. Change-Id: Ibd355c7acac4b8053dda1baf1032fe35f11da7f7	2010-11-22 13:17:35 +00:00
Paul Wilkins	1753f0d208	Merge "Added extra two pass stats gathering."	2010-11-22 04:11:20 -08:00
Paul Wilkins	70b885a0e8	Added extra two pass stats gathering. Added code to record spend so far against planed budget. Change-Id: I5a3335346fa1771b2b1219df9f6127f9993d2594	2010-11-19 14:12:33 -05:00
Pascal Massimino	ed5ab7fa49	remove warning was having: "vp8/encoder/onyx_if.c:5365: warning: comparison of unsigned expression >= 0 is always true"	2010-11-17 16:50:02 -08:00
Scott LaVarnway	9a6740af80	Merge "Removed unnecessary checks."	2010-11-17 11:28:22 -08:00
Scott LaVarnway	f7670acc68	Removed unnecessary checks. macro_block_yrd and vp8_rdcost_mby are not called for SPLITMV. Change-Id: I2224d3c8725df526d48426447482768d543752f1	2010-11-17 14:25:48 -05:00
Paul Wilkins	f874391e02	Replaced recode loop test with a function call Replaced existing code to decide if a frame recode is required with a function call. This is to simplify addition of extra clauses that may be needed for the planned constrained quality mode. Also fixed a bug where by alt ref not considered in the test. Change-Id: I3d40bb21abe3e19f8456761e6849deb171738b60	2010-11-17 15:12:04 +00:00
Fritz Koenig	69ee697fef	Comments for alt ref flags. Clarify what the alt ref flags do when encoding. Change-Id: I71f78e0f42edae633fb91840f29dfbe64362c44c	2010-11-16 15:16:24 -08:00
Fritz Koenig	e180255375	Remove stack shadowing for x86-x64 for SAD functions. x86-64 passes arguments in registers. There is no need to push them to the stack before using them. This fixes `15acc84f10` where ebx was not getting preserved on x86. Change-Id: I1214b5f818a0201f75ab6ad7d5c6f448e09b16c2	2010-11-15 10:56:02 -08:00
Paul Wilkins	f4709d2895	Merge "Bad cost tables used in ARNR filtering."	2010-11-15 09:55:35 -08:00
Paul Wilkins	373f5c3144	Bad cost tables used in ARNR filtering. The use of incorrect mv costing tables in the ARNR sub-pel filtering code led to corruption of the altref buffer in some cases, particularly at low data rates. The average gain from this fix is about 0.3% but there are a few extreme cases where nasty and visible artifacts manifested and for these few data points the improvement is > 10%. PGW and AWG Change-Id: I95cc02b196a433e71d0d2bd2b933fe68ed31e796	2010-11-15 17:47:12 +00:00
Yaowu Xu	73189f21b3	Merge "make rdmult adaptive for intra in quantizer RDO"	2010-11-15 09:22:45 -08:00
Yaowu Xu	ef2f27f10e	make rdmult adaptive for intra in quantizer RDO This intends to correct the tendency that VP8 aggressively favors rate on intra coded frames. Experiments tested different numbers in [0, 1] and found 9/16 overall provided about 2-4% gains for all-intra coded clips based on vpx-ssim metric. The impact on regular encoded clips is much smaller but positive overall. Overall impact on psnr is also positive even though very small. Change-Id: If808553aaaa87fdd44691f9787820ac9856d9f8a	2010-11-11 11:33:35 -08:00
John Koleszar	0a49747b01	quantizer: fix assertion in fast quantizer path The fast quantizer assembly code has not been updated to match the new exact quantizer, which was made the default in commit `6adbe09`. Specifically, they are not aware of the potential for the coefficient to be scaled, which results in the quantized result exceeding the range of the DCT. This patch restores the previous behavior of using the non-shifted coefficients when in the fast quantizer code path, but unfortunately requires rebuilding the tables when switching between the two. Change-Id: I0a33f5b3850335011a06906f49fafed54dda9546	2010-11-11 13:05:20 -05:00
Fritz Koenig	58083cb34d	Revert "Remove stack shadowing for x86-64" This reverts commit `15acc84f10`. Change-Id: Ia640be8cbc134432914849c1750f62575ea084e6	2010-11-11 08:20:02 -08:00
Paul Wilkins	213f7b0907	Merge "Relax rate control for last few frames"	2010-11-11 02:39:20 -08:00
Fritz Koenig	9b1ece2cca	Merge "Remove stack shadowing for x86-64"	2010-11-10 14:36:10 -08:00
Fritz Koenig	5f0e0617ba	FDCT optimizations. Fixed up the fdct for mmx and 8x4 sse2 to match them most recent changes. Change-Id: Ibee2d6c536fe14dcf75cd6eb1c73f4848a56d719	2010-11-10 14:34:02 -08:00
Fritz Koenig	647df00f30	postproc : Re-work posproc calling to allow more flags. Debugging in postproc needs more flags to allow for specific block types to be turned on or off in the visualizations. Must be enabled with --enable-postproc-visualizer during configuration time. Change-Id: Ia74f357ddc3ad4fb8082afd3a64f62384e4fcb2d	2010-11-10 14:14:46 -08:00
Paul Wilkins	513f8e6814	Relax rate control for last few frames VBR rate control can become very noisy for the last few frames. If there are a few bits to spare or a small overshoot then the target rate and hence quantizer may start to fluctuate wildly. This patch prevents further adjustment of the active Q limits for the last few frames. Patch also removes some redundant variables and makes one small bug fix. Change-Id: Ic167831bec79acc9f0d7e4698bcc4bb188840c45	2010-11-10 10:09:45 +00:00
Paul Wilkins	6adbe09058	Tuning for the more exact quantizer. Small changes to the default zero bin and rounding tables. Though the tables are currently the same for the Y1 and Y2 cases I have left them as separate tables in case we want to tune this later. There is now some adjustment of the zbin based on the prediction mode. Previously this was restricted to an adjustment for gf/arf 0,0 MV. The exact quantizer now marginal outperforms and is the default. The overall average gain is about 0.5% Change-Id: I5e4353f3d5326dde4e86823684b236a1e9ea7f47	2010-11-10 09:52:58 +00:00
John Koleszar	f7e187d362	improve average framerate calculation Change Ice204e86 identified a problem with bitrate undershoot due to low precision in the timestamps passed to the library. This patch takes a different approach by calculating the duration of this frame and passing it to the library, rather than using a fixed duration and letting the library average it out with higher precision timestamps. This part of the fix only applies to vpxenc. This patch also attempts to fix the problem for generic applications that may have made the same mistake vpxenc did. Instead of calculating this frame's duration by the difference of this frame's and the last frame's start time, we use the end times instead. This allows the framerate calculation to scavenge "unclaimed" time from the last frame. For instance: start \| end \| calculated duration ======+=======+==================== 0ms 33ms 33ms 33ms 66ms 33ms 66ms 99ms 33ms 100ms 133ms 34ms Change-Id: I92be4b3518e0bd530e97f90e69e75330a4c413fc	2010-11-05 08:42:46 -04:00
Scott LaVarnway	ff4a71f4c2	SSSE3 version of fast quantizer (test clip: tulip) For good quality mode with speed=1, this gave the encoder a small (2 - 3%) performance boost. Change-Id: I8a1d4269465944ac0819986c2f0be4b0a2ee0b35	2010-11-01 16:24:15 -04:00
Scott LaVarnway	dcee88ea37	Finding first label Using tables for the label count and label offset. Change-Id: Iac3d5b292c37341a881be0af282f5cac3b3e01eb	2010-10-29 10:01:04 -04:00
Yunqing Wang	6614563b8f	Save XMM registers in asm functions XMM6/7 are used in these functions, and need to be saved. Change-Id: I3dfaddaf2a69cd4bf8e8735c7064b17bac5a14e5	2010-10-28 16:59:03 -04:00
Yunqing Wang	f57fc7bcc6	Merge "Fix full-search SAD function crash in Visual Studio"	2010-10-28 13:46:35 -07:00
Yunqing Wang	7e3a1e7361	Fix full-search SAD function crash in Visual Studio Unlike GCC, Visual Studio compiler doesn't allocate SAD output array 16-byte aligned, which causes crash in visual studio. Change-Id: Ia755cf5a807f12929bda8db94032bb3c9d0c2362	2010-10-28 15:26:58 -04:00
Timothy B. Terriberry	c4d7e5e67e	Eliminate more warnings. This eliminates a large set of warnings exposed by the Mozilla build system (Use of C++ comments in ISO C90 source, commas at the end of enum lists, a couple incomplete initializers, and signed/unsigned comparisons). It also eliminates many (but not all) of the warnings expose by newer GCC versions and _FORTIFY_SOURCE (e.g., calling fread and fwrite without checking the return values). There are a few spurious warnings left on my system: ../vp8/encoder/encodemb.c:274:9: warning: 'sz' may be used uninitialized in this function gcc seems to be unable to figure out that the value shortcut doesn't change between the two if blocks that test it here. ../vp8/encoder/onyx_if.c:5314:5: warning: comparison of unsigned expression >= 0 is always true ../vp8/encoder/onyx_if.c:5319:5: warning: comparison of unsigned expression >= 0 is always true This is true, so far as it goes, but it's comparing against an enum, and the C standard does not mandate that enums be unsigned, so the checks can't be removed. Change-Id: Iaf689ae3e3d0ddc5ade00faa474debe73b8d3395	2010-10-27 18:08:04 -07:00
Yunqing Wang	71ecb5d7d9	Full search SAD function optimization in SSE4.1 Use mpsadbw, and calculate 8 sad at once. Function list: vp8_sad16x16x8_sse4 vp8_sad16x8x8_sse4 vp8_sad8x16x8_sse4 vp8_sad8x8x8_sse4 vp8_sad4x4x8_sse4 (test clip: tulip) For best quality mode, this gave encoder a 5% performance boost. For good quality mode with speed=1, this gave encoder a 3% performance boost. Change-Id: I083b5a39d39144f88dcbccbef95da6498e490134	2010-10-27 13:36:31 -04:00
John Koleszar	a0ae3682aa	Fix half-pixel variance RTCD functions This patch fixes the system dependent entries for the half-pixel variance functions in both the RTCD and non-RTCD cases: - The generic C versions of these functions are now correct. Before all three cases called the hv code. - Wire up the ARM functions in RTCD mode - Created stubs for x86 to call the optimized subpixel functions with the correct parameters, rather than falling back to C code. Change-Id: I1d937d074d929e0eb93aacb1232cc5e0ad1c6184	2010-10-27 13:00:30 -04:00
John Koleszar	1747207700	Merge "Add half-pixel variance RTCD functions"	2010-10-26 20:05:02 -07:00
John Koleszar	1320e54d95	Merge "make vp8_recon16x16mb{,y} RTCD functions"	2010-10-26 20:02:57 -07:00
John Koleszar	87e17737e9	Merge "make arm hex search the generic implementation"	2010-10-26 20:02:37 -07:00
John Koleszar	9fdd90c9aa	Merge "arm: remove duplicate functions"	2010-10-26 20:01:54 -07:00
John Koleszar	209d82ad72	Add half-pixel variance RTCD functions NEON has optimized 16x16 half-pixel variance functions, but they were not part of the RTCD framework. Add these functions to RTCD, so that other platforms can make use of this optimization in the future and special-case ARM code can be removed. A number of functions were taking two variance functions as parameters. These functions were changed to take a single parameter, a pointer to a struct containing all the variance functions for that block size. This provides additional flexibility for calling additional variance functions (the half-pixel special case, for example) and by initializing the table for all block sizes, we don't have to construct this function pointer table for each macroblock. Change-Id: I78289ff36b2715f9a7aa04d5f6fbe3d23acdc29c	2010-10-26 20:00:56 -07:00
John Koleszar	d6c67f02c9	make vp8_recon16x16mb{,y} RTCD functions ARM NEON has a platform specific version of vp8_recon16x16mb, though it's just a stub to extract the various parameters from the MACROBLOCKD struct and pass them to vp8_recon16x16mb_neon(). Using that function's prototype directly will be a better long term solution, but it's quite an invasive change. Change-Id: I04273149e2ade34749e2d09e7edb0c396e1dd620	2010-10-26 13:23:36 -04:00
John Koleszar	96cf6588de	make arm hex search the generic implementation The ARM version of vp8_hex_search() is a faster implementation of the same algorithm. Since it doesn't use any ARM specific code, it can be made the default implementation. This removes a linking error. Change-Id: I77d10f2c16b2515bff4522c350004e03b7659934	2010-10-26 10:46:31 -04:00
John Koleszar	1e7c05e0b4	Merge "add missing GET_GOT/RESTORE_GOT pairs"	2010-10-26 07:05:21 -07:00
John Koleszar	d330a5876b	arm: remove duplicate functions These functions were true duplicates of functions present in the generic code. This fixes some of the link errors when building with --enable-shared --enable-pic. Change-Id: Idff26599d510d954e439207883607ad6b74df20c	2010-10-26 09:37:44 -04:00
Jim Bankoski	0a5a638c60	Merge commit 'refs/changes/09/809/1' of https://review.webmproject.org/p/libvpx	2010-10-26 07:34:57 -04:00
John Koleszar	b523dd51bd	add missing GET_GOT/RESTORE_GOT pairs These functions made global references but did not set up the GOT, causing compilation failures in PIC mode. Change-Id: Iac473bf46733f87eb2e001cd736af4acf73fa51d	2010-10-25 23:45:02 -04:00
Johann	a3b002fc90	Merge "quiet compiler"	2010-10-25 13:26:55 -07:00
Martin Ettl	c3fd2c4ea7	Fix leaked file descriptor with ENTROPY_STATS cppcheck found a leaked file descriptor in the debugging code enabled by defining ENTROPY_STATS. Fixes issue #60. Change-Id: I0c1d0669cb94d44fed77860f97b82763be06b7cb	2010-10-25 13:16:39 -04:00
Johann	385865f820	quiet compiler clean up compiler warnings, man in the yellow hat warnings, and start to remove unused #includes Change-Id: I6267e98d9b3024b6fb1ef2732b29067a33cb96f6	2010-10-25 10:07:35 -04:00
Timothy B. Terriberry	b71962fdc9	Add runtime CPU detection support for ARM. The primary goal is to allow a binary to be built which supports NEON, but can fall back to non-NEON routines, since some Android devices do not have NEON, even if they are otherwise ARMv7 (e.g., Tegra). The configure-generated flags HAVE_ARMV7, etc., are used to decide which versions of each function to build, and when CONFIG_RUNTIME_CPU_DETECT is enabled, the correct version is chosen at run time. In order for this to work, the CFLAGS must be set to something appropriate (e.g., without -mfpu=neon for ARMv7, and with appropriate -march and -mcpu for even earlier configurations), or the native C code will not be able to run. The ASFLAGS must remain set for the most advanced instruction set required at build time, since the ARM assembler will refuse to emit them otherwise. I have not attempted to make any changes to configure to do this automatically. Doing so will probably require the addition of new configure options. Many of the hooks for RTCD on ARM were already there, but a lot of the code had bit-rotted, and a good deal of the ARM-specific code is not integrated into the RTCD structs at all. I did not try to resolve the latter, merely to add the minimal amount of protection around them to allow RTCD to work. Those functions that were called based on an ifdef at the calling site were expanded to check the RTCD flags at that site, but they should be added to an RTCD struct somewhere in the future. The functions invoked with global function pointers still are, but these should be moved into an RTCD struct for thread safety (I believe every platform currently supported has atomic pointer stores, but this is not guaranteed). The encoder's boolhuff functions did not even have _c and armv7 suffixes, and the correct version was resolved at link time. The token packing functions did have appropriate suffixes, but the version was selected with a define, with no associated RTCD struct. However, for both of these, the only armv7 instruction they actually used was rbit, and this was completely superfluous, so I reworked them to avoid it. The only non-ARMv4 instruction remaining in them is clz, which is ARMv5 (not even ARMv5TE is required). Considering that there are no ARM-specific configs which are not at least ARMv5TE, I did not try to detect these at runtime, and simply enable them for ARMv5 and above. Finally, the NEON register saving code was completely non-reentrant, since it saved the registers to a global, static variable. I moved the storage for this onto the stack. A single binary built with this code was tested on an ARM11 (ARMv6) and a Cortex A8 (ARMv7 w/NEON), for both the encoder and decoder, and produced identical output, while using the correct accelerated functions on each. I did not test on any earlier processors. Change-Id: I45cbd63a614f4554c3b325c45d46c0806f009eaa	2010-10-25 09:23:29 -04:00
Johann	e81e30c25d	isolate new temporal filtering code onyx_if is getting pretty big. split out the temporal code to make it easier to look at. Change-Id: I207c3a94c90e91b32e3ea5e1836a53b7a990fabd	2010-10-25 09:11:03 -04:00
Timothy B. Terriberry	8f75ea6b5c	Convert [4][4] matrices to [16] arrays. Most of the code that actually uses these matrices indexes them as if they were a single contiguous array, and coverity produces reports about the resulting accesses that overflow the static bounds of the first row. This is perfectly legal in C, but converting them to actual [16] arrays should eliminate the report, and removes a good deal of extraneous indexing and address operators from the code. Change-Id: Ibda479e2232b3e51f9edf3b355b8640520fdbf23	2010-10-21 17:04:30 -07:00
John Koleszar	1ee3ebcd66	Merge "Move firstpass motion map to stats packet"	2010-10-21 11:09:02 -07:00
John Koleszar	bb7dd5b1ba	Move firstpass motion map to stats packet The first implementation of the firstpass motion map for motion compensated temporal filtering created a file, fpmotionmap.stt, in the current working directory. This was not safe for multiple encoder instances. This patch merges this data into the first pass stats packet interface, so that it is handled like the other (numerical) firstpass stats. The new stats packet is defined as follows: Numerical Stats (16 doubles) -- 128 bytes Motion Map -- 1 byte / Macroblock Padding -- to align packet to 8 bytes The fpmotionmap.stt file can still be generated for debugging purposes in the same way that the textual version of the stats are available (defining OUTPUT_FPF in firstpass.c) Change-Id: I083ffbfd95e7d6a42bb4039ba0e81f678c8183ca	2010-10-21 14:04:20 -04:00
Yunqing Wang	4cefb4434f	Add MMWORD PTR/XMMWORD PTR in subtract_sse2.asm Change-Id: Ia649b500ef020225d8bbf611799d0f47658dc2ac	2010-10-21 13:42:24 -04:00
Yunqing Wang	31752f2f41	Merge "Rewrite vp8_short_walsh4x4_sse2()"	2010-10-21 10:31:23 -07:00
Yunqing Wang	0918747520	Merge "Add SSE2 subtract functions"	2010-10-21 10:30:27 -07:00
Fritz Koenig	15acc84f10	Remove stack shadowing for x86-64 x86-64 passes most arguments in registers. There is no need to push them to the stack before using them. Change-Id: I13c683f1358782682ecafaf1df3fb0af23b978ea	2010-10-21 10:28:08 -07:00
Yunqing Wang	fc94ffcea4	Rewrite vp8_short_walsh4x4_sse2() This rewriting reflects changes made in commit "Improve the accuracy of forward walsh-hadamard transform". Since this function is not called much, only a small encoder performance gain (~0.5% ) is seen. Change-Id: Ie9df58a43028a11fd5b115c4bbe3141f7596578b	2010-10-21 13:02:55 -04:00
Yaowu Xu	b9fe6d4da4	Merge "change to make use of more trellis quantization"	2010-10-19 08:11:52 -07:00
Yunqing Wang	4db2076594	Add SSE2 subtract functions Instead of doing 8-bit data unpack and 16-bit subtraction, use psubb to do 16 8-bit subtractions and pcmpgtb to preserve the sign information. This does not bring noticable gain since these functions are not called frequently. Change-Id: I90a0dfaa3db9d422e4ada324076596ffb178548e	2010-10-18 14:15:15 -04:00
Johann	ce1ce992ce	copy compiler warning fixes generic version got fixed, but not the arm version. fixes: vp8/encoder/arm/mcomp_arm.c: In function 'vp8_full_search_sadx3': vp8/encoder/arm/mcomp_arm.c:1208: warning: pointer targets in passing argument 5 of 'fn_ptr->sdx3f' differ in signedness vp8/encoder/arm/mcomp_arm.c:1208: note: expected 'unsigned int ' but argument is of type 'int ' and another unsigned change to keep the files similar Change-Id: I1b6255dc3a03b90394a791ee0d15d8167d9454db	2010-10-18 13:23:39 -04:00
Johann	963bcd6c87	remove dead code vp8_diamond_search_sadx4 isn't used in arm because there is no corrosponding sdx4df as in x86. rather than keep it in sync with ../mcomp.c, delete it vp8_hex_search had the original, more readable/understandable code if`d out. it's also available in ../mcomp.c, so remove the dead copy Change-Id: Ia42aa6e23b3a2e88040f467280befec091ec080e	2010-10-15 15:37:09 -04:00
Yaowu Xu	2e53e9e53f	change to make use of more trellis quantization when a subsequent frame is encoded as an alt reference frame, it is unlikely that any mb in current frame will be used as reference for future frames, so we can enable quantization optimization even when the RD constant is slightly rate-biased. The change has an overall benefit between 0.1% to 0.2% bit savings on the test sets based on vpxssim scores. Change-Id: I9aa7bc5cd573ea84e3ee655d2834c18c4460ceea	2010-10-15 10:14:34 -07:00
Jim Bankoski	39f41a4f36	safety check to avoid divide by 0s	2010-10-14 16:19:06 -04:00
Yunqing Wang	7f31d987f0	Merge "Improve bounds checking in vp8_diamond_search_sadx4()"	2010-10-14 11:29:24 -07:00
Yunqing Wang	d6da7b8ea1	Improve bounds checking in vp8_diamond_search_sadx4() In order to know if all 4/8 neighbor points are within the bounds, 4 bounds checking are enough instead of checking 4 bounds for each points (16/32 checkings). This improvement reduces cost of vp8_diamond_search_sadx4() by 30%, and gives encoder a 1.5% performance gain (test options: 1 pass, good, speed=4). Change-Id: Ie8da29d18a6ecfc9829e74ac02f6fa70e042331a	2010-10-14 11:06:37 -04:00
Fritz Koenig	1dc0ca1340	Fix compiler warning about vp8_fast_quantize_b_impl_ssse2. Typo had function defined as _ssse2 and prototyped as _sse2. Change-Id: If9f19da1a83cff40774a90cf936d601c0bf1b7fe	2010-10-13 17:08:13 -07:00
Fritz Koenig	92df4a06d2	Correct QWORD usage in assembly files QWORD was being undefined because it was being used incorrectly. Change-Id: I3610cefa3d6f0da4054316760f78b9694cde3876	2010-10-13 16:57:57 -07:00
John Koleszar	136857475e	Centralize mb skip state calculation This patch moves the scattered updates to the mb skip state (mode_info_context->mbmi.mb_skip_coeff) to vp8_tokenize_mb. Recent changes to the quantizer exposed a bug where if a macroblock could be coded as a skip but isn't, the encoder would run the loopfilter but the decoder wouldn't, causing a reference buffer mismatch. The loopfilter is controlled by a flag called dc_diff. The decoder looks at the number of decoded coefficients when setting this flag. The encoder sets this flag based on the skip state, since any skippable macroblock should be transmitted as a skip. The coefficient optimization pass (vp8_optimize_b()) could change the coefficients such that a block that was not a skip becomes one. The encoder was not updating the skip state in this situation for intra coded blocks. The underlying issue predates it, but this bug was recently triggered by enabling trellis quantization on the Y2 block in commit `dcd29e3`, and by changing the quantizer range control in commit `305be4e`. Change-Id: I5cce5da0dbc2d22f7d79ee48149f01e868a64802	2010-10-12 09:03:19 -04:00
John Koleszar	acff1627b8	Merge "Add const qualifiers to variance/SAD functions."	2010-10-12 05:44:20 -07:00
Timothy B. Terriberry	8d0f7a01e6	Add simple version of activity masking. This uses MB variance to change the RDO weight for mode decision and quantization. Activity is normalized against the average for the frame, which is currently tracked using feed-forward statistics. This could also be used to adjust the quantizer for the entire frame, but that requires more extensive rate control changes. This does not yet attempt to adapt the quantizer within the frame, but the signaling cost means that will likely only be useful at very high rates. Change-Id: I26cd7c755cac3ff33cfe0688b1da50b2b87b9c93	2010-10-12 08:41:03 -04:00
Timothy B. Terriberry	f4a8594492	Add const qualifiers to variance/SAD functions. These functions should never change their input, and there's no reason not to declare that. This allows them to be passed static const data. Change-Id: Ia49fe4b01e80e9afcb24b4844817694d4da5995c	2010-10-12 08:40:54 -04:00
John Koleszar	037345eb69	Merge "Move vp8_strict_quantize_b inside EXACT_QUANT #define."	2010-10-12 05:34:30 -07:00
John Koleszar	fc018e0d92	Merge "Remove INTRARDOPT #define and intra_rd_opt option."	2010-10-12 05:33:22 -07:00
Timothy B. Terriberry	82c4339885	Move vp8_strict_quantize_b inside EXACT_QUANT #define. There is currently no inexact version of this function, so do not even compile it without EXACT_QUANT. This will prevent someone from inadvertently trying to use it without the proper EXACT_QUANT setup. Change-Id: Ia13491e0128afb281c05c9222ee5987101e4010d	2010-10-11 13:51:35 -07:00
Timothy B. Terriberry	dd08db9315	Remove INTRARDOPT #define and intra_rd_opt option. This is just eliminating some cruft. Although a number of variables are declared only when INTRARDOPT is defined, they are used elsewhere without that protection, and no longer just for intra RDO. The intra_rd_opt flag was hard-coded to 1 and never checked. Change-Id: I83a81554ecee8053e7b4ccd8aa04e18fa60f8e4f	2010-10-11 11:53:57 -07:00
Scott LaVarnway	6b1b28a83c	Merge "Added vp8_fast_quantize_b_sse2"	2010-10-11 09:34:48 -07:00
Yunqing Wang	7e6f7b579a	Remove unused file in encoder Remove vp8/encoder/x86/csystemdependent.c Change-Id: I7c590dcd07b68704d463a1452f62f29ffb1402f4	2010-10-07 12:08:08 -04:00
Scott LaVarnway	d860f685b8	Added vp8_fast_quantize_b_sse2 Moved vp8_fast_quantize_b_sse from quantize_mmx.asm into quantize_sse2.asm and renamed. Updated the assembly code to match the C version. Change-Id: I1766d9e1ca60e173f65badc0ca0c160c2b51b200	2010-10-07 11:43:19 -04:00
Yaowu Xu	d338d14c6b	optimize fast_quantizer c version As the zbin and rounding constants are normalized, rounding effectively does the zbinning, therefore the zbin operation can be removed. In addition, the memset on the two arrays are no longer necessary. Change-Id: If39c353c42d7e052296cb65322e5218810b5cc4c	2010-10-06 13:28:36 -07:00
Paul Wilkins	2931b05ac5	Merge "Tune effect of motion on KF/GF boost in two pass;"	2010-10-05 06:58:24 -07:00
Jan Kratochvil	5cdc3a4c29	nasm: address labels 'rel label' vice 'wrt rip' nasm does not support `label wrt rip', it requires `rel label'. It is still fully compatible with yasm. Provide nasm compatibility. No binary change by this patch with yasm on {x86_64,i686}-fedora13-linux-gnu. Few longer opcodes with nasm on {x86_64,i686}-fedora13-linux-gnu have been checked as safe. Change-Id: I488773a4e930a56e43b0cc72d867ee5291215f50	2010-10-04 19:47:54 -04:00
Jan Kratochvil	e114f699f6	nasm: match instruction length (movd/movq) to parameters nasm requires the instruction length (movd/movq) to match to its parameters. I find it more clear to really use 64bit instructions when we use 64bit registers in the assembly. Provide nasm compatibility. No binary change by this patch with yasm on {x86_64,i686}-fedora13-linux-gnu. Few longer opcodes with nasm on {x86_64,i686}-fedora13-linux-gnu have been checked as safe. Change-Id: Id9b1a5cdfb1bc05697e523c317a296df43d42a91	2010-10-04 23:36:29 +02:00
Yaowu Xu	2d4ef37507	Merge "enable trellis quantization for 2nd order blocks"	2010-10-04 10:41:20 -07:00
Paul Wilkins	788c0eb54e	Tune effect of motion on KF/GF boost in two pass; This code adjust the impact of the amount and speed of motion on GF and KF boost. Sections with lots of slow motion will tend to have a somewhat bigger boost and sections with fast motion may have less. There is a knock on effect to the selection of the active quantizer range. This will likely require further tuning but helps with a couple of particularly bad edge cases. Change-Id: Ic2449cda7305672b69acf42fc0a845b77ac98d40	2010-10-02 17:31:46 +01:00
Yaowu Xu	dcd29e369f	enable trellis quantization for 2nd order blocks Experimented with different value for Y2_RD_MULT ranging f[1, 32], without adapting the value to MB coding mode/frame type/Q value, 4 works out best among all values, providing overall 0.1% coding gain on the test set. Change-Id: I6b2583a8aa5db5e7e5c65c646301909c0c58f876	2010-10-02 06:20:33 -07:00
Adrian Grange	999bc00301	Made temporal filter default to use centered mode If temporal filtering is enabled but a filter type is not specified centered filter mode is used by default. Change-Id: I87306f267c1390074c806c506a69b4ba914d92a2	2010-10-01 10:14:01 +01:00
John Koleszar	7e5e31516c	Rename mode_ref_lf_test_function This function graduated from being a test func to something that's on by default. Rename it and remove some spurious comments that confuse its status. Change-Id: I689695a3ad29c35e9a72a43ec93766733ac6c20b	2010-09-29 13:53:14 -04:00
John Koleszar	b9be7a464f	Fix loopfilter delta zero transitions Loopfilter deltas are initialized to zero on keyframes in the decoder. The values then persist from the previous frame unless an update bit is set in the bitstream. This data is not included in the entropy data saved by the 'refresh entropy' bit in the bitstream, so it is effectively an additional contextual element beyond the 3 ref-frames and the entropy data. The encoder was treating this delta update bit as update-if-nonzero, meaning that the value would be refreshed even if it hadn't changed, and more significantly, if the correct value for the delta changed to zero, the update wouldn't be sent, and the decoder would preserve the last (presumably non-zero) value. This patch updates the encoder to send an update only if the value has changed from the previously transmitted value. It also forces the value to be transmitted in error resilient mode, to account for lost context in the event of lost frames. Change-Id: I56671d5b42965d0166ac226765dbfce3e5301868	2010-09-29 13:04:04 -04:00
Paul Wilkins	7288cdf79d	Change to coefficient optimization rules. Allow coefficient optimization for good quality speed 0. Change-Id: Id0cb363df6823c6798671584fbba097916a7df2c	2010-09-29 13:22:05 +01:00
Adrian Grange	4f92b96bdb	Merge "Moved row-specific computation of MV bounds out of col loop"	2010-09-29 05:13:41 -07:00
Adrian Grange	0e7c45b391	Moved row-specific computation of MV bounds out of col loop Moved the bounds computation on vertical MV component out of the loop that processes MBs within a MB row.	2010-09-29 13:03:07 +01:00
Paul Wilkins	ff3068d6da	Control of active min quantizer for two pass. Create look up tables for controlling the active quantizer range. Some initial tuning to improve quality circa 0.5% on test set. Clean up of some stats output code Change-Id: Ia698a8525f8b8129a503cadace3ee73fe888f543	2010-09-29 12:03:19 +01:00
Adrian Grange	47fc8f2683	Enabled AltRef motion map creation Enabled the first-pass encode to output the map of macroblock coding modes required by the AltRef filter.	2010-09-28 16:52:19 +01:00
Adrian Grange	1b2f8308e4	Made AltRef filter adaptive & added motion compensation Modified AltRef temporal filter to adapt filter length based on macroblock coding modes selected during first-pass encode. Also added sub-pixel motion compensation to the AltRef filter.	2010-09-28 15:23:41 +01:00
Paul Wilkins	305be4e417	Badly placed initialization of rolling rate monitors. This affects control of the active quantizer range. Change-Id: I30511fc81ac9f75ff20d9f1372382423d56739da	2010-09-27 12:50:55 -04:00
John Koleszar	8ca779aba8	disable compilation of debugging code This patch avoids compiling some debugging code in onyx_if.c. The most significant fix is to avoid generating code for vp8_write_yuv_frame, which is never called. Some other code was removed by the dead code elimination performed by the compiler, and this patch does it with the preprocessor instead. There are advantages both ways. Change-Id: I044fd43179d2e947553f0d6f2cad5b40907ac458	2010-09-24 11:42:22 -04:00
John Koleszar	147b125b15	Reduce size of tokenizer tables This patch reduces the size of the global tables maintained by the tokenizer to 16k from 80k-96k. See issue #177. Change-Id: If0275d5f28389af11ac83c5d929d1157cde90fbe	2010-09-16 10:00:04 -04:00
John Koleszar	edcbb1c199	Fix GF interval for non-lagged ARFs When ARFs are enabled in non-lagged compress modes, the GF interval was being reset to zero. Non-lagged ARF updates were enabled in commit `63ccfbd`, but this incorrect GF interval caused a quality regression. Change-Id: I615c3b493f4ce2127044f4e68d0bcb07d6b730c3	2010-09-09 13:18:54 -04:00
John Koleszar	c2140b8af1	Use WebM in copyright notice for consistency Changes 'The VP8 project' to 'The WebM project', for consistency with other webmproject.org repositories. Fixes issue #97. Change-Id: I37c13ed5fbdb9d334ceef71c6350e9febed9bbba	2010-09-09 10:01:21 -04:00
Jim Bankoski	69ae8f475d	Skip unnecessary search of identical frames vp8_get_compressed_data() was defeating logic in encode_frame_to_datarate() that determined the reference buffers to search and forcing all frames to be eligible to search. In cases where buffers have identical contents, this is unnecessary extra work. Change-Id: I9e667ac39128ae32dc455a3db4c62e3efce6f114	2010-09-08 11:31:34 -04:00
Jim Bankoski	63ccfbd545	Enable ARFs for non-lagged compress ARFs were explicitly disabled except in lagged compress mode. New ARF logic allows for the ARF buffer to hold an older golden frame, which does not require lagged compress. Change-Id: I1dff82b6f53e8311f1e0514b1794ae05919d5f79	2010-09-08 11:26:13 -04:00
Scott LaVarnway	0de458f6b9	Reduced the size of MB_MODE_INFO Moved partition_bmi and partition_count out of MB_MODE_INFO and placed into MACROBLOCK. Also reduced the size of other members of the MB_MODE_INFO struct. For 1080p, the memory was reduced by 1,209,516 bytes. The decoder performance appeared to improve by 3% for the clip used. Note: The main goal for this change is to improve the decoder performance. The encoder will be revisited at a later date for further structure cleanup. Change-Id: I4733621292ee9cc3fffa4046cb3fd4d99bd14613	2010-09-03 16:43:23 -04:00
John Koleszar	4496db45e3	Whitespace: nuke CRLFs Change-Id: I8b9fdf9875a8fcff4cb49a3357ce44f18108c2e7	2010-09-02 13:33:01 -04:00
Yaowu Xu	fca129203a	added separate rounding/zbin constants for 2nd order This allows experiments of using different rounding and zerobin constants for 2nd order blocks. Change-Id: Idd829adba3edd1f713c66151a8d29bb245e33a71	2010-09-02 10:27:03 -04:00
Paul Wilkins	c239a1b67c	Improved Force Key Frame Behaviour These changes improve the behaviour of the code with forced key frames sent in by a calling application. The sizing of the frames is still suboptimal for two pass in particular but the behaviour is much better than it was. Change-Id: I35fae610c67688ccc69d11f385e87dfc884e65a1	2010-08-31 14:32:40 -04:00
Scott LaVarnway	e85e631504	Changed above and left context data layout The main reason for the change was to reduce cycles in the token decoder. (~1.5% gain for 32 bit) This layout should be more cache friendly. As a result of this change, the encoder had to be updated. Change-Id: Id5e804169d8889da0378b3a519ac04dabd28c837 Note: dixie uses a similar layout	2010-08-31 11:24:30 -04:00
John Koleszar	8e7ebacb19	increase rate control buffer level precision The external API exposes the RC initial/optimal/full buffer level in milliseconds, but this value was truncated internally to seconds. This patch allows the use of the full precision during the conversion from time to bits. Change-Id: If8dd2a87614c05747f81432cbe75dd9e6ed2f04e	2010-08-20 11:04:48 -04:00
John Koleszar	80d3923a78	move segmentation_common to encoder vp8_update_gf_useage_maps() is only used by the encoder. This patch fixes the ability to build in decode-only or encode-only configurations. Change-Id: I3a5211428e539886ba998e09e8abd747ac55c9aa	2010-08-13 14:54:24 -04:00
Scott LaVarnway	9c7a0090e0	Removed unnecessary MB_MODE_INFO copies These copies occurred for each macroblock in the encoder and decoder. Thetemp MB_MODE_INFO mbmi was removed from MACROBLOCKD. As a result, a large number compile errors had to be fixed. Change-Id: I4cf0ffae3ce244f6db04a4c217d52dd256382cf3	2010-08-12 16:25:43 -04:00
John Koleszar	d22e2968a8	cosmetics: add missing 2D array braces Silences compile warning. Change-Id: I4b207d97f8570fe29aa2710e4ce4f02e7e43b57a	2010-08-11 13:55:38 -04:00
John Koleszar	392a958274	avoid negative array subscript warnings The mv_ref and sub_mv_ref token encodings are indexed from NEARESTMV and LEFT4X4, respectively, rather than being zero-based like the other token encodings. Change-Id: I3699c3f84111209ecfb91097c4b900773e9a3ad5	2010-08-11 13:49:12 -04:00
Scott LaVarnway	99f46d62d9	Moved gf_active code to encoder only The gf_active code is only used by the encoder, so it was moved from common and decoder. Change-Id: Iada15acd5b2b33ff70c34668ca87d4cfd0d05025	2010-08-11 11:54:25 -04:00
Yaowu Xu	c404fa42ac	Removed duplicate functions Change-Id: Ie587972ccefd3c762b8cdf8ef39345cd22924b9b	2010-08-10 21:45:34 -07:00
Yaowu Xu	3b95a46c55	Normalize quantizer's zero bin and rounding factors This patch changes a few numbers in the two constant arrays for quantizer's zerobin and rounding factors, in general to make the sum of the two factors for any Q to be 128. While it might be beneficial to calibrate the two arrays for best quantizer performance, it is not the purpose of this patch. Normalizing the two arrays will enable quick optimization of the current faster quantizer, i.e .zerobin check can be removed. Change-Id: If9abfd7929bf4b8e9ecd64a79d817c6728c820bd	2010-08-10 21:12:04 -07:00
Timothy B. Terriberry	8fa38096a3	Add trellis quantization. Replace the exponential search for optimal rounding during quantization with a linear Viterbi trellis and enable it by default when using --best. Right now this operates on top of the output of the adaptive zero-bin quantizer in vp8_regular_quantize_b() and gives a small gain. It can be tested as a replacement for that quantizer by enabling the call to vp8_strict_quantize_b(), which uses normal rounding and no zero bin offset. Ultimately, the quantizer will have to become a function of lambda in order to take advantage of activity masking, since there is limited ability to change the quantization factor itself. However, currently vp8_strict_quantize_b() plus the trellis quantizer (which is lambda-dependent) loses to vp8_regular_quantize_b() alone (which is not) on my test clip. Patch Set 3: Fix an issue related to the cost evaluation of successor states when a coefficient is reduced to zero. With this issue fixed, now the trellis search almost exactly matches the exponential search. Patch Set 2: Overall, the goal of this patch set is to make "trellis" search to produce encodings that match the exponential search version. There are three main differences between Patch Set 2 and 1: a. Patch set 1 did not properly account for the scale of 2nd order error, so patch set 2 disable it all together for 2nd blocks. b. Patch set 1 was not consistent on when to enable the the quantization optimization. Patch set 2 restore the condition to be consistent. c. Patch set 1 checks quantized level L-1, and L for any input coefficient was quantized to L. Patch set 2 limits the candidate coefficient to those that were rounded up to L. It is worth noting here that a strategy to check L and L+1 for coefficients that were truncated down to L might work. (a and b get trellis quant to basically match the exponential search on all mid/low rate encodings on cif set, without a, b, trellis quant can hurt the psnr by 0.2 to .3db at 200kbps for some cif clips) (c gets trellis quant to match the exponential search to match at Q0 encoding, without c, trellis quant can be 1.5 to 2db lower for encodings with fixed Q at 0 on most derf cif clips) Change-Id: Ib1a043b665d75fbf00cb0257b7c18e90eebab95e	2010-08-10 20:58:24 -07:00
Jan Kratochvil	0327d3df90	nasm: end labels with colon (':') Labels should end by colon (':'), nasm requires it. Provide nasm compatibility. No binary change by this patch with yasm on {x86_64,i686}-fedora13-linux-gnu. Few longer opcodes with nasm on {x86_64,i686}-fedora13-linux-gnu have been checked as safe. Change-Id: I0b2ec6f01afb061d92841887affb5ca0084f936f	2010-08-02 09:20:03 -04:00
Jan Kratochvil	c8134bc54a	nasm: use OWORD vs DQWORD nasm knows only OWORD. yasm knows both OWORD and DQWORD. Provide nasm compatibility. No binary change by this patch with yasm on {x86_64,i686}-fedora13-linux-gnu. Few longer opcodes with nasm on {x86_64,i686}-fedora13-linux-gnu have been checked as safe. Change-Id: I62151390089e90df9a7667822fa594ac20b00e78	2010-08-02 09:17:14 -04:00
Yaowu Xu	f95c80b60f	Enable the switch between two versions of quantizer To facilitate more testing related to quantizer and rate control, the old version quantizer is added back. old and new quantizer can be switched back and forth by define or un-define the macro "EXACT_QUANT". Change-Id: Ia77e687622421550f10e9d65a9884128a79a65ff	2010-07-28 10:51:34 -07:00
Johann	a570bbd418	x86/sse2: disable asm quantizer follow up to Change I0e51492d: neon: disable asm quantizer Now x86 doesn't segfault with --disable-runtime-cpu-detect and -p=2 Change-Id: I8ca127bb299198efebbcbd5a661e81788361933f	2010-07-27 12:54:43 -04:00
John Koleszar	d8009c077a	neon: disable asm quantizer The assembly version of the quantizer has not been updated to match the new exact quantizer introduced in commit `e04e2935`. That commit tried to disable this code but missed the non-RTCD case. Thanks to David Baker <david.baker at openmarket.com> for isolating the issue and testing this fix. Change-Id: I0e51492dc6f8e44d2c10b587427448bf94135c65	2010-07-27 11:16:19 -04:00
Fritz Koenig	0ce3901282	Swap alt/gold/new/last frame buffer ptrs instead of copying. At the end of the decode, frame buffers were being copied. The frames are not updated after the copy, they are just for reference on later frames. This change allows multiple references to the same frame buffer instead of copying it. Changes needed to be made to the encoder to handle this. The encoder is still doing frame buffer copies in similar places where pointer reference could be done. Change-Id: I7c38be4d23979cc49b5f17241ca3a78703803e66	2010-07-23 14:53:59 -04:00
Paul Wilkins	68cf24310b	Merge commit 'refs/changes/51/351/1' of ssh://review.webmproject.org:29418/libvpx into KfRateBugMerged	2010-07-23 17:45:26 +01:00
Yaowu Xu	f5cf8553a2	Merge "Make the quantizer exact."	2010-07-23 09:26:26 -07:00
Paul Wilkins	9404c7db6d	Rate control bug with long key frame interval. In two pass encodes, the calculation of the number of bits allocated to a KF group had the potential to overflow for high data rates if the interval is very long. We observed the problem in one test clip where there was one section where there was an 8000 frame gap between key frames. Change-Id: Ic48eb86271775d7573b4afd166b567b64f25b787	2010-07-23 17:01:12 +01:00
Timothy B. Terriberry	e04e293522	Make the quantizer exact. This replaces the approximate division-by-multiplication in the quantizer with an exact one that costs just one add and one shift extra. The asm versions have not been updated in this patch, and thus have been disabled, since the new method requires different multipliers which are not compatible with the old method. Change-Id: I53ac887af0f969d906e464c88b1f4be69c6b1206	2010-07-23 08:48:01 -07:00
Paul Wilkins	d576690ba1	80 character line length on Arnr LUT Tweaked table to fit to 80 characters. Change-Id: Ie6ba80e0b31b33e23d2bf78599abe223369fcefb	2010-07-23 16:47:54 +01:00
Yaowu Xu	7a89d4c3d4	Merge "Improve the accuracy of forward walsh-hadamard transform"	2010-07-19 07:50:26 -07:00
Paul Wilkins	0ba32632cd	ARNR Lookup Table. Change submitted for Adrian Grange. Convert threshold calculation in ARNR filter to a lookup table. Change-Id: I12a4bbb96b9ce6231ce2a6ecc2d295610d49e7ec	2010-07-19 14:46:42 +01:00
Paul Wilkins	bf18069ceb	Rate control fix for ARNR filtered frames. Previously we had assumed that it was necessary to give a full frame's bit allocation to the alt ref frame if it has been created through temporal filtering. This is not the case. The active max quantizer control insures that sufficient bits are allocated if needed and allocating a full frame's worth of bits creates an excessive overhead for the ARF. Change-Id: I83c95ed7bc7ce0e53ccae6ff32db5a97f145937a	2010-07-19 14:10:07 +01:00
Paul Wilkins	7c938f4d3c	Fix: Incorrect 'cols' calculation in temporal filter. Change-Id: I37f10fbe4fbb505c1d34980a59af3e817c287e22	2010-07-16 15:57:17 +01:00
Yaowu Xu	3d0a1edadd	Fix a compiling error on armv6 The issue was caused by a bad merge in Change I5559d1e8 Change-Id: I6563f652bc1500202de361f8f51d11cc6ddf3331	2010-07-07 14:45:13 -04:00
Adrian Grange	0618ff14d6	Fix bug in 1st pass motion compensation In the case where the best reference mv is not (0,0) a secondary search is carried out centered on (0,0). However, rather than sending tmp_err into the search function, motion_error was inadvertently passed. As a result tmp_err remains set at INT_MAX and the (0,0)-based search result will never be selected, even if it is better. Change-Id: I3c82b246c8c82ba887b9d3fb4c9e0a0f2fe5a76c	2010-07-01 14:19:43 +01:00
Paul Wilkins	2e3d8d3263	Merge "Further adjustment of RD behaviour with Q and Zbin."	2010-07-01 01:53:40 -07:00
Paul Wilkins	1ca39bf26d	Further adjustment of RD behaviour with Q and Zbin. Following conversations with Tim T (Derf) I ran a large number of tests comparing the existing polynomial expression with a simpler ^2 variant. Though the polynomial was sometimes a little better at the extremes of Q it was possible to get close for most clips and even a little better on some. This code also changes the way the RD multiplier is calculated when the ZBIN is extended to use a variant of the same ^2 expression. I hope that this simpler expression will be easier to tune further as we expand our test set and consider adjustments based on content. Change-Id: I73b2564346e74d1332c33e2c1964ae093437456c	2010-06-29 12:15:54 +01:00
Yaowu Xu	b62d093efa	Improve the accuracy of forward walsh-hadamard transform Besides the slight improvement in round trip error. This also fixes a sign bias in the forward transform, so the round trip errors are evenly distributed between +1s and -1s. The old bias seemed to work well with the dc sign bias in old fdct, which no longer exist in the improved fdct. Change-Id: I8635e7be16c69e69a8669eca5438550d23089cef	2010-06-28 22:10:48 -07:00
Adrian Grange	aa8fe0d269	Fixed buffer selection for UV in AltRef filtering Corrected setting of "which_buffer" for U & V cases to match that used for Y, i.e. to refer to the temporally most recent frame of those to be filtered. Change-Id: Idf94b287ef47a05f060da3e61134a0b616adcb6b	2010-06-28 16:45:06 +01:00
Scott LaVarnway	f1a3b1e0d9	Added first-pass sse2 version of Yaowu's new fdct. Change-Id: Ib479210067510162879c368428b92690591120b2	2010-06-24 16:40:56 -04:00
Yaowu Xu	d0dd01b8ce	Redo the forward 4x4 dct The new fdct lowers the round trip sum squared error for a 4x4 block ~0.12. or ~0.008/pixel. For reference, the old matrix multiply version has average round trip error 1.46 for a 4x4 block. Thanks to "derf" for his suggestions and references. Change-Id: I5559d1e81d333b319404ab16b336b739f87afc79	2010-06-24 13:17:58 -07:00
Fritz Koenig	a5906668a3	vp8cx : bestsad declared and initialized incorrectly. bestsad needs to be a int and set to INT_MAX because at the end of the function it is compared to INT_MAX to determine if there was a match in the function. Change-Id: Ie80e88e4c4bb4a1ff9446079b794d14d5a219788	2010-06-24 14:30:48 -04:00
Fritz Koenig	cecdd73db7	vp8cx : bestsad declared and initialized incorrectly. bestsad should be an int initialized to INT_MAX. The optimized SAD function expects a signed value for bestsad to use for comparison and early loop termination. When no match is made, which is determined by a comparison of bestsad to INT_MAX, INT_MAX is returned.	2010-06-24 12:18:23 -04:00
agrange	a08df4552a	Fix breakout thresh computation for golden & AltRef frames 1. Unavailability of each reference frame type should be tested independently, 2. Also, only the VP8_GOLD_FLAG needs to be tested before setting golden frame specific thresholds, and only VP8_ALT_FLAG needs testing before setting thresholds relevant to the AltRef frame. (Raised by gbvalor, in response to Issue 47) Change-Id: I6a06fc2a6592841d85422bc1661e33349bb6c3b8	2010-06-21 16:50:59 +01:00
agrange	daa5d0eb3d	Changed unary operator from ! to ~ Since the intent is to reset the appropriate bit in ref_frame_flags not to test a logic condition. Prior result would always have been ref_frame_flags being set to 0. (Issue reported by dgohman, issue 47) Change-Id: I2c12502ed74c73cf38e98c9680e0249c29e16433	2010-06-21 15:23:51 +01:00
agrange	d4b99b8e3a	Moved DOUBLE_DIVIDE_CHECK to denominator (was on numerator) The DOUBLE_DIVIDE_CHECK macro prevents from divide by 0, so must be on the denominator to work as intended. Change-Id: Ie109242d52dbb9a2c4bc1e11890fa51b5f87ffc7	2010-06-21 15:20:52 +01:00
Jim Bankoski	220daa00e0	vp8_block_error_xmm: remove unnecessary instructions Remove a couple instructions from this function which weren't necessary for correct execution. Change-Id: Ib649674f140689f7e5c1530c35686241688a3151	2010-06-18 13:34:43 -04:00
John Koleszar	94c52e4da8	cosmetics: trim trailing whitespace When the license headers were updated, they accidentally contained trailing whitespace, so unfortunately we have to touch all the files again. Change-Id: I236c05fade06589e417179c0444cb39b09e4200d	2010-06-18 13:06:11 -04:00
Guillermo Ballester Valor	5a72620de9	Fix compiler warnings Change-Id: I2a97f08cc3c7808ce5be39e910cc5147ecf03a1d	2010-06-14 17:23:49 -04:00
Scott LaVarnway	48c84d138f	sse2 version of vp8_regular_quantize_b Added sse2 version of vp8_regular_quantize_b which improved encode performance(for the clip used) by ~10% for 32 bit builds and ~3% for 64 bit builds. Also updated SHADOW_ARGS_TO_STACK to allow for more than 9 arguments. Change-Id: I62f78eabc8040b39f3ffdf21be175811e96b39af	2010-06-14 14:07:56 -04:00
John Koleszar	900d0548db	Merge "Make this/next iiratio unsigned."	2010-06-13 14:35:21 -07:00
Paul Wilkins	5ef25a9728	Merge "Tuning of baseline Rd equation to improve behavior at the"	2010-06-13 04:01:46 -07:00
John Koleszar	cd475da8ed	Make this/next iiratio unsigned. This patch addresses issue #79, which is a regression since commit `28de670` "Fix RD bug." If the coded error value is zero, the iiratio calculation effectively multiplies by 1000000 by the DOUBLE_DIVIDE_CHECK macro. This can result in a value larger than INT_MAX, giving a negative ratio. Since the error values are conceptually unsigned (though they're stored in a double) this patch makes the iiratio values unsigned, which allows the clamping to work as expected.	2010-06-12 14:11:51 -04:00
John Koleszar	59c50966ac	Enable vp8_sad16x16x4d_sse3 in non-RTCD case Typo caused C version of 16x16x4 SAD to be called when built with --disable-runtime-cpu-detect. Change-Id: I0fe6fa67280b3a5f13acb3c8ed914f039aaaf316	2010-06-11 13:15:30 -04:00
Paul Wilkins	f6a58d620d	Tuning of baseline Rd equation to improve behavior at the low and high Q ends.	2010-06-11 15:10:51 +01:00
Paul Wilkins	10ae99c67b	Merge "Adjust to avoid long line"	2010-06-10 03:24:54 -07:00
Paul Wilkins	a04ed23ff5	Adjust to avoid long line	2010-06-10 11:15:05 +01:00
Paul Wilkins	cd715faa50	Merge "Correct comment"	2010-06-10 03:05:32 -07:00
Paul Wilkins	ae244efb85	Merge "Fix RD bug."	2010-06-10 03:04:45 -07:00
Yaowu Xu	3225b893e8	minor cleanup of quantizer and fdct code Change-Id: I7ccc580410bea096a70dce0cc3d455348d4287c5	2010-06-08 15:13:50 -07:00
Yaowu Xu	4bb895e854	fix a typo Change-Id: I180a05ad57ee6164a6a169ee08e8affd09671eee	2010-06-08 09:37:01 -07:00
Paul Wilkins	6702a4047d	Correct comment	2010-06-08 09:59:57 +01:00
Paul Wilkins	28de670cd9	Fix RD bug.	2010-06-07 17:34:46 +01:00
Yaowu Xu	854c007a77	Remove duplicate and unused functions Change-Id: I944035e720ef834561a9da0d723879a4f787312c	2010-06-07 07:41:07 -07:00
John Koleszar	09202d8071	LICENSE: update with latest text Change-Id: Ieebea089095d9073b3a94932791099f614ce120c	2010-06-04 16:19:40 -04:00
Yaowu Xu	a7bb3360bc	Fix stats format and correct data size and bit rate output Change-ID: I093abe6094589a0d73f6ca85b825678a19e68285	2010-05-27 19:56:18 -07:00
Paul Wilkins	57d59f6ee7	Merge "Correct bit allocation when the alternative reference frame"	2010-05-26 09:06:49 -07:00
Paul Wilkins	ea4b6f18cb	Correct bit allocation when the alternative reference frame is constructed from multiple source frames Change-Id: I2e026c10d02b071b401c9fe8ab8dcfc0ac306103	2010-05-25 14:26:26 +01:00
John Koleszar	b7492341ac	install includes in DIST_DIR/include/vpx, move vpx_codec/ to vpx/ This renames the vpx_codec/ directory to vpx/, to allow applications to more consistently reference these includes with the vpx/ prefix. This allows the includes to be installed in /usr/local/include/vpx rather than polluting the system includes directory with an excessive number of includes. Change-Id: I7b0652a20543d93f38f421c60b0bbccde4d61b4f	2010-05-24 20:27:42 -04:00
John Koleszar	6be1d9337e	Merge "Fixed an encoder debug/relese mismatch in x86_64-win64-vs8"	2010-05-24 11:07:13 -07:00
Yunqing Wang	ad6a9d4e50	Fixed minor bug for realtime-only building	2010-05-24 11:30:04 -04:00
Paul Wilkins	c012d63ec9	Fixed incorrect casts that broke rate control in some situations.	2010-05-20 16:49:39 +01:00
Yaowu Xu	c15652bce1	Fixed an encoder debug/relese mismatch in x86_64-win64-vs8 Visual c++ compiler uses xmm registers for floating point operations for 64 bit architecture, therefore its calling convention requires the preservation of xmm6-xmm15 in any function that have used these registers. However, the sse2 functions, that were originally written for 32 bit windows, may have used xmm6 and xmm7 without preserving the content. In this particular case, the compiler used xmm6 to save the variable "two_pass_min_rate", the value of the variable is mucked up by our sse2 optimized loop filter functions, hence the results of release/debug mismatching.	2010-05-19 15:48:00 -07:00
Pavol Rusnak	0fc9abfbfd	remove unneeded variables	2010-05-19 21:15:32 +02:00
John Koleszar	0ea50ce9cb	Initial WebM release	2010-05-18 11:58:33 -04:00

... 7 8 9 10 11 ...

828 Commits