generic-library/vpx

Author	SHA1	Message	Date
Guillermo Ballester Valor	236906863a	Add high limit check for unsigned parameters The patch related with issue #55 (`5a72620`) fixed some warnings, but the fix was not optimal. It actually was a trick to confuse compiler rather than a fix. This patch fixes it by creating a new macro used when needed just a high limit check for an unsigned. Change-Id: I94b322e0f7fb07604b3b1df1f9321185f48cfcb5	2010-09-20 10:03:05 -04:00
Johann	022323bf85	reorder data to use wider instructions the previous commit laid the groundwork by doing two sets of idcts together. this moved that further by grouping the interesting data (q[0], q+16[0]) together to allow using wider instructions. also managed to drop a few instructions by recognizing that the constant for sinpi8sqrt2 could be downshifted all the time which avoided a dowshift as well as workarounds for a function which only accepted signed data looks like a modest gain for performance: at qcif, went from ~180 fps to ~183 Change-Id: I842673f3080b8239e026cc9b50346dbccbab4adf	2010-09-17 16:47:39 -04:00
Yunqing Wang	f857a85088	Restructure multi-threaded decoder On each MB, loopfiltering is done right after MB decoding. This combines two loops in multi-threaded code into one, which reduces number of synchronizations to half. The above-row/left-col data are saved in temp buffers for next-row/next MB decoding. Tests on 4-core gLucid machine showed 10% decoder performance gain with threads=4 (tulip clip). Testing on other platforms isn't done yet. Change-Id: Id18ea7c1e84965dabea65d4c01ca5bc056ddeac9	2010-09-17 09:56:05 -04:00
John Koleszar	9100073e8d	cleanup: remove unused xprintf These files aren't currently used, and we can get them back if we need them. Change-Id: I62aa3bff828e491a80c80eeb84a7c44903df29b5	2010-09-16 13:14:12 -04:00
John Koleszar	147b125b15	Reduce size of tokenizer tables This patch reduces the size of the global tables maintained by the tokenizer to 16k from 80k-96k. See issue #177. Change-Id: If0275d5f28389af11ac83c5d929d1157cde90fbe	2010-09-16 10:00:04 -04:00
Fritz Koenig	746439ef6c	Modify GET_GOT macro for performance. GET_GOT was producing a zero length call. This resulted in pipeline flushes occuring when returing from the assembly functions. Masked on out of order cores, but evident on Atom cores. Change-Id: I8c375af313e8a169c77adbaf956693c0cfeb5ccd	2010-09-15 12:41:15 -07:00
Fritz Koenig	769f2424cc	Removed unnecessary pxor. There is no need to make sure that the lower byte of the register is 0 because the downshift by 11 overwrites that byte. Change-Id: I89cbf004b2ff532a2c68e0dc399c45a49cdad5a1	2010-09-13 18:34:34 -07:00
Fritz Koenig	71a1c19754	Merge "Make block access to frame buffer sequential"	2010-09-13 11:04:22 -07:00
John Koleszar	eeca6b786a	Remove legacy release.sh script This script is part of a legacy release process and is unsupported. Most of this functionality has been moved into 'make dist.' Change-Id: Id67936302083352b628869e2988876cf56558ca5	2010-09-13 09:46:51 -04:00
John Koleszar	887d6ef49a	configure: support for ppc32-linux-gcc Fixes issue 89. Thanks to josejx for the patch. Change-Id: I7e664fed703b49f2fb3af4c5e6ce1173742000c2	2010-09-13 09:04:55 -04:00
John Koleszar	7f1a908b97	cosmetics: expand tabs in configure Change-Id: I88ddb0afb56ef2be8184b56fe125ad938ead7a84	2010-09-13 09:02:18 -04:00
Fritz Koenig	a65cd3def0	Make block access to frame buffer sequential Sequentially accessing memory from a low address to a high address should make it easier for the processor to predict the cache. Change-Id: I1921ce996bdd547144fe864fea6435f527f5842d	2010-09-10 16:27:28 -07:00
Scott LaVarnway	a32ded1d5f	Merge "Improved subset block search"	2010-09-09 11:51:29 -07:00
Scott LaVarnway	c5fb0eb8d9	Improved subset block search Improved the subset block search and fill. (about 3% improvement for 32 bit) Modified/merged the code in order to create vp8_read_mb_modes_mv which can decode the modes/mvs on a macroblock level. This will allow the decode loop (in the future) to decode modes/mvs on a frame, row, or mb level. Change-Id: If637d994b508792f846d39b5d44a7bf9aa5cddf3	2010-09-09 14:42:48 -04:00
Johann	14ba764219	Update NEON wide idcts Expand `93c32a55` which used SSE2 instructions to do two idct/dequant/recons at a time to NEON. Initial working commit. More work needs to be put into rearranging and interlacing the data to take advantage of quadword operations, which is when we'll hopefully see a much better boost Change-Id: I86d59d96f15e0d0f9710253e2c098ac2ff2865d1	2010-09-09 14:08:12 -04:00
John Koleszar	edcbb1c199	Fix GF interval for non-lagged ARFs When ARFs are enabled in non-lagged compress modes, the GF interval was being reset to zero. Non-lagged ARF updates were enabled in commit `63ccfbd`, but this incorrect GF interval caused a quality regression. Change-Id: I615c3b493f4ce2127044f4e68d0bcb07d6b730c3	2010-09-09 13:18:54 -04:00
Fritz Koenig	6d90f867e4	Merge branch 'master' of git://review.webmproject.org/libvpx	2010-09-09 08:54:21 -07:00
John Koleszar	c2140b8af1	Use WebM in copyright notice for consistency Changes 'The VP8 project' to 'The WebM project', for consistency with other webmproject.org repositories. Fixes issue #97. Change-Id: I37c13ed5fbdb9d334ceef71c6350e9febed9bbba	2010-09-09 10:01:21 -04:00
Jim Bankoski	69ae8f475d	Skip unnecessary search of identical frames vp8_get_compressed_data() was defeating logic in encode_frame_to_datarate() that determined the reference buffers to search and forcing all frames to be eligible to search. In cases where buffers have identical contents, this is unnecessary extra work. Change-Id: I9e667ac39128ae32dc455a3db4c62e3efce6f114	2010-09-08 11:31:34 -04:00
Jim Bankoski	63ccfbd545	Enable ARFs for non-lagged compress ARFs were explicitly disabled except in lagged compress mode. New ARF logic allows for the ARF buffer to hold an older golden frame, which does not require lagged compress. Change-Id: I1dff82b6f53e8311f1e0514b1794ae05919d5f79	2010-09-08 11:26:13 -04:00
Fritz Koenig	3fb37162a8	Bilinear subpixel optimizations for ssse3. Used pmaddubsw for multiply and add of two filter taps at once for 16x16 and 8x8 blocks. Change-Id: Idccf2d6e094561624407b109fa7e80ba799355ea	2010-09-07 17:19:40 -07:00
Scott LaVarnway	0de458f6b9	Reduced the size of MB_MODE_INFO Moved partition_bmi and partition_count out of MB_MODE_INFO and placed into MACROBLOCK. Also reduced the size of other members of the MB_MODE_INFO struct. For 1080p, the memory was reduced by 1,209,516 bytes. The decoder performance appeared to improve by 3% for the clip used. Note: The main goal for this change is to improve the decoder performance. The encoder will be revisited at a later date for further structure cleanup. Change-Id: I4733621292ee9cc3fffa4046cb3fd4d99bd14613	2010-09-03 16:43:23 -04:00
John Koleszar	b0519a26b1	Update CHANGELOG for v0.9.2 release Change-Id: I184e927987544e9f34f890249b589ea13a93a330	2010-09-02 14:56:47 -04:00
John Koleszar	e4b5002490	Update AUTHORS Change-Id: I0395ffa107651a773fd11d12682ab9372f76a90b	2010-09-02 13:41:03 -04:00
John Koleszar	4496db45e3	Whitespace: nuke CRLFs Change-Id: I8b9fdf9875a8fcff4cb49a3357ce44f18108c2e7	2010-09-02 13:33:01 -04:00
John Koleszar	daab4bcba6	Use native win32 timers on mingw Changed to use QueryPerformanceCounter on Windows rather than only when building with MSVC, so that MSVC can link libs built with MinGW. Fixes issue #149. Change-Id: Ie2dc7edc8f4d096cf95ec5ffb1ab00f2d67b3e7d	2010-09-02 12:03:51 -04:00
John Koleszar	d6ee72a7bf	Fix target detection on mingw32 gcc -dumpmachine returns only 'mingw32' Change-Id: I774d05a97c5131fc12009e436712c319e54490a5	2010-09-02 11:52:39 -04:00
John Koleszar	21039ce16e	Use -fno-common for mingw Fixes http://code.google.com/p/webm/issues/detail?id=112 Thanks to Ramiro Polla for the issue/fix. Change-Id: I7f7b547a4ea3270e183f59280510066cc29a619e	2010-09-02 11:52:38 -04:00
James Zern	76640f85da	encoder: remove postproc dependency Remove the dependency on postproc.c for the encoder in general, the only unchecked need for it is when CONFIG_PSNR is enabled. All other cases are already wrapped in CONFIG_POSTPROC. In the CONFIG_PSNR case the file will still be included. Additionally, when VP8_SET_POSTPROC is used with the encoder when post processing has been disabled an error will be returned. This addresses issue #153. Change-Id: Ia6dfe20167f7077734a6058cbd1d794550346089	2010-09-02 11:52:37 -04:00
John Koleszar	7a3e0a1d93	Merge "added separate rounding/zbin constants for 2nd order"	2010-09-02 08:42:29 -07:00
John Koleszar	9398be0f46	Merge "Disable frame dropping by default"	2010-09-02 08:41:46 -07:00
Yaowu Xu	fca129203a	added separate rounding/zbin constants for 2nd order This allows experiments of using different rounding and zerobin constants for 2nd order blocks. Change-Id: Idd829adba3edd1f713c66151a8d29bb245e33a71	2010-09-02 10:27:03 -04:00
John Koleszar	23216211bc	Disable frame dropping by default This is not the behavior that most users expect. Change-Id: I226126ea400c22cf1f7918e80ea7fe0771c569cb	2010-09-02 09:32:03 -04:00
Frank Galligan	d45e55015e	Fix rare deadlock before loop filter There was an extremely rare deadlock that happened when one thread was waiting to start the loop filter on frame n while the other threads were starting to work on frame n+1. Change-Id: Icc94f728b3b6663405435640d9a2996735ba19ef	2010-09-01 22:01:21 -04:00
Paul Wilkins	18c902f8a4	Merge "Improved Force Key Frame Behaviour"	2010-09-01 02:45:12 -07:00
Yunqing Wang	0e78efad0b	Replace sleep(0) calls in multi-threaded decoder This is a workaround for gLucid problem. Change-Id: I188a016a07e4c2ea212444c5a6284ff3c48a5caa	2010-08-31 20:37:11 -04:00
Paul Wilkins	c239a1b67c	Improved Force Key Frame Behaviour These changes improve the behaviour of the code with forced key frames sent in by a calling application. The sizing of the frames is still suboptimal for two pass in particular but the behaviour is much better than it was. Change-Id: I35fae610c67688ccc69d11f385e87dfc884e65a1	2010-08-31 14:32:40 -04:00
Johann	0b94f5d6e8	followup arm patch make the arm asm detokenizer work with the new structures Change-Id: I7cd92c2a018ec24032bb1cfd1bb9739bc84b444a	2010-08-31 11:41:10 -04:00
Scott LaVarnway	e85e631504	Changed above and left context data layout The main reason for the change was to reduce cycles in the token decoder. (~1.5% gain for 32 bit) This layout should be more cache friendly. As a result of this change, the encoder had to be updated. Change-Id: Id5e804169d8889da0378b3a519ac04dabd28c837 Note: dixie uses a similar layout	2010-08-31 11:24:30 -04:00
John Koleszar	aaad6d1b54	Merge "Fix harmless off-by-1 error."	2010-08-30 12:40:42 -07:00
John Koleszar	6f4d4ab5ac	Merge "Fix two-pass framrate for Y4M input."	2010-08-30 12:40:37 -07:00
John Koleszar	674e477b81	Merge "increase rate control buffer level precision"	2010-08-30 07:49:35 -07:00
Timothy B. Terriberry	7a8e0a2935	Fix harmless off-by-1 error. The memory being zeroed in vp8_update_mode_info_border() was just allocated with calloc, and so the entire function is actually redundant, but it should be made correct in case someone expects it to actually work in the future. Change-Id: If7a84e489157ab34ab77ec6e2fe034fb71cf8c79	2010-08-27 16:07:54 -07:00
Timothy B. Terriberry	e105e245ef	Fix two-pass framrate for Y4M input. The timebase was being set to the value in the Y4M file on each pass, but only doubled to account for the altref placement on the first past. This avoids reseting it on the second pass. Change-Id: Ie342639bad1ffe9c2214fbbaaded72cfed835b42	2010-08-27 15:21:22 -07:00
Fritz Koenig	00358cb974	Merge "Allow --cpu= to work for x86."	2010-08-25 11:39:59 -07:00
Fritz Koenig	a790906c3b	Allow --cpu= to work for x86. --cpu was already implemented for most of our embedded platforms, this just extends it to x86. Corner case for Atom processor as it doesn't respond to the --march= option under icc. Change-Id: I2d57a7a6e9d0b55c0059e9bc46cfc9bf9468c185	2010-08-24 16:27:49 -07:00
Johann	5c244398e1	clean up compiler warnings did a test compile with clang and got rid of some warnings that have been annoying me for a while: vp8/decoder/detokenize.c: In function 'vp8_init_detokenizer': vp8/decoder/detokenize.c:121: warning: assignment discards qualifiers from pointer target type vp8/decoder/detokenize.c:122: warning: assignment discards qualifiers from pointer target type vp8/decoder/detokenize.c:123: warning: assignment from incompatible pointer type vp8/decoder/detokenize.c:124: warning: assignment discards qualifiers from pointer target type vp8/decoder/detokenize.c:125: warning: assignment discards qualifiers from pointer target type vp8/decoder/detokenize.c:128: warning: assignment discards qualifiers from pointer target type vp8/decoder/detokenize.c:129: warning: assignment discards qualifiers from pointer target type vp8/decoder/detokenize.c:130: warning: assignment discards qualifiers from pointer target type vp8/decoder/detokenize.c:131: warning: assignment discards qualifiers from pointer target type Change-Id: I78ddab176fe47cbeed30379709dc7bab01c0c2e4	2010-08-24 18:23:16 -04:00
Johann	d73217ab17	update structures mbmi and eob moved in previous commits Change-Id: I30a2eba36addf89ee50b406ad4afdd059a832711	2010-08-23 13:44:56 -04:00
Fritz Koenig	93c32a55c2	Rework idct calling structure. Moving the eob structure allows for a non-struct based function to handle decoding an entire mb of idct/dequant/recon data. This allows for SIMD functions to idct/dequant/recon multiple blocks at once. SSE2 implementation gives 3% gain on Atom. Change-Id: I8a8f3efd546ea4e0535f517d94f347cfb737c9c2	2010-08-23 08:58:54 -07:00
John Koleszar	8e7ebacb19	increase rate control buffer level precision The external API exposes the RC initial/optimal/full buffer level in milliseconds, but this value was truncated internally to seconds. This patch allows the use of the full precision during the conversion from time to bits. Change-Id: If8dd2a87614c05747f81432cbe75dd9e6ed2f04e	2010-08-20 11:04:48 -04:00

... 21 22 23 24 25 ...

1358 Commits