generic-library/vpx

Author	SHA1	Message	Date
Deb Mukherjee	08f6471890	Add 8x8 transform to experimental branch Please refer to previous commit messages for detailed info: https://on2-git.corp.google.com/g/#change,5940 https://on2-git.corp.google.com/g/#change,6045 Change-Id: I8b16992f2f69c5a808ad40a3e32ef589cce7c59d	2011-07-20 09:49:22 -07:00
Yunqing Wang	0d87098e08	Copy macroblock data to a buffer before encoding it I got this idea from Pascal (Thanks). Before encoding a macroblock, copy it to a 16x16 buffer, and then read source data from there instead. This will help keep the source data in cache, and help with the performance. Change-Id: Id05f4cb601299150511d59dcba0ae62c49b5b757	2011-06-23 13:54:02 -04:00
Paul Wilkins	4e81a68af7	Further activity masking changes: Some further re-structuring of activity masking code. Still has various experimental switches. Supports a metric based on intra encode. Experimental comparison against a fixed activity target rather than a frame average, for altering rd and zbin. Overall the SSIM performance is similar to TT's original code but there is a much smaller PSNR hit of circa 0.5% instead of 3.2% Change-Id: I0fd53b2dfb60620b3f74d7415e0b81c1ac58c39a	2011-06-08 16:03:37 +01:00
Johann	04edde2b11	Merge "neon fast quantize block pair"	2011-06-06 13:42:58 -07:00
Scott LaVarnway	773768ae27	Removed B_MODE_INFO Declared the bmi in BLOCKD as a union instead of B_MODE_INFO. Then removed B_MODE_INFO completely. Change-Id: Ieb7469899e265892c66f7aeac87b7f2bf38e7a67	2011-06-02 13:46:41 -04:00
Tero Rintaluoma	61f0c090df	neon fast quantize block pair vp8_fast_quantize_b_pair_neon function added to quantize two adjacent blocks at the same time to improve performance. - Additional 3-6% speedup compared to neon optimized fast quantizer (Tanya VGA@30fps, 1Mbps stream, cpu-used=-5..-16) Change-Id: I3fcbf141e5d05e9118c38ca37310458afbabaa4e	2011-06-01 10:48:05 +03:00
Scott LaVarnway	cfab2caee1	Removed unused variable warnings Change-Id: I6e5e921f03dc15a72da89a457848d519647677a3	2011-05-24 15:17:03 -04:00
Scott LaVarnway	e11f21af9a	MODE_INFO size reduction Declared the bmi in MODE_INFO as a union instead of B_MODE_INFO. This reduced the memory footprint by 518,400 bytes for 1080 resolutions. The decoder performance improved by ~4% for the clip used and the encoder showed very small improvements. (0.5%) This reduction was first mentioned to me by John K. and in a later discussion by Yaowu. This is WIP. Change-Id: I8e175fdbc46d28c35277302a04bee4540efc8d29	2011-05-24 13:24:52 -04:00
John Koleszar	048497720c	Remove unused members of VP8_COMP Various members that were either completely unreferenced or written and not read. Change-Id: Ie41ebac0ff0364a76f287586e4fe09a68907806e	2011-05-19 15:49:09 -04:00
Paul Wilkins	ff52bf3691	Restructure of activity masking code. This commit restructures the mb activity masking code to better facilitate experimentation using different metrics etc. and also allows for adjustment of the zero bin either for encode only or both the encode and mode selection stages It also uses information from the current frame rather than the previous frame and the default strength has been reduced. Change-Id: Id39b19eace37574dc429f25aae810c203709629b	2011-05-13 10:37:50 +01:00
Yaowu Xu	1bcf4e66bb	Merge "fix a bug related to gf_active_flags in multi-threaded encoder"	2011-05-10 19:59:52 -07:00
Yaowu Xu	89c6017cc0	fix a bug related to gf_active_flags in multi-threaded encoder Paul pointed out that the pointer to the gf_active_flags is not being properly incremented in multithreaded encoder. This commit fixes the issue by making sure the gf_active_ptr points to the starting of next group of mb rows. Change-Id: I3246e657d23beabb614dfb880733a68a5fd7e34c	2011-05-06 09:00:44 -07:00
Aron Rosenberg	eeb8117303	Fix semaphore emulation on Windows The existing emulation of posix semaphores on Windows uses SetEvent() and WaitForSingleObject(), which implements a binary semaphore, not a counting semaphore as implemented by posix. This causes deadlock when used with the expected posix semantics. Instead, this patch uses the CreateSemaphore() and ReleaseSemaphore() calls (introduced in Windows 2000) which have the expected behavior. This patch also reverts commit `eb16f00`, which split a semaphore that was being used with counting semantics into two binary semaphores. That commit is unnecessary with corrected emulation. Change-Id: If400771536a27af4b0c3a31aa4c4e9ced89ce6a0	2011-05-06 00:13:59 -04:00
Yunqing Wang	eb16f00cf2	Fix rare hang in multi-thread encoder on Windows This patch is to fix a rare hang in multi-thread encoder that was only seen on Windows. Thanks for John's help in debugging the problem. More test is needed. Change-Id: Idb11c6d344c2082362a032b34c5a602a1eea62fc	2011-05-05 10:42:29 -04:00
Yunqing Wang	aeb86d615c	Merge "Runtime detection of available processor cores."	2011-05-05 04:59:54 -07:00
Yunqing Wang	3d6815817c	Use full-pixel MV in mvsadcost calculation MV sad cost error is only used in full-pixel motion search, which only need full-pixel resolution instead of quarter-pixel resolution. This change reduced mvsadcost table size, and removed unneccessary pamameter passing since this table is constant once it is generated. Change-Id: I9f931e55f6abc3c99011321f1dfb2f3562e6f6b0	2011-04-01 16:41:58 -04:00
Attila Nagy	297b27655e	Runtime detection of available processor cores. Detect the number of available cores and limit the thread allocation accordingly. On decoder side limit the number of threads to the max number of token partition. Core detetction works on Windows and Posix platforms, which define _SC_NPROCESSORS_ONLN or _SC_NPROC_ONLN. Change-Id: I76cbe37c18d3b8035e508b7a1795577674efc078	2011-03-31 10:23:01 +03:00
Attila Nagy	bfe803bda3	Fix multithreaded encoding for 1 MB wide frame Thread synchronization was not correct when frame width was 1 MB. Number of allocated encoding threads is limited by the sync_range. There is no point having more because each thread lags sync_range MBs behind the thread processing the row above. http://code.google.com/p/webm/issues/detail?id=302 Change-Id: Icaf67a883beecc5ebf2f11e9be47b6997fdf6f26	2011-03-18 12:35:30 +02:00
Attila Nagy	3ae2465788	Encoder loopfilter running in its own thread In multithreaded mode the loopfilter is running in its own thread (filter level calculation and frame filtering). Filtering is mostly done in parallel with the bitstream packing. Before starting the packing the loopfilter level has to be calculated. Also any needed reference frame copying is done in the filter thread. Currently the encoder will create n+1 threads, where n > 1 is the number of threads specified by application and 1 is the extra filter thread. With n = 1 the encoder runs in single thread mode. There will never be more than n threads running concurrently. Change-Id: I4fb29b559a40275d6d3babb8727245c40fba931b	2011-03-11 10:52:51 +02:00
John Koleszar	02321de0f2	Fix relative include paths Allow compiling without adding vp8/{common,encoder,decoder} to the include paths. Change-Id: Ifeb5dac351cdfadcd659736f5158b315a0030b6c	2011-02-10 15:09:44 -05:00
Gaute Strokkenes	315e3c2518	Put more code under #if CONFIG_MULTITHREAD. Change-Id: Icf4b692099d7d249fe3553852b1022b027b28e4b	2011-02-09 11:21:18 -05:00
Attila Nagy	385c2a76d1	Improved encoder threading Reduce the number of sync points by letting each thread continue imediatly with a new MB row. Better multicore scaling, improves performance by 5-20% on ARM multicore. Change-Id: Ic97e4d1c4886a842c85dd3539a93cb217188ed1b	2011-02-01 12:17:58 +02:00
Scott LaVarnway	de4e8185e9	Fixed encoder crash when mult-threading is enabled. Happens in real-time mode. Will happen in good quality, speed 1. Change-Id: I3e5b68827b1a5798d0431b088a709256d1ce2c95	2010-12-29 16:41:22 -05:00
John Koleszar	b0da9b399d	Add psnr/ssim tuning option Add a new encoder control, VP8E_SET_TUNING, to allow the application to inform the encoder that the material will benefit from certain tuning. Expose this control as the --tune option to vpxenc. The args helper is expanded to support enumerated arguments by name or value. Two tunings are provided by this patch, PSNR (default) and SSIM. Activity masking is made dependent on setting --tune=ssim, as the current implementation hurts speed (10%) and PSNR (2.7% avg, 10% peak) too much for it to be a default yet. Change-Id: I110d969381c4805347ff5a0ffaf1a14ca1965257	2010-12-17 10:01:05 -05:00
Yaowu Xu	64f3d91579	fix a bug that "optimize" flag is not set for sub-threads The flag for quantization optimization was not properly propagated to mb row encoding threads. Change-Id: Ic561599c35acd94cd5698c9b314bccd596ac2deb	2010-12-14 10:12:21 -08:00
Yaowu Xu	97a86c5b13	fix a bug in multithreaded encoding with active_map enabled Added the initialization of the pointer to active map. Also added the same logic for cyclic refresh in mbrow encoding threads. Change-Id: Ic48d0849dc706b27fba72d07dcc498075725663d	2010-12-10 10:48:30 -08:00
Timothy B. Terriberry	8d0f7a01e6	Add simple version of activity masking. This uses MB variance to change the RDO weight for mode decision and quantization. Activity is normalized against the average for the frame, which is currently tracked using feed-forward statistics. This could also be used to adjust the quantizer for the entire frame, but that requires more extensive rate control changes. This does not yet attempt to adapt the quantizer within the frame, but the signaling cost means that will likely only be useful at very high rates. Change-Id: I26cd7c755cac3ff33cfe0688b1da50b2b87b9c93	2010-10-12 08:41:03 -04:00
John Koleszar	c2140b8af1	Use WebM in copyright notice for consistency Changes 'The VP8 project' to 'The WebM project', for consistency with other webmproject.org repositories. Fixes issue #97. Change-Id: I37c13ed5fbdb9d334ceef71c6350e9febed9bbba	2010-09-09 10:01:21 -04:00
Scott LaVarnway	0de458f6b9	Reduced the size of MB_MODE_INFO Moved partition_bmi and partition_count out of MB_MODE_INFO and placed into MACROBLOCK. Also reduced the size of other members of the MB_MODE_INFO struct. For 1080p, the memory was reduced by 1,209,516 bytes. The decoder performance appeared to improve by 3% for the clip used. Note: The main goal for this change is to improve the decoder performance. The encoder will be revisited at a later date for further structure cleanup. Change-Id: I4733621292ee9cc3fffa4046cb3fd4d99bd14613	2010-09-03 16:43:23 -04:00
Scott LaVarnway	e85e631504	Changed above and left context data layout The main reason for the change was to reduce cycles in the token decoder. (~1.5% gain for 32 bit) This layout should be more cache friendly. As a result of this change, the encoder had to be updated. Change-Id: Id5e804169d8889da0378b3a519ac04dabd28c837 Note: dixie uses a similar layout	2010-08-31 11:24:30 -04:00
Scott LaVarnway	9c7a0090e0	Removed unnecessary MB_MODE_INFO copies These copies occurred for each macroblock in the encoder and decoder. Thetemp MB_MODE_INFO mbmi was removed from MACROBLOCKD. As a result, a large number compile errors had to be fixed. Change-Id: I4cf0ffae3ce244f6db04a4c217d52dd256382cf3	2010-08-12 16:25:43 -04:00
Scott LaVarnway	99f46d62d9	Moved gf_active code to encoder only The gf_active code is only used by the encoder, so it was moved from common and decoder. Change-Id: Iada15acd5b2b33ff70c34668ca87d4cfd0d05025	2010-08-11 11:54:25 -04:00
Fritz Koenig	0ce3901282	Swap alt/gold/new/last frame buffer ptrs instead of copying. At the end of the decode, frame buffers were being copied. The frames are not updated after the copy, they are just for reference on later frames. This change allows multiple references to the same frame buffer instead of copying it. Changes needed to be made to the encoder to handle this. The encoder is still doing frame buffer copies in similar places where pointer reference could be done. Change-Id: I7c38be4d23979cc49b5f17241ca3a78703803e66	2010-07-23 14:53:59 -04:00
Timothy B. Terriberry	e04e293522	Make the quantizer exact. This replaces the approximate division-by-multiplication in the quantizer with an exact one that costs just one add and one shift extra. The asm versions have not been updated in this patch, and thus have been disabled, since the new method requires different multipliers which are not compatible with the old method. Change-Id: I53ac887af0f969d906e464c88b1f4be69c6b1206	2010-07-23 08:48:01 -07:00
Yaowu Xu	d0dd01b8ce	Redo the forward 4x4 dct The new fdct lowers the round trip sum squared error for a 4x4 block ~0.12. or ~0.008/pixel. For reference, the old matrix multiply version has average round trip error 1.46 for a 4x4 block. Thanks to "derf" for his suggestions and references. Change-Id: I5559d1e81d333b319404ab16b336b739f87afc79	2010-06-24 13:17:58 -07:00
John Koleszar	94c52e4da8	cosmetics: trim trailing whitespace When the license headers were updated, they accidentally contained trailing whitespace, so unfortunately we have to touch all the files again. Change-Id: I236c05fade06589e417179c0444cb39b09e4200d	2010-06-18 13:06:11 -04:00
Yaowu Xu	3225b893e8	minor cleanup of quantizer and fdct code Change-Id: I7ccc580410bea096a70dce0cc3d455348d4287c5	2010-06-08 15:13:50 -07:00
Yaowu Xu	854c007a77	Remove duplicate and unused functions Change-Id: I944035e720ef834561a9da0d723879a4f787312c	2010-06-07 07:41:07 -07:00
John Koleszar	09202d8071	LICENSE: update with latest text Change-Id: Ieebea089095d9073b3a94932791099f614ce120c	2010-06-04 16:19:40 -04:00
John Koleszar	0ea50ce9cb	Initial WebM release	2010-05-18 11:58:33 -04:00

40 Commits