generic-library/vpx

Author	SHA1	Message	Date
Yaowu Xu	6035da5448	WebM Experimental Codec Branch Snapshot This is a code snapshot of experimental work currently ongoing for a next-generation codec. The codebase has been cut down considerably from the libvpx baseline. For example, we are currently only supporting VBR 2-pass rate control and have removed most of the code relating to coding speed, threading, error resilience, partitions and various other features. This is in part to make the codebase easier to work on and experiment with, but also because we want to have an open discussion about how the bitstream will be structured and partitioned and not have that conversation constrained by past work. Our basic working pattern has been to initially encapsulate experiments using configure options linked to #IF CONFIG_XXX statements in the code. Once experiments have matured and we are reasonably happy that they give benefit and can be merged without breaking other experiments, we remove the conditional compile statements and merge them in. Current changes include: * Temporal coding experiment for segments (though still only 4 max, it will likely be increased). * Segment feature experiment - to allow various bits of information to be coded at the segment level. Features tested so far include mode and reference frame information, limiting end of block offset and transform size, alongside Q and loop filter parameters, but this set is very fluid. * Support for 8x8 transform - 8x8 dct with 2nd order 2x2 haar is used in MBs using 16x16 prediction modes within inter frames. * Compound prediction (combination of signals from existing predictors to create a new predictor). * 8 tap interpolation filters and 1/8th pel motion vectors. * Loop filter modifications. * Various entropy modifications and changes to how entropy contexts and updates are handled. * Extended quantizer range matched to transform precision improvements. There are also ongoing further experiments that we hope to merge in the near future: For example, coding of motion and other aspects of the prediction signal to better support larger image formats, use of larger block sizes (e.g. 32x32 and up) and lossless non-transform based coding options (especially for key frames). It is our hope that we will be able to make regular updates and we will warmly welcome community contributions. Please be warned that, at this stage, the codebase is currently slower than VP8 stable branch as most new code has not been optimized, and even the 'C' has been deliberately written to be simple and obvious, not fast. The following graphs have the initial test results, numbers in the tables measure the compression improvement in terms of percentage. The build has the following optional experiments configured: --enable-experimental --enable-enhanced_interp --enable-uvintra --enable-high_precision_mv --enable-sixteenth_subpel_uv CIF Size clips: http://getwebm.org/tmp/cif/ HD size clips: http://getwebm.org/tmp/hd/ (stable_20120309 represents encoding results of WebM master branch build as of commit#7a15907) They were encoded using the following encode parameters: --good --cpu-used=0 -t 0 --lag-in-frames=25 --min-q=0 --max-q=63 --end-usage=0 --auto-alt-ref=1 -p 2 --pass=2 --kf-max-dist=9999 --kf-min-dist=0 --drop-frame=0 --static-thresh=0 --bias-pct=50 --minsection-pct=0 --maxsection-pct=800 --sharpness=0 --arnr-maxframes=7 --arnr-strength=3(for HD,6 for CIF) --arnr-type=3 Change-Id: I5c62ed09cfff5815a2bb34e7820d6a810c23183c	2012-03-15 07:36:47 -07:00
Scott LaVarnway	19987dcbfa	Faster vp8_default_coef_probs Copies from a generated table instead of building the default coeff probabilities during runtime. Change-Id: I4d9551ea3a2d7d4a4f7ce9eda006495221a8de50	2011-08-16 16:21:21 -04:00
Yaowu Xu	57ad189129	changed configure option name to reduce confusion Renamed configure option "enable-psnr" to "enable-internal-stats" to better reflect the purpose of the option and eliminate the confusion reported in http://code.google.com/p/webm/issues/detail?id=35 Change-Id: If72df6fdb9f1e33dab1329240ba4d8911d2f1f7a	2011-04-29 09:39:05 -07:00
Johann	508ae1b3d5	keep values in registers during quantization add an sse4 quantizer so we can use pinsrw/pextrw and keep values in xmm registers instead of proxying through the stack. and as long as we're bumping up, use some ssse3 instructions in the EOB detection (see ssse3 fast quantizer) pick up about a percent on 32bit and about two on 64bit. Change-Id: If15abba0e8b037a1d231c0edf33501545c9d9363	2011-04-21 15:47:55 -04:00
John Koleszar	88841f1059	Refactor lookahead ring buffer This patch cleans up the source buffer storage and copy mechanism to allow access through a standard push/pop/peek interface. This approach also avoids an extra copy in the case where the source is not a multiple of 16, fixing issue #102. Change-Id: I05808c39f5743625cb4c7af54cc841b9b10fdbd9	2011-04-13 14:26:45 -04:00
Johann	4be062bbc3	add asm_enc_offsets.c for all targets now that we need asm_enc_offsets.c for x86 and arm and it is harmless to build it for other targets, add it unconditionally Change-Id: I320c5220afd94fee2b98bda9ff4e5e34c67062f3	2011-03-28 10:43:47 -04:00
John Koleszar	5db0eeea21	Only enable ssim_opt.asm on X86_64 Fix compiling on 32 bit x86. Change-Id: I6210573e1d9287ac49acbe3d7e5181e309316107	2011-03-11 11:27:08 -05:00
Jim Bankoski	3f6f7289aa	vp8cx- alternate ssim function with optimizations Change-Id: I91921b0a90dbaddc7010380b038955be347964b3	2011-03-11 08:51:21 -05:00
Yunqing Wang	244e2e1451	Write SSSE3 sub-pixel filter function 1. Process 16 pixels at one time instead of 8. 2. Add check for both xoffset =0 and yoffset=0, which happens during motion search. This change gave encoder 1%~3% performance gain. Change-Id: Idaa39506b48f4f8b2fbbeb45aae8226fa32afb3e	2011-03-08 13:29:01 -05:00
Attila Nagy	7af0d906e3	Remove temporal alt ref from realtime only build It is not used in realtime mode. Reduces memory footprint. Change-Id: I7f163225762368df5457cfd413050161d3704a3f	2011-02-22 12:53:32 +02:00
John Koleszar	02321de0f2	Fix relative include paths Allow compiling without adding vp8/{common,encoder,decoder} to the include paths. Change-Id: Ifeb5dac351cdfadcd659736f5158b315a0030b6c	2011-02-10 15:09:44 -05:00
Gaute Strokkenes	315e3c2518	Put more code under #if CONFIG_MULTITHREAD. Change-Id: Icf4b692099d7d249fe3553852b1022b027b28e4b	2011-02-09 11:21:18 -05:00
Johann	8b0cf5f79d	x86 sse2 temporal_filter_apply count can be reduced to short because the max number of filtered frames is set to 15. the max value for any frame is 32 (modifier = 16, filter_weight = 2). 15*32 = 480 which requires 9 bits this function goes from about 7000 us / 1000 iterations for the C code to < 275 us / 1000 iterations for sse2 for block_size = 16 and from about 1800 us / 1000 iters to < 100 us / 1000 iters for block_size = 8 Change-Id: I64a32607f58a2d33c39286f468b04ccd457d9e6e	2011-01-06 14:00:30 -05:00
Scott LaVarnway	ff4a71f4c2	SSSE3 version of fast quantizer (test clip: tulip) For good quality mode with speed=1, this gave the encoder a small (2 - 3%) performance boost. Change-Id: I8a1d4269465944ac0819986c2f0be4b0a2ee0b35	2010-11-01 16:24:15 -04:00
Yunqing Wang	71ecb5d7d9	Full search SAD function optimization in SSE4.1 Use mpsadbw, and calculate 8 sad at once. Function list: vp8_sad16x16x8_sse4 vp8_sad16x8x8_sse4 vp8_sad8x16x8_sse4 vp8_sad8x8x8_sse4 vp8_sad4x4x8_sse4 (test clip: tulip) For best quality mode, this gave encoder a 5% performance boost. For good quality mode with speed=1, this gave encoder a 3% performance boost. Change-Id: I083b5a39d39144f88dcbccbef95da6498e490134	2010-10-27 13:36:31 -04:00
Johann	e81e30c25d	isolate new temporal filtering code onyx_if is getting pretty big. split out the temporal code to make it easier to look at. Change-Id: I207c3a94c90e91b32e3ea5e1836a53b7a990fabd	2010-10-25 09:11:03 -04:00
Yunqing Wang	4db2076594	Add SSE2 subtract functions Instead of doing 8-bit data unpack and 16-bit subtraction, use psubb to do 16 8-bit subtractions and pcmpgtb to preserve the sign information. This does not bring noticable gain since these functions are not called frequently. Change-Id: I90a0dfaa3db9d422e4ada324076596ffb178548e	2010-10-18 14:15:15 -04:00
John Koleszar	c2140b8af1	Use WebM in copyright notice for consistency Changes 'The VP8 project' to 'The WebM project', for consistency with other webmproject.org repositories. Fixes issue #97. Change-Id: I37c13ed5fbdb9d334ceef71c6350e9febed9bbba	2010-09-09 10:01:21 -04:00
James Zern	76640f85da	encoder: remove postproc dependency Remove the dependency on postproc.c for the encoder in general, the only unchecked need for it is when CONFIG_PSNR is enabled. All other cases are already wrapped in CONFIG_POSTPROC. In the CONFIG_PSNR case the file will still be included. Additionally, when VP8_SET_POSTPROC is used with the encoder when post processing has been disabled an error will be returned. This addresses issue #153. Change-Id: Ia6dfe20167f7077734a6058cbd1d794550346089	2010-09-02 11:52:37 -04:00
John Koleszar	80d3923a78	move segmentation_common to encoder vp8_update_gf_useage_maps() is only used by the encoder. This patch fixes the ability to build in decode-only or encode-only configurations. Change-Id: I3a5211428e539886ba998e09e8abd747ac55c9aa	2010-08-13 14:54:24 -04:00
John Koleszar	4d86ef3534	msvs: fix install of codec sources The libs.mk file must be installed for the vpx.vcproj file to be generated. It was being installed, but not in the src/ directory as expected. Also missed include files yasm.rules, quantize_x86.h Change-Id: Ic1a6f836e953bfc954d6e42a18c102a0114821eb	2010-07-22 18:33:25 -04:00
Scott LaVarnway	f1a3b1e0d9	Added first-pass sse2 version of Yaowu's new fdct. Change-Id: Ib479210067510162879c368428b92690591120b2	2010-06-24 16:40:56 -04:00
Yaowu Xu	d0dd01b8ce	Redo the forward 4x4 dct The new fdct lowers the round trip sum squared error for a 4x4 block ~0.12. or ~0.008/pixel. For reference, the old matrix multiply version has average round trip error 1.46 for a 4x4 block. Thanks to "derf" for his suggestions and references. Change-Id: I5559d1e81d333b319404ab16b336b739f87afc79	2010-06-24 13:17:58 -07:00
John Koleszar	94c52e4da8	cosmetics: trim trailing whitespace When the license headers were updated, they accidentally contained trailing whitespace, so unfortunately we have to touch all the files again. Change-Id: I236c05fade06589e417179c0444cb39b09e4200d	2010-06-18 13:06:11 -04:00
Scott LaVarnway	48c84d138f	sse2 version of vp8_regular_quantize_b Added sse2 version of vp8_regular_quantize_b which improved encode performance(for the clip used) by ~10% for 32 bit builds and ~3% for 64 bit builds. Also updated SHADOW_ARGS_TO_STACK to allow for more than 9 arguments. Change-Id: I62f78eabc8040b39f3ffdf21be175811e96b39af	2010-06-14 14:07:56 -04:00
John Koleszar	9099fc0d69	require --enable-psnr to build ssim ssim.c comiles in a huge (512M) amount of global scratch space. Allocating this data on the heap would be a better solution, but this file doesn't need to be built at all in most cases, so as a first pass, disable it except when doing opsnr.stt output (--enable-psnr). Change-Id: I320d812f6d652a12516a16b52295ebff20b5bd42	2010-06-11 13:05:08 -04:00
John Koleszar	7aa97a35b5	shared library support (.so) This patch adds support for building shared libraries when configured with the --enable-shared switch. Building DLLs would require more invasive changes to the sample utilities than I want to make in this patch, since on Windows you can't use the address of an imported symbol in a static initializer. The best way to work around this is proably to build the codec interface mapping table with an init() function, but dll support is of questionable value anyway, since most windows users will probably use a media framework lib like webmdshow, which links this library in staticly. Change-Id: Iafb48900549b0c6b67f4a05d3b790b2643d026f4	2010-06-05 16:47:23 -04:00
John Koleszar	09202d8071	LICENSE: update with latest text Change-Id: Ieebea089095d9073b3a94932791099f614ce120c	2010-06-04 16:19:40 -04:00
John Koleszar	b7492341ac	install includes in DIST_DIR/include/vpx, move vpx_codec/ to vpx/ This renames the vpx_codec/ directory to vpx/, to allow applications to more consistently reference these includes with the vpx/ prefix. This allows the includes to be installed in /usr/local/include/vpx rather than polluting the system includes directory with an excessive number of includes. Change-Id: I7b0652a20543d93f38f421c60b0bbccde4d61b4f	2010-05-24 20:27:42 -04:00
John Koleszar	0ea50ce9cb	Initial WebM release	2010-05-18 11:58:33 -04:00

30 Commits