generic-library/vpx

Author	SHA1	Message	Date
John Koleszar	bdd35c13cc	avoid resetting framerate during vpx_codec_enc_config_set() The calculated frame_rate is a state variable in the codec, and shouldn't be maintained in the configuration struct. Move it to the main part of cpi so that it isn't clobbered when the configuration struct is updated. The initial framerate estimate is moved from the vp8_cx_iface.c wrapper into the body of init_config() in onyx_if.c, so that it is only called once and not reset on every call to vp8_change_config(). Change-Id: I8d9a3d1283330d1ee297d07e9d78d1f2875f2465	2011-11-11 14:45:58 -08:00
Scott LaVarnway	df49c7c58d	SSE2 optimizations for vp8_build_intra_predictors_mby{,_s}() Ronald recently sent me this patch that he did in April. > From: Ronald S. Bultje <rbultje@google.com> > Date: Thu, 28 Apr 2011 17:30:15 -0700 > Subject: [PATCH] SSE2 optimizations for > vp8_build_intra_predictors_mby{,_s}(). HD decode tests have shown a performance boost up to 1.5%, depending on material. Patch set 3: Fixed encoder crash. Change-Id: Ie1fd1fa3dc750eec1a7a20bfa2decc079dcf48c8	2011-11-09 15:30:35 -05:00
Scott LaVarnway	9532bda0fb	Merge "Relocated idct/add calls for encoder"	2011-11-09 10:17:43 -08:00
Johann	ea2229bab6	Merge "ARMv6 optimized Intra4x4 prediction"	2011-11-09 09:36:33 -08:00
John Koleszar	82e8884ad8	Merge "Remove unused file recon.c"	2011-11-09 09:31:23 -08:00
Scott LaVarnway	861ed6a5c1	Relocated idct/add calls for encoder Call the idct/add after the tokenize. This is WIP with the goal of creating a common idct/add for the encoder and decoder. This move is necessary because the decoder's version of the idct clobbers qcoeff, which is used by the tokenize. Change-Id: I6b08d8e8397cd873647fa4fb9469884e3c876756	2011-11-09 10:41:05 -05:00
Tero Rintaluoma	5a2fd63a2a	ARMv6 optimized Intra4x4 prediction Added ARM optimized intra 4x4 prediction - 2x faster on Profiler compared to C-code compiled with -O3 - Function interface changed a little to improve BLOCKD structure access Change-Id: I9bc2b723155943fe0cf03dd9ca5f1760f7a81f54	2011-11-09 09:13:51 +02:00
James Zern	9d60506130	threading: avoid defining _WIN32_WINNT The referenced function (SignalObjectAndWait) isn't used. Reduces the warnings with mingw32-w64 which defines this. Change-Id: I4ce592879ec9372bf196dac640204c4d370bd210	2011-11-08 18:50:45 -08:00
John Koleszar	f89e109f56	Remove unused file recon.c File not referenced from anywhere and no longer compiles. Change-Id: I38b11bd60db615c2c2c9d7ad35caba3a1adf1750	2011-11-08 15:54:56 -08:00
James Zern	f89ea3432f	fix file permissions all of googletest import (0ab00a22) was marked executable Change-Id: Id7b7ee03efc21ab998bb03349bd91644e8af25da	2011-11-04 18:50:35 -07:00
John Koleszar	7ca6c91732	Merge "Changing decoder input partition API to input fragments."	2011-11-04 09:36:27 -07:00
Scott LaVarnway	46639567a0	Merge "Change use of eob in the encoder"	2011-11-03 08:06:06 -07:00
Tero Rintaluoma	e4f2ec7a52	Change use of eob in the encoder Changed 'int eob' to 'char *eob' in BLOCKD so that both encoder and decoder will use eobs[25] array from MACROBLOCKD structure. In future, this will enable use of the decoder side IDCT in the encoder. Change-Id: I6e1c011628cb8864fd4a0b80f0279ce16a5ca978	2011-11-03 16:08:09 +02:00
Stefan Holmer	1427205215	Changing decoder input partition API to input fragments. Adding support for several partitions within one input fragment. This is necessary to fully support all possible packetization combinations in the VP8 RTP profile. Several partitions can be transmitted in the same packet, and they can only be split by reading the partition lengths from the bitstream. Change-Id: If7d7ea331cc78cb7efd74c4a976b720c9a655463	2011-11-01 14:44:37 -07:00
Scott LaVarnway	e0309e1509	Merge "Improved decode_split_mv()"	2011-10-28 09:27:17 -07:00
Scott LaVarnway	6064384d59	Improved decode_split_mv() Tests showed ~1.2% performance boost on the HD clip used. Performance will vary based on material. Change-Id: Icbcf1a828750d5b4ae5252bf596b3ef594042e8a	2011-10-27 11:26:30 -04:00
Scott LaVarnway	0db5599957	Merge "Improved mv_bias"	2011-10-27 06:14:00 -07:00
Scott LaVarnway	21970d1dc2	Improved mv_bias Small performance gains. Change-Id: I709b9390a8a27a70f5f23574313b8db85ac7f23d	2011-10-26 11:46:10 -04:00
Attila Nagy	de82809444	Reduce partial frame copy in encoder's pick_filter_level_fast The partial frame copy function used to copy an extra 8 lines above and below. The partial frame filtering can only modify 3 pixel rows above the partial frame. Reduce copy to bare minimum needed, which is 4 lines, so that partial filtering on copied frame is possible. Define the "magic" fraction number for partial filtering in loopfilter.h . Change-Id: I4791ffc541b6884b12759a0d0714a8faf16147ec	2011-10-26 15:25:07 +03:00
James Berry	bc7151131d	Fix: check cx_data buffer prior to write check to make sure that cx_data buffer has enough room before writting to it, prior behavior did not which could result in a crash. Change-Id: I3fab6f2bc4a96d7c675ea81acd39ece121738b28	2011-10-20 15:55:00 -04:00
Scott LaVarnway	ed9c66f584	Remove usage of predict buffer for decode Instead of using the predict buffer, the decoder now writes the predictor into the recon buffer. For blocks with eob=0, unnecessary idcts can be eliminated. This gave a performance boost of ~1.8% for the HD clips used. Tero: Added needed changes to ARM side and scheduled some assembly code to prevent interlocks. Patch Set 6: Merged (I1bcdca7a95aacc3a181b9faa6b10e3a71ee24df3) into this commit because of similarities in the idct functions. Patch Set 7: EC bug fix. Change-Id: Ie31d90b5d3522e1108163f2ac491e455e3f955e6	2011-10-18 12:06:50 -04:00
Adrian Grange	04182a121a	Merge "Added rate-targeted temporal scalability"	2011-10-11 12:54:52 -07:00
Adrian Grange	217591fde5	Added rate-targeted temporal scalability Added the ability to create rate-targeted, temporally scalable, VP8 compatible bitstreams. The application vp8_scalable_patterns.c demonstrates how to use this capability. Users can create output bitstreams containing upto 5 temporally separable streams encoded as a single VP8 bitstream. (previously abandoned as: I92d1483e887adb274d07ce9e567e4d0314881b0a) Change-Id: I156250a3fe930be57c069d508c41b6a7a4ea8d6a	2011-10-11 12:49:12 -07:00
James Berry	05bde9d4a4	bug fix - starting/optimal/max and buffer_level changed from int to int64_t buffer_level in VP8_COMP and starting_buffer_level, optimal_buffer_level and maximum_buffer_size in VP8_CONFIG changed from int to int64_t to avoid potential crash issues for larger target bit rates. Change-Id: I0d5ab6c8a44c2fef51f30cd8df4bb4b739c5df26	2011-10-10 12:16:55 -04:00
Scott LaVarnway	af12c23e8e	Merge "Improved tokenize"	2011-10-04 09:57:42 -07:00
Johann	2aa408524c	Merge "Reduce computational complexity of generic C loop filter."	2011-09-30 16:17:56 -07:00
Scott LaVarnway	ab00d209bc	Improved tokenize For a realtime HD encodings, up to 1.6% gains seen. Change-Id: If45028e23db95124da63f9d38ffe06e05596cc6e	2011-09-30 12:49:46 -04:00
Johann	3556deaca3	combine loopfilter data access The data processed by the loopfilter overlaps. At the block level, this results in some redundant transforms. Grouping the filtering allows for a single 16x16 transpose (and inversion) instead of three 16x8 transposes (and three more inversions). This implementation is x86_64 only. We retain the previous implementation for x86. Improvements are obviously material dependant, but it seems to be ~%1 in tests here. Change-Id: I467b7ec3655be98fb5f1a94b5d145e5e5a660007	2011-09-30 07:38:35 -07:00
Aaron Watry	69aa303d96	Reduce computational complexity of generic C loop filter. Change-Id: I1e7f9ed3cd907844a495b9e0073bc140b87e5c06	2011-09-29 17:25:48 -05:00
John Koleszar	6f9457ec12	Merge "clamp_mvs() using the wrong motion vector information"	2011-09-22 11:54:15 -07:00
Attila Nagy	1a7d25a484	Replace vpx_ports/config.h with vpx_config.h Just a clean-up. Change-Id: Iea5b6dc925dcfa7db548bc1ab1a13d26ed5a2c9a	2011-09-22 13:33:54 +03:00
Scott LaVarnway	c0ee870b0a	clamp_mvs() using the wrong motion vector information In the "Removed bmi copy to/from BLOCKD" commit, the copy to the bmi in BLOCKD was eliminated. The clamp_mvs() used the bmi in BLOCKD, which now contains incorrect values. This patch fixes this problem. Change-Id: I8eca1eaf4015052b0b63e90876f7ad321aba7cff	2011-09-16 11:03:53 -04:00
Scott LaVarnway	222c72e50f	Merge "Removed bmi copy to/from BLOCKD"	2011-08-31 06:57:20 -07:00
Scott LaVarnway	b870947d42	Removed bmi copy to/from BLOCKD for SPLITMV and B_PRED modes. Modified code to use the bmi found in mode_info_context instead of BLOCKD. On the decode side, the uvmvs are calculated only when required, instead of every macroblock. This is WIP. (bmi should eventually be removed from BLOCKD) Small performance gains noticed for RT encodes and decodes.(VGA) Change-Id: I2ed7f0fd5ca733655df684aa82da575c77a973e7	2011-08-24 14:42:26 -04:00
Fritz Koenig	112bd4e2b4	Fix naming of sse2 idct functions. Prepend idct function names with vp8_ so that under profiling they show up associated with libvpx. Change-Id: I4fe357b50236cb7730a4cc00164c0a3487a1d8b4	2011-08-24 10:25:32 -07:00
Scott LaVarnway	1de5da80c9	Merge "Faster vp8_default_coef_probs"	2011-08-24 07:52:10 -07:00
Johann	85358d04cd	Fix data accesses for simple loopfilters The data that the simple horizontal loopfilter reads is aligned, treat it accordingly. For the vertical, we only use the bottom 4 bytes, so don't read in 16 (and incur the penalty for unaligned access). This shows a small improvement on older processors which have a significant penalty for unaligned reads. postproc_mmx.c is unused Change-Id: I87b29bbc0c3b19ee1ca1de3c4f47332a53087b3d	2011-08-23 20:42:45 -04:00
Fritz Koenig	c5f890af2c	Use local labels for jumps/loops in x86 assembly. Prepend . to local labels in assembly code. This allows non unique labels within a file. Also makes profiling information more informative by keeping the function name with the loop name. Change-Id: I7a983cb3a5ba2413d5dafd0a37936b268fb9e37f	2011-08-23 09:05:29 -07:00
John Koleszar	edec5eb5e7	Merge "Copy less when active map is in use"	2011-08-19 07:31:00 -07:00
Alpha Lam	4e8d35a461	Copy less when active map is in use When active map is specified and the current frame is not a key frame, golden frame nor a altref frame then copy only those active regions. This significantly reduces encoding time by as much as 19% on the test system where realtime encoding is used. This is particularly useful when the frame size is large (e.g. 2560x1600) and there's only a few action macroblocks. Change-Id: If394a813ec2df5a0201745d1348dbde4278f7ad4	2011-08-19 10:29:41 -04:00
Scott LaVarnway	19987dcbfa	Faster vp8_default_coef_probs Copies from a generated table instead of building the default coeff probabilities during runtime. Change-Id: I4d9551ea3a2d7d4a4f7ce9eda006495221a8de50	2011-08-16 16:21:21 -04:00
John Koleszar	e96131705a	Revert "Improved 1-pass CBR rate control" This reverts commit b5ea2fbc2c1554769848774c836aad262af95072. Further testing showed noticable keyframe popping in some cases, reverting this for now to give time for a proper fix. Conflicts: vp8/encoder/onyx_if.c vp8/encoder/ratectrl.c Change-Id: I159f53d1bf0e24c035754ab3ded8ccfd58fd04af	2011-08-12 14:51:36 -04:00
Johann	30e5deae5d	update extend frame borders the neon code made several assumptions which were broken by a recent change: https://review.webmproject.org/2676 update the code with new assumptions and guard them with a compile time assert Change-Id: I32a8378030759966068f34618d7b4b1b02e101a0	2011-08-02 19:26:46 -04:00
John Koleszar	06c3d5bb9a	Fix building with --disable-postproc Change-Id: I7e6bc28e7974a376da747300744e0dd5dc1d21e9	2011-08-01 17:50:23 -04:00
James Zern	b45065d38b	cosmetics: consistently use [u]int64_t Removes mixed usage of (unsigned) long long and INT64. Fixes Issue #208. Change-Id: I220d3ed5ce4bb1280cd38bb3715f208ce23cf83a	2011-07-26 11:34:36 -07:00
Yunqing Wang	65dfcf4696	Use CONFIG_FAST_UNALIGNED consistently in codec CONFIG_FAST_UNALIGNED is enabled by default. Disable it if it is not supported by hardware. Change-Id: I7d6905ed79fed918bca074bd62820b0c929d81ab	2011-07-25 10:11:24 -04:00
Johann	773bcc300d	Merge "fix sharpness bug and clean up"	2011-07-22 09:34:55 -07:00
Johann	a04ed0e8f3	fix sharpness bug and clean up sharpness was not recalculated in vp8cx_pick_filter_level_fast remove last_filter_type. all values are calculated, don't need to update the lfi data when it changes. always use cm->sharpness_level. the extra indirection was annoying. don't track last frame_type or sharpness_level manually. frame type only matters for motion search and sharpness_level is taken care of in frame_init move function declarations to their proper header Change-Id: I7ef037bd4bf8cf5e37d2d36bd03b5e22a2ad91db	2011-07-22 12:33:57 -04:00
Yunqing Wang	829179e888	Merge "Preload reference area to an intermediate buffer in sub-pixel motion search"	2011-07-22 06:56:15 -07:00
Yunqing Wang	20bd1446c0	Preload reference area to an intermediate buffer in sub-pixel motion search In sub-pixel motion search, the search range is small(+/- 3 pixels). Preload whole search area from reference buffer into a 32-byte aligned buffer. Then in search, load reference data from this buffer instead. This keeps data in cache, and reduces the crossing cache- line penalty. For tulip clip, tests on Intel Core2 Quad machine(linux) showed encoder speed improvement: 3.4% at --rt --cpu-used =-4 2.8% at --rt --cpu-used =-3 2.3% at --rt --cpu-used =-2 2.2% at --rt --cpu-used =-1 Test on Atom notebook showed only 1.1% speed improvement(speed=-4). Test on Xeon machine also showed less improvement, since unaligned data access latency is greatly reduced in newer cores. Next, I will apply similar idea to other 2 sub-pixel search functions for encoding speed > 4. Make this change exclusively for x86 platforms. Change-Id: Ia7bb9f56169eac0f01009fe2b2f2ab5b61d2eb2f	2011-07-22 09:28:06 -04:00

... 3 4 5 6 7 ...

463 Commits