generic-library/vpx

Author	SHA1	Message	Date
Jim Bankoski	818ee904a9	remove fdct invoke macros Remove the fdct invoke macro calls Change-Id: Ica2431c655819fa012133ee7abc75a16761e5fd6	2012-10-29 11:25:56 -07:00
Jim Bankoski	1838d87771	invoke macro removal encodemb Change-Id: I321280abcf48f3dc16e194d29bde2bd3baec6006	2012-10-29 12:36:50 +00:00
Jim Bankoski	118b2fe962	Remove variance vtable from rtcd Change-Id: Idd2722a538423b451e1e3495f89a7141480493d6	2012-10-21 20:47:57 -07:00
Jim Bankoski	7c15c18c5e	removed the recon rtcd invoke macro code (unrevert) This reinstates reverted commit `2113a83157` Change-Id: I9a9af13497d1e58d4f467e3e083fddf06b1b786c	2012-10-16 12:02:31 -07:00
Jim Bankoski	f9d5f86643	Revert "removed the recon. rtcd invoke macro code" This reverts commit `2113a83157`	2012-10-13 20:29:04 -07:00
Jim Bankoski	2113a83157	removed the recon. rtcd invoke macro code Code clean up - removed rtcd Change-Id: Id963ecf53c370b1d99484ef18d6befeed7e0c748	2012-10-13 18:49:44 -07:00
Deb Mukherjee	a7333b0a5b	Merge of the TX_16X16 experiment Change-Id: I22aa803ffff330622cdb77277e7b196a9766f882	2012-10-10 17:05:54 -07:00
Jingning Han	de6dfa6bb0	hybrid transform of 16x16 dimension Enable ADST/DCT of dimension 16x16 for I16X16 modes. This change provides benefits mostly for hd sequences. Set up the framework for selectable transform dimension. Also allowing quantization parameter threshold to control the use of hybrid transform (This is currently disabled by setting threshold always above the quantization parameter. Adaptive thresholding can be built upon this, which will further improve the coding performance.) The coding performance gains (with respect to the codec that has all other configuration settings turned on) are derf: 0.013 yt: 0.086 hd: 0.198 std-hd: 0.501 Change-Id: Ibb4263a61fc74e0b3c345f54d73e8c73552bf926	2012-08-30 16:52:25 -07:00
Ronald S. Bultje	5d4cffb35f	Superblock coding. This commit adds a pick_sb_mode() function which selects the best 32x32 superblock coding mode. Then it selects the best per-MB modes, compares the two and encodes that in the bitstream. The bitstream coding is rather simplistic right now. At the SB level, we code a bit to indicate whether this block uses SB-coding (32x32 prediction) or MB-coding (anything else), and then we follow with the actual modes. This could and should be modified in the future, but is omitted from this commit because it will likely involve reorganizing much more code rather than just adding SB coding, so it's better to let that be judged on its own merits. Gains on derf: about even, YT/HD: +0.75%, STD/HD: +1.5%. Change-Id: Iae313a7cbd8f75b3c66d04a68b991cb096eaaba6	2012-08-20 14:43:34 -07:00
Daniel Kang	fed8a1837f	16x16 DCT blocks. Set on all 16x16 intra/inter modes Features: - Butterfly fDCT/iDCT - Loop filter does not filter internal edges with 16x16 - Optimize coefficient function - Update coefficient probability function - RD - Entropy stats - 16x16 is a config option Have not tested with experiments. hd: 2.60% std-hd: 2.43% yt: 1.32% derf: 0.60% Change-Id: I96fb090517c30c5da84bad4fae602c3ec0c58b1c	2012-08-02 17:33:10 -07:00
John Koleszar	c6b9039fd9	Restyle code Approximate the Google style guide[1] so that that there's a written document to follow and tools to check compliance[2]. [1]: http://google-styleguide.googlecode.com/svn/trunk/cppguide.xml [2]: http://google-styleguide.googlecode.com/svn/trunk/cpplint/cpplint.py Change-Id: Idf40e3d8dddcc72150f6af127b13e5dab838685f	2012-07-17 11:46:03 -07:00
Paul Wilkins	c88d335f7d	Only support improved quant Deprecate fast quant and strict_quant code. Small effect on quality as fast was used in first pass but the effect is basically neutral across the derf set. The rationale here is to reduce the number of code paths for now to make experimentation easier. Optimized and fast code options can be re-introduced later along with other encode speed options. Change-Id: Ia30c5daf3dbc52e72c83b277a1d281e3c934cdad	2012-03-21 18:22:33 +00:00
Yaowu Xu	89ee68b1f7	Merge t8x8 experiments Change-Id: I8e9b6b154e1a0d0cb42d596366380d69c00ac15f	2012-03-01 12:59:11 -08:00
Paul Wilkins	79d330d7d5	Code simplification Removal of the pickinter.c and .h files and calls to this code. Removal of some code relating to real time and one pass settings though there is more to be done in this regard. However, vp8_set_speed_features() now only supports modes 0 and 1 and speeds up to 3 so rd should always be set. Change-Id: I62c0c1b6154ab499785baef310536080e87bc4d8	2012-02-16 17:21:20 +00:00
Paul Wilkins	9a8204d6ee	Simplification of experimental code base. Removed ~CONFIG_REALTIME_ONLY code. Change-Id: I5fafff29a08acd8928699f9ddce8744787024d8c	2012-02-14 09:03:56 +00:00
Paul Wilkins	3e9890a394	Merge Extended Q experiment. Merge the extended Q experiment as indicated by the Change-Id: I02d9e654fff9998cc7e9e2f1f5cd838dad8fb431	2012-02-09 17:22:34 +00:00
Paul Wilkins	01ce04bc06	Further segment feature extensions. This quite large check in includes the following: Merge in some code from Ronald (mbgraph.c) that scans a Gf/arf group. This is used as a basis for a simple segmentation for the normal frames in a gf/arf group. This code also uses satd functions from Yaowu. Adds functionality for coding the latest possible position of an EOB for blocks in the segment. (Currently 0-15 only, hence just for 4x4 dct). Where the EOB position is 0 this acts like "skip" and the normal coding of skip at the per mb level is disabled. Added functions (seg_common.c) for setting and reading segment feature elements. These may want to be optimized away at some point but while the mecahnism is in a state of flux they provide a single location for making changes and keep things a bit cleaner. This is still proof of concept code. Currently the tested feature set:- Quantizer, Loop Filter level, Reference frame, Prediction Mode, EOB end stop. TBD:- Add functions for setting and reading the feature data with range and validity checking. Handling of signed and unsigned feature data. At the moment all is assumed to be signed and a sign bit is coded but many cannot be negative. Correct handling of EOB feature with intra coded blocks. Testing/trapping of legal/illegal ref frame and mode combinations. Transform size switch plus merge and test with 8c8 DCT work Merge and test with Sumans Segmenation coding optimizations Change-Id: Iee12e83661c7abbd1e0ce6810915eb4ec35e2d8e	2011-10-24 15:52:18 +01:00
John Koleszar	67864c5f97	Merge remote branch 'internal/upstream' into HEAD	2011-08-24 00:05:05 -04:00
Fritz Koenig	694d4e7777	Reclassify optimized ssim calculations as SSE2. Calculations were incorrectly classified as either SSE3 or SSSE3. Only using SSE2 instructions. Cleanup function names and make non-RTCD code work as well. Change-Id: I48ad0218af0cc51c5078070a08511dee43ecfe09	2011-08-22 12:36:28 -07:00
Fritz Koenig	734b1b2041	Revert "Reclasify optimized ssim calculations as SSE2." This reverts commit `01376858cd`	2011-08-22 11:31:12 -07:00
Fritz Koenig	01376858cd	Reclasify optimized ssim calculations as SSE2. Calculations were incorrectly classified as either SSE3 or SSSE3. Only using SSE2 instructions. Cleanup function names and make non-RTCD code work as well. Change-Id: I29f5c2ead342b2086a468029c15e2c1d948b5d97	2011-08-19 08:51:27 -07:00
John Koleszar	664cd5ac91	Merge remote branch 'internal/upstream' into HEAD	2011-07-23 00:05:14 -04:00
Yunqing Wang	20bd1446c0	Preload reference area to an intermediate buffer in sub-pixel motion search In sub-pixel motion search, the search range is small(+/- 3 pixels). Preload whole search area from reference buffer into a 32-byte aligned buffer. Then in search, load reference data from this buffer instead. This keeps data in cache, and reduces the crossing cache- line penalty. For tulip clip, tests on Intel Core2 Quad machine(linux) showed encoder speed improvement: 3.4% at --rt --cpu-used =-4 2.8% at --rt --cpu-used =-3 2.3% at --rt --cpu-used =-2 2.2% at --rt --cpu-used =-1 Test on Atom notebook showed only 1.1% speed improvement(speed=-4). Test on Xeon machine also showed less improvement, since unaligned data access latency is greatly reduced in newer cores. Next, I will apply similar idea to other 2 sub-pixel search functions for encoding speed > 4. Make this change exclusively for x86 platforms. Change-Id: Ia7bb9f56169eac0f01009fe2b2f2ab5b61d2eb2f	2011-07-22 09:28:06 -04:00
Deb Mukherjee	08f6471890	Add 8x8 transform to experimental branch Please refer to previous commit messages for detailed info: https://on2-git.corp.google.com/g/#change,5940 https://on2-git.corp.google.com/g/#change,6045 Change-Id: I8b16992f2f69c5a808ad40a3e32ef589cce7c59d	2011-07-20 09:49:22 -07:00
John Koleszar	deb2e9cf62	Merge remote branch 'internal/upstream' into HEAD Conflicts: vp8/encoder/encodeframe.c vp8/encoder/rdopt.c Change-Id: I183fd3ce9e94617ec888c9f891055b9f1f8ca6c5	2011-06-17 15:36:43 -04:00
Yaowu Xu	361717d2be	remove one set of 16x16 variance funcations call to this set of functions are replaced by var16x16. Change-Id: I5ff1effc6c1358ea06cda1517b88ec28ef551b0d	2011-06-09 11:23:05 -07:00
John Koleszar	d13cfba344	Merge remote branch 'internal/upstream' into HEAD	2011-06-07 00:05:04 -04:00
Yaowu Xu	d4700731ca	remove redundant functions The encoder defined about 4 set of similar functions to calculate sum, variance or sse or a combination of them. This commit removed one set of these functions, get8x8var and get16x16var, where calls to the later function are replaced with var16x16 by using the fact on a 16x16 MB: variance == sse - sum*sum/256 Change-Id: I803eabd1fb3ab177780a40338cbd596dffaed267	2011-06-06 16:44:05 -07:00
Tero Rintaluoma	61f0c090df	neon fast quantize block pair vp8_fast_quantize_b_pair_neon function added to quantize two adjacent blocks at the same time to improve performance. - Additional 3-6% speedup compared to neon optimized fast quantizer (Tanya VGA@30fps, 1Mbps stream, cpu-used=-5..-16) Change-Id: I3fcbf141e5d05e9118c38ca37310458afbabaa4e	2011-06-01 10:48:05 +03:00
John Koleszar	27331e1377	Merge remote branch 'internal/upstream' into HEAD	2011-05-20 00:05:16 -04:00
John Koleszar	c684d5e5f2	Merge "changed configure option name to reduce confusion"	2011-05-19 11:17:08 -07:00
John Koleszar	65b1648f35	Merge remote branch 'internal/upstream' into HEAD	2011-05-11 00:05:07 -04:00
Yunqing Wang	cb7b1fb144	Use diamond search to replace full search in full-pixel refining search In NEWMV mode, currently, full search is used as the refining search after n-step search. By replacing it with an iterative diamond search of radius 1 largely reduced the computation complexity, but still maintained the same encoding quality since the refining search is done for every macroblock instead of only a small precentage of macroblocks while using full search. Tests on the test set showed a 3.4% encoding speed increase with none psnr & ssim loss. Change-Id: Ife907d7eb9544d15c34f17dc6e4cfd97cb743d41	2011-05-09 14:07:06 -04:00
Yaowu Xu	57ad189129	changed configure option name to reduce confusion Renamed configure option "enable-psnr" to "enable-internal-stats" to better reflect the purpose of the option and eliminate the confusion reported in http://code.google.com/p/webm/issues/detail?id=35 Change-Id: If72df6fdb9f1e33dab1329240ba4d8911d2f1f7a	2011-04-29 09:39:05 -07:00
John Koleszar	8b20b578bf	Merge remote branch 'internal/upstream' into HEAD	2011-04-13 00:05:07 -04:00
Yunqing Wang	4fd81a99f8	Set cpu_used range to [-16, 16] in real-time mode Remove encoding speed limitation in real-time mode. Change-Id: Ib5e35d8bb522b2a25f3e4ad5cfe2788ebebb3617	2011-04-11 15:55:04 -04:00
John Koleszar	51bcf621c1	Merge remote branch 'internal/upstream' into HEAD Conflicts: vp8/decoder/decodemv.c vp8/decoder/onyxd_if.c vp8/encoder/ratectrl.c vp8/encoder/rdopt.c Change-Id: Ia1c1c5e589f4200822d12378c7749ba62bd17ae2	2011-03-23 00:27:52 -04:00
John Koleszar	429dc676b1	Increase static linkage, remove unused functions A large number of functions were defined with external linkage, even though they were only used from within one file. This patch changes their linkage to static and removes the vp8_ prefix from their names, which should make it more obvious to the reader that the function is contained within the current translation unit. Functions that were not referenced were removed. These symbols were identified by: $ nm -A libvpx.a \| sort -k3 \| uniq -c -f2 \| grep ' [A-Z] ' \ \| sort \| grep '^ *1 ' Change-Id: I59609f58ab65312012c047036ae1e0634f795779	2011-03-17 20:53:47 -04:00
John Koleszar	ba83622a00	Merge remote branch 'internal/upstream' into HEAD Conflicts: vp8/encoder/onyx_if.c Change-Id: Ieef9a58a2effdc68cf52bc5f14d90c31a1dbc13a	2011-03-14 08:53:02 -04:00
Jim Bankoski	3f6f7289aa	vp8cx- alternate ssim function with optimizations Change-Id: I91921b0a90dbaddc7010380b038955be347964b3	2011-03-11 08:51:21 -05:00
John Koleszar	ca29f6a7c4	Merge remote branch 'internal/upstream' into HEAD Conflicts: vp8/vp8_cx_iface.c Change-Id: Iecfd4532ab1c722d10ecce8a5ec473e96093cf3b	2011-03-03 08:59:34 -05:00
Attila Nagy	7af0d906e3	Remove temporal alt ref from realtime only build It is not used in realtime mode. Reduces memory footprint. Change-Id: I7f163225762368df5457cfd413050161d3704a3f	2011-02-22 12:53:32 +02:00
John Koleszar	f13212b728	Merge remote branch 'internal/upstream' into HEAD	2011-02-18 00:05:13 -05:00
John Koleszar	02321de0f2	Fix relative include paths Allow compiling without adding vp8/{common,encoder,decoder} to the include paths. Change-Id: Ifeb5dac351cdfadcd659736f5158b315a0030b6c	2011-02-10 15:09:44 -05:00
Yaowu Xu	5b42ae09ae	experiment extending the quantizer range Prior to this change, VP8 min quantizer is 4, which caps the highest quality around 51DB. This experimental change extends the min quantizer to 1, removes the cap and allows the highest quality to be around ~73DB, consistent with the fdct/idct round trip error. To test this change, at configure time use options: --enable-experimental --enable-extend_qrange The following is a brief log of changes in each of the patch sets patch set 1: In this commit, the quantization/dequantization constants are kept unchanged, instead scaling factor 4 is rolled into fdct/idct. Fixed Q0 encoding tests on mobile: Before: 9560.567kbps Overall PSNR:50.255DB VPXSSIM:98.288 Now: 18035.774kbps Overall PSNR:73.022DB VPXSSIM:99.991 patch set 2: regenerated dc/ac quantizer lookup tables based on the scaling factor rolled in the fdct/idct. Also slightly extended the range towards the high quantizer end. patch set 3: slightly tweaked the quantizer tables and generated bits_per_mb table based on Paul's suggestions. patch set 4: fix a typo in idct, re-calculated tables relating active max Q to active min Q patch set 5: added rdmult lookup table based on Q patch set 6: fix rdmult scale: dct coefficient has scaled up by 4 patch set 7: make transform coefficients to be within 16bits patch set 8: normalize 2nd order quantizers patch set 9: fix mis-spellings patch set 10: change the configure script and macros to allow experimental code to be enabled at configure time with --enable-extend_qrange patch set 11: rebase for merge Change-Id: Ib50641ddd44aba2a52ed890222c309faa31cc59c	2011-01-19 13:22:35 -08:00
Attila Nagy	cb791aaa2f	Fix encoder real-time only configuration. Remove allocation/deallocation of stats storage. Remove full search functions in machine specific encoder inits. Remove last pass validation in validate_config. Change-Id: I7f29be69273981a4fef6e80ecdb6217c68cbad4e	2011-01-18 08:19:21 -05:00
Johann	4b6219cb33	temporal filter naming changes be more consistant with the naming pattern, especially wrt rtcd Change-Id: I3df50686a09f1dab0a9620b5adbb8a1577b40f2f	2010-12-22 11:32:15 -05:00
Johann	092b5bef37	abstract apply_temporal_filter allow for optimized versions of apply_temporal_filter (now vp8_apply_temporal_filter_c) the function was previously declared as static and appears to have been inlined. with this change, that's no longer possible. performance takes a small hit. the declaration for vp8_cx_temp_filter_c was moved to onyx_if.c because of a circular dependency. for rtcd, temporal_filter.h holds the definition for the rtcd table, so it needs to be included by onyx_int.h. however, onyx_int.h holds the definition for VP8_COMP which is needed for the function prototype. blah. Change-Id: I499c055fdc652ac4659c21c5a55fe10ceb7e95e3	2010-12-22 11:31:54 -05:00
Yunqing Wang	71ecb5d7d9	Full search SAD function optimization in SSE4.1 Use mpsadbw, and calculate 8 sad at once. Function list: vp8_sad16x16x8_sse4 vp8_sad16x8x8_sse4 vp8_sad8x16x8_sse4 vp8_sad8x8x8_sse4 vp8_sad4x4x8_sse4 (test clip: tulip) For best quality mode, this gave encoder a 5% performance boost. For good quality mode with speed=1, this gave encoder a 3% performance boost. Change-Id: I083b5a39d39144f88dcbccbef95da6498e490134	2010-10-27 13:36:31 -04:00
John Koleszar	a0ae3682aa	Fix half-pixel variance RTCD functions This patch fixes the system dependent entries for the half-pixel variance functions in both the RTCD and non-RTCD cases: - The generic C versions of these functions are now correct. Before all three cases called the hv code. - Wire up the ARM functions in RTCD mode - Created stubs for x86 to call the optimized subpixel functions with the correct parameters, rather than falling back to C code. Change-Id: I1d937d074d929e0eb93aacb1232cc5e0ad1c6184	2010-10-27 13:00:30 -04:00

1 2

56 Commits