generic-library/vpx

Author	SHA1	Message	Date
James Yu	4a8336fa9d	VP8 for ARMv8 by using NEON intrinsics 11 Add mbloopfilter_neon.c - vp8_mbloop_filter_horizontal_edge_y_neon - vp8_mbloop_filter_horizontal_edge_uv_neon - vp8_mbloop_filter_vertical_edge_y_neon - vp8_mbloop_filter_vertical_edge_uv_neon Change-Id: Ia9084e0892d4d49412d9cf2b165a0f719f2382d7 Signed-off-by: James Yu <james.yu@linaro.org>	2014-05-03 04:54:33 -07:00
James Yu	c500fc22c1	VP8 for ARMv8 by using NEON intrinsics 10 Add loopfiltersimpleverticaledge_neon.c - vp8_loop_filter_bvs_neon - vp8_loop_filter_mbvs_neon Change-Id: I7cf0a161ad4ae37c881b94cc0122f895d3baae79 Signed-off-by: James Yu <james.yu@linaro.org>	2014-05-03 04:11:00 -07:00
James Yu	55c95f2d2c	VP8 for ARMv8 by using NEON intrinsics 09 Add loopfiltersimplehorizontaledge_neon.c - vp8_loop_filter_bhs_neon - vp8_loop_filter_mbhs_neon Change-Id: I77f9721b20585da8bf3869a3850ff0ae4b4bfeea Signed-off-by: James Yu <james.yu@linaro.org>	2014-05-03 04:10:45 -07:00
James Yu	a5d79f43b9	VP8 for ARMv8 by using NEON intrinsics 08 Add loopfilter_neon.c - vp8_loop_filter_horizontal_edge_y_neon - vp8_loop_filter_horizontal_edge_uv_neon - vp8_loop_filter_vertical_edge_y_neon - vp8_loop_filter_vertical_edge_uv_neon Change-Id: I50b57dedabd42d2a3c183c1738cc5346f0e71ed8 Signed-off-by: James Yu <james.yu@linaro.org>	2014-05-02 09:32:11 -07:00
James Yu	930557be10	VP8 for ARMv8 by using NEON intrinsics 07 Add iwalsh_neon.c - vp8_short_inv_walsh4x4_neon Change-Id: I8beda6ce11ad8ce9e80cc0a38d40161938359162 Signed-off-by: James Yu <james.yu@linaro.org>	2014-05-02 09:24:54 -07:00
James Yu	81ad047ee5	VP8 for ARMv8 by using NEON intrinsics 06 Add idct_dequant_full_2x_neon.c - idct_dequant_full_2x_neon ==== Summary of apply VP8 decode patch series ==== Benchmark on Samsung Chromebook, Cortex-A15, 1.7GHz, Dual core Toolchain: linaro-1.13.1-4.8-2014.01 Compile argument: CROSS=arm-linux-gnueabihf- ../libvpx/configure --target=armv7-linux-gcc --prefix=$HOME/out --enable-shared --cpu=cortex-a7 Test argument: vpxdec --summary --noblit ./tears_of_steel_1080p.webm NEON assembly 46.68 (fps) Apply patch 06 46.65, -0.03 Apply patch 07 46.86, +0.21 Apply patch 08 46.58, -0.28 Apply patch 09 46.57, -0.01 Apply patch 10 46.51, -0.06 Apply patch 11 46.13, -0.38 Apply patch 12 45.42, -0.71 Apply patch 13 46.06, +0.64 Apply patch 14 45.19, -0.87 Apply patch 15 45.93, +0.74 Apply patch 16 45.48, -0.45 Apply patch 17 45.84, +0.36 Apply patch 18 45.91, +0.07 <= With all NEON intrinsics patches Total -0.77 fps, 1.65% performance regression Change-Id: I77bfc9eaccfb97b8d401e949ceff8795e26ca6b7 Signed-off-by: James Yu <james.yu@linaro.org>	2014-05-02 11:57:47 +08:00
Yunqing Wang	096eaba728	Remove VP8 save_reg_neon function This patch did a cleanup following the commit "Save NEON registers in VP8 NEON functions". The pushing/poping of callee-saved NEON registers was moved into individual NEON functions. Therefore, we don't need to save those registers at the beginning of codec. The related code was removed. Change-Id: I5648166514fc9beffb780aa138495597731f49ea	2014-04-29 16:13:24 -07:00
James Zern	805078a1bf	build: convert rtcd.sh to perl significantly speeds up file generation. the goal of this change is to convert rtcd.sh to perl as directly as possible to allow for simple comparison. future changes can make it more perl-like. --- Linux [CREATE] vpx_scale_rtcd.h real 0m0.485s -> 0m0.022s [CREATE] vp8_rtcd.h real 0m4.619s -> 0m0.060s [CREATE] vp9_rtcd.h real 0m10.102s -> 0m0.087s Windows [CREATE] vpx_scale_rtcd.h real 0m8.360s -> 0m0.080s [CREATE] vp8_rtcd.h real 1m8.083s -> 0m0.160s [CREATE] vp9_rtcd.h real 2m6.489s -> 0m0.233s Change-Id: Idfb71188206c91237d6a3c3a81dfe00d103f11ee	2014-03-03 14:47:11 -08:00
James Yu	fb5d281bb6	VP8 for ARMv8 by using NEON intrinsics 05 Add dequantizeb_neon.c - vp8_dequantize_b_loop_neon vpxdec --summary --noblit ../videos/tears_of_steel_1080p.webm Before => After, 13.25 => 13.23 (fps) Change-Id: Iebe3b0c6ed2359c778b0570763c5681ae25fef0c Signed-off-by: James Yu <james.yu@linaro.org>	2014-02-26 10:16:00 +08:00
James Yu	28b2f82f97	VP8 for ARMv8 by using NEON intrinsics 04 Add dequant_idct_neon.c - vp8_dequant_idct_add_neon vpxdec --summary --noblit ../videos/tears_of_steel_1080p.webm Before => After, 13.25 => 13.22 (fps) Change-Id: Id48f39e1da58dd3d8d37658e94989411997f4f7c Signed-off-by: James Yu <james.yu@linaro.org>	2014-02-26 09:59:23 +08:00
James Yu	d749ab6221	VP8 for ARMv8 by using NEON intrinsics 03 Add dc_only_idct_add_neon.c - vp8_dc_only_idct_add_neon vpxdec --summary --noblit ../videos/tears_of_steel_1080p.webm Before => After, 13.25 => 13.24 (fps) Change-Id: I5e9e277ec3a3ca67e13c8cc4c324a6fbe8a897fc Signed-off-by: James Yu <james.yu@linaro.org>	2014-02-26 09:28:29 +08:00
James Yu	300a3bfc73	VP8 for ARMv8 by using NEON intrinsics 02 Add copymem_neon.c - vp8_copy_mem16x16_neon - vp8_copy_mem8x8_neon - vp8_copy_mem8x4_neon vpxdec --summary --noblit ../videos/tears_of_steel_1080p.webm Before => After, 13.25 => 13.25 (fps) Change-Id: Ib956b5a20522ff57dc8a580bf0aef7b252bddba6 Signed-off-by: James Yu <james.yu@linaro.org>	2014-02-23 22:56:53 +08:00
Johann	dadf350551	Apply neon flags to intrinsic files Filter out files ending in _neon.c and append .neon so the Android build system knows to apply -mfpu=neon Change-Id: Ib67277e5920bfcaeda7c4aa16cd1001b11d59305	2014-01-10 12:16:59 -08:00
James Yu	79395e16cf	VP8 for ARMv8 by using NEON intrinsics 01 Add bilinearpredict_neon_intrinsics.c - vp8_bilinear_predict4x4_neon - vp8_bilinear_predict8x4_neon - vp8_bilinear_predict8x8_neon - vp8_bilinear_predict16x16_neon Change-Id: I33dfa502881219841b442dda32b73220e51b716b Signed-off-by: James Yu <james.yu@linaro.org>	2014-01-09 09:56:22 -08:00
James Zern	f89335f7ca	remove unused VP8 com/dec asm offsets Change-Id: Ib3b26ee27f04b2dcbbd32b3127afb45e9f50cfcf	2013-07-09 14:33:49 -07:00
James Zern	08348d9cab	prefix vp8 asm_{com,dec,enc}_offsets files make them symmetrical with the generated output and their vp9 counterparts Change-Id: I72cc97c4d33d713dff620a6d7cc25955266216fc	2013-03-02 14:45:40 -08:00
John Koleszar	a9c7597adc	support building vp8 and vp9 into a single lib Change-Id: Ib8f8a66c9fd31e508cdc9caa662192f38433aa3d	2012-11-15 10:46:17 -08:00
John Koleszar	7b8dfcb5a2	Rough merge of master into experimental Creates a merge between the master and experimental branches. Fixes a number of conflicts in the build system to allow either VP8 or VP9 to be built. Specifically either: $ configure --disable-vp9 $ configure --disable-vp8 --disable-unit-tests VP9 still exports its symbols and files as VP8, so that will be resolved in the next commit. Unit tests are broken in VP9, but this isn't a new issue. They are fixed upstream on origin/experimental as of this writing, but rebasing this merge proved difficult, so will tackle that in a second merge commit. Change-Id: I2b7d852c18efd58d1ebc621b8041fe0260442c21	2012-11-07 11:30:16 -08:00
Ronald S. Bultje	4b2c2b9aa4	Rename vp8/ codec directory to vp9/. Change-Id: Ic084c475844b24092a433ab88138cf58af3abbe4	2012-11-01 16:31:22 -07:00
Ronald S. Bultje	6c280c2299	Adjust style to match Google Coding Style a little more closely. Most of these were picked up by jenkins in the commit that changed the vp8 namespace to vp9 in common/. Change-Id: I5cbd56ffc753b92ef805133cda6acc1713a13878	2012-11-01 10:03:48 -07:00
Ronald S. Bultje	8166657109	Make implicit_segmentation-related code an experiment. This way, the code is not compiled in by default, thus decreasing overall binary size. Change-Id: I85cac8f5a22a51a7d99c820ef6d6ed179d4106a0	2012-10-29 21:15:42 -07:00
Scott LaVarnway	ce811f87c4	Faster 8t filtering Quickly modified the ssse3 sixtap filters to support eight taps. For the test clip used, a 23+% boost in decoder performance was seen. We can revisit later and improve further. Change-Id: I5f59860459e80d6fa23e6cc0fd91296a969f5240	2012-10-25 17:24:50 -07:00
Scott LaVarnway	9ba2efd034	Added sse2 instrinsic version of vp8_sad16x3 3.7% boost in decoder performance for the clip used. Change-Id: I74f28486a9352b472b36e21b5eaf30eff35e9199	2012-10-25 12:16:08 -07:00
Scott LaVarnway	d36ecb42da	Added rtcd support vp8_sad16x3 and vp8_sad3x16 Change-Id: I5bca7b7a4b230082d36ac6fb84db84137ad177d7	2012-10-22 13:45:42 -07:00
Scott LaVarnway	085433c2d0	sse2 intrinsic version of vp8_mbloop_filter_vertical_edge() First sse2 version of vp8_mbloop_filter_vertical_edge(). For now, intrinsics are being used until the bitstream is finalized. This function will be revisited later for further performance improvements. For the test clip used, a 34+% decoder performance improvement was seen. This will vary depending on material. Change-Id: I455b438bc8d8af76cf7533ac42eda5f689b21f7c	2012-10-19 15:52:12 -07:00
Scott LaVarnway	992b5e2d95	sse2 intrinsic version of vp8_mbloop_filter_horizontal_edge() First sse2 version of vp8_mbloop_filter_horizontal_edge(). For now, intrinsics are being used until the bitstream is finalized. This function will be revisited later for further performance improvements. For the test clip used, a 31+% decoder performance improvement was seen. This will vary depending on material. Change-Id: I03ed3a7182478bdd1f094644ff3e0442625600e7	2012-10-18 14:29:26 -07:00
Jim Bankoski	ffff213463	removed obselete build dependency this commit fixes the build on windows with visual studio 2008. Change-Id: I0baa4044e9e54237da29f2e17332ea6f766dbbec	2012-10-17 09:22:05 -07:00
Paul Wilkins	2d60bee1fb	New Motion Reference Search Alternative strategy for finding a list of candidate motion vectors to use as reference values in mv coding and as nearest and near. Sort by sad in vp8_find_best_ref_mvs() rather than just pick the best. Allow 0,0 as a best ref option but not a nearest or near unless there are no alternatives. Encode/Decode verified on at least some clips. Some commented out experimental and stats code still in place. Gain over existing code averages about 1% on derf (alll metrics) with improvement on all clips. Other test results pending. The entropy coding of the mode (nearest/near etc) still depends upon and requires the old "findnear" code so this needs looking at and may provide room for further gains. Change-Id: I871d7cba1d1c379c4bad9bcccce1fb19c46b8247	2012-08-24 18:08:21 +01:00
John Koleszar	b43ed7a5b1	Merge "remove rotation experiment" into experimental	2012-08-22 10:01:39 -07:00
Christian Duvivier	63ef9c40a4	SSE2 version of vectorized 8-tap filtering. About 20% overall encoder speedup (vs. about 30% for sse4 version). Change-Id: Ibf608a6a1bc94b14ec47e8046d3206b275b5a8bd	2012-08-21 15:26:14 -07:00
John Koleszar	5055a1610d	remove rotation experiment This is being reimplemented more generically in terms of affine transforms. Change-Id: I9300bfde5f8b93c708c64f59427087720f8ed782	2012-08-21 10:09:56 -07:00
Christian Duvivier	5a34e0eb89	First partial snapshot of vectorized 8-tap filtering. About 3.5x faster, 30% overall encoder speedup. Rest of optimizations will come soon (see TODO section in filter_sse4.c). Change-Id: If18108048bfd5345fc942e8574e4c7f58e0e86e0	2012-08-15 17:55:06 -07:00
Christian Duvivier	707b65bd16	Partial import of "New RTCD implementation" from master branch. Latest version of all scripts/makefile but rtcd_defs.sh is empty, all existing functions are still selected using the old/current way. Change-Id: Ib92946a48a31d6c8d1d7359eca524bc1d3e66174	2012-08-08 16:43:48 -07:00
Johann	aa165c8c5d	Update armv6 vp8_intra4x4_predict Change-Id: I52a3b0a4a42e5af91b987e19523df07c8f467847	2012-08-08 10:57:33 -07:00
Johann	a497cb59cd	Rename vp8_intra4x4_predict_d predict_d has become canonical. Remove previous helper function. Disable ARM assembly pending update. Change-Id: Idd84ac8a28f9b0221ea97904a77de1e705d06a7d	2012-08-01 11:17:57 -07:00
Dragan Mrdjan	07ff7fa811	VP8 optimizations for MIPS dspr2 Signed-off-by: Raghu Gandham <raghu@mips.com> Change-Id: I3a8bca425cd3dab746a6328c8fc8843c8e87aea6	2012-07-10 10:01:54 -07:00
Yaowu Xu	e9818bb697	changed the way that default probs for 8x8 is set. The commit changed how baseline 8x8 coefficient probabilities are initialized, to be consistent with the initialization of baseline 4x4 coefficient probabilities. The commit does not have any effect on compression. Change-Id: Ifb3902b5dc0b0c2e6dc3aa5d4a6589d528e58355	2012-05-23 10:09:41 -07:00
John Koleszar	2d225689d3	Move all tests to test/ directory Consolodate the unit tests under vp8/ to the test/ directory Change-Id: I6d6a0fb60f5e3874a4d6710e9e121dd3e81a93db	2012-05-22 15:00:10 -07:00
John Koleszar	e82d261d10	Build unit tests monolithically Rework unit tests to have a single executable rather than many, which should avoid pollution of the visual studio project namespace, improve build times, and make it easier to use the gtest test sharding system when we get these going on the continuous build cluster. Change-Id: If4c3e5d4b3515522869de6c89455c2a64697cca6	2012-05-22 14:37:30 -07:00
Scott LaVarnway	317d4244cb	Makes all mode token tables const part 2 (see Change I9b2ccc88: Makes all mode token tables const) Further remove runtime table initialization and use precalculated const data. Data footprint reduced by 4112 bytes. Change-Id: Ia3ae9fc19f77316b045cabff01f6e5f0876a86ab	2012-04-19 17:35:20 -04:00
Yaowu Xu	3f5feb7d13	fixed .mk files to reflect add/remove of a header file In a previous commit, the duplicate of headerfile defaultcoefcounts.h was identified. This commit updates the .mk file to ensure configure and make works properly for all platforms. Change-Id: I31a39c809a734ba438ee53db700f252e9a03eddd	2012-03-12 14:51:54 -07:00
Johann	fd903902ef	RFC: Reorganize MFQE loops Break MFQE code into it's own file. It is currently only valid for 16x16 and 8x8 Y blocks. It also filters 4x4 U/V blocks. Refactor filtering and add associated assembly. Limited test cases show --mfqe introduces a penalty of ~20% with HD content. The assembly reduces the penalty to ~15% Change-Id: I4b8de6b5cdff5413037de5b6c42f437033ee55bf	2012-03-06 15:20:03 -08:00
Johann	e50f96a4a3	Move SAD and variance functions to common The MFQE function of the postprocessor depends on these Change-Id: I256a37c6de079fe92ce744b1f11e16526d06b50a	2012-03-05 16:50:33 -08:00
James Berry	0c1cec2205	Add unit tests for idctllm_test and idctllm_mmx add unit tests for vp8_short_idct4x4llm_c Change-Id: I472b7c0baa365ba25dc99a3f6efccc816d27c941	2012-02-21 14:52:36 -05:00
Makoto Kato	7989bb7fe7	Support Android x86 NDK build On Android NDK, rand() is inlined function. But, on our SSE optimization, we need symbol for rand() Change-Id: I42ab00e3255208ba95d7f9b9a8a3605ff58da8e1	2012-02-16 12:03:30 -08:00
Paul Wilkins	2615ca5d41	Removal of threading code. For the experimental branch we are trying to slim the codebase down removing features such as threading for now which complicate the process of development and testing. Change-Id: I657c0246aef4d1fa8c8ffc6a1adfeee45bce8e24	2012-02-10 16:23:59 +00:00
Paul Wilkins	b2f64dff7d	Added common prediction modules. This function adds the common prediction modules, some data structures and a config option but does not use them. It also corrects a bug in clearing down the MODE_INFO border and introduces a new element that indicates if an entry corresponds to an "in image" macro block or is part of the border. Change-Id: Ib69eec0876173ebe9d1de9df9537d0b2447702e0	2012-01-31 12:53:36 +00:00
John Koleszar	f103dcefaf	RTCD: add subpixel functions This commit continues the process of converting to the new RTCD system. Change-Id: I6c519ab61e4f4e0ebcc796f2df061f945c48cefe	2012-01-30 12:08:29 -08:00
John Koleszar	2a8f57f50d	RTCD: add postproc functions This commit continues the process of converting to the new RTCD system. Change-Id: If54eb5cb5d1b0cac6c4c0633a9e99c93ca860ba2	2012-01-30 12:08:29 -08:00
John Koleszar	fdb61a4531	RTCD: add recon functions This commit continues the process of converting to the new RTCD system. Change-Id: I9bfcf9bef65c3d4ba0fb9a3e1532bad1463a10d6	2012-01-30 12:08:28 -08:00
John Koleszar	ab77b4e898	RTCD: add remaining IDCT functions This commit continues the process of converting to the new RTCD system. Change-Id: I03c4dbf30dfd3558b0e256ff9d3ff4c012aadc80	2012-01-30 12:08:22 -08:00
John Koleszar	55f74c59c7	RTCD: add loopfilter functions This commit continues the process of converting to the new RTCD system. Change-Id: Ic8a4047d72ff3a54ec98977dd90e70c13213db71	2012-01-30 12:06:31 -08:00
John Koleszar	a910049aea	New RTCD implementation This is a proof of concept RTCD implementation to replace the current system of nested includes, prototypes, INVOKE macros, etc. Currently only the decoder specific functions are implemented in the new system. Additional functions will be added in subsequent commits. Overview: RTCD "functions" are implemented as either a global function pointer or a macro (when only one eligible specialization available). Functions which have RTCD specializations are listed using a simple DSL identifying the function's base name, its prototype, and the architecture extensions that specializations are available for. Advantages over the old system: - No INVOKE macros. A call to an RTCD function looks like an ordinary function call. - No need to pass vtables around. - If there is only one eligible function to call, the function is called directly, rather than indirecting through a function pointer. - Supports the notion of "required" extensions, so in combination with the above, on x86_64 if the best function available is sse2 or lower it will be called directly, since all x86_64 platforms implement sse2. - Elides all references to functions which will never be called, which could reduce binary size. For example if sse2 is required and there are both mmx and sse2 implementations of a certain function, the code will have no link time references to the mmx code. - Significantly easier to add a new function, just one file to edit. Disadvantages: - Requires global writable data (though this is not a new requirement) - 1 new generated source file. Change-Id: Iae6edab65315f79c168485c96872641c5aa09d55	2012-01-30 12:06:27 -08:00
Attila Nagy	294aa37745	Rename save_neon_reg.asm as save_reg_neon.asm Easier to filter out all NEON asm. Change-Id: I0022dae8321a9608e864b09d4181414c5fff4610	2012-01-26 09:44:00 +02:00
Jim Bankoski	91325b8fe7	vpn common -> implicit segmentation This introduces base functions for introducing implicit segmentation. The code that actually stores the results to the segment map isn't here yet. This just prints out the segmentation map results if you call it. Uses connected component labeling technique on mbmi info so that only if 2 mbs are horizontally or vertically touching do they get the same segment. vp8next - plumbing for rotation code to produce taps for rotation ( tapify. py ), code for predicting using rotation ( predict_rotated.c ) , code for finding the best rotation find_rotation.c. didn't checkin code that uses this in the codec. still work in progress. Fixed copyright notice Change-Id: I450c13cfa41ab2fcb699f3897760370b4935fdf8	2012-01-24 11:20:13 -08:00
Fritz Koenig	892102842a	Disconnect ARM tgt_isa from dsp extensions A processor with ARMv7 instructions does not necessarily have NEON dsp extensions. This CL has the added side effect of allowing the ability to enable/disable the dsp extensions cleanly. Change-Id: Ie1e879b8fe131885bc3d4138a0acc9ffe73a36df	2012-01-20 10:38:15 -08:00
Scott LaVarnway	33d9ea5471	Merge "Remove useless g_common.h"	2012-01-03 09:48:35 -08:00
John Koleszar	f56918ba9c	Remove legacy integer types Remove BOOL, INTn, UINTn, etc, in favor of C99-style fixed width types. Change-Id: I396636212fb5edd6b347d43cc940186d8cd1e7b5	2011-12-22 09:58:40 -08:00
John Koleszar	0c2f8e77cc	Remove useless g_common.h This file declared a bunch of nonexistent, unreferenced global function pointers. Change-Id: Ic26bb8c7712deba754c49fc01f383b53afc9e728	2011-12-21 15:02:23 -08:00
John Koleszar	056bcc8771	remove armv6 files from armv5 build Make bilinearfilter_arm.c compiled only when HAVE_ARMV6, as its definitions are v6 only. This is normally not a problem for static builds as the file is elided at link time, but this was not being done properly for the --enable-shared --enable-pic build. Change-Id: Ic800a7cde751f74f22555c5b247f99f9df5e550d	2011-12-19 13:51:11 -08:00
Scott LaVarnway	a53d5a4c44	Moved dequant idct into common These functions are now used by the encoder. This is WIP with the goal of creating a common idct/add for the encoder and decoder. A boost of 1.8% was seen for the HD rt test clip used. [Tero] Added needed changes to ARM side. Change-Id: Ibbb8000be09034203d7adffc457d3c3f8b06a5bf	2011-12-15 14:23:41 -05:00
Johann	f2cd4ded22	Move shared data to shared location Storing vp8_bilinear_filters_mmx in an mmx file and using it in an sse2 file is bad Moving towards allowing --disable-mmx Change-Id: I20493b35bdedcdcfc0915e6f05fdbe6c81a4a742	2011-11-18 16:23:14 -08:00
Tero Rintaluoma	5a2fd63a2a	ARMv6 optimized Intra4x4 prediction Added ARM optimized intra 4x4 prediction - 2x faster on Profiler compared to C-code compiled with -O3 - Function interface changed a little to improve BLOCKD structure access Change-Id: I9bc2b723155943fe0cf03dd9ca5f1760f7a81f54	2011-11-09 09:13:51 +02:00
Paul Wilkins	01ce04bc06	Further segment feature extensions. This quite large check in includes the following: Merge in some code from Ronald (mbgraph.c) that scans a Gf/arf group. This is used as a basis for a simple segmentation for the normal frames in a gf/arf group. This code also uses satd functions from Yaowu. Adds functionality for coding the latest possible position of an EOB for blocks in the segment. (Currently 0-15 only, hence just for 4x4 dct). Where the EOB position is 0 this acts like "skip" and the normal coding of skip at the per mb level is disabled. Added functions (seg_common.c) for setting and reading segment feature elements. These may want to be optimized away at some point but while the mecahnism is in a state of flux they provide a single location for making changes and keep things a bit cleaner. This is still proof of concept code. Currently the tested feature set:- Quantizer, Loop Filter level, Reference frame, Prediction Mode, EOB end stop. TBD:- Add functions for setting and reading the feature data with range and validity checking. Handling of signed and unsigned feature data. At the moment all is assumed to be signed and a sign bit is coded but many cannot be negative. Correct handling of EOB feature with intra coded blocks. Testing/trapping of legal/illegal ref frame and mode combinations. Transform size switch plus merge and test with 8c8 DCT work Merge and test with Sumans Segmenation coding optimizations Change-Id: Iee12e83661c7abbd1e0ce6810915eb4ec35e2d8e	2011-10-24 15:52:18 +01:00
Scott LaVarnway	ed9c66f584	Remove usage of predict buffer for decode Instead of using the predict buffer, the decoder now writes the predictor into the recon buffer. For blocks with eob=0, unnecessary idcts can be eliminated. This gave a performance boost of ~1.8% for the HD clips used. Tero: Added needed changes to ARM side and scheduled some assembly code to prevent interlocks. Patch Set 6: Merged (I1bcdca7a95aacc3a181b9faa6b10e3a71ee24df3) into this commit because of similarities in the idct functions. Patch Set 7: EC bug fix. Change-Id: Ie31d90b5d3522e1108163f2ac491e455e3f955e6	2011-10-18 12:06:50 -04:00
Johann	3556deaca3	combine loopfilter data access The data processed by the loopfilter overlaps. At the block level, this results in some redundant transforms. Grouping the filtering allows for a single 16x16 transpose (and inversion) instead of three 16x8 transposes (and three more inversions). This implementation is x86_64 only. We retain the previous implementation for x86. Improvements are obviously material dependant, but it seems to be ~%1 in tests here. Change-Id: I467b7ec3655be98fb5f1a94b5d145e5e5a660007	2011-09-30 07:38:35 -07:00
John Koleszar	4a6ac727fe	Install missing default_coef_probs.h Make sure that this header is listed as one of the sources, so that it will be installed if necessary. Change-Id: I2427e494488126b179151dc21043c1e2c8ba5991	2011-09-22 11:08:24 -04:00
John Koleszar	180b0306cc	Merge remote branch 'internal/upstream' into HEAD Conflicts: vp8/common/defaultcoefcounts.h vp8/common/entropy.c vp8/encoder/bitstream.c Change-Id: Idd4990c80d5b5494ac036254694015fab449bc08	2011-08-25 08:36:19 -04:00
Scott LaVarnway	19987dcbfa	Faster vp8_default_coef_probs Copies from a generated table instead of building the default coeff probabilities during runtime. Change-Id: I4d9551ea3a2d7d4a4f7ce9eda006495221a8de50	2011-08-16 16:21:21 -04:00
John Koleszar	a16cd74ba1	Merge remote branch 'internal/upstream-experimental' into HEAD Conflicts: vp8/decoder/detokenize.c vp8/decoder/onyxd_if.c vp8/vp8_common.mk Change-Id: Ifca1108186a8bc715da86a44021ee2fa5550b5b8	2011-08-11 13:01:45 -04:00
James Berry	27ee521753	include asm_com/dec_offsets for make dist Change-Id: Ia1ad66066a24c01915cd9e3ff75c7e070cc984c8	2011-08-02 13:42:03 -04:00
Johann	3e8c6d3d35	include the arm header files in make dist Change-Id: Ibcf5b4b14153f65ce1b53c3bfba87ad2feb17bbd	2011-08-01 17:20:21 -04:00
John Koleszar	9dfd006017	Merge remote branch 'internal/upstream-experimental' into HEAD Conflicts: vp8/encoder/bitstream.c Change-Id: I44c00f98dcb99eb728ce4f5256aefb135a711a74	2011-06-30 08:46:49 -04:00
Stefan Holmer	4cb0ebe5b2	Adding support for independent partitions Adding support in the encoder for generating independent residual partitions by forcing equal probabilities over the prev coef entropy contexts. Change-Id: I402f5c353255f3ca20eae2620af739f6a498cd21	2011-06-28 11:10:17 -04:00
John Koleszar	f86e14d8dc	Merge remote branch 'internal/upstream' into HEAD	2011-06-28 00:05:04 -04:00
Attila Nagy	6f23f24afe	configuration, support disabling any subset of ARM arch Useful for leaving out any version specific asm files. Change-Id: I233514410eb9d7ca88d2d2c839673122c507fa99	2011-06-21 10:39:01 +03:00
John Koleszar	e1b90ce862	Merge remote branch 'internal/upstream' into HEAD	2011-04-28 00:05:07 -04:00
Ronald S. Bultje	1083fe4999	SSE2/SSSE3 optimizations for build_predictors_mbuv{,_s}(). decoding before 10.425 10.432 10.423 =10.426 after: 10.405 10.416 10.398 =10.406, 0.2% faster encoding before 14.252 14.331 14.250 14.223 14.241 14.220 14.221 =14.248 after 14.095 14.090 14.085 14.095 14.064 14.081 14.089 =14.086, 1.1% faster Change-Id: I483d3d8f0deda8ad434cea76e16028380722aee2	2011-04-27 11:31:27 -07:00
John Koleszar	51bcf621c1	Merge remote branch 'internal/upstream' into HEAD Conflicts: vp8/decoder/decodemv.c vp8/decoder/onyxd_if.c vp8/encoder/ratectrl.c vp8/encoder/rdopt.c Change-Id: Ia1c1c5e589f4200822d12378c7749ba62bd17ae2	2011-03-23 00:27:52 -04:00
John Koleszar	429dc676b1	Increase static linkage, remove unused functions A large number of functions were defined with external linkage, even though they were only used from within one file. This patch changes their linkage to static and removes the vp8_ prefix from their names, which should make it more obvious to the reader that the function is contained within the current translation unit. Functions that were not referenced were removed. These symbols were identified by: $ nm -A libvpx.a \| sort -k3 \| uniq -c -f2 \| grep ' [A-Z] ' \ \| sort \| grep '^ *1 ' Change-Id: I59609f58ab65312012c047036ae1e0634f795779	2011-03-17 20:53:47 -04:00
John Koleszar	820b2b927f	Merge remote branch 'internal/upstream' into HEAD	2011-03-10 00:05:04 -05:00
John Koleszar	5c24071504	Add missing filter.h to build system Missing file causes 'make dist' to not include a complete copy of the source. Change-Id: I3f55aeb5a86d0e81234e4e4588cb8086ba4cfc4a	2011-03-09 13:43:31 -05:00
John Koleszar	b21fe3b278	Merge remote branch 'internal/upstream' into HEAD	2011-02-19 00:05:44 -05:00
John Koleszar	c764c2a20f	Merge "clean up unused files"	2011-02-18 06:33:05 -08:00
John Koleszar	3ed8fe8778	remove unused vp8_predict_dc function Change-Id: I64fa47889c54cfed094a674c49ef0996d49bdd42	2011-02-18 09:12:20 -05:00
John Koleszar	cbf923b12c	clean up unused files Removed a number of files that were unused or little-used. Change-Id: If9ae5e5b11390077581a9a879e8a0defe709f5da	2011-02-18 09:09:49 -05:00
John Koleszar	f13212b728	Merge remote branch 'internal/upstream' into HEAD	2011-02-18 00:05:13 -05:00
John Koleszar	64aebb6c7a	Merge remote branch 'internal/upstream' into HEAD	2011-02-11 00:05:19 -05:00
John Koleszar	02321de0f2	Fix relative include paths Allow compiling without adding vp8/{common,encoder,decoder} to the include paths. Change-Id: Ifeb5dac351cdfadcd659736f5158b315a0030b6c	2011-02-10 15:09:44 -05:00
John Koleszar	ec3b8f1f32	Merge remote branch 'internal/upstream' into HEAD Conflicts: vp8/decoder/onyxd_int.h Change-Id: Id9aa577f03e37b4f406ba3b593c3c4330812a49e	2011-02-10 14:26:40 -05:00
Tero Rintaluoma	cb14764fab	Adds armv6 optimized variance calculation Adds vp8_sub_pixel_variance16x16_armv6 function to encoder. Integrates ARMv6 optimized bilinear interpolations from vp8/common/arm/armv6 and adds new assembly file for variance16x16 calculation. - vp8_filter_block2d_bil_first_pass_armv6 (integrated) - vp8_filter_block2d_bil_second_pass_armv6 (integrated) - vp8_variance16x16_armv6 (new) - bilinearfilter_arm.h (new) Change-Id: I18a8331ce7d031ceedd6cd415ecacb0c8f3392db	2011-02-09 10:23:43 -05:00
John Koleszar	b2ad177942	Merge remote branch 'internal/upstream' into HEAD Conflicts: vp8/vp8_common.mk Change-Id: I2094ddf20834c0b7dfe912feac6a79500bb8cce2	2011-02-09 08:34:48 -05:00
Johann	e5aaac24bb	clean up bilinear filter make reference version of bilinear_filters short. use reference versions of bilinear_filters and sub_pel_filters when possible. recognize that Width was being passed into filter_block2d_bil_first_pass multiple times. ARM version had already fixed this. propegate to C. change references to src_pixels_per_line to src_pitch and standardize on src/dst (instead of input/output). recognize that first_pass is only run in the verticle and second_pass only horizontal. ARM version had already fixed this. propegate to C Change-Id: I292d376d239a9a7ca37ec2bf03cc0720606983e2	2011-02-08 17:42:54 -05:00
Johann	40dcae9c2e	clarify _offsets.asm differences it's difficult to mux the _offsets.c files because of header conflicts. make three instead, name them consistently and partititon the contents to allow building them as required. Change-Id: I8f9768c09279f934f44b6c5b0ec363f7943bb796	2011-02-08 16:35:43 -05:00
Johann	3273c7b679	move one of the offset files common/arm/vpx_asm_offsets moves up a level. prepare for muxing with encoder/arm/vpx_vp8_enc_asm_offsets Change-Id: I89a04a5235447e66571995c9d9b4b6edcb038e24	2011-02-07 11:35:30 -05:00
John Koleszar	7211ac407b	Merge remote branch 'internal/upstream' into HEAD	2010-12-14 00:05:07 -05:00
John Koleszar	b1aa54ab26	remove unused temporal preproc code This code is unused, as the current preproc implementation uses the same spatial filter that postproc uses. Change-Id: Ia06d5664917d67283f279e2480016bebed602ea7	2010-12-13 16:47:59 -05:00
Jim Bankoski	b4a3602f66	changes to start experimenting with color segmentation prediction modes.	2010-11-16 14:38:40 -05:00
John Koleszar	d6c67f02c9	make vp8_recon16x16mb{,y} RTCD functions ARM NEON has a platform specific version of vp8_recon16x16mb, though it's just a stub to extract the various parameters from the MACROBLOCKD struct and pass them to vp8_recon16x16mb_neon(). Using that function's prototype directly will be a better long term solution, but it's quite an invasive change. Change-Id: I04273149e2ade34749e2d09e7edb0c396e1dd620	2010-10-26 13:23:36 -04:00
John Koleszar	19638c2309	arm: move unrolled loops back to generic code Some of the ARM functions differed from their generic counterparts only by unrolling their loops. Since this change may be useful on other platforms, or might even supercede the looped version in the generic case, move it back to the generic file. This code is left under #if ARCH_ARM for now, but it may be worth considering a different (possibly new) conditional for these. If it turns out that this should be runtime selectable, these functions will have to move to the RTCD infrastructure. Don't want to take that step at this time without more profile data. Change-Id: I4612fdbc606fbebba4971a690fb743ad184ff15f	2010-10-26 09:51:35 -04:00

1 2 3 4

166 Commits