generic-library/vpx

Author	SHA1	Message	Date
Scott LaVarnway	ec94967ffe	Revert "Revert "VP8 for ARMv8 by using NEON intrinsics 10"" This reverts commit `677fb5123e` Compiles with 4.6. Change-Id: I7f87048911b6bc28a61741d95501fa45ee97b819	2014-09-04 08:51:20 -07:00
Jia Jia	a51704d9c7	vp8 common: change 'HAVE_NEON_ASM' to 'HAVE_NEON' for compiling idct_blk_neon.c. Change-Id: Ib89107fb824b5fe58afef6841104d5a27b2e0f2d	2014-09-04 08:40:26 -07:00
Scott LaVarnway	dcbfacbb98	Neon version of vp8_build_intra_predictors_mby_s() and vp8_build_intra_predictors_mbuv_s(). This patch replaces the assembly version with an intrinsic version. On a Nexus 7, vpxenc (in realtime mode, speed -12) reported a performance improvement of ~2.6%. Change-Id: I9ef65bad929450c0215253fdae1c16c8b4a8f26f	2014-09-03 13:41:27 -07:00
Scott LaVarnway	9293d267d2	VP8 for ARMv8 by using NEON intrinsics 17 Add vp8_subpixelvariance_neon.c - vp8_sub_pixel_variance16x16_neon_func - vp8_variance_halfpixvar16x16_h_neon - vp8_variance_halfpixvar16x16_v_neon - vp8_variance_halfpixvar16x16_hv_neon - vp8_sub_pixel_variance8x8_neon Change-Id: I3e5d85b2eafc26be0eef6a777789b80e4579257b Signed-off-by: James Yu <james.yu@linaro.org>	2014-09-03 13:33:44 -07:00
Johann	5b788c0cbe	Merge "Revert "Revert "VP8 for ARMv8 by using NEON intrinsics 06" This reverts commit `81ad047ee5`. Revert "VP8 for ARMv8 by using NEON intrinsics 15" This reverts commit 727af7cebe3698b8493ba6c1360b0a6606c310fb.""	2014-09-03 13:27:11 -07:00
Scott LaVarnway	652ef29d09	Revert "Revert "VP8 for ARMv8 by using NEON intrinsics 08"" This reverts commit `928ff03889` Compiles with 4.6 now. Change-Id: Ib455da1098bb0e0623248be07579882a425fcbd1	2014-08-29 13:29:36 -07:00
Johann	911e96a4eb	Revert "Revert "VP8 for ARMv8 by using NEON intrinsics 06" This reverts commit `81ad047ee5`. Revert "VP8 for ARMv8 by using NEON intrinsics 15" This reverts commit 727af7cebe3698b8493ba6c1360b0a6606c310fb." This reverts commit `920f803f2e` Change-Id: I410d9036214a1b18427cca70b4bc6d8239740737	2014-08-20 09:41:50 -07:00
Johann	79afb5eb41	Use lrand48 on Android When building x86 assembly use lrand48 instead of the undocumented inlined _rand function. Android now supports rand() https://android-review.googlesource.com/97731 but only for new versions. Original workaround: https://gerrit.chromium.org/gerrit/15744 Change-Id: I130566837d5bfc9e54187ebe9807350d1a7dab2a	2014-06-12 19:57:25 -07:00
Dmitry Kovalev	1f0b2f95af	Removing vp8/common/pragmas.h. Change-Id: I80630a7350e884ebc4fef73fb5b52ec25f908523	2014-05-23 13:03:15 -07:00
Deb Mukherjee	e272273443	Renames x86_64 specific asm files Renames all x86_64 specific assembly files to consistently end in _x86_64.asm. This will be useful for build systems to handle these files differently. All new 64-bit specific assembly files should use the new naming convention. Change-Id: I36c89584967c82ffc4088b1b5044ac15d2bb7536	2014-05-21 13:55:56 -07:00
Johann	f625b2ac93	Correct HAVE_NEON_ASM define These optimizations are currently disabled. Change-Id: I19c58c9cb82d017638b86196641b9e001dfa798b	2014-05-16 08:20:13 -07:00
Johann	920f803f2e	Revert "VP8 for ARMv8 by using NEON intrinsics 06" This reverts commit `81ad047ee5`. Revert "VP8 for ARMv8 by using NEON intrinsics 15" This reverts commit `727af7cebe`. This exposes a bug in gcc 4.9 regarding register allocation. Will reland when 4.9 is fixed. Change-Id: I2d8a04e4edde93719280e41550f4c0765608ec4d	2014-05-13 13:21:17 -07:00
Johann	ce23931a3f	Only build neon assembly for armv7 targets Allow selectively building just the intrinsics for armv8 Change-Id: I2f29b2e4508b8b8e5649c2906b3159ad1d4ec477	2014-05-12 08:52:02 -07:00
Johann	677fb5123e	Revert "VP8 for ARMv8 by using NEON intrinsics 10" This reverts commit `c500fc22c1` There is an issue with gcc 4.6 in the Android NDK: loopfiltersimpleverticaledge_neon.c: In function 'vp8_loop_filter_bvs_neon': loopfiltersimpleverticaledge_neon.c:176:1: error: insn does not satisfy its constraints: Change-Id: I95b6509d12f075890308914cc691b813d2e5cd9f	2014-05-06 14:28:00 -07:00
Johann	928ff03889	Revert "VP8 for ARMv8 by using NEON intrinsics 08" This reverts commit `a5d79f43b9` There is an issue with gcc 4.6 in the Android NDK: loopfilter_neon.c: In function 'vp8_loop_filter_vertical_edge_y_neon': loopfilter_neon.c:394:1: error: insn does not satisfy its constraints: Change-Id: I2b8c6ee3fa595c152ac3a5c08dd79bd9770c7b52	2014-05-06 13:20:24 -07:00
James Yu	4ea9cf3e2d	VP8 for ARMv8 by using NEON intrinsics 16 Add variance_neon.c - vp8_variance16x16_neon - vp8_variance16x8_neon - vp8_variance8x16_neon - vp8_variance8x8_neon Change-Id: Idfb9c96134a1c6a696a98ce68b4f7ed593a00660 Signed-off-by: James Yu <james.yu@linaro.org>	2014-05-03 19:07:40 -07:00
James Yu	727af7cebe	VP8 for ARMv8 by using NEON intrinsics 15 Add idct_dequant_0_2x_neon.c - idct_dequant_0_2x_neon Change-Id: I8e129172ef1b2517cf72ff267788921f1a792586 Signed-off-by: James Yu <james.yu@linaro.org>	2014-05-03 19:07:33 -07:00
James Yu	08e38f06db	VP8 for ARMv8 by using NEON intrinsics 14 Add sixtappredict_neon.c - vp8_sixtap_predict16x16_neon - vp8_sixtap_predict8x8_neon - vp8_sixtap_predict8x4_neon - vp8_sixtap_predict4x4_neon Change-Id: I3b02fce48ae2e6c6099041ba5ddd7b090f1463b9 Signed-off-by: James Yu <james.yu@linaro.org>	2014-05-03 19:07:12 -07:00
James Yu	18e9caad47	VP8 for ARMv8 by using NEON intrinsics 13 Add shortidct4x4llm_neon.c - vp8_short_idct4x4llm_neon Change-Id: I5a734bbffca8dacf8633c2b0ff07b98aa2f438ba Signed-off-by: James Yu <james.yu@linaro.org>	2014-05-03 19:07:05 -07:00
James Yu	feaf766bd0	VP8 for ARMv8 by using NEON intrinsics 12 Add sad_neon.c - vp8_sad16x16_neon - vp8_sad16x8_neon - vp8_sad8x8_neon - vp8_sad8x16_neon - vp8_sad4x4_neon Change-Id: I08eaae49ec03fb91b394354660a5df0367cea311 Signed-off-by: James Yu <james.yu@linaro.org>	2014-05-03 04:54:39 -07:00
James Yu	4a8336fa9d	VP8 for ARMv8 by using NEON intrinsics 11 Add mbloopfilter_neon.c - vp8_mbloop_filter_horizontal_edge_y_neon - vp8_mbloop_filter_horizontal_edge_uv_neon - vp8_mbloop_filter_vertical_edge_y_neon - vp8_mbloop_filter_vertical_edge_uv_neon Change-Id: Ia9084e0892d4d49412d9cf2b165a0f719f2382d7 Signed-off-by: James Yu <james.yu@linaro.org>	2014-05-03 04:54:33 -07:00
James Yu	c500fc22c1	VP8 for ARMv8 by using NEON intrinsics 10 Add loopfiltersimpleverticaledge_neon.c - vp8_loop_filter_bvs_neon - vp8_loop_filter_mbvs_neon Change-Id: I7cf0a161ad4ae37c881b94cc0122f895d3baae79 Signed-off-by: James Yu <james.yu@linaro.org>	2014-05-03 04:11:00 -07:00
James Yu	55c95f2d2c	VP8 for ARMv8 by using NEON intrinsics 09 Add loopfiltersimplehorizontaledge_neon.c - vp8_loop_filter_bhs_neon - vp8_loop_filter_mbhs_neon Change-Id: I77f9721b20585da8bf3869a3850ff0ae4b4bfeea Signed-off-by: James Yu <james.yu@linaro.org>	2014-05-03 04:10:45 -07:00
James Yu	a5d79f43b9	VP8 for ARMv8 by using NEON intrinsics 08 Add loopfilter_neon.c - vp8_loop_filter_horizontal_edge_y_neon - vp8_loop_filter_horizontal_edge_uv_neon - vp8_loop_filter_vertical_edge_y_neon - vp8_loop_filter_vertical_edge_uv_neon Change-Id: I50b57dedabd42d2a3c183c1738cc5346f0e71ed8 Signed-off-by: James Yu <james.yu@linaro.org>	2014-05-02 09:32:11 -07:00
James Yu	930557be10	VP8 for ARMv8 by using NEON intrinsics 07 Add iwalsh_neon.c - vp8_short_inv_walsh4x4_neon Change-Id: I8beda6ce11ad8ce9e80cc0a38d40161938359162 Signed-off-by: James Yu <james.yu@linaro.org>	2014-05-02 09:24:54 -07:00
James Yu	81ad047ee5	VP8 for ARMv8 by using NEON intrinsics 06 Add idct_dequant_full_2x_neon.c - idct_dequant_full_2x_neon ==== Summary of apply VP8 decode patch series ==== Benchmark on Samsung Chromebook, Cortex-A15, 1.7GHz, Dual core Toolchain: linaro-1.13.1-4.8-2014.01 Compile argument: CROSS=arm-linux-gnueabihf- ../libvpx/configure --target=armv7-linux-gcc --prefix=$HOME/out --enable-shared --cpu=cortex-a7 Test argument: vpxdec --summary --noblit ./tears_of_steel_1080p.webm NEON assembly 46.68 (fps) Apply patch 06 46.65, -0.03 Apply patch 07 46.86, +0.21 Apply patch 08 46.58, -0.28 Apply patch 09 46.57, -0.01 Apply patch 10 46.51, -0.06 Apply patch 11 46.13, -0.38 Apply patch 12 45.42, -0.71 Apply patch 13 46.06, +0.64 Apply patch 14 45.19, -0.87 Apply patch 15 45.93, +0.74 Apply patch 16 45.48, -0.45 Apply patch 17 45.84, +0.36 Apply patch 18 45.91, +0.07 <= With all NEON intrinsics patches Total -0.77 fps, 1.65% performance regression Change-Id: I77bfc9eaccfb97b8d401e949ceff8795e26ca6b7 Signed-off-by: James Yu <james.yu@linaro.org>	2014-05-02 11:57:47 +08:00
Yunqing Wang	096eaba728	Remove VP8 save_reg_neon function This patch did a cleanup following the commit "Save NEON registers in VP8 NEON functions". The pushing/poping of callee-saved NEON registers was moved into individual NEON functions. Therefore, we don't need to save those registers at the beginning of codec. The related code was removed. Change-Id: I5648166514fc9beffb780aa138495597731f49ea	2014-04-29 16:13:24 -07:00
James Zern	805078a1bf	build: convert rtcd.sh to perl significantly speeds up file generation. the goal of this change is to convert rtcd.sh to perl as directly as possible to allow for simple comparison. future changes can make it more perl-like. --- Linux [CREATE] vpx_scale_rtcd.h real 0m0.485s -> 0m0.022s [CREATE] vp8_rtcd.h real 0m4.619s -> 0m0.060s [CREATE] vp9_rtcd.h real 0m10.102s -> 0m0.087s Windows [CREATE] vpx_scale_rtcd.h real 0m8.360s -> 0m0.080s [CREATE] vp8_rtcd.h real 1m8.083s -> 0m0.160s [CREATE] vp9_rtcd.h real 2m6.489s -> 0m0.233s Change-Id: Idfb71188206c91237d6a3c3a81dfe00d103f11ee	2014-03-03 14:47:11 -08:00
James Yu	fb5d281bb6	VP8 for ARMv8 by using NEON intrinsics 05 Add dequantizeb_neon.c - vp8_dequantize_b_loop_neon vpxdec --summary --noblit ../videos/tears_of_steel_1080p.webm Before => After, 13.25 => 13.23 (fps) Change-Id: Iebe3b0c6ed2359c778b0570763c5681ae25fef0c Signed-off-by: James Yu <james.yu@linaro.org>	2014-02-26 10:16:00 +08:00
James Yu	28b2f82f97	VP8 for ARMv8 by using NEON intrinsics 04 Add dequant_idct_neon.c - vp8_dequant_idct_add_neon vpxdec --summary --noblit ../videos/tears_of_steel_1080p.webm Before => After, 13.25 => 13.22 (fps) Change-Id: Id48f39e1da58dd3d8d37658e94989411997f4f7c Signed-off-by: James Yu <james.yu@linaro.org>	2014-02-26 09:59:23 +08:00
James Yu	d749ab6221	VP8 for ARMv8 by using NEON intrinsics 03 Add dc_only_idct_add_neon.c - vp8_dc_only_idct_add_neon vpxdec --summary --noblit ../videos/tears_of_steel_1080p.webm Before => After, 13.25 => 13.24 (fps) Change-Id: I5e9e277ec3a3ca67e13c8cc4c324a6fbe8a897fc Signed-off-by: James Yu <james.yu@linaro.org>	2014-02-26 09:28:29 +08:00
James Yu	300a3bfc73	VP8 for ARMv8 by using NEON intrinsics 02 Add copymem_neon.c - vp8_copy_mem16x16_neon - vp8_copy_mem8x8_neon - vp8_copy_mem8x4_neon vpxdec --summary --noblit ../videos/tears_of_steel_1080p.webm Before => After, 13.25 => 13.25 (fps) Change-Id: Ib956b5a20522ff57dc8a580bf0aef7b252bddba6 Signed-off-by: James Yu <james.yu@linaro.org>	2014-02-23 22:56:53 +08:00
Johann	dadf350551	Apply neon flags to intrinsic files Filter out files ending in _neon.c and append .neon so the Android build system knows to apply -mfpu=neon Change-Id: Ib67277e5920bfcaeda7c4aa16cd1001b11d59305	2014-01-10 12:16:59 -08:00
James Yu	79395e16cf	VP8 for ARMv8 by using NEON intrinsics 01 Add bilinearpredict_neon_intrinsics.c - vp8_bilinear_predict4x4_neon - vp8_bilinear_predict8x4_neon - vp8_bilinear_predict8x8_neon - vp8_bilinear_predict16x16_neon Change-Id: I33dfa502881219841b442dda32b73220e51b716b Signed-off-by: James Yu <james.yu@linaro.org>	2014-01-09 09:56:22 -08:00
James Zern	f89335f7ca	remove unused VP8 com/dec asm offsets Change-Id: Ib3b26ee27f04b2dcbbd32b3127afb45e9f50cfcf	2013-07-09 14:33:49 -07:00
James Zern	08348d9cab	prefix vp8 asm_{com,dec,enc}_offsets files make them symmetrical with the generated output and their vp9 counterparts Change-Id: I72cc97c4d33d713dff620a6d7cc25955266216fc	2013-03-02 14:45:40 -08:00
John Koleszar	a9c7597adc	support building vp8 and vp9 into a single lib Change-Id: Ib8f8a66c9fd31e508cdc9caa662192f38433aa3d	2012-11-15 10:46:17 -08:00
John Koleszar	7b8dfcb5a2	Rough merge of master into experimental Creates a merge between the master and experimental branches. Fixes a number of conflicts in the build system to allow either VP8 or VP9 to be built. Specifically either: $ configure --disable-vp9 $ configure --disable-vp8 --disable-unit-tests VP9 still exports its symbols and files as VP8, so that will be resolved in the next commit. Unit tests are broken in VP9, but this isn't a new issue. They are fixed upstream on origin/experimental as of this writing, but rebasing this merge proved difficult, so will tackle that in a second merge commit. Change-Id: I2b7d852c18efd58d1ebc621b8041fe0260442c21	2012-11-07 11:30:16 -08:00
Ronald S. Bultje	4b2c2b9aa4	Rename vp8/ codec directory to vp9/. Change-Id: Ic084c475844b24092a433ab88138cf58af3abbe4	2012-11-01 16:31:22 -07:00
Ronald S. Bultje	6c280c2299	Adjust style to match Google Coding Style a little more closely. Most of these were picked up by jenkins in the commit that changed the vp8 namespace to vp9 in common/. Change-Id: I5cbd56ffc753b92ef805133cda6acc1713a13878	2012-11-01 10:03:48 -07:00
Ronald S. Bultje	8166657109	Make implicit_segmentation-related code an experiment. This way, the code is not compiled in by default, thus decreasing overall binary size. Change-Id: I85cac8f5a22a51a7d99c820ef6d6ed179d4106a0	2012-10-29 21:15:42 -07:00
Scott LaVarnway	ce811f87c4	Faster 8t filtering Quickly modified the ssse3 sixtap filters to support eight taps. For the test clip used, a 23+% boost in decoder performance was seen. We can revisit later and improve further. Change-Id: I5f59860459e80d6fa23e6cc0fd91296a969f5240	2012-10-25 17:24:50 -07:00
Scott LaVarnway	9ba2efd034	Added sse2 instrinsic version of vp8_sad16x3 3.7% boost in decoder performance for the clip used. Change-Id: I74f28486a9352b472b36e21b5eaf30eff35e9199	2012-10-25 12:16:08 -07:00
Scott LaVarnway	d36ecb42da	Added rtcd support vp8_sad16x3 and vp8_sad3x16 Change-Id: I5bca7b7a4b230082d36ac6fb84db84137ad177d7	2012-10-22 13:45:42 -07:00
Scott LaVarnway	085433c2d0	sse2 intrinsic version of vp8_mbloop_filter_vertical_edge() First sse2 version of vp8_mbloop_filter_vertical_edge(). For now, intrinsics are being used until the bitstream is finalized. This function will be revisited later for further performance improvements. For the test clip used, a 34+% decoder performance improvement was seen. This will vary depending on material. Change-Id: I455b438bc8d8af76cf7533ac42eda5f689b21f7c	2012-10-19 15:52:12 -07:00
Scott LaVarnway	992b5e2d95	sse2 intrinsic version of vp8_mbloop_filter_horizontal_edge() First sse2 version of vp8_mbloop_filter_horizontal_edge(). For now, intrinsics are being used until the bitstream is finalized. This function will be revisited later for further performance improvements. For the test clip used, a 31+% decoder performance improvement was seen. This will vary depending on material. Change-Id: I03ed3a7182478bdd1f094644ff3e0442625600e7	2012-10-18 14:29:26 -07:00
Jim Bankoski	ffff213463	removed obselete build dependency this commit fixes the build on windows with visual studio 2008. Change-Id: I0baa4044e9e54237da29f2e17332ea6f766dbbec	2012-10-17 09:22:05 -07:00
Paul Wilkins	2d60bee1fb	New Motion Reference Search Alternative strategy for finding a list of candidate motion vectors to use as reference values in mv coding and as nearest and near. Sort by sad in vp8_find_best_ref_mvs() rather than just pick the best. Allow 0,0 as a best ref option but not a nearest or near unless there are no alternatives. Encode/Decode verified on at least some clips. Some commented out experimental and stats code still in place. Gain over existing code averages about 1% on derf (alll metrics) with improvement on all clips. Other test results pending. The entropy coding of the mode (nearest/near etc) still depends upon and requires the old "findnear" code so this needs looking at and may provide room for further gains. Change-Id: I871d7cba1d1c379c4bad9bcccce1fb19c46b8247	2012-08-24 18:08:21 +01:00
John Koleszar	b43ed7a5b1	Merge "remove rotation experiment" into experimental	2012-08-22 10:01:39 -07:00
Christian Duvivier	63ef9c40a4	SSE2 version of vectorized 8-tap filtering. About 20% overall encoder speedup (vs. about 30% for sse4 version). Change-Id: Ibf608a6a1bc94b14ec47e8046d3206b275b5a8bd	2012-08-21 15:26:14 -07:00

1 2 3

136 Commits