Johann
6a82f0d7fb
Move sub pixel variance to vpx_dsp
...
Change-Id: I66bf6720c396c89aa2d1fd26d5d52bf5d5e3dff1
2015-07-07 15:51:04 -07:00
James Zern
dcf5b7cfdd
loopfiltersimpleverticaledge_neon: quiet uninit var warnings
...
take 2. localize the function parameter to actually remove the warning
Change-Id: I23c02061b5e21b0b75bd33c26062d1e531df7b92
2015-06-30 23:23:59 -07:00
James Zern
69c153c4e6
loopfiltersimpleverticaledge_neon: quiet uninit var warnings
...
the vector used in vld*_lane_* should be initialized before use
Change-Id: Idce95354737915f6fb4e6b5e8980a050e953036d
2015-06-25 20:39:21 -07:00
James Zern
f4d746a3c1
idct_dequant_0_2x_neon: quiet uninit var warnings
...
the vector used in vld*_lane_* should be initialized before use
Change-Id: I6b791088479fec3bc021ca75cc2af5adcc39d954
2015-06-25 20:29:35 -07:00
James Zern
4bd87a9b9e
vp8_subpixelvariance_neon: right size coeff table
...
only uint8 is required; each use only loads one value as a uint8
quiets a few type conversion warnings
Change-Id: I03dc0dc0eb01ac23a6e8673daa2b77c6c57bf1b0
2015-06-23 23:48:12 -07:00
Johann
c3bdffb0a5
Move variance functions to vpx_dsp
...
subpel functions will be moved in another patch.
Change-Id: Idb2e049bad0b9b32ac42cc7731cd6903de2826ce
2015-05-26 12:01:52 -07:00
James Zern
fd3658b0e4
replace DECLARE_ALIGNED_ARRAY w/DECLARE_ALIGNED
...
this macro was used inconsistently and only differs in behavior from
DECLARE_ALIGNED when an alignment attribute is unavailable. this macro
is used with calls to assembly, while generic c-code doesn't rely on it,
so in a c-only build without an alignment attribute the code will
function as expected.
Change-Id: Ie9d06d4028c0de17c63b3a27e6c1b0491cc4ea79
2015-05-07 11:55:08 -07:00
Johann
d5d9289800
Move shared SAD code to vpx_dsp
...
Create a new component, vpx_dsp, for code that can be shared
between codecs. Move the SAD code into the component.
This reduces the size of vpxenc/dec by 36k on x86_64 builds.
Change-Id: I73f837ddaecac6b350bf757af0cfe19c4ab9327a
2015-05-06 16:58:20 -07:00
Johann
eabb793f3b
Use correct buffer size in vp8 subpixel variance
...
In vp8_sub_pixel_variance8x8_neon the temp2 buffer is only initialized
to kHeight8 * kWidth8. However, in the case that xoffset != 0 and
yoffset == 0, var_filter_block2d_bil_w8 is called with output_width
kHeight8PlusOne.
Thanks to cmugurel for diagnosing and yulius for the patch.
Change-Id: Ib71ffd96ffad963c92b8b7ca23f303942785b8e0
https://code.google.com/p/webrtc/issues/detail?id=4190
2015-02-03 09:11:05 -08:00
Johann
f6be2f3c87
Clarify GCC version check
...
The version check was incorrectly matching some versions of clang
which reported as gcc 4.2
Change-Id: I686d3576e71883fe1463206b56ab5e2aa9bb68a8
2014-09-25 11:53:45 -07:00
Jia Jia
0ae866bd19
vp8/vp9: neon: msvc: move the 'ifdef _MSC_VER' bit to vpx_ports/mem.h.
...
fix compiling warning.
Change-Id: If8706a9046436f704c597e4275a6810c76ba7daa
2014-09-14 01:43:54 +08:00
James Zern
35fadf1d25
bilinearpredict_neon: fix type conversion warnings
...
make bifilter4_coeff[][] uint8_t, no values exceed this range and
they're loaded with vdup_n_u8().
Change-Id: I921983e9edd828d29820e40ac30a7801dbe0fb4f
2014-09-04 20:50:42 -07:00
James Zern
f61e00c79d
Merge "arm: Fix building vp8_subpixelvariance_neon.c with MSVC"
2014-09-04 11:00:53 -07:00
Scott LaVarnway
ec94967ffe
Revert "Revert "VP8 for ARMv8 by using NEON intrinsics 10""
...
This reverts commit 677fb5123e
Compiles with 4.6.
Change-Id: I7f87048911b6bc28a61741d95501fa45ee97b819
2014-09-04 08:51:20 -07:00
Martin Storsjo
0002da32e6
arm: Fix building vp8_subpixelvariance_neon.c with MSVC
...
Use the right return values - vget_low_s64 returns int64x1_t, not
a normal int64_t.
Also make __builtin_prefetch a no-op on MSVC for this file.
Change-Id: I4d2fce01d0ba106b98d3d53b137803119c2c2c08
2014-09-04 09:49:30 +03:00
Scott LaVarnway
dcbfacbb98
Neon version of vp8_build_intra_predictors_mby_s() and
...
vp8_build_intra_predictors_mbuv_s().
This patch replaces the assembly version with an intrinsic
version.
On a Nexus 7, vpxenc (in realtime mode, speed -12)
reported a performance improvement of ~2.6%.
Change-Id: I9ef65bad929450c0215253fdae1c16c8b4a8f26f
2014-09-03 13:41:27 -07:00
Scott LaVarnway
9293d267d2
VP8 for ARMv8 by using NEON intrinsics 17
...
Add vp8_subpixelvariance_neon.c
- vp8_sub_pixel_variance16x16_neon_func
- vp8_variance_halfpixvar16x16_h_neon
- vp8_variance_halfpixvar16x16_v_neon
- vp8_variance_halfpixvar16x16_hv_neon
- vp8_sub_pixel_variance8x8_neon
Change-Id: I3e5d85b2eafc26be0eef6a777789b80e4579257b
Signed-off-by: James Yu <james.yu@linaro.org>
2014-09-03 13:33:44 -07:00
Johann
5b788c0cbe
Merge "Revert "Revert "VP8 for ARMv8 by using NEON intrinsics 06" This reverts commit 81ad047ee5
. Revert "VP8 for ARMv8 by using NEON intrinsics 15" This reverts commit 727af7cebe3698b8493ba6c1360b0a6606c310fb.""
2014-09-03 13:27:11 -07:00
Scott LaVarnway
652ef29d09
Revert "Revert "VP8 for ARMv8 by using NEON intrinsics 08""
...
This reverts commit 928ff03889
Compiles with 4.6 now.
Change-Id: Ib455da1098bb0e0623248be07579882a425fcbd1
2014-08-29 13:29:36 -07:00
Johann
911e96a4eb
Revert "Revert "VP8 for ARMv8 by using NEON intrinsics 06" This reverts commit 81ad047ee5
. Revert "VP8 for ARMv8 by using NEON intrinsics 15" This reverts commit 727af7cebe3698b8493ba6c1360b0a6606c310fb."
...
This reverts commit 920f803f2e
Change-Id: I410d9036214a1b18427cca70b4bc6d8239740737
2014-08-20 09:41:50 -07:00
James Zern
74ed33cf6e
vp8_bilinear_predict4x4_neon: init src vectors
...
quiets uninitialized warnings on the first load.
Change-Id: I58a5af337087d96b4eaea8991a0f85c4ba58aebe
2014-07-11 00:05:25 -07:00
James Zern
5e30127c7a
vp8_sixtap_predict4x4_neon: init src vectors
...
quiets uninitialized warnings on the first load.
Change-Id: Ied9b03928537a9ed2cd414b9e8a0be00191b0f32
2014-07-10 23:48:47 -07:00
Johann
2f6f955a17
Remove intermediate step in vp8_dequantize_b
...
With the intrinsics it is no longer necessary to have a stub/helper
function.
Change-Id: I3695961c3c94f1bb750d3b7b29716e509ebba482
2014-05-14 12:24:18 -07:00
Johann
920f803f2e
Revert "VP8 for ARMv8 by using NEON intrinsics 06"
...
This reverts commit 81ad047ee5
.
Revert "VP8 for ARMv8 by using NEON intrinsics 15"
This reverts commit 727af7cebe
.
This exposes a bug in gcc 4.9 regarding register allocation. Will reland
when 4.9 is fixed.
Change-Id: I2d8a04e4edde93719280e41550f4c0765608ec4d
2014-05-13 13:21:17 -07:00
Johann
4bffb75ba3
Merge "Revert "VP8 for ARMv8 by using NEON intrinsics 10""
2014-05-07 06:47:48 -07:00
Johann
3a695015ad
Merge "arm: Use a correct neon vector type for 64 bit integers"
2014-05-07 06:34:25 -07:00
Martin Storsjo
d5d82a5e1a
arm: Add a no-op define of __builtin_prefetch for MSVC
...
Both GCC and RVCT/ARMCC support __builtin_prefetch, but MSVC
doesn't.
Change-Id: I44e1eecead61bc88d8fdfd3fef03d76d4f5afe08
2014-05-07 10:43:24 +03:00
Martin Storsjo
82a83c4fe0
arm: Use a correct neon vector type for 64 bit integers
...
This fixes building with MSVC.
Change-Id: I763ba8855c8083d82c8b477d3a297e310e93a335
2014-05-07 10:22:40 +03:00
Johann
677fb5123e
Revert "VP8 for ARMv8 by using NEON intrinsics 10"
...
This reverts commit c500fc22c1
There is an issue with gcc 4.6 in the Android NDK:
loopfiltersimpleverticaledge_neon.c: In function 'vp8_loop_filter_bvs_neon':
loopfiltersimpleverticaledge_neon.c:176:1: error: insn does not satisfy its constraints:
Change-Id: I95b6509d12f075890308914cc691b813d2e5cd9f
2014-05-06 14:28:00 -07:00
Johann
928ff03889
Revert "VP8 for ARMv8 by using NEON intrinsics 08"
...
This reverts commit a5d79f43b9
There is an issue with gcc 4.6 in the Android NDK:
loopfilter_neon.c: In function 'vp8_loop_filter_vertical_edge_y_neon':
loopfilter_neon.c:394:1: error: insn does not satisfy its constraints:
Change-Id: I2b8c6ee3fa595c152ac3a5c08dd79bd9770c7b52
2014-05-06 13:20:24 -07:00
Johann
34843e9784
Merge "VP8 for ARMv8 by using NEON intrinsics 16"
2014-05-05 13:09:43 -07:00
Johann
ad42b09f2f
Merge "VP8 for ARMv8 by using NEON intrinsics 15"
2014-05-05 13:09:12 -07:00
Johann
1b7291d52c
Merge "VP8 for ARMv8 by using NEON intrinsics 14"
2014-05-05 07:08:08 -07:00
Johann
a7355f3bbb
Merge changes Iaf7d6b0a,Iece0bf56
...
* changes:
Use INLINE and include vpx_config.h instead of plain 'inline'
Use vreinterpret instead of casting neon vector types
2014-05-05 05:36:54 -07:00
Martin Storsjo
7afed9a1b6
Use INLINE and include vpx_config.h instead of plain 'inline'
...
This fixes compilation with MSVC.
Change-Id: Iaf7d6b0a0134968a6addf315fde6d852f298db8c
2014-05-04 22:42:13 +03:00
Martin Storsjo
dfb8fc917a
Use vreinterpret instead of casting neon vector types
...
MSVC doesn't support casting neon vector types but requires using
vreinterpret.
Change-Id: Iece0bf5632567efd7f37f527abea38afeab4926d
2014-05-04 22:40:57 +03:00
James Yu
4ea9cf3e2d
VP8 for ARMv8 by using NEON intrinsics 16
...
Add variance_neon.c
- vp8_variance16x16_neon
- vp8_variance16x8_neon
- vp8_variance8x16_neon
- vp8_variance8x8_neon
Change-Id: Idfb9c96134a1c6a696a98ce68b4f7ed593a00660
Signed-off-by: James Yu <james.yu@linaro.org>
2014-05-03 19:07:40 -07:00
James Yu
727af7cebe
VP8 for ARMv8 by using NEON intrinsics 15
...
Add idct_dequant_0_2x_neon.c
- idct_dequant_0_2x_neon
Change-Id: I8e129172ef1b2517cf72ff267788921f1a792586
Signed-off-by: James Yu <james.yu@linaro.org>
2014-05-03 19:07:33 -07:00
James Yu
08e38f06db
VP8 for ARMv8 by using NEON intrinsics 14
...
Add sixtappredict_neon.c
- vp8_sixtap_predict16x16_neon
- vp8_sixtap_predict8x8_neon
- vp8_sixtap_predict8x4_neon
- vp8_sixtap_predict4x4_neon
Change-Id: I3b02fce48ae2e6c6099041ba5ddd7b090f1463b9
Signed-off-by: James Yu <james.yu@linaro.org>
2014-05-03 19:07:12 -07:00
James Yu
18e9caad47
VP8 for ARMv8 by using NEON intrinsics 13
...
Add shortidct4x4llm_neon.c
- vp8_short_idct4x4llm_neon
Change-Id: I5a734bbffca8dacf8633c2b0ff07b98aa2f438ba
Signed-off-by: James Yu <james.yu@linaro.org>
2014-05-03 19:07:05 -07:00
James Yu
feaf766bd0
VP8 for ARMv8 by using NEON intrinsics 12
...
Add sad_neon.c
- vp8_sad16x16_neon
- vp8_sad16x8_neon
- vp8_sad8x8_neon
- vp8_sad8x16_neon
- vp8_sad4x4_neon
Change-Id: I08eaae49ec03fb91b394354660a5df0367cea311
Signed-off-by: James Yu <james.yu@linaro.org>
2014-05-03 04:54:39 -07:00
James Yu
4a8336fa9d
VP8 for ARMv8 by using NEON intrinsics 11
...
Add mbloopfilter_neon.c
- vp8_mbloop_filter_horizontal_edge_y_neon
- vp8_mbloop_filter_horizontal_edge_uv_neon
- vp8_mbloop_filter_vertical_edge_y_neon
- vp8_mbloop_filter_vertical_edge_uv_neon
Change-Id: Ia9084e0892d4d49412d9cf2b165a0f719f2382d7
Signed-off-by: James Yu <james.yu@linaro.org>
2014-05-03 04:54:33 -07:00
James Yu
c500fc22c1
VP8 for ARMv8 by using NEON intrinsics 10
...
Add loopfiltersimpleverticaledge_neon.c
- vp8_loop_filter_bvs_neon
- vp8_loop_filter_mbvs_neon
Change-Id: I7cf0a161ad4ae37c881b94cc0122f895d3baae79
Signed-off-by: James Yu <james.yu@linaro.org>
2014-05-03 04:11:00 -07:00
James Yu
55c95f2d2c
VP8 for ARMv8 by using NEON intrinsics 09
...
Add loopfiltersimplehorizontaledge_neon.c
- vp8_loop_filter_bhs_neon
- vp8_loop_filter_mbhs_neon
Change-Id: I77f9721b20585da8bf3869a3850ff0ae4b4bfeea
Signed-off-by: James Yu <james.yu@linaro.org>
2014-05-03 04:10:45 -07:00
James Yu
a5d79f43b9
VP8 for ARMv8 by using NEON intrinsics 08
...
Add loopfilter_neon.c
- vp8_loop_filter_horizontal_edge_y_neon
- vp8_loop_filter_horizontal_edge_uv_neon
- vp8_loop_filter_vertical_edge_y_neon
- vp8_loop_filter_vertical_edge_uv_neon
Change-Id: I50b57dedabd42d2a3c183c1738cc5346f0e71ed8
Signed-off-by: James Yu <james.yu@linaro.org>
2014-05-02 09:32:11 -07:00
James Yu
930557be10
VP8 for ARMv8 by using NEON intrinsics 07
...
Add iwalsh_neon.c
- vp8_short_inv_walsh4x4_neon
Change-Id: I8beda6ce11ad8ce9e80cc0a38d40161938359162
Signed-off-by: James Yu <james.yu@linaro.org>
2014-05-02 09:24:54 -07:00
James Yu
81ad047ee5
VP8 for ARMv8 by using NEON intrinsics 06
...
Add idct_dequant_full_2x_neon.c
- idct_dequant_full_2x_neon
==== Summary of apply VP8 decode patch series ====
Benchmark on Samsung Chromebook, Cortex-A15, 1.7GHz, Dual core
Toolchain: linaro-1.13.1-4.8-2014.01
Compile argument: CROSS=arm-linux-gnueabihf- ../libvpx/configure
--target=armv7-linux-gcc --prefix=$HOME/out
--enable-shared --cpu=cortex-a7
Test argument: vpxdec --summary --noblit ./tears_of_steel_1080p.webm
NEON assembly 46.68 (fps)
Apply patch 06 46.65, -0.03
Apply patch 07 46.86, +0.21
Apply patch 08 46.58, -0.28
Apply patch 09 46.57, -0.01
Apply patch 10 46.51, -0.06
Apply patch 11 46.13, -0.38
Apply patch 12 45.42, -0.71
Apply patch 13 46.06, +0.64
Apply patch 14 45.19, -0.87
Apply patch 15 45.93, +0.74
Apply patch 16 45.48, -0.45
Apply patch 17 45.84, +0.36
Apply patch 18 45.91, +0.07 <= With all NEON intrinsics patches
Total -0.77 fps, 1.65% performance regression
Change-Id: I77bfc9eaccfb97b8d401e949ceff8795e26ca6b7
Signed-off-by: James Yu <james.yu@linaro.org>
2014-05-02 11:57:47 +08:00
Yunqing Wang
096eaba728
Remove VP8 save_reg_neon function
...
This patch did a cleanup following the commit "Save NEON registers
in VP8 NEON functions". The pushing/poping of callee-saved NEON
registers was moved into individual NEON functions. Therefore,
we don't need to save those registers at the beginning of codec.
The related code was removed.
Change-Id: I5648166514fc9beffb780aa138495597731f49ea
2014-04-29 16:13:24 -07:00
Yunqing Wang
33df6d1fc1
Save NEON registers in VP8 NEON functions
...
The recent compiler can generate optimized code that uses NEON registers
for various operations besides floating-point operations. Therefore,
only saving callee-saved registers d8 - d15 at the beginning of the
encoder/decoder is not enough anymore. This patch added register saving
code in VP8 NEON functions that use those registers.
Change-Id: Ie9e44f5188cf410990c8aaaac68faceee9dffd31
2014-04-28 14:51:53 -07:00
James Yu
fb5d281bb6
VP8 for ARMv8 by using NEON intrinsics 05
...
Add dequantizeb_neon.c
- vp8_dequantize_b_loop_neon
vpxdec --summary --noblit ../videos/tears_of_steel_1080p.webm
Before => After, 13.25 => 13.23 (fps)
Change-Id: Iebe3b0c6ed2359c778b0570763c5681ae25fef0c
Signed-off-by: James Yu <james.yu@linaro.org>
2014-02-26 10:16:00 +08:00