Johann
4e5e5fc52b
Rename vp8 loopfilter[_neon.c]
...
Avoid conflict with vpx_dsp version
Change-Id: I041b1532a9276400a5547de8dfed1de43ad4e83d
2015-08-18 11:47:00 -07:00
Johann
749c393c8d
Rename vp8 loopfilter_mmx.asm
...
Chromium puts all the yasm output in the same directory. Looking at ways
to improve this but in the meantime get rid of collisions.
Change-Id: I923c5231d14e895ab96521eb89807ede868a0753
2015-08-03 14:27:03 -07:00
Parag Salasakar
a5d9416fd7
mips msa vp8 post proc optimization
...
average improvement ~2x-4x
Change-Id: I93abc15389649c169bb8b69127c0b95407d34692
2015-07-29 09:40:26 +05:30
Parag Salasakar
5deb983744
mips msa vp8 filter by weight optimization
...
average improvement ~3x-5x
Change-Id: Ia808ae56b118e0e1b293901447aa5a0f597b405b
2015-07-28 08:16:34 +05:30
Parag Salasakar
af6733aec6
mips msa vp8 recon intra optimization
...
average improvement ~3x-5x
Change-Id: I73306863e9bf172d5adc06b8dd54e43985d1e063
2015-07-25 12:32:26 +05:30
Parag Salasakar
fb73ceae85
mips msa vp8 bilinear filter optimization
...
average improvement ~3x-4x
Change-Id: I8c0b3d5c86c9eb4f802b87c971864d2cfceeb7cc
2015-07-24 09:21:35 +05:30
Parag Salasakar
509fb0bc9d
mips msa vp8 copy mem optimization
...
average improvement ~2x-4x
Change-Id: I3af3ecced96c5b8e0cb811256e5089e28fe013a2
2015-07-23 10:29:40 +05:30
Parag Salasakar
55c0df5ef1
mips msa vp8 sixtap filter optimization
...
average improvement ~3x-5x
Change-Id: I5fd88cb088814be443d04be384b9fca99b22adef
2015-07-13 09:23:52 +05:30
Parag Salasakar
0ea2684c2c
mips msa vp8 loop filter optimization
...
average improvement ~2x-4x
Change-Id: I20c4f900ef95d99b18f9cf4db592cd352c2212eb
2015-07-08 12:41:00 +05:30
Johann
6a82f0d7fb
Move sub pixel variance to vpx_dsp
...
Change-Id: I66bf6720c396c89aa2d1fd26d5d52bf5d5e3dff1
2015-07-07 15:51:04 -07:00
Parag Salasakar
3d938d71b0
mips msa vp8 idct optimization
...
average improvement ~2x-5x
Change-Id: I19e82f78772993bcd67fcf975fe180232172f86d
2015-07-07 12:41:54 +05:30
Johann
907b33cdc4
Move vp8 variance files
...
There is a naming conflict in the chromium build system.
The rest of the variance functions will move to vpx_dsp soon.
Change-Id: Iff78da2aafb0d7380eda73e38d7dac72110a1e47
2015-06-18 16:42:28 -07:00
Johann
c3bdffb0a5
Move variance functions to vpx_dsp
...
subpel functions will be moved in another patch.
Change-Id: Idb2e049bad0b9b32ac42cc7731cd6903de2826ce
2015-05-26 12:01:52 -07:00
Johann
d5d9289800
Move shared SAD code to vpx_dsp
...
Create a new component, vpx_dsp, for code that can be shared
between codecs. Move the SAD code into the component.
This reduces the size of vpxenc/dec by 36k on x86_64 builds.
Change-Id: I73f837ddaecac6b350bf757af0cfe19c4ab9327a
2015-05-06 16:58:20 -07:00
Scott LaVarnway
ec94967ffe
Revert "Revert "VP8 for ARMv8 by using NEON intrinsics 10""
...
This reverts commit 677fb5123e
Compiles with 4.6.
Change-Id: I7f87048911b6bc28a61741d95501fa45ee97b819
2014-09-04 08:51:20 -07:00
Jia Jia
a51704d9c7
vp8 common: change 'HAVE_NEON_ASM' to 'HAVE_NEON' for compiling idct_blk_neon.c.
...
Change-Id: Ib89107fb824b5fe58afef6841104d5a27b2e0f2d
2014-09-04 08:40:26 -07:00
Scott LaVarnway
dcbfacbb98
Neon version of vp8_build_intra_predictors_mby_s() and
...
vp8_build_intra_predictors_mbuv_s().
This patch replaces the assembly version with an intrinsic
version.
On a Nexus 7, vpxenc (in realtime mode, speed -12)
reported a performance improvement of ~2.6%.
Change-Id: I9ef65bad929450c0215253fdae1c16c8b4a8f26f
2014-09-03 13:41:27 -07:00
Scott LaVarnway
9293d267d2
VP8 for ARMv8 by using NEON intrinsics 17
...
Add vp8_subpixelvariance_neon.c
- vp8_sub_pixel_variance16x16_neon_func
- vp8_variance_halfpixvar16x16_h_neon
- vp8_variance_halfpixvar16x16_v_neon
- vp8_variance_halfpixvar16x16_hv_neon
- vp8_sub_pixel_variance8x8_neon
Change-Id: I3e5d85b2eafc26be0eef6a777789b80e4579257b
Signed-off-by: James Yu <james.yu@linaro.org>
2014-09-03 13:33:44 -07:00
Johann
5b788c0cbe
Merge "Revert "Revert "VP8 for ARMv8 by using NEON intrinsics 06" This reverts commit 81ad047ee5
. Revert "VP8 for ARMv8 by using NEON intrinsics 15" This reverts commit 727af7cebe3698b8493ba6c1360b0a6606c310fb.""
2014-09-03 13:27:11 -07:00
Scott LaVarnway
652ef29d09
Revert "Revert "VP8 for ARMv8 by using NEON intrinsics 08""
...
This reverts commit 928ff03889
Compiles with 4.6 now.
Change-Id: Ib455da1098bb0e0623248be07579882a425fcbd1
2014-08-29 13:29:36 -07:00
Johann
911e96a4eb
Revert "Revert "VP8 for ARMv8 by using NEON intrinsics 06" This reverts commit 81ad047ee5
. Revert "VP8 for ARMv8 by using NEON intrinsics 15" This reverts commit 727af7cebe3698b8493ba6c1360b0a6606c310fb."
...
This reverts commit 920f803f2e
Change-Id: I410d9036214a1b18427cca70b4bc6d8239740737
2014-08-20 09:41:50 -07:00
Johann
79afb5eb41
Use lrand48 on Android
...
When building x86 assembly use lrand48 instead of the
undocumented inlined _rand function.
Android now supports rand()
https://android-review.googlesource.com/97731
but only for new versions. Original workaround:
https://gerrit.chromium.org/gerrit/15744
Change-Id: I130566837d5bfc9e54187ebe9807350d1a7dab2a
2014-06-12 19:57:25 -07:00
Dmitry Kovalev
1f0b2f95af
Removing vp8/common/pragmas.h.
...
Change-Id: I80630a7350e884ebc4fef73fb5b52ec25f908523
2014-05-23 13:03:15 -07:00
Deb Mukherjee
e272273443
Renames x86_64 specific asm files
...
Renames all x86_64 specific assembly files to consistently
end in _x86_64.asm. This will be useful for build systems to
handle these files differently.
All new 64-bit specific assembly files should use the new
naming convention.
Change-Id: I36c89584967c82ffc4088b1b5044ac15d2bb7536
2014-05-21 13:55:56 -07:00
Johann
f625b2ac93
Correct HAVE_NEON_ASM define
...
These optimizations are currently disabled.
Change-Id: I19c58c9cb82d017638b86196641b9e001dfa798b
2014-05-16 08:20:13 -07:00
Johann
920f803f2e
Revert "VP8 for ARMv8 by using NEON intrinsics 06"
...
This reverts commit 81ad047ee5
.
Revert "VP8 for ARMv8 by using NEON intrinsics 15"
This reverts commit 727af7cebe
.
This exposes a bug in gcc 4.9 regarding register allocation. Will reland
when 4.9 is fixed.
Change-Id: I2d8a04e4edde93719280e41550f4c0765608ec4d
2014-05-13 13:21:17 -07:00
Johann
ce23931a3f
Only build neon assembly for armv7 targets
...
Allow selectively building just the intrinsics for armv8
Change-Id: I2f29b2e4508b8b8e5649c2906b3159ad1d4ec477
2014-05-12 08:52:02 -07:00
Johann
677fb5123e
Revert "VP8 for ARMv8 by using NEON intrinsics 10"
...
This reverts commit c500fc22c1
There is an issue with gcc 4.6 in the Android NDK:
loopfiltersimpleverticaledge_neon.c: In function 'vp8_loop_filter_bvs_neon':
loopfiltersimpleverticaledge_neon.c:176:1: error: insn does not satisfy its constraints:
Change-Id: I95b6509d12f075890308914cc691b813d2e5cd9f
2014-05-06 14:28:00 -07:00
Johann
928ff03889
Revert "VP8 for ARMv8 by using NEON intrinsics 08"
...
This reverts commit a5d79f43b9
There is an issue with gcc 4.6 in the Android NDK:
loopfilter_neon.c: In function 'vp8_loop_filter_vertical_edge_y_neon':
loopfilter_neon.c:394:1: error: insn does not satisfy its constraints:
Change-Id: I2b8c6ee3fa595c152ac3a5c08dd79bd9770c7b52
2014-05-06 13:20:24 -07:00
James Yu
4ea9cf3e2d
VP8 for ARMv8 by using NEON intrinsics 16
...
Add variance_neon.c
- vp8_variance16x16_neon
- vp8_variance16x8_neon
- vp8_variance8x16_neon
- vp8_variance8x8_neon
Change-Id: Idfb9c96134a1c6a696a98ce68b4f7ed593a00660
Signed-off-by: James Yu <james.yu@linaro.org>
2014-05-03 19:07:40 -07:00
James Yu
727af7cebe
VP8 for ARMv8 by using NEON intrinsics 15
...
Add idct_dequant_0_2x_neon.c
- idct_dequant_0_2x_neon
Change-Id: I8e129172ef1b2517cf72ff267788921f1a792586
Signed-off-by: James Yu <james.yu@linaro.org>
2014-05-03 19:07:33 -07:00
James Yu
08e38f06db
VP8 for ARMv8 by using NEON intrinsics 14
...
Add sixtappredict_neon.c
- vp8_sixtap_predict16x16_neon
- vp8_sixtap_predict8x8_neon
- vp8_sixtap_predict8x4_neon
- vp8_sixtap_predict4x4_neon
Change-Id: I3b02fce48ae2e6c6099041ba5ddd7b090f1463b9
Signed-off-by: James Yu <james.yu@linaro.org>
2014-05-03 19:07:12 -07:00
James Yu
18e9caad47
VP8 for ARMv8 by using NEON intrinsics 13
...
Add shortidct4x4llm_neon.c
- vp8_short_idct4x4llm_neon
Change-Id: I5a734bbffca8dacf8633c2b0ff07b98aa2f438ba
Signed-off-by: James Yu <james.yu@linaro.org>
2014-05-03 19:07:05 -07:00
James Yu
feaf766bd0
VP8 for ARMv8 by using NEON intrinsics 12
...
Add sad_neon.c
- vp8_sad16x16_neon
- vp8_sad16x8_neon
- vp8_sad8x8_neon
- vp8_sad8x16_neon
- vp8_sad4x4_neon
Change-Id: I08eaae49ec03fb91b394354660a5df0367cea311
Signed-off-by: James Yu <james.yu@linaro.org>
2014-05-03 04:54:39 -07:00
James Yu
4a8336fa9d
VP8 for ARMv8 by using NEON intrinsics 11
...
Add mbloopfilter_neon.c
- vp8_mbloop_filter_horizontal_edge_y_neon
- vp8_mbloop_filter_horizontal_edge_uv_neon
- vp8_mbloop_filter_vertical_edge_y_neon
- vp8_mbloop_filter_vertical_edge_uv_neon
Change-Id: Ia9084e0892d4d49412d9cf2b165a0f719f2382d7
Signed-off-by: James Yu <james.yu@linaro.org>
2014-05-03 04:54:33 -07:00
James Yu
c500fc22c1
VP8 for ARMv8 by using NEON intrinsics 10
...
Add loopfiltersimpleverticaledge_neon.c
- vp8_loop_filter_bvs_neon
- vp8_loop_filter_mbvs_neon
Change-Id: I7cf0a161ad4ae37c881b94cc0122f895d3baae79
Signed-off-by: James Yu <james.yu@linaro.org>
2014-05-03 04:11:00 -07:00
James Yu
55c95f2d2c
VP8 for ARMv8 by using NEON intrinsics 09
...
Add loopfiltersimplehorizontaledge_neon.c
- vp8_loop_filter_bhs_neon
- vp8_loop_filter_mbhs_neon
Change-Id: I77f9721b20585da8bf3869a3850ff0ae4b4bfeea
Signed-off-by: James Yu <james.yu@linaro.org>
2014-05-03 04:10:45 -07:00
James Yu
a5d79f43b9
VP8 for ARMv8 by using NEON intrinsics 08
...
Add loopfilter_neon.c
- vp8_loop_filter_horizontal_edge_y_neon
- vp8_loop_filter_horizontal_edge_uv_neon
- vp8_loop_filter_vertical_edge_y_neon
- vp8_loop_filter_vertical_edge_uv_neon
Change-Id: I50b57dedabd42d2a3c183c1738cc5346f0e71ed8
Signed-off-by: James Yu <james.yu@linaro.org>
2014-05-02 09:32:11 -07:00
James Yu
930557be10
VP8 for ARMv8 by using NEON intrinsics 07
...
Add iwalsh_neon.c
- vp8_short_inv_walsh4x4_neon
Change-Id: I8beda6ce11ad8ce9e80cc0a38d40161938359162
Signed-off-by: James Yu <james.yu@linaro.org>
2014-05-02 09:24:54 -07:00
James Yu
81ad047ee5
VP8 for ARMv8 by using NEON intrinsics 06
...
Add idct_dequant_full_2x_neon.c
- idct_dequant_full_2x_neon
==== Summary of apply VP8 decode patch series ====
Benchmark on Samsung Chromebook, Cortex-A15, 1.7GHz, Dual core
Toolchain: linaro-1.13.1-4.8-2014.01
Compile argument: CROSS=arm-linux-gnueabihf- ../libvpx/configure
--target=armv7-linux-gcc --prefix=$HOME/out
--enable-shared --cpu=cortex-a7
Test argument: vpxdec --summary --noblit ./tears_of_steel_1080p.webm
NEON assembly 46.68 (fps)
Apply patch 06 46.65, -0.03
Apply patch 07 46.86, +0.21
Apply patch 08 46.58, -0.28
Apply patch 09 46.57, -0.01
Apply patch 10 46.51, -0.06
Apply patch 11 46.13, -0.38
Apply patch 12 45.42, -0.71
Apply patch 13 46.06, +0.64
Apply patch 14 45.19, -0.87
Apply patch 15 45.93, +0.74
Apply patch 16 45.48, -0.45
Apply patch 17 45.84, +0.36
Apply patch 18 45.91, +0.07 <= With all NEON intrinsics patches
Total -0.77 fps, 1.65% performance regression
Change-Id: I77bfc9eaccfb97b8d401e949ceff8795e26ca6b7
Signed-off-by: James Yu <james.yu@linaro.org>
2014-05-02 11:57:47 +08:00
Yunqing Wang
096eaba728
Remove VP8 save_reg_neon function
...
This patch did a cleanup following the commit "Save NEON registers
in VP8 NEON functions". The pushing/poping of callee-saved NEON
registers was moved into individual NEON functions. Therefore,
we don't need to save those registers at the beginning of codec.
The related code was removed.
Change-Id: I5648166514fc9beffb780aa138495597731f49ea
2014-04-29 16:13:24 -07:00
James Zern
805078a1bf
build: convert rtcd.sh to perl
...
significantly speeds up file generation.
the goal of this change is to convert rtcd.sh to perl as directly as
possible to allow for simple comparison. future changes can make it more
perl-like.
---
Linux
[CREATE] vpx_scale_rtcd.h
real 0m0.485s -> 0m0.022s
[CREATE] vp8_rtcd.h
real 0m4.619s -> 0m0.060s
[CREATE] vp9_rtcd.h
real 0m10.102s -> 0m0.087s
Windows
[CREATE] vpx_scale_rtcd.h
real 0m8.360s -> 0m0.080s
[CREATE] vp8_rtcd.h
real 1m8.083s -> 0m0.160s
[CREATE] vp9_rtcd.h
real 2m6.489s -> 0m0.233s
Change-Id: Idfb71188206c91237d6a3c3a81dfe00d103f11ee
2014-03-03 14:47:11 -08:00
James Yu
fb5d281bb6
VP8 for ARMv8 by using NEON intrinsics 05
...
Add dequantizeb_neon.c
- vp8_dequantize_b_loop_neon
vpxdec --summary --noblit ../videos/tears_of_steel_1080p.webm
Before => After, 13.25 => 13.23 (fps)
Change-Id: Iebe3b0c6ed2359c778b0570763c5681ae25fef0c
Signed-off-by: James Yu <james.yu@linaro.org>
2014-02-26 10:16:00 +08:00
James Yu
28b2f82f97
VP8 for ARMv8 by using NEON intrinsics 04
...
Add dequant_idct_neon.c
- vp8_dequant_idct_add_neon
vpxdec --summary --noblit ../videos/tears_of_steel_1080p.webm
Before => After, 13.25 => 13.22 (fps)
Change-Id: Id48f39e1da58dd3d8d37658e94989411997f4f7c
Signed-off-by: James Yu <james.yu@linaro.org>
2014-02-26 09:59:23 +08:00
James Yu
d749ab6221
VP8 for ARMv8 by using NEON intrinsics 03
...
Add dc_only_idct_add_neon.c
- vp8_dc_only_idct_add_neon
vpxdec --summary --noblit ../videos/tears_of_steel_1080p.webm
Before => After, 13.25 => 13.24 (fps)
Change-Id: I5e9e277ec3a3ca67e13c8cc4c324a6fbe8a897fc
Signed-off-by: James Yu <james.yu@linaro.org>
2014-02-26 09:28:29 +08:00
James Yu
300a3bfc73
VP8 for ARMv8 by using NEON intrinsics 02
...
Add copymem_neon.c
- vp8_copy_mem16x16_neon
- vp8_copy_mem8x8_neon
- vp8_copy_mem8x4_neon
vpxdec --summary --noblit ../videos/tears_of_steel_1080p.webm
Before => After, 13.25 => 13.25 (fps)
Change-Id: Ib956b5a20522ff57dc8a580bf0aef7b252bddba6
Signed-off-by: James Yu <james.yu@linaro.org>
2014-02-23 22:56:53 +08:00
Johann
dadf350551
Apply neon flags to intrinsic files
...
Filter out files ending in _neon.c and append .neon so the Android build
system knows to apply -mfpu=neon
Change-Id: Ib67277e5920bfcaeda7c4aa16cd1001b11d59305
2014-01-10 12:16:59 -08:00
James Yu
79395e16cf
VP8 for ARMv8 by using NEON intrinsics 01
...
Add bilinearpredict_neon_intrinsics.c
- vp8_bilinear_predict4x4_neon
- vp8_bilinear_predict8x4_neon
- vp8_bilinear_predict8x8_neon
- vp8_bilinear_predict16x16_neon
Change-Id: I33dfa502881219841b442dda32b73220e51b716b
Signed-off-by: James Yu <james.yu@linaro.org>
2014-01-09 09:56:22 -08:00
James Zern
f89335f7ca
remove unused VP8 com/dec asm offsets
...
Change-Id: Ib3b26ee27f04b2dcbbd32b3127afb45e9f50cfcf
2013-07-09 14:33:49 -07:00
James Zern
08348d9cab
prefix vp8 asm_{com,dec,enc}_offsets files
...
make them symmetrical with the generated output and their vp9
counterparts
Change-Id: I72cc97c4d33d713dff620a6d7cc25955266216fc
2013-03-02 14:45:40 -08:00