James Zern
2d3d95f7ac
enable vpx_idct16x16_256_add_neon in hbd builds
...
BUG=webm:1294
Change-Id: Ib421c150b0d29dee0a81390a612bf01a4a28cff1
2016-12-06 18:32:21 -08:00
James Zern
8befcd0089
enable vpx_idct16x16_10_add_neon in hbd builds
...
BUG=webm:1294
Change-Id: Ibad079f25e673d4f5181961896a8a8333a51e825
2016-12-06 16:09:19 -08:00
James Zern
af9d7aa9fb
idct16x16,NEON: rm output_stride from pass1 fns
...
vpx_idct16x16_256_add_neon_pass1, vpx_idct16x16_10_add_neon:
this was a constant 8 in all cases meaning the results are stored
contiguously, this allows the number of stores to be reduced.
Change-Id: I7858a0a15a284883ef45c13dfd97c308df9ea09e
2016-12-06 15:13:33 -08:00
James Zern
c6641782c3
idct16x16,NEON,cosmetics: normalize fn signatures
...
+ remove unused parameters from vpx_idct16x16_10_add_neon_pass2
Change-Id: Ie5912a4abdd308fab589380bca054a2e7234a2c4
2016-11-28 16:46:01 -08:00
James Zern
21a1abd8e3
enable vpx_idct32x32_135_add_neon in hbd builds
...
BUG=webm:1294
Change-Id: Ide6d3994fe01c4320c9d143e6d059b49568048e4
2016-11-23 19:59:43 -08:00
James Zern
568d4b1d63
idct_neon: rename load_tran_low_to_s16 -> ...s16q
...
BUG=webm:1294
Change-Id: I164cfcbe9bc4511d1d04af9206cf351a0ec2957b
2016-11-23 19:57:48 -08:00
James Zern
d757d7e998
Merge changes Icc4ead05,Ib019964b,I3b5fd3b3,Ieedadee2
...
* changes:
Update vpx_idct4x4_16_add_neon() to pass SingleExtremeCoeff test
Refine 8-bit 4x4 idct NEON intrinsics
Add idct speed test.
Update partial_idct_test.cc to support high bitdepth
2016-11-24 03:31:25 +00:00
Linfeng Zhang
05e2b5a59f
Merge "Add 32x32 d45 and 8x8, 16x16, 32x32 d135 NEON intra prediction"
2016-11-22 23:20:53 +00:00
Linfeng Zhang
6cc76ec73f
Update vpx_idct4x4_16_add_neon() to pass SingleExtremeCoeff test
...
Change-Id: Icc4ead05506797d12bf134e8790443676fef5c10
2016-11-22 11:35:05 -08:00
Linfeng Zhang
974e81d184
Refine 8-bit 4x4 idct NEON intrinsics
...
Change-Id: Ib019964bfcbce7aec57d8c3583127f9354d3c11f
2016-11-22 11:26:03 -08:00
James Zern
cbeae53e76
Merge "Clean horizontal intra prediction NEON optimization"
2016-11-19 01:29:37 +00:00
Linfeng Zhang
85c1ee434d
Add high bitdepth intra prediction NEON optimization (mode tm)
...
BUG=webm:1316
Change-Id: Ib014de06836ac12726f4a2c9f0833ec4eb4d233b
2016-11-15 14:19:46 -08:00
Linfeng Zhang
a3128ad33a
Add high bitdepth intra prediction NEON optimization (h and v)
...
BUG=webm:1316
Change-Id: I47eeac698a98a31d1af5f72441052302e9fa4f46
2016-11-12 12:00:19 -08:00
James Zern
80f6b243a7
Merge changes I339088b2,Iaade219e,If142afb1,I4257c4b3
...
* changes:
fdct8x8_test: add vpx_idct8x8_64_add_neon in hbd
fdct4x4_test: add vpx_idct4x4_16_add_neon in hbd
partial_idct_test,NEON: add missing idct variants
enable vpx_idct32x32_34_add_neon in hbd builds
2016-11-10 05:02:39 +00:00
Linfeng Zhang
40ab0424d4
Add high bitdepth intra prediction NEON optimization (mode d45 and d135)
...
BUG=webm:1316
Change-Id: I6a330874348df04df24a6d9efdc06f567e04bf8e
2016-11-09 12:04:04 -08:00
James Zern
738c8f23c6
enable vpx_idct32x32_34_add_neon in hbd builds
...
replace load_and_transpose_s16_8x8() in idct32_6_neon() with a separate
load_tran_low_to_s16() and transpose_s16_8x8(). the combined function is
used in idct32_8_neon() where the input is the correctly sized output
from the earlier stage.
BUG=webm:1294
Change-Id: I4257c4b3a421b2cf5d13651f966eee0680ef98a9
2016-11-08 17:03:36 -08:00
Johann
50b40f114c
Optimize idct32x32_135_add for NEON
...
BUG=webm:1295
Change-Id: I7f80ef4d29813fcb401fc6075babf19e3c195462
2016-11-08 22:06:07 +00:00
Linfeng Zhang
64a5a8fd6f
Merge "Add high bitdepth intra prediction NEON optimization (mode dc)"
2016-11-08 16:53:42 +00:00
Martin Storsjo
34c35b6fb6
Add a missing END directive in idct_neon.asm
...
This fixes building with MS armasm.
Change-Id: I2629eeed859b775ca667a65ba109f8d1bf7b0e03
2016-11-04 12:21:18 +02:00
Linfeng Zhang
1338c71dfb
Clean horizontal intra prediction NEON optimization
...
Change-Id: I1ef0a5b2655cbc7e1cc2a4a1a72e0eed9aa41f05
2016-11-02 11:43:45 -07:00
Linfeng Zhang
1868582e7d
Add 32x32 d45 and 8x8, 16x16, 32x32 d135 NEON intra prediction
...
Change-Id: I852616794244490123eb615ac750da50265f0fa5
2016-11-02 11:40:37 -07:00
Johann Koenig
5ac7a59a05
Merge "arm idct: move to-be-shared code to header"
2016-11-02 18:09:45 +00:00
Linfeng Zhang
3b74066b10
Add high bitdepth intra prediction NEON optimization (mode dc)
...
BUG=webm:1316
Change-Id: I984d6004ea2445e86f213fb6fa4d794a9955af8f
2016-11-01 17:07:36 -07:00
Johann
bf8ab194ee
arm idct: move to-be-shared code to header
...
Change-Id: I67458cd358b4dc4434bbdbfcdd571769561b619e
2016-11-01 15:43:56 -07:00
James Zern
1b275ab898
Merge "idct32x32_1_add_neon: clear a couple conv warnings"
2016-11-01 22:34:59 +00:00
James Zern
9de91855ef
Merge changes I08af3a54,If5959a25,I6763e62e
...
* changes:
build/make/Android.mk: s/armv8/arm64/
build/make/Android.mk: fix armeabi-v7a build
use .S suffix rather than .s for NEON asm
2016-11-01 21:43:13 +00:00
Linfeng Zhang
cc5f49767a
Refine 8-bit intra prediction NEON optimization (mode tm)
...
Change-Id: I98b9577ec51367df5e5d564bedf7c3ea0606de4c
2016-11-01 09:45:16 -07:00
James Zern
7625c803b3
idct32x32_1_add_neon: clear a couple conv warnings
...
int16_t -> uint8_t
Change-Id: I3c5e0985bc3584dce289c35b5973de24cdc73b76
2016-10-31 18:56:34 -07:00
James Zern
1ddb4c0362
use .S suffix rather than .s for NEON asm
...
for compatibility with other build systems
Change-Id: I6763e62e3126850ad4f8ad29e388b8dad0bbc4c3
2016-10-31 16:39:05 -07:00
James Zern
410d947c5f
Merge "idct,NEON: add a tran_low_t->s16 load adapter"
2016-10-31 21:59:12 +00:00
James Zern
3ae25974fd
idct,NEON: add a tran_low_t->s16 load adapter
...
enable idct4x4* and idct8x8* which are compatible for 8-bit decodes in
high-bitdepth mode. the adapter narrows 32-bit input to 16, whether the
expansion can be avoided at all in this case remains a TODO. roughly
matches sse2.
BUG=webm:1294
Change-Id: I3ea94e5a2070dfd509b5de0c555aab4e1f4da036
2016-10-31 11:21:16 -07:00
Linfeng Zhang
a347118f3c
Refine 8-bit intra prediction NEON optimization (mode h and v)
...
Change-Id: I45e1454c3a85e081bfa14386e0248f57e2a91854
2016-10-31 10:33:44 -07:00
Linfeng Zhang
4ae9f5c092
Refine 8-bit intra prediction NEON optimization (mode d45 and d135)
...
dst += stride behaving better with gcc/clang.
Unroll loops.
Change-Id: I83f85df2bc9f17c6159542f57680b509395db2b1
2016-10-27 14:24:50 -07:00
Linfeng Zhang
9c0680bd43
Merge "Refine 8-bit intra prediction NEON optimization (mode dc)"
2016-10-26 16:51:44 +00:00
Johann
9720b58aac
Optimize idct32x32_34_add for NEON
...
Approximately 3 times faster than the 1024 version which was used
previously.
BUG=webm:1295
Change-Id: Id15fb3d096029ec38ef01c53e5f6eb08254347c9
2016-10-25 15:43:58 -07:00
Linfeng Zhang
ce88b8f5c5
Refine 8-bit intra prediction NEON optimization (mode dc)
...
dst += stride behaving better with gcc/clang
Expanding inline function dc_SIZExSIZE() save intructions for
vpx_dc_predictor_SIZExSIZE_neon().
Change-Id: Id0ccbd58b6a31df539141fd33bdf28633339150d
2016-10-24 13:18:51 -07:00
James Zern
2e6a1976a0
Merge "remove idct32x32*_add_neon.asm"
2016-10-22 02:29:56 +00:00
James Zern
5d91752a98
Merge "vpx_highbd_convolve_copy_neon: use multi reg loads"
2016-10-22 02:28:15 +00:00
James Zern
9dbb3ad396
remove idct32x32*_add_neon.asm
...
the intrinsics are neutral to ~20% faster on cros/android
devices when using gcc-4.9/clang-3.8.1 and gcc-4.9/clang-3.8.x from the
r13 ndk. neutral results typically came with gcc-4.9 while larger
positive gains were achieved with clang 3.8.x.
BUG=webm:1303
Change-Id: I4d31f9c017944681b881493525d4573a7a5b1e16
2016-10-20 19:47:14 -07:00
Urvang Joshi
e084e05484
Fix warnings reported by -Wshadow: Part1: vpx_dsp directory
...
While we are at it:
- Rename some variables to more meaningful names
- Reuse some common consts from a header instead of redefining them.
Change-Id: I75c4248cb75aa54c52111686f139b096dc119328
(cherry picked from aomedia 09eea21)
2016-10-17 19:25:19 -07:00
James Zern
68cd3052ca
vpx_highbd_convolve_copy_neon: use multi reg loads
...
for copy16/32/64
BUG=webm:1299
Change-Id: I5080d736bde7e487c80ef3d7024dda1e96a57eaf
2016-10-17 17:15:03 -07:00
Linfeng Zhang
9c8981c666
add vpx high bitdepth convolve8 NEON intrinsics optimization
...
BUG=webm:1299
Change-Id: I236bfa0441e357b6ff05add8269a2cfb543924d1
2016-10-17 15:23:54 -07:00
Linfeng Zhang
f910d14a1a
add vpx_highbd_convolve_{copy,avg}_neon()
...
BUG=webm:1299
Change-Id: Ib87ac466ada63251eb06ae2abd1e13e61e0d1538
2016-10-13 15:21:14 -07:00
James Zern
fd270437f0
cosmetics,*loopfilter_neon.c: s/tranpose/transpose/
...
Change-Id: I267d6a9d715ddb6110f0881c2e820c37fc673fe1
2016-10-12 16:12:56 -07:00
Linfeng Zhang
01454ec485
[vpx highbd lpf NEON 6/6] vertical 16
...
BUG=webm:1300
Change-Id: I29d0b482d66f05e278325ddebcf108fbf0b6e222
2016-10-11 22:59:19 -07:00
Linfeng Zhang
27479775c4
[vpx highbd lpf NEON 5/6] horizontal 16
...
BUG=webm:1300
Change-Id: I21da32d6cfb8a1a6f58bc9756d17f48f13a59a12
2016-10-11 22:59:19 -07:00
Linfeng Zhang
251cbfbec8
[vpx highbd lpf NEON 4/6] vertical 8
...
BUG=webm:1300
Change-Id: If06b12bc081bab60059b100414dd7018f83ac62d
2016-10-11 22:59:19 -07:00
Linfeng Zhang
96c7206ede
[vpx highbd lpf NEON 3/6] horizontal 8
...
BUG=webm:1300
Change-Id: Ica2379e294be60b7f80fcfcec110dca4c3b59d81
2016-10-12 00:48:31 +00:00
Linfeng Zhang
49aa9b1f12
[vpx highbd lpf NEON 2/6] vertical 4
...
BUG=webm:1300
Change-Id: Ia33a9f2d6c7e2e6b3497ad6f1a09439a85b33983
2016-10-06 14:22:26 -07:00
Linfeng Zhang
7aa27bd62f
[vpx highbd lpf NEON 1/6] horizontal 4
...
BUG=webm:1300
Change-Id: Idf441806e6bf397ff5ecd8776146b3f781f50c40
2016-10-06 14:03:04 -07:00