Linfeng Zhang
9c0680bd43
Merge "Refine 8-bit intra prediction NEON optimization (mode dc)"
2016-10-26 16:51:44 +00:00
Johann
9720b58aac
Optimize idct32x32_34_add for NEON
...
Approximately 3 times faster than the 1024 version which was used
previously.
BUG=webm:1295
Change-Id: Id15fb3d096029ec38ef01c53e5f6eb08254347c9
2016-10-25 15:43:58 -07:00
Linfeng Zhang
ce88b8f5c5
Refine 8-bit intra prediction NEON optimization (mode dc)
...
dst += stride behaving better with gcc/clang
Expanding inline function dc_SIZExSIZE() save intructions for
vpx_dc_predictor_SIZExSIZE_neon().
Change-Id: Id0ccbd58b6a31df539141fd33bdf28633339150d
2016-10-24 13:18:51 -07:00
James Zern
2e6a1976a0
Merge "remove idct32x32*_add_neon.asm"
2016-10-22 02:29:56 +00:00
James Zern
5d91752a98
Merge "vpx_highbd_convolve_copy_neon: use multi reg loads"
2016-10-22 02:28:15 +00:00
James Zern
9dbb3ad396
remove idct32x32*_add_neon.asm
...
the intrinsics are neutral to ~20% faster on cros/android
devices when using gcc-4.9/clang-3.8.1 and gcc-4.9/clang-3.8.x from the
r13 ndk. neutral results typically came with gcc-4.9 while larger
positive gains were achieved with clang 3.8.x.
BUG=webm:1303
Change-Id: I4d31f9c017944681b881493525d4573a7a5b1e16
2016-10-20 19:47:14 -07:00
James Zern
a60dd5c83a
Merge "Fix warnings reported by -Wshadow: Part1: vpx_dsp directory"
2016-10-18 22:09:29 +00:00
Kaustubh Raste
8ff5af773a
Merge "Optimize sad_64width_x4d_msa function"
2016-10-18 07:46:02 +00:00
Kaustubh Raste
b7310e2aff
Optimize sad_64width_x4d_msa function
...
Reduced HADD_UH_U32 macro calls
Change-Id: Ie089b9a443de516646b46e8f72156aa826ca8cfa
2016-10-18 04:05:33 +00:00
Urvang Joshi
e084e05484
Fix warnings reported by -Wshadow: Part1: vpx_dsp directory
...
While we are at it:
- Rename some variables to more meaningful names
- Reuse some common consts from a header instead of redefining them.
Change-Id: I75c4248cb75aa54c52111686f139b096dc119328
(cherry picked from aomedia 09eea21)
2016-10-17 19:25:19 -07:00
James Zern
68cd3052ca
vpx_highbd_convolve_copy_neon: use multi reg loads
...
for copy16/32/64
BUG=webm:1299
Change-Id: I5080d736bde7e487c80ef3d7024dda1e96a57eaf
2016-10-17 17:15:03 -07:00
Linfeng Zhang
9c8981c666
add vpx high bitdepth convolve8 NEON intrinsics optimization
...
BUG=webm:1299
Change-Id: I236bfa0441e357b6ff05add8269a2cfb543924d1
2016-10-17 15:23:54 -07:00
Linfeng Zhang
f910d14a1a
add vpx_highbd_convolve_{copy,avg}_neon()
...
BUG=webm:1299
Change-Id: Ib87ac466ada63251eb06ae2abd1e13e61e0d1538
2016-10-13 15:21:14 -07:00
James Zern
1909270f65
Merge "cosmetics,*loopfilter_neon.c: s/tranpose/transpose/"
2016-10-13 07:12:51 +00:00
Kaustubh Raste
9e75c01353
Merge "Optimize vpx_mbpost_proc_across_ip_msa function"
2016-10-13 02:12:33 +00:00
Kaustubh Raste
99adf8b22e
Merge "Optimize vpx_get4x4sse_cs_msa function"
2016-10-13 02:12:00 +00:00
James Zern
fd270437f0
cosmetics,*loopfilter_neon.c: s/tranpose/transpose/
...
Change-Id: I267d6a9d715ddb6110f0881c2e820c37fc673fe1
2016-10-12 16:12:56 -07:00
Linfeng Zhang
01454ec485
[vpx highbd lpf NEON 6/6] vertical 16
...
BUG=webm:1300
Change-Id: I29d0b482d66f05e278325ddebcf108fbf0b6e222
2016-10-11 22:59:19 -07:00
Linfeng Zhang
27479775c4
[vpx highbd lpf NEON 5/6] horizontal 16
...
BUG=webm:1300
Change-Id: I21da32d6cfb8a1a6f58bc9756d17f48f13a59a12
2016-10-11 22:59:19 -07:00
Linfeng Zhang
251cbfbec8
[vpx highbd lpf NEON 4/6] vertical 8
...
BUG=webm:1300
Change-Id: If06b12bc081bab60059b100414dd7018f83ac62d
2016-10-11 22:59:19 -07:00
Linfeng Zhang
96c7206ede
[vpx highbd lpf NEON 3/6] horizontal 8
...
BUG=webm:1300
Change-Id: Ica2379e294be60b7f80fcfcec110dca4c3b59d81
2016-10-12 00:48:31 +00:00
Linfeng Zhang
57e4cbc632
Merge "[vpx highbd lpf NEON 2/6] vertical 4"
2016-10-10 16:57:55 +00:00
Linfeng Zhang
19046d9963
Merge "[vpx highbd lpf NEON 1/6] horizontal 4"
2016-10-10 16:56:23 +00:00
Kaustubh Raste
3da752fe00
Optimize vpx_mbpost_proc_across_ip_msa function
...
Removed HADD_SW_S32 calculation
Change-Id: I7384dc881451d197404d09beb7c27b222e1d6875
2016-10-10 18:03:28 +05:30
Kaustubh Raste
d05104b488
Optimize vpx_get4x4sse_cs_msa function
...
Reuse CALC_MSE_B macro
Change-Id: I39f0a92ac2dbb5fa8628df1a5d556cfdc42a3648
2016-10-10 16:31:57 +05:30
Kaustubh Raste
3c2f7eb339
Optimize vp9 loopfilter msa functions
...
Updated code to process in 8bit as saturation/clipping takes care of
overflow
Removed unused macro
Change-Id: I113df60286fb28b216df800d95b2d3695ef71440
2016-10-07 19:26:26 -07:00
Linfeng Zhang
49aa9b1f12
[vpx highbd lpf NEON 2/6] vertical 4
...
BUG=webm:1300
Change-Id: Ia33a9f2d6c7e2e6b3497ad6f1a09439a85b33983
2016-10-06 14:22:26 -07:00
Linfeng Zhang
7aa27bd62f
[vpx highbd lpf NEON 1/6] horizontal 4
...
BUG=webm:1300
Change-Id: Idf441806e6bf397ff5ecd8776146b3f781f50c40
2016-10-06 14:03:04 -07:00
James Zern
1e1caad165
vpx_dsp/idct*_neon.asm: simplify immediate loads
...
mov supports 0-65535
Change-Id: I019de0d784836d7bd60e6b36f2cdeefb541cb3fd
2016-10-05 14:28:32 -07:00
James Zern
a6be7ba1aa
enable idct*_1_add_neon in high-bitdepth builds
...
these are compatible as they only load one element of the input so the
larger size of tran_low_t makes no difference in little endian builds.
note the asm is incompatible with big-endian, but there are other points of
failure there so currently it's considered unsupported.
BUG=webm:1294
Change-Id: Icd2665a0699bccae92d1bea43a95b0a83fb17028
2016-10-05 11:14:25 -07:00
Angie Chiang
5d635365bb
Merge "Move highbd txfm input range check from 2d iht transform to 1d idct/iadst"
2016-10-04 16:57:37 +00:00
Kaustubh Raste
0a92dd7319
Merge "Fix vpx_plane_add_noise_msa functionality bit-mismatch"
2016-10-04 06:35:47 +00:00
Angie Chiang
5b073c695b
Move highbd txfm input range check from 2d iht transform to 1d idct/iadst
...
This change will make the highbd txfm input range check more comprehensive
The 25-bit highbd input range is composed by
12 signal input bits + 7 bits for 2D forward transform amplification + 5 bits for
1D inverse transform amplification + 1 bit for contingency in rounding and quantizing
BUG=https://bugs.chromium.org/p/webm/issues/detail?id=1286
BUG=https://bugs.chromium.org/p/chromium/issues/detail?id=651625
Change-Id: I04c0796edd7653f8d463fba5dc418132986131e7
2016-10-03 17:21:08 -07:00
James Zern
c6bc7499d9
Merge "cosmetics,*_neon.c: rm redundant return from void fns"
2016-10-03 22:40:42 +00:00
Kaustubh Raste
6922fc8230
Fix vpx_plane_add_noise_msa functionality bit-mismatch
...
Change-Id: I04961afb592ae6a67fdcfd8c9066e920dd4b30e7
2016-10-03 18:15:59 +00:00
James Zern
50b9c467da
Merge "vpx_convolve8_neon,load/store*: correct param type"
2016-10-01 23:52:14 +00:00
James Zern
c449983c56
vpx_convolve8_neon,load/store*: correct param type
...
stride/pitch in convolve is expressed with a ptrdiff_t
Change-Id: Ia5a6732dc509f06ccf7035386fa8ae721b4b1a71
2016-10-01 11:03:29 -07:00
Martin Storsjo
9255328f27
Remove a stray END declaration in loopfilter_4_neon.asm
...
Change-Id: Ic8c359a5677f9c663787aac74f530e886163bc69
2016-10-01 14:12:42 +03:00
Linfeng Zhang
da14d23e44
Merge "Refactor vpx lpf NEON files (step 2/2)"
2016-10-01 00:07:51 +00:00
Linfeng Zhang
edbca72a53
Merge "Refactor vpx lpf NEON files (step 1/2)"
2016-10-01 00:07:31 +00:00
James Zern
db80c23fd4
cosmetics,*_neon.c: rm redundant return from void fns
...
+ a couple of 'break's after a return
Change-Id: Ia21f12ebcef98244feb923c17b689fc8115da015
2016-09-30 13:09:57 -07:00
James Zern
b6277a47c7
Merge changes from topic '8bit-hbd-idct'
...
* changes:
*idct*_neon.c: add missing rtcd include
idct,msa/neon: exclude idct files from hbd build
*rtcd_defs.pl: remove empty specialize calls
2016-09-30 19:36:08 +00:00
James Zern
1396d12103
*idct*_neon.c: add missing rtcd include
...
+ correct declarations as necessary
BUG=webm:1294
Change-Id: I719602df9a56e79188a78e7f8b31257c6d3cc11d
2016-09-30 11:41:26 -07:00
James Zern
b51c4df93a
idct,msa/neon: exclude idct files from hbd build
...
these functions are incompatible currently and unreferenced in rtcd,
exclude them from the build.
BUG=webm:1294
Change-Id: I7790c195a91e1b142f56c04d2a5e305d9133b896
2016-09-30 11:32:47 -07:00
Linfeng Zhang
ca2fe7a8c7
Refactor vpx lpf NEON files (step 2/2)
...
Change-Id: I0744407cd3361ff752bd7f6e654b70ab6b41a58f
2016-09-30 09:56:28 -07:00
Linfeng Zhang
4779f5308d
Refactor vpx lpf NEON files (step 1/2)
...
Change-Id: I4016d096d46ca691f3b17199b259b7231e983cfb
2016-09-30 09:48:54 -07:00
Linfeng Zhang
8c744fd978
Merge "Unify loopfilter function names"
2016-09-30 15:58:08 +00:00
Linfeng Zhang
c435b7fbdd
Merge "Refine vpx convolve8 NEON intrinsics optimization"
2016-09-30 15:56:31 +00:00
Linfeng Zhang
bde905cba1
Merge "Refine vpx_convolve_copy_neon() and vpx_convolve_avg_neon()"
2016-09-30 15:54:02 +00:00
James Zern
ed62d27c71
*rtcd_defs.pl: remove empty specialize calls
...
add_proto adds a 'c' specialization
Change-Id: I0ed0c2240d45264b0e0056ce7c8f63f4a00780bc
2016-09-29 20:38:26 -07:00