Johann
1d7ccd5325
Relocate memory operations for common code
...
With the sad functions, and hopefully the variance functions soon,
moving to the vpx_dsp location, place the defines used in the
reference C code in a common location.
Change-Id: I4c8ce7778eb38a0a3ee674d2f1c488eda01cfeca
2015-05-13 11:41:15 -07:00
James Zern
fd3658b0e4
replace DECLARE_ALIGNED_ARRAY w/DECLARE_ALIGNED
...
this macro was used inconsistently and only differs in behavior from
DECLARE_ALIGNED when an alignment attribute is unavailable. this macro
is used with calls to assembly, while generic c-code doesn't rely on it,
so in a c-only build without an alignment attribute the code will
function as expected.
Change-Id: Ie9d06d4028c0de17c63b3a27e6c1b0491cc4ea79
2015-05-07 11:55:08 -07:00
Johann
66b9933b8d
Rename neon convolve avg file
...
Some build systems use just the basename for object files.
Change-Id: I333e1107ee866f3906cc46476ef8d04c6200a8a0
2015-04-21 14:18:17 -07:00
Johann
377b6682f9
Disable vp9 _8_ loopfilters
...
Investigating https://code.google.com/p/chromium/issues/detail?id=443839
Change-Id: Ibb7485d835c5aa5e1d40f31715596ba8d208eedb
2015-01-06 19:26:11 -08:00
Johann
b1ba4cc394
Rearrange loopfilter functions
...
Separate functions and rename files. This will make it easier to disable
some functions later to help work around a compiler issue in chromium.
Change-Id: I7f30e109f77c4cd22e2eda7bd006672f090c1dc5
2015-01-06 19:26:11 -08:00
James Yu
aeeaa67987
VP9 common for ARMv8 by using NEON intrinsics 15
...
Re-write
- vp9_lpf_horizontal_4_dual_neon
in vp9_loopfilter_16_neon.c
Change-Id: Ie14f63d352f9564ad01db3939a61d91cf6d21a31
Signed-off-by: James Yu <james.yu@linaro.org>
2014-12-16 20:00:26 -08:00
Johann
ebc1951c7c
Merge "Use defines for inline and __builtin_prefetch"
2014-12-16 18:04:04 -08:00
Johann
2fdbf70d40
Use defines for inline and __builtin_prefetch
...
These were established for compatibility. Make sure to use them.
Most frequently they manifest as issues on Visual Studio builds.
Change-Id: I39d764d2eb341b999d7a6132cb44b2acfc511160
2014-12-16 15:21:19 -08:00
James Yu
aa8dd897c1
VP9 common for ARMv8 by using NEON intrinsics 16
...
Add vp9_reconintra_neon.c
- vp9_v_predictor_4x4_neon
- vp9_v_predictor_8x8_neon
- vp9_v_predictor_16x16_neon
- vp9_v_predictor_32x32_neon
- vp9_h_predictor_4x4_neon
- vp9_h_predictor_8x8_neon
- vp9_h_predictor_16x16_neon
- vp9_h_predictor_32x32_neon
- vp9_tm_predictor_4x4_neon
- vp9_tm_predictor_8x8_neon
- vp9_tm_predictor_16x16_neon
- vp9_tm_predictor_32x32_neon
Change-Id: Ib5d54a4766a1b5127169045659974f33aa98376d
Signed-off-by: James Yu <james.yu@linaro.org>
2014-12-16 12:57:52 -08:00
James Yu
ba05a4c640
VP9 common for ARMv8 by using NEON intrinsics 19
...
Delete vp9_dc_only_idct_add_neon.c
The function was merged with vp9_short_idct4x4_1_add (later
vp9_idct4x4_1_add) in d2de1ca
and should have been deleted then.
Change-Id: Ie58ba3dd9dc7330a8f1238dd7dd71c9ed4639b94
Signed-off-by: James Yu <james.yu@linaro.org>
2014-12-16 11:14:12 -08:00
James Yu
4f856cd7fa
VP9 common for ARMv8 by using NEON intrinsics 06
...
Add vp9_iht8x8_add_neon.c
- vp9_iht8x8_64_add_neon
The assembly did not previously implement tx_type 0
BUG=716
Change-Id: Icfc99dd24f3d59047f9184a7d0c761ba7e3de934
Signed-off-by: James Yu <james.yu@linaro.org>
2014-12-15 12:18:06 -08:00
James Yu
6b71013277
VP9 common for ARMv8 by using NEON intrinsics 05
...
Add vp9_iht4x4_add_neon.c
- vp9_iht4x4_16_add_neon
The assembly did not previously implement tx_type 0
BUG=715
Change-Id: I60034d1568de034edba45c5cdd13f3d87dbc73b6
Signed-off-by: James Yu <james.yu@linaro.org>
2014-12-15 12:16:19 -08:00
James Yu
3f7c12dab9
VP9 common for ARMv8 by using NEON intrinsics 18
...
Add vp9_idct32x32_add_neon.c
- vp9_idct32x32_1024_add_neon
Change-Id: Ic598b772c28bd3487a8ead7a4598a66b25f9b00f
Signed-off-by: James Yu <james.yu@linaro.org>
2014-12-10 18:20:04 -08:00
James Yu
3cfed4bf76
VP9 common for ARMv8 by using NEON intrinsics 14
...
Add vp9_idct16x16_add_neon.c
- vp9_idct16x16_256_add_neon_pass1
- vp9_idct16x16_256_add_neon_pass2
- vp9_idct16x16_10_add_neon_pass1
- vp9_idct16x16_10_add_neon_pass2
Change-Id: I54d25b54a36f4371760f54e4036693aaea40a5de
Signed-off-by: James Yu <james.yu@linaro.org>
2014-12-10 18:19:54 -08:00
James Yu
ce76aeb00d
VP9 common for ARMv8 by using NEON intrinsics 13
...
Add vp9_idct8x8_add_neon.c
- vp9_idct8x8_64_add_neon
- vp9_idct8x8_10_add_neon
Change-Id: I6ee7b4496765aa36ed52990f2ef73e9f24459610
Signed-off-by: James Yu <james.yu@linaro.org>
2014-12-10 14:56:54 -08:00
James Yu
8c25f4af6a
VP9 common for ARMv8 by using NEON intrinsics 12
...
Add vp9_idct4x4_add_neon.c
- vp9_idct4x4_16_add_neon
Change-Id: I011a96b10f1992dbd52246019ce05bae7ca8ea4f
Signed-off-by: James Yu <james.yu@linaro.org>
2014-12-10 14:49:59 -08:00
James Yu
420f58f2d2
VP9 common for ARMv8 by using NEON intrinsics 11
...
Add vp9_idct16x16_1_add_neon.c
- vp9_idct16x16_1_add_neon
Change-Id: I7c6524024ad4cb4e66aa38f1c887e733503c39df
Signed-off-by: James Yu <james.yu@linaro.org>
2014-12-10 13:06:58 -08:00
James Yu
030ca4d0e5
VP9 common for ARMv8 by using NEON intrinsics 10
...
Add vp9_idct32x32_1_add_neon.c
- vp9_idct32x32_1_add_neon
Change-Id: If9ffe9a857228f5c67f61dc2b428b40965816eda
Signed-off-by: James Yu <james.yu@linaro.org>
2014-12-10 13:04:29 -08:00
James Yu
2772b45ac0
VP9 common for ARMv8 by using NEON intrinsics 09
...
Add vp9_idct8x8_1_add_neon.c
- vp9_idct8x8_1_add_neon
Change-Id: I9d23e01fa96013febbf64db6c76c6c955f14e3ff
Signed-off-by: James Yu <james.yu@linaro.org>
2014-12-10 12:52:33 -08:00
James Yu
9114f0afdb
VP9 common for ARMv8 by using NEON intrinsics 08
...
Add vp9_idct4x4_1_add_neon.c
- vp9_idct4x4_1_add_neon
Change-Id: Ieab9af107dbd07a4f9503bc945890c90faccb8ac
Signed-off-by: James Yu <james.yu@linaro.org>
2014-12-10 12:49:28 -08:00
James Yu
01fc6f51e0
VP9 common for ARMv8 by using NEON intrinsics 07
...
Add vp9_convolve8_neon.c
- vp9_convolve8_horiz_neon
- vp9_convolve8_vert_neon
Change-Id: I0bdd99ff72d275223fe211ac7243c25a5a60cf87
Signed-off-by: James Yu <james.yu@linaro.org>
2014-12-09 20:03:07 -08:00
James Yu
893534a996
VP9 common for ARMv8 by using NEON intrinsics 04
...
Add vp9_convolve8_avg_neon.c
- vp9_convolve8_avg_horiz_neon
- vp9_convolve8_avg_vert_neon
Change-Id: I617971e37b02186fec5aca181f4f9622050ea2df
Signed-off-by: James Yu <james.yu@linaro.org>
2014-12-09 20:03:07 -08:00
James Yu
d12757f5c6
VP9 common for ARMv8 by using NEON intrinsics 03
...
Add vp9_copy_neon.c
- vp9_convolve_copy_neon
Change-Id: I291fc5423d06240876411bbceab03eae5ef585be
Signed-off-by: James Yu <james.yu@linaro.org>
2014-12-09 20:02:46 -08:00
Scott LaVarnway
617382a2e3
VP9 common for ARMv8 by using NEON intrinsics 02
...
Add vp9_avg_neon.c
- vp9_convolve_avg_neon
Change-Id: Id2c9d5bcfa37cff1a16417aba1656ff07bdf10fd
Signed-off-by: James Yu <james.yu@linaro.org>
2014-12-09 19:00:21 -08:00
James Yu
5b098b1825
VP9 common for ARMv8 by using NEON intrinsics 01
...
Add vp9_loopfilter_neon.c
- vp9_lpf_horizontal_4_neon
- vp9_lpf_vertical_4_neon
- vp9_lpf_horizontal_8_neon
- vp9_lpf_vertical_8_neon
Change-Id: I97a0d7b399a431c21ee77396be3d5f5a1f7ebccb
Signed-off-by: James Yu <james.yu@linaro.org>
2014-12-09 12:26:56 -08:00
Frank Galligan
95a568b3a8
Fix Neon convolve profiling
...
When profiling, gprof can't distinguish between matching labels in
different files.
Change-Id: I56770df212ed314a0d8568071fa8157624ef1e8f
2014-10-22 10:51:53 -07:00
Johann
1fc2b0fd00
Merge "Include type defines"
2014-06-20 11:29:19 -07:00
Johann
d658216276
Don't return value for void functions
...
Clears "warning: 'return' with a value, in function returning void"
Change-Id: I93972610d67e243ec772a1021d2fdfcfc689c8c2
2014-06-20 11:26:44 -07:00
Johann
baef0b89da
Include type defines
...
Clears error: unknown type name 'uint8_t'
Change-Id: I9b6eff66a5c69bc24aeaeb5ade29255a164ef0e2
2014-06-20 11:26:13 -07:00
Jingning Han
41a350a83d
Change eob threshold for partial inverse 8x8 2D-DCT to 12
...
The scanning order has the first 12 coefficients of the 8x8 2D-DCT
sitting in the top left 4x4 block. Hence the partial inverse 8x8
2D-DCT allows to handle cases with eob below 12.
The overall runtime of the inverse 8x8 2D-DCT unit is reduced from
166 cycles (using SSE2) to 150 cycles (using SSSE3).
Change-Id: I4514f9748042809ac84df4c14382c00f313f1cd2
2014-05-08 09:48:58 -07:00
hkuang
edcbbf2ee3
Merge "Fix a bug in neon that has not save and restore q4-q7 registers."
2014-02-28 09:48:26 -08:00
hkuang
f3d8e315ac
Fix a bug in neon that has not save and restore q4-q7 registers.
...
Change-Id: Ie21b5ae89100389b80f919710839084f935a8545
2014-02-27 14:06:52 -08:00
James Yu
e486488ce8
Replace vqshrun by vqmovun if shift #0 bit
...
Change-Id: Ifabb8c7ec0c327fea9d6739cab10addb060ff435
Signed-off-by: James Yu <james.yu@linaro.org>
2014-02-14 21:03:40 -08:00
Johann
4378503665
Merge "Remove redundant arm neon instructions."
2014-02-14 20:02:51 -08:00
Yaowu Xu
ecf392a155
Merge "minor spelling cleanup in comments"
2014-02-14 14:29:35 -08:00
Frank Galligan
b41acbf9bb
Fix neon wide loopfilter for filter8 only branch
...
The current code removed the check to only perform the filter8.
Change-Id: Ie54e19a77745042a5660eab986d9ef1c42e82410
2014-02-12 18:36:17 -08:00
Andrew Russell
549c31f8ae
minor spelling cleanup in comments
...
Change-Id: Ia91c6c406273345b08505097ffe1af3896980f06
2014-02-12 16:32:51 -08:00
James Yu
619f29cdb0
Remove redundant arm neon instructions.
...
Change-Id: I1fabad59747eb5f68c64275a36c3a1d94daf32a3
Signed-off-by: James Yu <james.yu@linaro.org>
2014-02-11 21:19:12 -08:00
Martin Storsjo
03bc491721
arm: Consistently use braces around doubleword arguments to vld
...
This isn't strictly necessary, but makes the file more consistent
with the other arm assembly source files.
Change-Id: I245c9677d89e0ab3f31991e473764858af35b180
2014-02-05 13:24:25 +02:00
Martin Storsjo
c2bb1aa544
arm: Use {} around quadword arguments to vld
...
This fixes building for iOS.
Change-Id: Ice082648c02a3faf93891f7ddc122875e2bdc9cb
2014-02-05 13:24:17 +02:00
Dmitry Kovalev
c49b08c9a1
Removing "_short" suffix from arm transform file names.
...
Change-Id: Iefe118f61a335e88821a21a9f50fb919212c1507
2014-01-31 17:19:02 -08:00
hkuang
770454f3a8
Add vp9_tm_predictor_32x32 neon implementation
...
which is 7.8 times faster than C.
Change-Id: I858ef4ec09202a07d445da8db702783d6d9d7321
2014-01-27 16:01:07 -08:00
hkuang
05d2081d38
Fix the vp9_tm_predictor_8x8_neon.
...
Change-Id: I832cf83871044bfee7b7e57dbd31bae05cbd53e9
2014-01-27 10:17:20 -08:00
Frank Galligan
183361dadb
Merge "Optimize vp9_tm_predictor_8x8_neon function"
2014-01-24 16:21:56 -08:00
Frank Galligan
56a8a0b54b
Optimize vp9_tm_predictor_8x8_neon function
...
Change-Id: Ia12aae491202098ff66366145aa0c3da38dc97e5
2014-01-24 11:07:14 -08:00
hkuang
3633ffcbf7
Add vp9_tm_predictor_16x16 neon implementation
...
which is 3.5 times faster than C.
Change-Id: I24439ba7a2971829c11620f34848facf2c916678
2014-01-24 10:22:58 -08:00
hkuang
97826df96b
Add tm_predictor_8x8 neon implementation.
...
Change-Id: I76c2720546b737cb63018a8ab6a3ff62a291786d
2014-01-22 13:43:20 -08:00
hkuang
2a2d8c140f
Merge "Add vp9_tm_predictor_4x4 neon implementation"
2014-01-16 10:18:12 -08:00
hkuang
f2ef389256
Add vp9_tm_predictor_4x4 neon implementation
...
Change-Id: I10c423bde7ea5a3bac9f14f35c73b6bc31c8f3e3
2014-01-15 11:51:36 -08:00
hkuang
5be0ed30dc
Merge "Add initial intra frame neon optimization. 1~2% gain."
2014-01-08 14:41:43 -08:00