Kaustubh Raste
a38e9f412d
Merge "Fix SingleLargeCoeff idct test"
2016-11-19 03:37:29 +00:00
James Zern
cbeae53e76
Merge "Clean horizontal intra prediction NEON optimization"
2016-11-19 01:29:37 +00:00
Jerome Jiang
de5fd00ec5
Change *_xmm to *_sse2 in deblocker assembly functions.
...
Some cosmetic changes because xmm is an anachronism.
Change-Id: I436a5b78a3c52776c20d6640939311f2a84a9bc7
2016-11-17 23:38:04 +00:00
Kaustubh Raste
c56e5dd620
Fix SingleLargeCoeff idct test
...
Updated idct code to handle single large coefficient (-32768)
Change-Id: Ia13ab1ab434a9a1b9954a5914088977a88841cc7
2016-11-17 11:41:07 +00:00
Jerome Jiang
5d48663e04
Merge "Change C and msa to match results from sse2."
2016-11-17 05:16:27 +00:00
Jerome Jiang
cb1b1b8fef
Change C and msa to match results from sse2.
...
Re-enable the tests to check CvsAssembly.
BUG=webm:1321
Change-Id: Id7f7d74b06c469fb6c8f5d04e91359e9cd9097a6
2016-11-16 17:05:26 -08:00
Linfeng Zhang
85c1ee434d
Add high bitdepth intra prediction NEON optimization (mode tm)
...
BUG=webm:1316
Change-Id: Ib014de06836ac12726f4a2c9f0833ec4eb4d233b
2016-11-15 14:19:46 -08:00
Linfeng Zhang
a3128ad33a
Add high bitdepth intra prediction NEON optimization (h and v)
...
BUG=webm:1316
Change-Id: I47eeac698a98a31d1af5f72441052302e9fa4f46
2016-11-12 12:00:19 -08:00
James Zern
80f6b243a7
Merge changes I339088b2,Iaade219e,If142afb1,I4257c4b3
...
* changes:
fdct8x8_test: add vpx_idct8x8_64_add_neon in hbd
fdct4x4_test: add vpx_idct4x4_16_add_neon in hbd
partial_idct_test,NEON: add missing idct variants
enable vpx_idct32x32_34_add_neon in hbd builds
2016-11-10 05:02:39 +00:00
Linfeng Zhang
40ab0424d4
Add high bitdepth intra prediction NEON optimization (mode d45 and d135)
...
BUG=webm:1316
Change-Id: I6a330874348df04df24a6d9efdc06f567e04bf8e
2016-11-09 12:04:04 -08:00
James Zern
738c8f23c6
enable vpx_idct32x32_34_add_neon in hbd builds
...
replace load_and_transpose_s16_8x8() in idct32_6_neon() with a separate
load_tran_low_to_s16() and transpose_s16_8x8(). the combined function is
used in idct32_8_neon() where the input is the correctly sized output
from the earlier stage.
BUG=webm:1294
Change-Id: I4257c4b3a421b2cf5d13651f966eee0680ef98a9
2016-11-08 17:03:36 -08:00
Johann
50b40f114c
Optimize idct32x32_135_add for NEON
...
BUG=webm:1295
Change-Id: I7f80ef4d29813fcb401fc6075babf19e3c195462
2016-11-08 22:06:07 +00:00
Linfeng Zhang
64a5a8fd6f
Merge "Add high bitdepth intra prediction NEON optimization (mode dc)"
2016-11-08 16:53:42 +00:00
Linfeng Zhang
d545c19afa
Rename vpx_highbd_idct8x8_10{*}() to vpx_highbd_idct8x8_12{*}()
...
Also update its trigger threshold from 10 to 12.
Change-Id: Ib8dddd87a5a22a12ca66e7084d342fbb027b0a2f
2016-11-07 09:07:55 -08:00
Linfeng Zhang
a9874961f0
Merge "Replace highbd_dct_const_round_shift with dct_const_round_shift"
2016-11-07 16:55:01 +00:00
Johann
e10c95dc83
Update vp9_fdct8x8_quant_ssse3 for highbitdepth
...
Borrow transition functions from fdct.h nee vpx_quantize_b_sse2
BUG=webm:1304
Change-Id: I9c88c3eec3ff8bb461411d98c26c3c236ea28ef1
2016-11-05 01:23:07 +00:00
Linfeng Zhang
04c3bf3c85
Replace highbd_dct_const_round_shift with dct_const_round_shift
...
They are identical.
Change-Id: I1ccaf03c81c3cbf88e82d77ffeb8204f5b063c61
2016-11-04 16:15:02 -07:00
Linfeng Zhang
32326c2f13
Merge "Cosmetics of inv_txfm.c"
2016-11-04 22:40:03 +00:00
Johann Koenig
900ec31bea
Merge "Extract high bit depth helper functions"
2016-11-04 21:03:17 +00:00
Linfeng Zhang
b68d8107cb
Cosmetics of inv_txfm.c
...
Unify code of 8-bit and high bitdepth.
Change-Id: I3fe441577af0249030ca3a1ef769eb9030711434
2016-11-04 13:24:41 -07:00
Johann
cf35ffc025
Extract high bit depth helper functions
...
These can be used in the vp9 fdct as well.
Change-Id: I4f3875e0cba1b8cad209c3a0581e121deba7675e
2016-11-04 18:13:51 +00:00
Martin Storsjo
34c35b6fb6
Add a missing END directive in idct_neon.asm
...
This fixes building with MS armasm.
Change-Id: I2629eeed859b775ca667a65ba109f8d1bf7b0e03
2016-11-04 12:21:18 +02:00
Linfeng Zhang
1338c71dfb
Clean horizontal intra prediction NEON optimization
...
Change-Id: I1ef0a5b2655cbc7e1cc2a4a1a72e0eed9aa41f05
2016-11-02 11:43:45 -07:00
Johann Koenig
5ac7a59a05
Merge "arm idct: move to-be-shared code to header"
2016-11-02 18:09:45 +00:00
Linfeng Zhang
3b74066b10
Add high bitdepth intra prediction NEON optimization (mode dc)
...
BUG=webm:1316
Change-Id: I984d6004ea2445e86f213fb6fa4d794a9955af8f
2016-11-01 17:07:36 -07:00
Johann
bf8ab194ee
arm idct: move to-be-shared code to header
...
Change-Id: I67458cd358b4dc4434bbdbfcdd571769561b619e
2016-11-01 15:43:56 -07:00
James Zern
1b275ab898
Merge "idct32x32_1_add_neon: clear a couple conv warnings"
2016-11-01 22:34:59 +00:00
James Zern
9de91855ef
Merge changes I08af3a54,If5959a25,I6763e62e
...
* changes:
build/make/Android.mk: s/armv8/arm64/
build/make/Android.mk: fix armeabi-v7a build
use .S suffix rather than .s for NEON asm
2016-11-01 21:43:13 +00:00
Linfeng Zhang
cc5f49767a
Refine 8-bit intra prediction NEON optimization (mode tm)
...
Change-Id: I98b9577ec51367df5e5d564bedf7c3ea0606de4c
2016-11-01 09:45:16 -07:00
James Zern
7625c803b3
idct32x32_1_add_neon: clear a couple conv warnings
...
int16_t -> uint8_t
Change-Id: I3c5e0985bc3584dce289c35b5973de24cdc73b76
2016-10-31 18:56:34 -07:00
James Zern
1ddb4c0362
use .S suffix rather than .s for NEON asm
...
for compatibility with other build systems
Change-Id: I6763e62e3126850ad4f8ad29e388b8dad0bbc4c3
2016-10-31 16:39:05 -07:00
James Zern
410d947c5f
Merge "idct,NEON: add a tran_low_t->s16 load adapter"
2016-10-31 21:59:12 +00:00
James Zern
3ae25974fd
idct,NEON: add a tran_low_t->s16 load adapter
...
enable idct4x4* and idct8x8* which are compatible for 8-bit decodes in
high-bitdepth mode. the adapter narrows 32-bit input to 16, whether the
expansion can be avoided at all in this case remains a TODO. roughly
matches sse2.
BUG=webm:1294
Change-Id: I3ea94e5a2070dfd509b5de0c555aab4e1f4da036
2016-10-31 11:21:16 -07:00
Linfeng Zhang
a347118f3c
Refine 8-bit intra prediction NEON optimization (mode h and v)
...
Change-Id: I45e1454c3a85e081bfa14386e0248f57e2a91854
2016-10-31 10:33:44 -07:00
Linfeng Zhang
4ae9f5c092
Refine 8-bit intra prediction NEON optimization (mode d45 and d135)
...
dst += stride behaving better with gcc/clang.
Unroll loops.
Change-Id: I83f85df2bc9f17c6159542f57680b509395db2b1
2016-10-27 14:24:50 -07:00
Linfeng Zhang
9c0680bd43
Merge "Refine 8-bit intra prediction NEON optimization (mode dc)"
2016-10-26 16:51:44 +00:00
Johann
9720b58aac
Optimize idct32x32_34_add for NEON
...
Approximately 3 times faster than the 1024 version which was used
previously.
BUG=webm:1295
Change-Id: Id15fb3d096029ec38ef01c53e5f6eb08254347c9
2016-10-25 15:43:58 -07:00
Linfeng Zhang
ce88b8f5c5
Refine 8-bit intra prediction NEON optimization (mode dc)
...
dst += stride behaving better with gcc/clang
Expanding inline function dc_SIZExSIZE() save intructions for
vpx_dc_predictor_SIZExSIZE_neon().
Change-Id: Id0ccbd58b6a31df539141fd33bdf28633339150d
2016-10-24 13:18:51 -07:00
James Zern
2e6a1976a0
Merge "remove idct32x32*_add_neon.asm"
2016-10-22 02:29:56 +00:00
James Zern
5d91752a98
Merge "vpx_highbd_convolve_copy_neon: use multi reg loads"
2016-10-22 02:28:15 +00:00
James Zern
9dbb3ad396
remove idct32x32*_add_neon.asm
...
the intrinsics are neutral to ~20% faster on cros/android
devices when using gcc-4.9/clang-3.8.1 and gcc-4.9/clang-3.8.x from the
r13 ndk. neutral results typically came with gcc-4.9 while larger
positive gains were achieved with clang 3.8.x.
BUG=webm:1303
Change-Id: I4d31f9c017944681b881493525d4573a7a5b1e16
2016-10-20 19:47:14 -07:00
James Zern
a60dd5c83a
Merge "Fix warnings reported by -Wshadow: Part1: vpx_dsp directory"
2016-10-18 22:09:29 +00:00
Kaustubh Raste
8ff5af773a
Merge "Optimize sad_64width_x4d_msa function"
2016-10-18 07:46:02 +00:00
Kaustubh Raste
b7310e2aff
Optimize sad_64width_x4d_msa function
...
Reduced HADD_UH_U32 macro calls
Change-Id: Ie089b9a443de516646b46e8f72156aa826ca8cfa
2016-10-18 04:05:33 +00:00
Urvang Joshi
e084e05484
Fix warnings reported by -Wshadow: Part1: vpx_dsp directory
...
While we are at it:
- Rename some variables to more meaningful names
- Reuse some common consts from a header instead of redefining them.
Change-Id: I75c4248cb75aa54c52111686f139b096dc119328
(cherry picked from aomedia 09eea21)
2016-10-17 19:25:19 -07:00
James Zern
68cd3052ca
vpx_highbd_convolve_copy_neon: use multi reg loads
...
for copy16/32/64
BUG=webm:1299
Change-Id: I5080d736bde7e487c80ef3d7024dda1e96a57eaf
2016-10-17 17:15:03 -07:00
Linfeng Zhang
9c8981c666
add vpx high bitdepth convolve8 NEON intrinsics optimization
...
BUG=webm:1299
Change-Id: I236bfa0441e357b6ff05add8269a2cfb543924d1
2016-10-17 15:23:54 -07:00
Linfeng Zhang
f910d14a1a
add vpx_highbd_convolve_{copy,avg}_neon()
...
BUG=webm:1299
Change-Id: Ib87ac466ada63251eb06ae2abd1e13e61e0d1538
2016-10-13 15:21:14 -07:00
James Zern
1909270f65
Merge "cosmetics,*loopfilter_neon.c: s/tranpose/transpose/"
2016-10-13 07:12:51 +00:00
Kaustubh Raste
9e75c01353
Merge "Optimize vpx_mbpost_proc_across_ip_msa function"
2016-10-13 02:12:33 +00:00