Commit Graph

19 Commits

Author SHA1 Message Date
Martin Storsjö
0995390c4a Remove apple specific versions of arm macros with arguments
The apple assembler for arm can handle the gnu binutils style
macros just fine these days, so there is no need to duplicate all
of these macros in two syntaxes, when the new one works fine in all cases.

We already require a new enough assembler to support the gnu binutils
style features since we use the .rept directive in a few places.
2015-03-27 11:11:45 +02:00
Martin Storsjö
cdce1b73ca Remove a stray, unused parameter to a gnu binutils style macro in arm assembly 2015-03-27 11:11:23 +02:00
Martin Storsjö
0b0884874d Remove superfluous .text directives at the start of arm assembly files
This directive can be set by the common include header that is
included by all files anyway.
2015-03-27 10:46:34 +02:00
Martin Storsjö
5a78735802 Force armv7/neon within the arm assembly header file
This avoids having to add extra compiler flags to be able to build
them.
2015-02-02 00:53:39 +02:00
zhiliang wang
01b74ea7c1 Add asm code for NoneZeroCount and refine related code 2015-01-04 16:39:17 +08:00
ruil2
ed341048de refine common moudle for part of intra prediction function 2014-09-25 14:03:11 +08:00
Martin Storsjö
f69c9074e7 Add ifdef HAVE_NEON around the contents of arm_arch_common_macro.S
This file is built on its own from within the xcode projects,
even though it isn't necessary. Previously its contents was just
empty, but now a .syntax unified was added, which failed the build
when building for arm64.

Make this file a no-op, just like the other arm assembly source files,
unless HAVE_NEON is defined.
2014-08-15 08:49:31 +03:00
Martin Storsjö
38d2d64ede Explicitly add .syntax unified when building for iOS
This is the default when building with the clang built-in assembler,
but not if using the external assembler - thus always specify it,
for clarity.

Also use the three-operand for of a sub instruction in BS_NZC_CHECK.
The same is already done in the gnu version of the macro.

This fixes building most of the arm assembly with Apple's external
assembler. While this isn't a necessary goal in itself, there's no
harm in doing this either.
2014-08-08 14:09:37 +03:00
Martin Storsjö
57f6bcc4b0 Convert all tabs to spaces in assembly sources, unify indentation
Previously the assembly sources had mixed indentation consisting
of both spaces and tabs, making it quite hard to read unless
the right tab size was used in the editor.

Tabs have been interpreted as 4 spaces in most cases, matching
the surrounding code.
2014-06-01 01:35:43 +03:00
Martin Storsjö
ac03b8b503 Avoid unnecessary tabs in macro declarations 2014-06-01 01:13:01 +03:00
Martin Storsjö
932a38abc0 Reformat the copyright header of deblocking_neon.S
This makes it identical to the ones in the other files.
2014-05-31 13:44:21 +03:00
dongzhang
218adc7e29 Fix a bug in deblocking for neon 32 bit arm implementation 2014-05-09 14:06:16 +08:00
Martin Storsjö
23f57adaea Do full register loads instead of single-lane loads in DeblockLumaEq4H_neon
Instead of loading the registers one lane at a time, load full
registers and then transpose them.

This is faster, reducing the runtime for the function from about
506 cycles to 434 cycles (tested on a Cortex A8).

This also avoids an issue which seems like a cpu bug, present
on Sony Xperia T (cpu implementer 0x51 architecture 7 variant 0x1
part 0x04d). On such a device, it seemed like the "vswp q9, q10"
could start executing before the previous
vld4.u8 {d20[x],d21[x],d22[x],d23[x]}, [r3], r1
had finished and written back their result. Changing the
"vswp q9, q10" into "vswp q10, q9", or into separate
"vswp d18, d20; vswp d19, d21" (or the other way around) seemed to
avoid the issue. This happened occasionally (a couple times per
100000 invocations or so).
2014-04-28 10:12:16 +03:00
Licai Guo
3f2ea77908 Merge pull request #719 from dongzha/MC
Modify ARM32 Neon code for Expand Chroma Picture, when UVWidth%16==8.
2014-04-21 14:38:51 +08:00
Licai Guo
2f8c539e60 Merge pull request #707 from dongzha/FixIssueMcNEON
Fix potential issue for neon implement on encoder mode decision.
2014-04-17 17:26:25 +08:00
dongzhang
a4f59bc0d7 Modify ARM32 Neon code for Expand Chroma Picture, when UVWidth%16==8. 2014-04-17 15:58:30 +08:00
Licai Guo
c8e1a41c29 Move copy_mb neon code to common folder 2014-04-17 10:06:48 +08:00
Dong Zhang
8a4300be50 Fix potential issue for neon implement on encoder mode decision.
Error happens when ME_REFINE_BUF_STRIDE is not equal to 32.
2014-04-13 19:41:29 -07:00
Licai Guo
e39de8d404 reoranize common to inc/src/x86/arm 2014-03-18 19:41:32 -07:00