Ported from arm NEON and added vector_dmul_scalar.
Functions between 1.5 and 5 times faster than the C implementations
using Apple's clang-503.0.19 on A7.
* commit '9c029f67ca82147ddfa83a1546ee1e109e11fbd4':
aarch64: use EXTERN_ASM consistently for exported symbols
Merged-by: Michael Niedermayer <michaelni@gmx.at>
NEON and VFP are currently mandatory for all ARMv8 profiles. Both are
handled as extensions as far as cpuflags are concerned. This is
consistent with handling x86_64 which always has SSE2, but still
handles it as an extension.