This memcpy code uses NEON/VFP to achieve very good performance
on ARMv7-A processors. It is specifically tuned for A15 but should
provide good performance on A9 also. It is equivalent to the code
in cortex-strings rev 116.
This patch is a follow up the existing gerrit change:
I7f6f77995f3ca903ad9c66d14261441667a2a935
But this version includes a tweak for performance on misaligned
buffers.
Change-Id: I285abac0068f8ae29a1cbf7862ea8590aadaf0a7
Signed-off-by: Will Newton <will.newton@linaro.org>