While ARMv7 in general is capable of unaligned access, not all instructions
actually are. And trouble is that compiler doesn't seem to differentiate
those capable and incapable of unaligned access. Side effect is that kernel
goes into endless loop retrying same instruction triggering unaligned trap.
Problem was observed in xts128.c and ccm128.c modules. It's possible to
resolve it by using (volatile u32*) casts, but letting STRICT_ALIGNMENT
be feels more appropriate.
(cherry picked from commit 3bdd80521a81d50ade4214053cd9b293f920a77b)
Submitted by: "Noszticzius, Istvan" <inoszticzius@rightnow.com>
Don't clear the output buffer: ciphers should correctly the same input
and output buffers.
rely on indirect call to block functions, they are not as fast as
non-consolidated routines. However, performance loss(*) is within
measurement error and consolidation advantages are considered to
outweigh it.
(*) actually one can observe performance *improvement* on e.g.
CBC benchmarks thanks to optimization, which also becomes
shared among ciphers.