This addresses
- request for improvement for faster key setup in RT#3576;
- clearing registers and stack in RT#3554 (this is more of a gesture to
see if there will be some traction from compiler side);
- more commentary around input parameters handling and stack layout
(desired when RT#3553 was reviewed);
- minor size and single block performance optimization (was lying around);
Reviewed-by: Matt Caswell <matt@openssl.org>
(cherry picked from commit 23f6eec71dbd472044db7dc854599f1de14a1f48)
ARM has optimized Cortex-A5x pipeline to favour pairs of complementary
AES instructions. While modified code improves performance of post-r0p0
Cortex-A53 performance by >40% (for CBC decrypt and CTR), it hurts
original r0p0. We favour later revisions, because one can't prevent
future from coming. Improvement on post-r0p0 Cortex-A57 exceeds 50%,
while new code is not slower on r0p0, or Apple A7 for that matter.
[Update even SHA results for latest Cortex-A53.]
Reviewed-by: Richard Levitte <levitte@openssl.org>
(cherry picked from commit 94376cccb4ed5b376220bffe0739140ea9dad8c8)
Td4 and Te4 are arrays of u8. A u8 << int promotes the u8 to an int first then shifts.
If the mathematical result of a shift (as modelled by lhs * 2^{rhs}) is not representable
in an integer, behaviour is undefined. In other words, you can't shift into the sign bit
of a signed integer. Fix this by casting to u32 whenever we're shifting left by 24.
(For consistency, cast other shifts, too.)
Caught by -fsanitize=shift
Submitted by Nick Lewycky (Google)
Reviewed-by: Andy Polyakov <appro@openssl.org>
(cherry picked from commit 8b37e5c14f0eddb10c7f91ef91004622d90ef361)
indent will not alter them when reformatting comments
(cherry picked from commit 1d97c8435171a7af575f73c526d79e1ef0ee5960)
Conflicts:
crypto/bn/bn_lcl.h
crypto/bn/bn_prime.c
crypto/engine/eng_all.c
crypto/rc4/rc4_utl.c
crypto/sha/sha.h
ssl/kssl.c
ssl/t1_lib.c
Reviewed-by: Tim Hudson <tjh@openssl.org>
This facilitates "universal" builds, ones that target multiple
architectures, e.g. ARMv5 through ARMv7. See commentary in
Configure for details.
Reviewed-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Reviewed-by: Matt Caswell <matt@openssl.org>
(cherry picked from commit c1669e1c205dc8e695fb0c10a655f434e758b9f7)
Improve CBC decrypt and CTR by ~13/16%, which adds up to ~25/33%
improvement over "pre-Silvermont" version. [Add performance table to
aesni-x86.pl].
(cherry picked from commit 5599c7331b90d9d29c9914c2a95c16d91485415a)
Add support for key wrap algorithms via EVP interface.
Generalise AES wrap algorithm and add to modes, making existing
AES wrap algorithm a special case.
Move test code to evptests.txt
(cherry picked from commit 97cf1f6c2854a3a955fd7dd3a1f113deba00c9ef)
Conflicts:
CHANGES