202 Commits

Author SHA1 Message Date
Andy Polyakov
5dc52b919b PPC assembly pack: relax 64-bit requirement for little-endian support. 2014-01-07 22:44:21 +01:00
Andy Polyakov
1fb83a3bc2 aes/asm/vpaes-ppc.pl: add little-endian support. 2014-01-07 16:48:04 +01:00
Andy Polyakov
25f7117f0e aesni-sha1-x86_64.pl: refine Atom-specific optimization.
(and update performance data, and fix typo)
2014-01-04 17:13:57 +01:00
Andy Polyakov
2f3af3dc36 aesni-sha1-x86_64.pl: add stiched decrypt procedure,
but keep it disabled, too little gain... Add some Atom-specific
optimization.
2014-01-03 21:40:08 +01:00
Andy Polyakov
a61e51220f aes/asm/vpaes-ppc.pl: comply with ABI. 2013-12-04 21:46:40 +01:00
Andy Polyakov
89bb96e51d vpaes-ppc.pl: fix bug in IV handling and comply with ABI. 2013-11-29 14:40:51 +01:00
Andy Polyakov
b5c54c914f Add Vector Permutation AES for PPC. 2013-11-27 22:32:56 +01:00
Andy Polyakov
c944f81703 aes/asm/aes-ppc.pl: add little-endian support.
Submitted by: Marcelo Cerri
2013-10-31 11:41:26 +01:00
Andy Polyakov
76c15d790e PPC assembly pack: make new .size directives profiler-friendly.
Suggested by: Anton Blanchard
2013-10-15 23:40:12 +02:00
Andy Polyakov
d6019e1654 PPC assembly pack: add .size directives. 2013-10-15 00:14:39 +02:00
Andy Polyakov
7e1e3334f6 aes/asm/bsaes-x86_64.pl: fix Windows-specific bug in XTS.
PR: 3139
2013-10-12 21:37:55 +02:00
Andy Polyakov
6f6a613032 aes/asm/bsaes-*.pl: improve decrypt performance.
Improve decrypt performance by 10-20% depending on platform. Thanks
to Jussi Kivilinna for providing valuable hint. Also thanks to Ard
Biesheuvel.
2013-10-03 23:08:31 +02:00
Andy Polyakov
b783858654 x86_64 assembly pack: add multi-block AES-NI, SHA1 and SHA256. 2013-10-03 00:18:58 +02:00
Andy Polyakov
066caf0551 aes/asm/*-armv*.pl: compensate for inconsistencies in tool-chains.
Suggested by: Ard Biesheuvel
2013-10-01 20:33:06 +02:00
Andy Polyakov
e0202d946d aes-armv4.pl, bsaes-armv7.pl: add Linux kernel and Thumb2 support.
Submitted by: Ard Biesheuvel
2013-09-20 13:22:57 +02:00
Andy Polyakov
612f4e2384 bsaes-armv7.pl: remove partial register operations in CTR subroutine. 2013-09-15 19:47:51 +02:00
Andy Polyakov
29f41e8a80 bsaes-armv7.pl: remove byte order dependency and minor optimization. 2013-09-15 19:44:43 +02:00
Ard Biesheuvel
a2ea9f3ecc Added support for ARM/NEON based bit sliced AES in XTS mode
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
2013-09-15 19:37:16 +02:00
Andy Polyakov
42386fdb62 aesni-sha256-x86_64.pl: fix typo in Windows SEH. 2013-06-30 23:06:28 +02:00
Andy Polyakov
02450ec69d PA-RISC assembler pack: switch to bve in 64-bit builds.
PR: 3074
2013-06-18 10:37:00 +02:00
Andy Polyakov
3b848d3401 aesni-sha1-x86_64.pl: update performance data. 2013-06-10 22:35:22 +02:00
Andy Polyakov
42b9a4177b aesni-sha256-x86_64.pl: harmonize with latest sha512-x86_64.pl. 2013-06-10 22:34:06 +02:00
Andy Polyakov
36df342f9b aesni-x86_64.pl: optimize XTS.
PR: 3042
2013-05-25 19:23:09 +02:00
Andy Polyakov
4df2280b4f aesni-sha1-x86_64.pl: Atom-specific optimization. 2013-05-25 19:08:39 +02:00
Andy Polyakov
988d11b641 vpaes-x86[_64].pl: minor Atom-specific optimization. 2013-05-25 18:57:03 +02:00
Andy Polyakov
8a97a33063 Add AES-SHA256 stitch. 2013-05-13 22:49:58 +02:00
Andy Polyakov
cd54249c21 aesni-x86_64.pl: minor CTR performance improvement. 2013-05-13 15:49:03 +02:00
Andy Polyakov
9575d1a91a bsaes-armv7.pl: add bsaes_cbc_encrypt and bsaes_ctr32_encrypt_blocks.
Submitted by: Ard Biesheuvel <ard.biesheuvel@linaro.org>

Contributor claims ~50% improvement in CTR and ~9% in CBC decrypt
on Cortex-A15.
2013-04-23 17:52:14 +02:00
Andy Polyakov
75fe422323 bsaes-armv7.pl: take it into build loop. 2013-04-23 17:49:54 +02:00
Andy Polyakov
73325b221c aesni-x86_64.pl: optimize CBC decrypt.
Give CBC decrypt approximately same treatment as to CTR and collect 25%.
2013-04-04 15:56:23 +02:00
Andy Polyakov
b4a9d5bfe8 aesni-x86_64.pl: fix typo and optimize small block performance. 2013-03-29 18:54:24 +01:00
Andy Polyakov
6c79faaa9d aesni-x86_64.pl: optimize CTR even further.
Based on suggestions from Shay Gueron and Vlad Krasnov.
PR: 3021
2013-03-26 14:29:18 +01:00
Andy Polyakov
1bc4d009e1 aesni-x86_64.pl: optimize CTR even further. 2013-03-19 20:03:02 +01:00
Andy Polyakov
7c9e81be40 [aesni-]sha1-x86_64.pl: code refresh. 2013-02-14 16:14:02 +01:00
Andy Polyakov
46bf83f07a x86_64 assembly pack: make Windows build more robust.
PR: 2963 and a number of others
2013-01-22 22:27:28 +01:00
Andy Polyakov
8df400cf8d aes-s390x.pl: fix XTS bugs in z196-specific code path. 2012-12-05 17:44:45 +00:00
Andy Polyakov
9282c33596 aesni-x86_64.pl: CTR face lift, +25% on Bulldozer. 2012-12-01 18:20:39 +00:00
Andy Polyakov
c3cddeaec8 aes-s390x.pl: harmonize software-only code path [and minor optimization]. 2012-12-01 11:06:19 +00:00
Andy Polyakov
904732f68b C64x+ assembly pack: improve EABI support. 2012-11-28 13:19:10 +00:00
Andy Polyakov
cd68694646 AES for SPARC T4: add XTS, reorder subroutines to improve TLB locality. 2012-11-24 21:55:23 +00:00
Andy Polyakov
98dc178494 aes-x86_64.pl: Atom-specific optimizations, +10%.
vpaes-x86_64.pl: minor performance squeeze.
2012-11-12 17:52:41 +00:00
Andy Polyakov
89f1eb8213 aes-586.pl: Atom-specific optimization, +44/29%, minor improvement on others.
vpaes-x86.pl: minor performance squeeze.
2012-11-12 17:50:19 +00:00
Andy Polyakov
8ed11a815e [aes|cmll]t4-sparcv9.pl: unify argument handling. 2012-10-25 12:03:41 +00:00
Andy Polyakov
eec82a0e1f [aes|cmll]t4-sparcv9.pl: addendum to previous sparcv9_modes.pl commit. 2012-10-14 14:42:27 +00:00
Andy Polyakov
54a1f4480e aest4-sparcv9.pl: split it to AES-specific and reusable part. 2012-10-11 18:30:35 +00:00
Andy Polyakov
c5f6da54fc Add SPARC T4 AES support.
Submitted by: David Miller
2012-10-06 18:08:09 +00:00
Andy Polyakov
5cc2159526 MIPS assembly pack: add support for SmartMIPS ASE. 2012-09-18 12:52:23 +00:00
Andy Polyakov
8df5518bd9 MIPS assembly pack: add MIPS[32|64]R2 code. 2012-09-15 11:18:20 +00:00
Andy Polyakov
9b222748e7 aes-mips.pl: interleave load and integer instructions for better performance. 2012-09-15 11:15:02 +00:00
Andy Polyakov
e7db9896bb bsaes-armv7.pl: closest shave. While 0.3 cpb improvement on S4 appears
insignificant, it's actually 4 cycles less for 14 instructions sequence!
2012-09-07 12:29:18 +00:00