Andy Polyakov
c8ec4a1b0b
Final (for this commit series) optimized version and with commentary section.
2007-12-29 20:30:09 +00:00
Andy Polyakov
699e1a3a82
This is also informational commit exposing loop modulo scheduling "factor."
2007-12-29 20:28:01 +00:00
Andy Polyakov
64214a2183
New Montgomery multiplication module, ppc64-mont.pl. Reference, non-optimized
...
implementation. This is essentially informational commit.
2007-12-29 20:26:46 +00:00
Andy Polyakov
7722e53f12
Yet another ARM update. It appears to be more appropriate to make
...
developers responsible for -march choice.
2007-09-27 16:27:03 +00:00
Andy Polyakov
673c55a2fe
Latest bn_mont.c modification broke ECDSA test. I've got math wrong, which
...
is fixed now.
2007-06-29 13:10:19 +00:00
Andy Polyakov
5b89f78a89
Typo in x86_64-mont.pl.
...
PR: 1549
2007-06-21 11:38:52 +00:00
Andy Polyakov
1c7f8707fd
bn_asm for s390x.
2007-06-20 14:10:16 +00:00
Andy Polyakov
2329694222
SPARC Solaris and Linux assemblers treat .align directive differently.
...
PR: 1547
2007-06-20 12:24:22 +00:00
Andy Polyakov
7d9cf7c0bb
Eliminate conditional final subtraction in Montgomery assembler modules.
2007-06-17 17:10:03 +00:00
Andy Polyakov
a2a54ffc5f
s390x assembler pack.
2007-04-30 08:42:54 +00:00
Andy Polyakov
8b71d35458
nasm fixes.
2007-03-20 08:55:58 +00:00
Andy Polyakov
760e353528
sparcv9a-mont was modified to handle 32-bit aligned input, but check
...
for 64-bit alignment was not removed.
2007-03-20 08:54:51 +00:00
Andy Polyakov
64aecc6720
Make armv4t-mont module backward binary compatible with armv4 and rename it
...
accordingly.
2007-01-17 20:12:41 +00:00
Andy Polyakov
43b8fe1cd0
Montgomery multiplication for ARMv4.
2007-01-11 21:43:25 +00:00
Andy Polyakov
8876e58f34
Montgomery multiplication for MIPS III/IV. Not engaged.
2006-12-29 11:09:33 +00:00
Andy Polyakov
7321a84d4c
Minor clean-up in crypto/bn/asm.
2006-12-29 11:05:20 +00:00
Andy Polyakov
4cfe3df1f5
Minor performance improvements to x86-mont.pl.
2006-12-28 12:43:16 +00:00
Andy Polyakov
8f2d60ec26
Fix for "strange errors" exposed by ccgost engine. The fix is
...
two extra insructions in sqradd loop at line #503 .
2006-12-27 10:59:51 +00:00
Andy Polyakov
1702c8c4bf
x86-mont.pl sse2 tune-up and integer-only squaring procedure.
2006-12-22 15:28:07 +00:00
Andy Polyakov
87d3af6475
Eliminate 64-bit alignment limitation in sparcv9a-mont.
2006-12-08 15:18:41 +00:00
Andy Polyakov
98939a05b6
alpha-mont.pl: gcc portability fix and make-rule.
2006-12-08 14:18:58 +00:00
Andy Polyakov
d28134b8f3
Minor, +10%, tune-up for x86_64-mont.pl.
2006-12-08 10:13:51 +00:00
Andy Polyakov
8583eba015
Montgomery multiplication routine for Alpha.
2006-12-08 10:12:56 +00:00
Andy Polyakov
73b979e601
Clarify HAL SPARC64 support situation in sparcv9a-mont.pl.
2006-11-28 11:07:36 +00:00
Andy Polyakov
ebae8092cb
Minor optimizations based on intruction level profiler feedback.
2006-11-28 10:34:51 +00:00
Andy Polyakov
2e21922eb6
Modulo-schedule loops in sparcv9a-mont.pl. Overall improvement factor
...
over 0.9.8 is up to 3x on USI&II cores and up to 80% - on USIII&IV.
2006-11-28 07:24:26 +00:00
Andy Polyakov
1c3d2b94be
This is "informational" commit. Its mere purpose is to expose "modulo
...
factor" in inner loops.
2006-11-28 07:20:36 +00:00
Andy Polyakov
48d2335d73
Non-SSE2 path to bn_mul_mont. But it's disabled, because it currently
...
doesn't give performance improvement.
2006-11-27 14:59:35 +00:00
Andy Polyakov
31439046e0
bn/asm/ppc.pl to use ppc-xlate.pl.
2006-10-17 14:37:07 +00:00
Andy Polyakov
cecfdbf72d
VIA-specific Montgomery multiplication routine.
2006-10-17 07:04:48 +00:00
Andy Polyakov
8ea975d070
+20% tune-up for Power5.
2006-08-09 15:40:30 +00:00
Andy Polyakov
c8a0d0aaf9
Engage assembler in solaris64-x86_64-cc.
2006-07-31 22:28:40 +00:00
Andy Polyakov
67d990904e
Futher minor PPC assembler update.
2006-05-04 21:30:41 +00:00
Andy Polyakov
c09a0318b7
Minor PPC assembler updates.
2006-05-03 14:07:34 +00:00
Andy Polyakov
2c5d4daac5
Yet another "teaser" Montgomery multiplication module, for PowerPC.
2006-04-30 21:15:29 +00:00
Andy Polyakov
7a5dbeb782
Minor sparcv9 clean-ups.
2005-12-27 21:27:39 +00:00
Andy Polyakov
3b4a0225e2
As SPARCV9 CPU flavor is [expected to be] detected at run-time, we can
...
afford to relax SPARCV9/8+ compiler command line and produce "unversal"
binaries as we used to.
2005-12-19 09:10:06 +00:00
Andy Polyakov
a00e414faf
Unify sparcv9 assembler naming and build rules among 32- and 64-bit builds.
...
Engage run-time switch between bn_mul_mont_fpu and bn_mul_mont_int.
2005-12-16 17:39:57 +00:00
Andy Polyakov
68ea60683a
Add IALU-only bn_mul_mont for SPARCv9. See commentary section for details.
2005-12-15 22:43:33 +00:00
Andy Polyakov
6df8c74d5b
Switch 64-bit sparcv9 platforms from bn(64,64) to bn(64,32). This doesn't
...
have impact on performance, because amount of multiplications does not
increase with this switch, not on sparcv9 that is. On the contrary, it
actually improves performance, because it spares a load of instructions
used to chase carries. Not to mention that BN assembler modules can be
shared more freely between 32- and 64-bit builts.
2005-12-15 22:40:58 +00:00
Andy Polyakov
07645deeb8
Apply "better safe than sorry" approach after addressing sporadic SEGV in
...
bn_sub_words to the rest of the sparcv8plus.S.
2005-11-15 08:02:10 +00:00
Andy Polyakov
c52c82ffc1
Attempt to resolve sporadic SEGV crashes in bn_sub_words in OpenSSH. I'm
...
baffled why it crashes and does it sporadically...
2005-11-11 20:07:07 +00:00
Andy Polyakov
a4d729f31d
Clarify binary compatibility with HAL/Fujitsu SPARC64 family.
2005-10-25 15:39:47 +00:00
Andy Polyakov
aa2be094ae
Add support for 32-bit ABI to sparcv9a-mont.pl module.
2005-10-22 18:16:09 +00:00
Andy Polyakov
4d524040bc
Change bn_mul_mont declaration and BN_MONT_CTX. Update CHANGES.
2005-10-22 17:57:18 +00:00
Andy Polyakov
bcb43bb358
Yet another "teaser" Montgomery multiply module, for UltraSPARC. It's not
...
integrated yet, but it's tested and benchmarked [see commentary section
for further details].
2005-10-19 07:12:06 +00:00
Andy Polyakov
34736de4c0
Flip saved argument block and tp [required for non-SSE2 path].
2005-10-14 16:05:21 +00:00
Andy Polyakov
5f50d597f2
Make sure x86-mont.pl returns zero even if compiled with no-sse2.
2005-10-14 15:24:06 +00:00
Andy Polyakov
35593b33f4
Add timestamp to x86-mont.pl.
2005-10-09 10:26:56 +00:00
Andy Polyakov
54f3d200d3
Throw in bn/asm/x86-mont.pl Montgomery multiplication "teaser".
2005-10-09 09:53:58 +00:00