These new files will not be included literally in OpenSSL, but I intend
to integrate most of their contents. Most file names will change,
and when the integration is done, the superfluous files will be deleted.
Submitted by: Lenka Fibikova <fibikova@exp-math.uni-essen.de>
I'm a little bit nervous about bn_div_words, as I don't know what it's
supposed to return on overflow. For now, I trust the rest of the
system to give it numbers that will not cause any overflow...
BN_mul() correctly constified, avoids two realloc()'s that aren't
really necessary and saves memory to boot. This required a small
change in bn_mul_part_recursive() and the addition of variants of
bn_cmp_words(), bn_add_words() and bn_sub_words() that can take arrays
with differing sizes.
The test results show a performance that very closely matches the
original code from before my constification. This may seem like a
very small win from a performance point of view, but if one remembers
that the variants of bn_cmp_words(), bn_add_words() and bn_sub_words()
are not at all optimized for the moment (and there's no corresponding
assembler code), and that their use may be just as non-optimal, I'm
pretty confident there are possibilities...
This code needs reviewing!
two functions that did expansion on in parameters (BN_mul() and
BN_sqr()). The problem was solved by making bn_dup_expand() which is
a mix of bn_expand2() and BN_dup().
BN_mod_mul_montgomery, which calls bn_sqr_recursive
without much preparation.
bn_sqr_recursive requires the length of its argument to be
a power of 2, which is not always the case here.
There's no reason for not using BN_sqr -- if a simpler
approach to squaring made sense, then why not change
BN_sqr? (Using BN_sqr should also speed up DH where g is chosen
such that it becomes small [e.g., 2] when converted
to Montgomery representation.)
Case closed :-)
Also, "make update" has added some missing functions to libeay.num,
updated the TABLE for the alpha changes, and updated thousands of
dependancies that have changed from recent commits.
because we're only handling words anyway) in BN_mod_exp_mont_word
making it a little faster for very small exponents,
and adjust the performance gain estimate in CHANGES according
to slightly more thorough measurements.
(15% faster than BN_mod_exp_mont for "large" base,
20% faster than BN_mod_exp_mont for small base.)