diff --git a/crypto/bn/asm/ia64.S b/crypto/bn/asm/ia64.S new file mode 100644 index 000000000..ae5606631 --- /dev/null +++ b/crypto/bn/asm/ia64.S @@ -0,0 +1,1498 @@ +.explicit +.text +.ident "ia64.S, Version 1.1" +.ident "IA-64 ISA artwork by Andy Polyakov " + +// +// ==================================================================== +// Written by Andy Polyakov for the OpenSSL +// project. +// +// Rights for redistribution and usage in source and binary forms are +// granted according to the OpenSSL license. Warranty of any kind is +// disclaimed. +// ==================================================================== +// + +// Q. How much faster does it get? +// A. Here is the output from 'openssl speed rsa dsa' for vanilla +// 0.9.6a compiled with gcc version 2.96 20000731 (Red Hat +// Linux 7.1 2.96-81): +// +// sign verify sign/s verify/s +// rsa 512 bits 0.0036s 0.0003s 275.3 2999.2 +// rsa 1024 bits 0.0203s 0.0011s 49.3 894.1 +// rsa 2048 bits 0.1331s 0.0040s 7.5 250.9 +// rsa 4096 bits 0.9270s 0.0147s 1.1 68.1 +// sign verify sign/s verify/s +// dsa 512 bits 0.0035s 0.0043s 288.3 234.8 +// dsa 1024 bits 0.0111s 0.0135s 90.0 74.2 +// +// And here is similar output but for this assembler +// implementation:-) +// +// sign verify sign/s verify/s +// rsa 512 bits 0.0021s 0.0001s 549.4 9638.5 +// rsa 1024 bits 0.0055s 0.0002s 183.8 4481.1 +// rsa 2048 bits 0.0244s 0.0006s 41.4 1726.3 +// rsa 4096 bits 0.1295s 0.0018s 7.7 561.5 +// sign verify sign/s verify/s +// dsa 512 bits 0.0012s 0.0013s 891.9 756.6 +// dsa 1024 bits 0.0023s 0.0028s 440.4 376.2 +// +// Yes, you may argue that it's not fair comparison as it's +// possible to craft the C implementation with BN_UMULT_HIGH +// inline assembler macro. But of course! Here is the output +// with the macro: +// +// sign verify sign/s verify/s +// rsa 512 bits 0.0020s 0.0002s 495.0 6561.0 +// rsa 1024 bits 0.0086s 0.0004s 116.2 2235.7 +// rsa 2048 bits 0.0519s 0.0015s 19.3 667.3 +// rsa 4096 bits 0.3464s 0.0053s 2.9 187.7 +// sign verify sign/s verify/s +// dsa 512 bits 0.0016s 0.0020s 613.1 510.5 +// dsa 1024 bits 0.0045s 0.0054s 221.0 183.9 +// +// My code is still way faster, huh:-) And I believe that even +// higher performance can be achieved. Note that as keys get +// longer, performance gain is larger. Why? According to the +// profiler there is another player in the field, namely +// BN_from_montgomery consuming larger and larger portion of CPU +// time as keysize decreases. I therefore consider putting effort +// to assembler implementation of the following routine: +// +// void bn_mul_add_mont (BN_ULONG *rp,BN_ULONG *np,int nl,BN_ULONG n0) +// { +// int i,j; +// BN_ULONG v; +// +// for (i=0; i + + int SSL_want(SSL *ssl); + int SSL_want_nothing(SSL *ssl); + int SSL_want_read(SSL *ssl); + int SSL_want_write(SSL *ssl); + int SSL_want_x509_lookup(SSL *ssl); + +=head1 DESCRIPTION + +SSL_want() returns state information for the SSL object B. + +The other SSL_want_*() calls are shortcuts for the possible states returned +by SSL_want(). + +=head1 NOTES + +SSL_want() examines the internal state information of the SSL object. Its +return values are similar to that of L. +Unlike L, which also evaluates the +error queue, the results are obtained by examining an internal state flag +only. The information must therefore only be used for normal operation under +non-blocking I/O. Error conditions are not handled and must be treated +using L. + +The result returned by SSL_want() should always be consistent with +the result of L. + +=head1 RETURN VALUES + +The following return values can currently occur for SSL_want(): + +=over 4 + +=item SSL_NOTHING + +There is no data to be written or to be read. + +=item SSL_WRITING + +There are data in the SSL buffer that must be written to the underlying +B layer in order to complete the actual SSL_*() operation. +A call to L should return +SSL_ERROR_WANT_WRITE. + +=item SSL_READING + +More data must be read from the underlying B layer in order to +complete the actual SSL_*() operation. +A call to L should return +SSL_ERROR_WANT_READ. + +=item SSL_X509_LOOKUP + +The operation did not complete because an application callback set by +SSL_CTX_set_client_cert_cb() has asked to be called again. +A call to L should return +SSL_ERROR_WANT_X509_LOOKUP. + +=back + +SSL_want_nothing(), SSL_want_read(), SSL_want_write(), SSL_want_x509_lookup() +return 1, when the corresponding condition is true or 0 otherwise. + +=head1 SEE ALSO + +L, L, L + +=cut