x86[_64]cpuid.pl: handle new extensions.

This commit is contained in:
Andy Polyakov
2011-05-16 20:35:11 +00:00
parent a3e07010b4
commit b906422149
3 changed files with 132 additions and 55 deletions

View File

@@ -2,7 +2,7 @@
=head1 NAME
OPENSSL_ia32cap - finding the IA-32 processor capabilities
OPENSSL_ia32cap - the IA-32 processor capabilities vector
=head1 SYNOPSIS
@@ -18,30 +18,52 @@ input value (see Intel Application Note #241618). Naturally it's
meaningful on x86 and x86_64 platforms only. The variable is normally
set up automatically upon toolkit initialization, but can be
manipulated afterwards to modify crypto library behaviour. For the
moment of this writing seven bits are significant, namely:
moment of this writing following bits are significant:
1. bit #4 denoting presence of Time-Stamp Counter.
2. bit #20, reserved by Intel, is used to choose among RC4 code
paths;
3. bit #23 denoting MMX support;
4. bit #25 denoting SSE support;
5. bit #26 denoting SSE2 support;
6. bit #28 denoting Hyperthreading, which is used to distiguish
cores with shared cache;
7. bit #30, reserved by Intel, is used to choose among RC4 code
paths;
8. bit #57 denoting Intel AES instruction set extension;
=item bit #4 denoting presence of Time-Stamp Counter.
=item bit #19 denoting availability of CLFLUSH instruction;
=item bit #20, reserved by Intel, is used to choose among RC4 code paths;
=item bit #23 denoting MMX support;
=item bit #24, FXSR bit, denoting availability of XMM registers;
=item bit #25 denoting SSE support;
=item bit #26 denoting SSE2 support;
=item bit #28 denoting Hyperthreading, which is used to distiguish
cores with shared cache;
=item bit #30, reserved by Intel, is used to choose among RC4 code
paths;
=item bit #33 denoting availability of PCLMULQDQ instruction;
=item bit #41 denoting SSSE3, Supplemental SSE3, support;
=item bit #43 denoting AMD XOP support (forced to zero on Intel);
=item bit #57 denoting AES-NI instruction set extension;
=item bit #59, OSXSAVE bit, denoting availability of YMM registers;
=item bit #60 denoting AVX extension;
For example, clearing bit #26 at run-time disables high-performance
SSE2 code present in the crypto library. You might have to do this if
target OpenSSL application is executed on SSE2 capable CPU, but under
control of OS which does not support SSE2 extentions. Even though you
can manipulate the value programmatically, you most likely will find it
more appropriate to set up an environment variable with the same name
prior starting target application, e.g. on Intel P4 processor 'env
OPENSSL_ia32cap=0x12900010 apps/openssl', to achieve same effect
without modifying the application source code. Alternatively you can
reconfigure the toolkit with no-sse2 option and recompile.
SSE2 code present in the crypto library, while clearing bit #24
disables SSE2 code operating on 128-bit XMM register bank. You might
have to do the latter if target OpenSSL application is executed on SSE2
capable CPU, but under control of OS that does not enable XMM
registers. Even though you can manipulate the value programmatically,
you most likely will find it more appropriate to set up an environment
variable with the same name prior starting target application, e.g. on
Intel P4 processor 'env OPENSSL_ia32cap=0x16980010 apps/openssl', to
achieve same effect without modifying the application source code.
Alternatively you can reconfigure the toolkit with no-sse2 option and
recompile.
Less intuituve is clearing bit #28. The truth is that it's not copied
from CPUID output verbatim, but is adjusted to reflect whether or not
@@ -49,4 +71,3 @@ the data cache is actually shared between logical cores. This in turn
affects the decision on whether or not expensive countermeasures
against cache-timing attacks are applied, most notably in AES assembler
module.
=cut