Additional description of the engine code and why it was designed the way it was.

2000-08-01 17:29:22 +00:00
parent acab12fbe3
commit 28b3b4c6e6
1 changed files with 244 additions and 0 deletions
--- a/crypto/engine/README
+++ b/crypto/engine/README
@@ -32,3 +32,247 @@ NOTES, THOUGHTS, and EVERYTHING
    the engine code is hidden from code outside the crypto/engine/
    directory so change shouldn't be too viral. More important though
    is how things should evolve ... this needs thought and discussion.
+
+
+-----------------------------------==*==-----------------------------------
+
+More notes 2000-08-01
+---------------------
+
+Geoff Thorpe, who designed the engine part, wrote a pretty good description
+of the thoughts he had when he built it, good enough to include verbatim here
+(with his permission)					-- Richard Levitte
+
+
+Date: Tue, 1 Aug 2000 16:54:08 +0100 (BST)
+From: Geoff Thorpe
+Subject: Re: The thoughts to merge BRANCH_engine into the main trunk are
+ emerging
+
+Hi there,
+
+I'm going to try and do some justice to this, but I'm a little short on
+time and the there is an endless amount that could be discussed on this
+subject. sigh ... please bear with me :-)
+
+> The changes in BRANCH_engine dig deep into the core of OpenSSL, for example
+> into the RSA and RAND routines, adding a level of indirection which is needed
+> to keep the abstraction, as far as I understand.  It would be a good thing if
+> those who do play with those things took a look at the changes that have been
+> done in the branch and say out loud how much (or hopefully little) we've made
+> fools of ourselves.
+
+The point here is that the code that has emerged in the BRANCH_engine
+branch was based on some initial requirements of mine that I went in and
+addressed, and Richard has picked up the ball and run with it too. It
+would be really useful to get some review of the approach we've taken, but
+first I think I need to describe as best I can the reasons behind what has
+been done so far, in particular what issues we have tried to address when
+doing this, and what issues we have intentionally (or necessarily) tried
+to avoid.
+
+methods, engines, and evps
+--------------------------
+
+There has been some dicussion, particularly with Steve, about where this
+ENGINE stuff might fit into the conceptual picture as/when we start to
+abstract algorithms a little bit to make the library more extensible. In
+particular, it would desirable to have algorithms (symmetric, hash, pkc,
+etc) abstracted in some way that allows them to be just objects sitting in
+a list (or database) ... it'll just happen that the "DSA" object doesn't
+support encryption whereas the "RSA" object does. This requires a lot of
+consideration to begin to know how to tackle it; in particular how
+encapsulated should these things be? If the objects also understand their
+own ASN1 encodings and what-not, then it would for example be possible to
+add support for elliptic-curve DSA in as a new algorithm and automatically
+have ECC-DSA certificates supported in SSL applications. Possible, but not
+easy. :-)
+
+Whatever, it seems that the way to go (if I've grok'd Steve's comments on
+this in the past) is to amalgamate these things in EVP as is already done
+(I think) for ciphers or hashes (Steve, please correct/elaborate). I
+certainly think something should be done in this direction because right
+now we have different source directories, types, functions, and methods
+for each algorithm - even when conceptually they are very much different
+feathers of the same bird. (This is certainly all true for the public-key
+stuff, and may be partially true for the other parts.)
+
+ENGINE was *not* conceived as a way of solving this, far from it. Nor was
+it conceived as a way of replacing the various "***_METHOD"s. It was
+conceived as an abstraction of a sort of "virtual crypto device". If we
+lived in a world where "EVP_ALGO"s (or something like them) encapsulated
+particular algorithms like RSA,DSA,MD5,RC4,etc, and "***_METHOD"s
+encapsulated interfaces to algorithms (eg. some algo's might support a
+PKC_METHOD, a HASH_METHOD, or a CIPHER_METHOD, who knows?), then I would
+think that ENGINE would encapsulate an implementation of arbitrarily many
+of those algorithms - perhaps as alternatives to existing algorithms
+and/or perhaps as new previously unimplemented algorithms. An ENGINE could
+be used to contain an alternative software implementation, a wrapper for a
+hardware acceleration and/or key-management unit, a comms-wrapper for
+distributing cryptographic operations to remote machines, or any other
+"devices" your imagination can dream up.
+
+However, what has been done in the ENGINE branch so far is nothing more
+than starting to get our toes wet. I had a couple of self-imposed
+requirements when putting the initial abstraction together, and I may have
+already posed these in one form or another on the list, but briefly;
+
+   (i) only bother with public key algorithms for now, and maybe RAND too
+       (motivated by the need to get hardware support going and the fact
+       this was a comparitively easy subset to address to begin with).
+
+  (ii) don't change (if at all possible) the existing crypto code, ie. the
+       implementations, the way the ***_METHODs work, etc.
+
+ (iii) ensure that if no function from the ENGINE code is ever called then
+       things work the way they always did, and there is no memory
+       allocation (otherwise the failure to cleanup would be a problem -
+       this is part of the reason no STACKs were used, the other part of
+       the reason being I found them inappropriate).
+
+  (iv) ensure that all the built-in crypto was encapsulated by one of
+       these "ENGINE"s and that this engine was automatically selected as
+       the default.
+
+   (v) provide the minimum hooking possible in the existing crypto code
+       so that global functions (eg. RSA_public_encrypt) do not need any
+       extra parameter, yet will use whatever the current default ENGINE
+       for that RSA key is, and that the default can be set "per-key"
+       and globally (new keys will assume the global default, and keys
+       without their own default will be operated on using the global
+       default). NB: Try and make (v) conflict as little as possible with
+       (ii). :-)
+
+  (vi) wrap the ENGINE code up in duct tape so you can't even see the
+       corners. Ie. expose no structures at all, just black-box pointers.
+
+   (v) maintain internally a list of ENGINEs on which a calling
+       application can iterate, interrogate, etc. Allow a calling
+       application to hook in new ENGINEs, remove ENGINEs from the list,
+       and enforce uniqueness within the global list of each ENGINE's
+       "unique id".
+
+  (vi) keep reference counts for everything - eg. this includes storing a
+       reference inside each RSA structure to the ENGINE that it uses.
+       This is freed when the RSA structure is destroyed, or has its
+       ENGINE explicitly changed. The net effect needs to be that at any
+       time, it is deterministic to know whether an ENGINE is in use or
+       can be safely removed (or unloaded in the case of the other type
+       of reference) without invalidating function pointers that may or
+       may not be used indavertently in the future. This was actually
+       one of the biggest problems to overcome in the existing OpenSSL
+       code - implementations had always been assumed to be ever-present,
+       so there was no trivial way to get round this.
+
+ (vii) distinguish between structural references and functional
+       references.
+
+A *little* detail
+-----------------
+
+While my mind is on it; I'll illustrate the bit in item (vii). This idea
+turned out to be very handy - the ENGINEs themselves need to be operated
+on and manipulated simply as objects without necessarily trying to
+"enable" them for use. Eg. most host machines will not have the necessary
+hardware or software to support all the engines one might compile into
+OpenSSL, yet it needs to be possible to iterate across the ENGINEs,
+querying their names, properties, etc - all happening in a thread-safe
+manner that uses reference counts (if you imagine two threads iterating
+through a list and one thread removing the ENGINE the other is currently
+looking at - you can see the gotcha waiting to happen). For all of this,
+*structural references* are used and operate much like the other reference
+counts in OpenSSL.
+
+The other kind of reference count is for *functional* references - these
+indicate a reference on which the caller can actually assume the
+particular ENGINE to be initialised and usable to perform the operations
+it implements. Any increment or decrement of the functional reference
+count automatically invokes a corresponding change in the structural
+reference count, as it is fairly obvious that a functional reference is a
+restricted case of a structural reference. So struct_ref >= funct_ref at
+all times. NB: functional references are usually obtained by a call to
+ENGINE_init(), but can also be created implicitly by calls that require a
+new functional reference to be created, eg. ENGINE_set_default(). Either
+way the only time the underlying ENGINE's "init" function is really called
+is when the (functional) reference count increases to 1, similarly the
+underlying "finish" handler is only called as the count goes down to 0.
+The effect of this, for example, is that if you set the default ENGINE for
+RSA operations to be "cswift", then its functional reference count will
+already be at least 1 so the CryptoSwift shared-library and the card will
+stay loaded and initialised until such time as all RSA keys using the
+cswift ENGINE are changed or destroyed and the default ENGINE for RSA
+operations has been changed. This prevents repeated thrashing of init and
+finish handling if the count keeps getting down as far as zero.
+
+Otherwise, the way the ENGINE code has been put together I think pretty
+much reflects the above points. The reason for the ENGINE structure having
+individual RSA_METHOD, DSA_METHOD, etc pointers is simply that it was the
+easiest way to go about things for now, to hook it all into the raw
+RSA,DSA,etc code, and I was trying to the keep the structure invisible
+anyway so that the way this is internally managed could be easily changed
+later on when we start to work out what's to be done about these other
+abstractions.
+
+Down the line, if some EVP-based technique emerges for adequately
+encapsulating algorithms and all their various bits and pieces, then I can
+imagine that "ENGINE" would turn into a reference-counting database of
+these EVP things, of which the default "openssl" ENGINE would be the
+library's own object database of pre-built software implemented algorithms
+(and such). It would also be cool to see the idea of "METHOD"s detached
+from the algorithms themselves ... so RSA, DSA, ElGamal, etc can all
+expose essentially the same METHOD (aka interface), which would include
+any querying/flagging stuff to identify what the algorithm can/can't do,
+its name, and other stuff like max/min block sizes, key sizes, etc. This
+would result in ENGINE similarly detaching its internal database of
+algorithm implementations from the function definitions that return
+interfaces to them. I think ...
+
+As for DSOs etc. Well the DSO code is pretty handy (but could be made much
+more so) for loading vendor's driver-libraries and talking to them in some
+generic way, but right now there's still big problems associated with
+actually putting OpenSSL code (ie. new ENGINEs, or anything else for that
+matter) in dynamically loadable libraries. These problems won't go away in
+a hurry so I don't think we should expect to have any kind of
+shared-library extensions any time soon - but solving the problems is a
+good thing to aim for, and would as a side-effect probably help make
+OpenSSL more usable as a shared-library itself (looking at the things
+needed to do this will show you why).
+
+One of the problems is that if you look at any of the ENGINE
+implementations, eg. hw_cswift.c or hw_ncipher.c, you'll see how it needs
+a variety of functionality and definitions from various areas of OpenSSL,
+including crypto/bn/, crypto/err/, crypto/ itself (locking for example),
+crypto/dso/, crypto/engine/, crypto/rsa, etc etc etc. So if similar code
+were to be suctioned off into shared libraries, the shared libraries would
+either have to duplicate all the definitions and code and avoid loader
+conflicts, or OpenSSL would have to somehow expose all that functionality
+to the shared-library. If this isn't a big enough problem, the issue of
+binary compatibility will be - anyone writing Apache modules can tell you
+that (Ralf? Ben? :-). However, I don't think OpenSSL would need to be
+quite so forgiving as Apache should be, so OpenSSL could simply tell its
+version to the DSO and leave the DSO with the problem of deciding whether
+to proceed or bail out for fear of binary incompatibilities.
+
+Certainly one thing that would go a long way to addressing this is to
+embark on a bit of an opaqueness mission. I've set the ENGINE code up with
+this in mind - it's so draconian that even to declare your own ENGINE, you
+have to get the engine code to create the underlying ENGINE structure, and
+then feed in the new ENGINE's function/method pointers through various
+"set" functions. The more of the code that takes on such a black-box
+approach, the more of the code that will be (a) easy to expose to shared
+libraries that need it, and (b) easy to expose to applications wanting to
+use OpenSSL itself as a shared-library. From my own explorations in
+OpenSSL, the biggest leviathan I've seen that is a problem in this respect
+is the BIGNUM code. Trying to "expose" the bignum code through any kind of
+organised "METHODs", let alone do all the necessary bignum operations
+solely through functions rather than direct access to the structures and
+macros, will be a massive pain in the "r"s.
+
+Anyway, I'm done for now - hope it was readable. Thoughts?
+
+Cheers,
+Geoff
+
+
+-----------------------------------==*==-----------------------------------
+