VW 7.4 Spoilers - ASN.1

#smalltalk

(Republished from Cincom Smalltalk Tech Tips)

I’ve mentioned previously that we have spent a lot of time on ASN.1 in this release cycle, so I better say something about it. However this article won’t be an introduction to ASN.1, I want to focus on the improvements in our implementation, but there are some easy introductions available and even free books for the gory details.

I figured that the best way to demonstrate the framework is to show how it’s used in an application, and our most interesting application so far is the X.509 framework, so let’s take a look at that. The X.509 framework is structurally fairly simple, you have a hierarchy of X509Objects representing various components of an X.509 Certificate and few supporting classes like X509Registry or specific exception classes. The job of the ASN.1 framework is to turn X509Object instances into DER encoded bytes and back. To be able to do that the framework needs a structural description of the encoded bytes. It needs to know that an encoded certificate starts with an encoded TBSCertificate, then an identifier of the algorithm used to sign the contents of the TBSCertificate (TBS stands for ’to be signed’ here) and finally the bytes of the signature itself. ASN.1 describes this structure using a C-ish notation. The Certificate definition looks as follows:

	Certificate  ::=  SEQUENCE  {
		tbsCertificate       TBSCertificate,
		signatureAlgorithm   AlgorithmIdentifier,
		signatureValue       BIT STRING  }

A SEQUENCE is like a struct in C and the elements show name first and type second. Our framework represents this information with a structure of ASN1.Type objects, closely following the ASN.1 expressions. In this case it would be something like

	(SEQUENCE name: #Certificate)
		addElement: #tbsCertificate type: TBSCertificate;
		addElement: #signatureAlgorithm type: AlgorithmIdentifier;
		addElement: #signatureValue type: BIT_STRING;
		yourself

This expression will not work correctly though, because the #type: arguments would have to be other ASN1.Type objects. If you build the type objects in the right order, making sure all the component types are created before the containing types, you might be able build the structure by hand, however that would be very inconvenient to maintain. It is OK for relatively simple structures, like

"	RSAPublicKey ::= SEQUENCE {
		modulus           INTEGER,  -- n
		publicExponent    INTEGER   -- e }
"
	(SEQUENCE name: #RSAPublicKey)
		addElement: #modulus type: INTEGER;
		addElement: #publicExponent type: INTEGER;
		yourself

For the more complex cases the framework provides a more convenient mechanism, a Module. An ASN.1 Module is a container for a set of related Type definitions and consequently provides a context for lookup of types by name. As soon as you put a SEQUENCE into a Module, you can define its elements using type names instead of full instances and it also takes care of resolving forward references, i.e. you can define types in any order you wish, their mutual references will be resolved properly as the type definitions get added. The X509 framework maintains its module in a shared class variable X509Object.ASN1Module. Using a module the type definition for Certificate can look as follows:

	module := Module new: #X509.
	tCertificate :=
		(module SEQUENCE: #Certificate)
			addElement: #tbsCertificate type: #TBSCertificate;
			addElement: #signatureAlgorithm type: #AlgorithmIdentifier;
			addElement: #signatureValue type: #BIT_STRING;
			yourself

Once we have the type definitions in place the marshaling framework knows enough about the encoded bytes, however in order to be able to map objects to bytes, it needs to know how the types correspond to classes. For most of the “simple” types there’s predefined correspondence, i.e. BOOLEAN maps to Booleans, INTEGER to Integers, etc. However SEQUENCE and SET types default to instances of ASN1.Struct which is kind of like a Dictionary but with few convenience gimmicks, like you can use the element names as accessors and such. But we don’t want a Struct instance for Certificates, we want instances of Certificate class. That’s why the SEQUENCE and SET types have an optional ‘mapping’ attribute. You can tell it to map to a given Smalltalk class. It is responsibility of the developer to make sure that the class provides all the expected accessor methods. The Certificate class does that of course, so all that needs to be done is to tell the type about it:

	tCertificate mapping: Certificate

The last type feature to discuss is encoding retention. X.509 has a fairly tortured history and the practical outcome of it is the rule to never ever re-encode a certificate. Therefore it is desirable for a certificate imported from outside to retain its DER encoding, in case it needs to be exported again. The retained encoding also serves as a cache, so writing out an object with retained encoding can simply dump the retained bits, instead of going through the encoding process. Any Type can be told to retain its encoding. An encoding is captured in an instance of ASN1.Encoding pointing to the relevant bytes. The framework will pass the Encoding to the corresponding object using #_encoding:type: message. The default implementation in Object will wrap the object in a TypeWrapper which has a slot to capture the encoding, however it is expected that most applications will simply allocate a slot directly in the objects that will retain encoding and therefore will most likely override the method to store the Encoding there (see Certificate»#_encoding:type: for example). Note also that the encoding retention behavior was factored out of the marshaling machinery into a standalone EncodingPolicy object and is therefore completely customizable. An interesting side-effect of this is that with customized EncodingPolicy you get a chance to intervene with the marshaling at interesting points in the process. One possible exploitation of this, that was very handy for us while trying to figure out bugs in the marshaling process was the PrettyPrinter policy which produces a map of the bytes on a text stream while marshaling. Here’s an example. Let’s take something simpler than Certificate, for example the Name used for issuer or subject fields of Certificate. The full ASN.1 definition of Name is somewhat complex, but this would be the part relevant to our example:

"	Name ::= CHOICE { RDNSequence }
	RDNSequence ::= SEQUENCE OF RelativeDistinguishedName
	RelativeDistinguishedName ::= SET OF AttributeTypeAndValue
	AttributeTypeAndValue ::= SEQUENCE {
		type     AttributeType,
		value    AttributeValue }
	AttributeType ::= OBJECT IDENTIFIER
	AttributeValue ::= ANY DEFINED BY AttributeType
"
	(module CHOICE: #Name)
		addElement: nil type: #RDNSequence;
		retainEncoding: true.
	module SEQUENCE: #RDNSequence OF: #RelativeDistinguishedName.
	module SET: #RelativeDistinguishedName OF: #AttributeTypeAndValue.
	(module SEQUENCE: #AttributeTypeAndValue)
		addElement: #type type: #AttributeType;
		addElement: #value type: #AttributeValue.
	module OBJECT_IDENTIFIER: #AttributeType.
	module ANY: #AttributeValue.

Basically a Name is somewhat nested collection of attributes, where attribute has a type and a value. Now let’s unmarshal an encoded Name using the PrettyPrinter policy.

 	bytes :=   16r3068310B3009060355040613025553311330110603550407130A43696E63696E6E61746931173015060355040A130E43696E636F6D2053797374656D7331193017060355040B131043696E636F6D20536D616C6C74616C6B3110300E0603550403130754657374204341 asBigEndianByteArray.
	marshaler := DERStream with: bytes.
	output := String new writeStream.
	marshaler encodingPolicy: (PrettyPrinter on: output).
	marshaler reset.
	name := marshaler unmarshalObjectType: module Name.
	output contents

If all goes well the result of the above code should look as follows:

0	Name
0		RDNSequence
2			RelativeDistinguishedName
4				AttributeTypeAndValue
6					AttributeType
11					ObjectIdentifier(2.5.4.6)
11					AttributeValue
15					'US'
15				AttributeTypeAndValue {type ObjectIdentifier(2.5.4.6), value 'US'}
15			OrderedCollection (AttributeTypeAndValue {type ObjectIdentifier(2.5.4.6), value 'US'})
15			RelativeDistinguishedName
17				AttributeTypeAndValue
19					AttributeType
24					ObjectIdentifier(2.5.4.7)
24					AttributeValue
36					'Cincinnati'
36				AttributeTypeAndValue {type ObjectIdentifier(2.5.4.7), value 'Cincinnati'}
36			OrderedCollection (AttributeTypeAndValue {type ObjectIdentifier(2.5.4.7), value 'Cincinnati'})
36			RelativeDistinguishedName
38				AttributeTypeAndValue
40					AttributeType
45					ObjectIdentifier(2.5.4.10)
45					AttributeValue
61					'Cincom Systems'
61				AttributeTypeAndValue {type ObjectIdentifier(2.5.4.10), value 'Cincom Systems'}
61			OrderedCollection (AttributeTypeAndValue {type ObjectIdentifier(2.5.4.10), value 'Cincom Systems'})
61			RelativeDistinguishedName
63				AttributeTypeAndValue
65					AttributeType
70					ObjectIdentifier(2.5.4.11)
70					AttributeValue
88					'Cincom Smalltalk'
88				AttributeTypeAndValue {type ObjectIdentifier(2.5.4.11), value 'Cincom Smalltalk'}
88			OrderedCollection (AttributeTypeAndValue {type ObjectIdentifier(2.5.4.11), value 'Cincom Smalltalk'})
88			RelativeDistinguishedName
90				AttributeTypeAndValue
92					AttributeType
97					ObjectIdentifier(2.5.4.3)
97					AttributeValue
106					'Test CA'
106				AttributeTypeAndValue {type ObjectIdentifier(2.5.4.3), value 'Test CA'}
106			OrderedCollection (AttributeTypeAndValue {type ObjectIdentifier(2.5.4.3), value 'Test CA'})
106		OrderedCollection (OrderedCollection (AttributeTypeAndValue {type ObjectIdentifier(2.5.4.6), value 'US'}) OrderedCollection (AttributeTypeAndValue {type ObjectIdentifier(2.5.4.7), value 'Cincinnati'}) OrderedCollection (AttributeTypeAndValue {type ObjectIdentifier(2.5.4.10), value 'Cincom Systems'}) OrderedCollection (AttributeTypeAndValue {type ObjectIdentifier(2.5.4.11), value 'Cincom Smalltalk'}) OrderedCollection (AttributeTypeAndValue {type ObjectIdentifier(2.5.4.3), value 'Test CA'}))
106	Name<RDNSequence:OrderedCollection (OrderedCollection (AttributeTypeAndValue {type ObjectIdentifier(2.5.4.6), value 'US'}) OrderedCollection (AttributeTypeAndValue {type ObjectIdentifier(2.5.4.7), value 'Cincinnati'}) OrderedCollection (AttributeTypeAndValue {type ObjectIdentifier(2.5.4.10), value 'Cincom Systems'}) OrderedCollection (AttributeTypeAndValue {type ObjectIdentifier(2.5.4.11), value 'Cincom Smalltalk'}) OrderedCollection (AttributeTypeAndValue {type ObjectIdentifier(2.5.4.3), value 'Test CA'}))>

The numbers at the beginning of each line show offsets into the source bytes. Each unmarshaled entity has two entries in the output, its type at the offset where it starts and the value printString at the offset where it ends. Simple types will have 2 entries next to each other and constructed types will have their elements indented.

There’s much more to talk about in ASN.1, things like tagging, sub-typing, constraints, etc. But this post is again getting long, so I’ll pick this up some other time. If the release is out before I get to it, you can find more about ASN.1 in the release notes and in the shiny new ASN.1 chapter of the SecurityGuide.pdf. Until then, thanks for reading this far and Happy Holidays!