BSD 4_3_Net_2 development
[unix-history] / usr / src / contrib / isode / pepsy / doc / DESCRIPTION
_\b1. _\bO_\bv_\be_\br_\bv_\bi_\be_\bw _\bo_\bf _\bp_\be_\bp_\bs_\by _\bs_\by_\bs_\bt_\be_\bm
This section describes how the various parts fit
together to make the system work. The principle behind
pepsy is fairly simple. The ASN.1 is summarised as tables
of integers. These tables are read by driver routines which
encode or decode data to or from the internal format that
ISODE OSI implementation uses. In ISODE specific functions
are generated for each ASN.1 type defined in contrast the
pepsy merely generates a new table of data which is far far
smaller.
As there is a great deal of effort invested in the
ISODE interface to the encoding/decoding routines pepsy
automatically provides macros which map the original func-
tions into the appropriate function call of a driver. This
allows existing posy using code to switch to the pepsy sys-
tem with no changes to the code provided no function
pointers are used to the original ISODE functions. Even
when there are function pointers used the changes are very
simple and take only a few hours to implement.
_\b1._\b1. _\bB_\br_\bi_\be_\bf _\bd_\be_\bs_\bc_\br_\bi_\bp_\bt_\bi_\bo_\bn _\bo_\bf _\bt_\bh_\be _\bu_\bs_\be _\bo_\bf _\bt_\bh_\be _\bp_\be_\bp_\bs_\by _\bs_\by_\bs_\bt_\be_\bm.
_\b1._\b1._\b1. _\bO_\bu_\bt_\bl_\bi_\bn_\be _\bo_\bf _\bt_\bh_\be _\bf_\bi_\bl_\be_\bs _\bp_\br_\bo_\bd_\bu_\bc_\be_\bd _\bu_\bn_\bd_\be_\br _\bt_\bh_\be _\bp_\be_\bp_\bs_\by _\bs_\by_\bs_\b-
_\bt_\be_\bm.
The pepsy system consists of a program called _\bp_\bo_\bs_\by
which translates ASN.1 modules into a set of tables, called
_\bp_\bo_\bs_\by at the moment, and library of driver routines, called
_\bl_\bi_\bb_\bp_\be_\bp_\bs_\by._\ba. Running this _\bp_\bo_\bs_\by program on the ASN.1 file
will produce several files. If the name of the ASN.1 module
is MODULE the following files are generated:
MODULE-types.h
which contains C structure definitions. The user of
the library provides data as a linked list of these C
data structures and expects to receive data back as a
similar linked list. These data structures are exactly
the same as those produced by the original ISODE _\bp_\bo_\bs_\by
so that existing software written for the old _\bp_\bo_\bs_\by pro-
gram needs no change. For details on the C data struc-
tures types generated see the documentation of the ori-
ginal _\bp_\bo_\bs_\by program in volume 4 Chapter 5 of the ISODE
manuals.
MODULE_tables.c
This file contains the tables generated by the new _\bp_\bo_\bs_\by
program. These tables consist of three parts, the
first which contains the summary of ASN.1 types. Each
type is summarised as an array of a primitive type,
struct pte, for encoding and decoding, and struct ptpe
for printing. As implied there is one array for each
type for each of encoding, decoding and printing as
January 23, 1990
- 2 -
specified when _\bp_\bo_\bs_\by is run. The next part contains up
to three tables of pointers to these arrays. Each of
the three different types of arrays, encoding, decoding
and printing, has its own table of pointers. Finally
there is the module type definition which contains con-
tains pointers to these tables and some other useful
information about the module such as its name. This
module type structure, which is typedefed to modtyp, is
the only piece of data which is global, all the rest of
the data is static and is only addressable via the mod-
typ data structure. This provides a kind of object
oriented approach to handling the tables. Once you are
passed a pointer to an ASN.1's modtyp structure you can
encode, decode and print any of its types by calling
the appropriate libpepsy.a routine with its type
number.
MODULE_pre_defs.h
This file contains #defines symbol of each of the ASN.1
types to its type number, which is used when calling a
libpepsy.a routine. Each symbol is _Ztype-nameMODULE
where _\bt_\by_\bp_\be-_\bn_\ba_\bm_\be is the name of the type with dashes (-)
turned into underscores (_) and _\bM_\bO_\bD_\bU_\bL_\bE is the name of
the module. For example of the ASN.1 universal type
_\bG_\br_\ba_\bp_\bh_\bi_\bc_\bS_\bt_\br_\bi_\bn_\bg would have the #define symbol _ZGra-
phicStringUNIV. The __\bZ is prepended to try to make the
symbols unique. This file also contains and extern
declaration for the modtyp data for its module.
MODULE_defs.h
This file contains macros for all the encoding, decod-
ing and printing functions that the _\bp_\be_\bp_\by program would
have for these ASN.1 types. This allows much of the
code that uses the routines generated by running the
old _\bp_\bo_\bs_\by program and taking its output and running _\bp_\be_\bp_\by
on augmented ASN.1 output can be recompiled unchanged.
If the code used pointers to these functions it is
necessary to change it to pass around the type numbers
instead and to call appropriately call a libpepsy.a
library routine with the type number. As pointers to
the printing routines in ISODE are passed as arguments
a #define is provided to turn the argument into the
pair of arguments, type number and pointer to modtyp
structure, which are needed to allow the diagnostic
printing code to work with no change for the current
ISODE stack. This file also contains a #include of the
_\bM_\bO_\bD_\bU_\bL_\bE__\bp_\br_\be__\bd_\be_\bf_\bs._\bh file.
As the _\bM_\bO_\bD_\bU_\bL_\bE-_\bt_\by_\bp_\be_\bs._\bh file #include's the _\bM_\bO_\bD_\bU_\bL_\bE__\bd_\be_\bf_\bs._\bh
file no further #includes need to be added to the files
using the encoding/decoding/printing functions. This means
that code written to use posy/pepy system may need no change
at all and the only effort required is to change the
Makefile to use the pepsy system. If there is code changes
January 23, 1990
- 3 -
required it would most likely be because function pointers
are used to reference the functions generated by posy. If
only the _\bp_\be_\bp_\by system was used, not posy then pepy, with code
placed inside action statements then quite a large amount of
work may be needed to change over to the new system, depend-
ing on how large and complex the _\bp_\be_\bp_\by module is.
_\b1._\b1._\b2. _\bO_\bu_\bt_\bl_\bi_\bn_\be _\bo_\bf _\bt_\bh_\be _\bp_\be_\bp_\bs_\by _\bl_\bi_\bb_\br_\ba_\br_\by.
enc.cThis contains the routines that encode data from the C
data structures into ISODE's PElement linked list data
structure which it uses for all presentation data. The
most important function to pepsy users is enc_f which
called to encode a particular type. It is passed the
type number and a pointer to modtyp structure for that
module and then the rest of the arguments which are
passed to an encode function generated by _\bp_\bo_\bs_\by/_\bp_\be_\bp_\by
system. See the documentation in Volume 4, "The Appli-
cations Cookbook", Section 6.4 called "Pepy Environ-
ment". Most of these latter arguments are ignored,
only parm and pe, are used.
Contrary to what the ISODE documentation says these
ignored parameters are hardly ever used by existing code.
We have not found a single case where used for encoding a
named type, which is all that the user can reference anyway,
so we don't see any problems with ignoring these other
parameters. Hopefully one day they can be thrown away
entirely, until then they are actually passed the the encod-
ing function.
The rest of the functions are mostly recursive routines
which encode a particular type of table entry. For example
SEQUENCE is encoded by en_seq which may call itself or oth-
ers to encode the types from which it is built up. The
function en_type builds up a simple type and en_obj encodes
a new type (object) and so on with other functions. There
are a few utility routines in the file such as same which
determines whether the value is the same as the default
value also.
dec.cThis file contains the decoding routines that translate
presentation data into C data structures defined in the
MODULE-types.h is like _\be_\bn_\bc._\bc. It is very much like the
file _\be_\bn_\bc._\bc except the routines do the reverse tasks The
routines are structured in a very similar way. We have
dec_f which is called by the user to decode a type and
like enc_f takes the same arguments as the decoding
functions generated by _\bp_\bo_\bs_\by with two additions, the
type number and a pointer to the modtyp structure for
that module. Likewise the other functions are very
much like those of enc.c
prnt.cThis file contains the routines that print the
January 23, 1990
- 4 -
presentation data in a format similar to that generated
by _\bp_\be_\bp_\by's printing functions. It's main function
prnt_f is takes the same arguments as the printing
function generated by _\bp_\be_\bp_\by as well as the now familiar
type number and modtyp pointer. The functions are
modeled on the decoding routines as it has similar job
to. The only difference is that instead of storing the
decoded data into a C data structure it is nicely
printed out.
fr.c This file contains code to free the data structures
defined in MODULE-types.h. Likewise if the -f flag is
given when generating the types file it also includes
macros in the types file which replace the freeing
functions generated by ISODE's _\bp_\bo_\bs_\by. The function that
the user calls us fre_obj which takes a pointer to the
data structure, its decoding table entry and a pointer
to the modtyp structure for the module. The freeing is
based on the decoding routines except instead of decod-
ing all it does is free each part of the data struc-
ture, which might involve recursive calls, then it
frees the data structure at the end.
util.cThis contains the utility routines used by more than
one of the above files. This is mostly diagnostic rou-
tines at the moment, more general routines could be
included in here. If there is an error at the moment
which it can't recover from it just prints out a mes-
sage on standard error and calls exit. Not perfect and
this is something that will need work.
main.cThis contains code to perform a series of tests on the
_\bp_\be_\bp_\bs_\by library which is a useful check to see whether
any of the routines has been broken by any changes
made. It basically loops through a whole series of
test cases. Each test case is encoded from some built
in test data and then decoded and checked to see if the
data has changed in the transfer. If it is compiled
with -_\bD_\bP_\bR_\bN_\bT=_\b1 the encoded data is also printed out to
check the printing routines which generates a vast
amount of output. Finally the free routines are used
to free the allocated data, although it can not
directly check the free routines to see if they work,
it can be used with a malloc tracing package to check
that the routines work.
test_table.h
This contains the test cases that _\bm_\ba_\bi_\bn._\bc program runs.
Each entry in the table corresponds to a type. One of
the fields is count of how many times that type is to
be tested to try out the different possibly data values
it might have.
pep.h and pepdefs.h
January 23, 1990
- 5 -
These files contain the definition of types used for
the tables that drive the encoding/decoding/printing
routines. All the constants used in that table are
defined here via #defines. The modtyp structure is
defined in _\bp_\be_\bp_\bd_\be_\bf_\bs._\bh.
t1.py and t2.py
These are test ASN.1 modules that are used by _\bm_\ba_\bi_\bn._\bc
routines to check the _\bp_\be_\bp_\bs_\by library. The file _\bt_\b1._\bp_\by
contains the majority of different types with a few of
a different module provided in _\bt_\b2._\bp_\by. This allows the
testing of the code for handling ASN.1 external refer-
ences, i.e. references to types defined in other,
external, modules.
_\b1._\b1._\b3. _\bN_\be_\bw _\bf_\bi_\bl_\be_\bs _\bi_\bn _\bt_\bh_\be _\bp_\be_\bp_\by _\bd_\bi_\br_\be_\bc_\bt_\bo_\br_\by
etabs.c, dtabs.c and ptabs.c
These files contain the code to generate the
encoding/decoding/printing tables. The main routine in
_\be_\bt_\ba_\bb_\bs._\bc is tenc_typ which is called on each ASN.1 type
to generate an array of entries which describe how to
encode that type. See the details section for more
information about how the table entries function.
Similarly _\bd_\bt_\ba_\bb_\bs._\bc contains the routine tdec_typ which
is called on each type to generate its decoding table
entries. Likewise tprnt_typ routine generates the
arrays of table entries for the printing tables. This
function is in _\bp_\bt_\ba_\bb_\bs._\bc.
dfns.cThis file contains miscellaneous string handling rou-
tines and hash table routines that don't really belong
anywhere else. Some of the routines could be cleaned
up in that they tend not to free memory they use.
mine.hThis file contains the definitions for the hash
table(s) that are used to keep track of the ASN.1
types. This could probably be done with out a hash
table, should anyone want to clean this up, feel wel-
come. The lookup function is in _\bd_\bf_\bn_\bs._\bc.
pass2.h
This file has most of the #defines for the table gen-
erating program. Most of the prefixes and suffixes of
function names and files names are defined here so,
hopefully, the names can be changed by merely changing
the definition. This contains most of the important
definitions needed by the changes to the _\bp_\bo_\bs_\by program
needed to generate tables.
posy.hThis contains the definition of a symbol which is now
needed outside of the the main routine and the yacc
file. By putting it here we can include it any file
that needs to know it with out putting in any that
January 23, 1990
- 6 -
doesn't need it and with out including all the other
definitions that occur in _\bp_\be_\bp_\by._\bh. The structure and
meaning of the tables generated from the ASN.1 grammar
Each collection of ASN.1 grammar is called a module.
(See ASN.1 ) Each ASN.1 module is completely specified in
the program by a single C structure of type modtyp and the
data which it references. See the _\bp_\be_\bp_\bd_\be_\bf_\bs._\bh file in the
_\bp_\be_\bp_\bs_\by directory. For each ASN.1 module there are three
tables that are generated fromASN.1 grammar. These initial-
ised arrays which we call tables are called the encoding,
decoding and printing tables. Each of these tables is
referenced through a different pointer of the modtyp struc-
ture.
Each of these pointers references an array of pointers,
one pointer for each ASN.1 type defined in the module. The
position of one of these pointers is the unique type number
we give to its corresponding type. The pointer references
an array of type tpe or ptpe, depending whether it is an
entry in the decoding/encoding tables or printing tables
respectively. See _\bp_\be_\bp._\bh in the _\bp_\be_\bp_\bs_\by directory. This array
actually contains the necessary information to
encode/decode/print that ASN.1 type. So given the modtyp
structure of an ASN.1 module and its type number you can
call a routine to encode, decode or print that type.
The rest of this document assumes a good knowledge of
ASN.1 notation so go read a copy if you haven't already.
From here on I shall mention only tpe and this means tpe in
the case of encoding or decoding and ptpe in the case of
printing, unless otherwise stated. Each type is represented
by an array of tpe (or ptpe for printing). The basic ele-
ment consists of four integer fields, the printing table is
the same with an addition char pointer field which contains
the name corresponding to that entry in the ASN.1 grammar.
The first specifies the type of the entry and determines how
the rest are interpreted. The possible types are listed in
_\bp_\be_\bp_\bs_\by/_\bp_\be_\bp._\bh. Each type is an array which starts with an
entry of type PE_START and ends with one of type PE_END.
Each primitive type requires one entry to specify it, apart
from possible PE_START and PE_END used to specify the start
and end of the type. Constructed types are represented by a
list of entries terminated by an entry of type PE_END. As
ASN.1 types can be nested inside so will the representation
in tpe entries be nested. For example the ASN.1 type defin-
ition:
Example1 ::=
SEQUENCE {
seq1 SEQUENCE {
an-i INTEGER,
an-ostring OCTET STRING
},
a-bool IMPLICIT [0] BOOLEAN
January 23, 1990
- 7 -
}
Will generate an encoding array:
static tpe et_Example1Test[] = {
{ PE_START, 0, 0, 0 },
{ SEQ_START, 0, 16, FL_UNIVERSAL },
{ SEQ_START, OFFSET(struct type_Test_Example1, seq1), 16, FL_UNIVERSAL },
{ INTEGER, OFFSET(struct element_Test_0, an__i), 2, FL_UNIVERSAL },
{ OCTETSTRING, OFFSET(struct element_Test_0, an__ostring), 4, FL_UNIVERSAL },
{ PE_END, 0, 0, 0 },
{ BOOLEAN, OFFSET(struct type_Test_Example1, a__bool), 0, FL_CONTEXT },
{ PE_END, 0, 0, 0 },
{ PE_END, 0, 0, 0 }
};
Here the second last PE_END matches and closes off the
first SEQ_START. The entries which correspond to the other
primative types are pretty obvious, with the INTEGER entry
corresponding to the primative INTEGER. For fields that
generate data the general interpretation of the other three
fields is offset, tag and flags/class fields respectively.
offsetThe second field gives the offset in a C data struc-
ture needed to reference the data that corresponds to
this table entry. Each ASN.1 type has C structure
types generated as described in the ISODE manuals,
volume 4 "The applications Cookbook" Section 5.2, "POSY
Environment". As this offset may have to be determined
in a compiler dependent manner a C preprocessor macro
is used hide the actual details.
tag This is the tag associated with the ASN.1 type for that
entry. Notice that in the example the [0] IMPLICIT
which changes the tag associated with the BOOLEAN entry
actually has the correct tag of 0 in the table. Like-
wise SEQUENCE has the correct tag of 16 in its
SEQ_START entry and so on for the others.
flags/class
This contains the ASN.1 class associated with the
entry's type. That is UNIVERSAL for all except the
BOOLEAN type which is CONTEXT class. This fourth can
also contain flags that specify if the type is OPTIONAL
or DEFAULT. There is plenty of room here as there is
only four possibly classes.
Now that you have some idea of how these arrays are
arranged for a type definition I will proceed to go through
the possible type of entries and describe what they do and
how they work. These values are defined in _\bp_\be_\bp_\bs_\by/_\bp_\be_\bp._\bh.
Those entries with a value below TYPE_DATA are entries that
don't correspond to data to be encoded/decoded and are for
other book keeping type purposes.
January 23, 1990
- 8 -
PE_START and PE_END
As explained above PE_START starts the beginning of a
ASN.1 type's array. It probably isn't necessary but
the size of the tables is so small it isn't much of an
over head to keep around for cosmetic reasons. The
entry type PE_END is necessary to mark the end of some
compound type as well as the end of ASN.1 data type.
XOBJECT and UCODE
These are obsolete types and probably should be
removed. They were to allow C code written directly by
the user to be incorporated into the encoding/decoding
but it was found unnecessary. Prehaps some brave soul
would like to use them in an attempt to implement a
similar system based on _\bp_\be_\bp_\by which is what we first
attempted to do until we found this to be much easier.
MALLOCThis field only occurs in the decoding tables. It
specifies how much space to malloc out for the current
C structure it is just inside of. For instance in the
example above the decoding table has the following
entry:
{ MALLOC, 0, sizeof (struct type_Test_Example1), 0 },
just after the first SEQ_START entry. It tells it to
malloc out a struct type_Test_Example1 structure to
hold the data from the sequence when it is decoded.
SCTRLThis entry is used in handling the ASN.1 CHOICE type.
The C type generated for ASN.1 CHOICE type is a struc-
ture with an offset field in it and a union of all the
C types present in the CHOICE. Each ASN.1 type in the
CHOICE of types has a C type definition generated for
it. The union is of all these types, which is quite a
logical way to implement a CHOICE type. The offset
field specifies which possibility of interpreting the
union should be used (which _\bm_\be_\bm_\bb_\be_\br should selected).
As such it needs to be read by the encoding routines
when encoding the data from the C data structures and
to be set by the decoding routines when it is decoding
the data into the C data structures. There is one such
entry for each CHOICE type to specify where the offset
field is.
CH_ACTAnother redundant entry type. I think this was also
used in code to handle C statements or actions speci-
fied by the user. It probably should be removed.
OPTL This is used to handle the optionals field that is gen-
erated by posy when optional types that are _\bn_\bo_\bt imple-
mented by pointers are present in the ASN.1 type. For
example if an ASN.1 type has an optional integer field
how does the encoding routine determine if the integer
January 23, 1990
- 9 -
is to be present or not? If it was implemented as a
pointer it could use a NULL (zero) pointer to mean that
the type was not present because NULL is guaranteed to
never occur as a legal pointer to a real object. But
all the possible values for integer could be legally
passed so instead for these types which are not
pointers and are optional a bit map is allocated in the
structure. Each non pointer optional type a bit from
the bit map is allocated.
If that bit is set the corresponding type is present
and it is not present if the bit is not set. Each bit has a
#define generated for it. The bit map is merely an integer
field called "optionals" limiting maximum number of such
optionals to 32 on Sun machines, 16 on some others. (An
array of char as BSD fd_sets would have avoid all such lim-
its, not that this limit is expected to be exceeded very
often !) Like the SCTRL entry this entry merely serves to
specify where this field is so it can be test and set by the
encoding and decoding routines respectively.
ANY and CONS_ANY
The C type corresponding to the entry is a PE pointer.
To conform with _\bp_\be_\bp_\by the tag and class of this entry
are ignored, which may or may not be the most sensible
thing. The CONS_ANY is a redundant symbol which means
the same thing but is not used. This should be clean
up and removed.
INTEGER, BOOLEAN, BITSTRING, OCTETSTRING and OBJID
These are just as described in the first article. See
the ISODE manual to find out what they are allocated as
a C data type to implement them. The offset fields
says where to find this data type with in the current
structure.
SET_START, SETOF_START, SEQ_START and SEQOF_START
These compound entries differ from the above in that
they group all the following entries together up to the
matching PE_END. The entries with OF in them
correspond to the ASN.1 types which have OF in them
e.g. SET OF. Allowing the OF items to have an arbi-
trary number of entries is excessive flexibility, they
can only have one type by the ASN.1 grammar rules. The
C data type corresponding to them is either a structure
if it is the first such type in the array or a pointer
to a structure is isn't. This complicates the process-
ing of these structures a little but not greatly. The
OF types differ one other important way, they may occur
zero, one or more times, with no upper bound. To cope
with this the C data type is a linked list structure.
The pointer to the data structure determines whether or
not there is another occurrence of the type, if it is
NULL there isn't. Thus each data structure has this
January 23, 1990
- 10 -
pointer to the next occurrence, the offset of this
pointer is placed in the PE_END field where it can con-
veniently be used to determine whether or not to make
another pass through the table entry.
OBJECTWhen one type references another it generates an
OBJECT entry. This specifies the type number of the
type which is present in the 3rd field of the tpe
structure, pe_tag. The 2nd field still gives the
offset in the C data structure which specifies where
the user's data for that type is to be found. Usually
this a pointer to the C data structure for that type.
T_NULLThis entry means the ASN.1 primative type NULL. It
doesn't have any body and consequently has no offset as
it cannot carry data directly. Only its absence or
presence can mean anything so if it is optional it sets
or clears a bit in the bit map as described earlier for
OPTL entry.
T_OIDThis use to be used for Object Identifiers and now is
unused, it should be got rid.
OBJIDThis corresponds to the Object Identifier ASN.1 type
primitive. It is implemented the same as other prima-
tive types like INTEGER and OCTET STRING.
ETAG This entry gives the explicit tag of the following
entry. The usual fields which define class and tag are
the only ones which have meaning in this entry. By
concatenating successive ETAG entries it is possibly to
build up an limited number explicit tags, although this
hasn't been tested yet.
IMP_OBJ
If a type has an implicit tag usually all we have to do
is set its tag and class appropriately in its entry.
This works for all but one important case, the refer-
ence of another type. This is messy because we can't
alter the definition of the type with out wrecking it
for the other uses. So what we do for encoding is
build the type normally and then afterward it is built
change its tag and class to be the values we want.
Similarly for decoding we match the tag and class up
and then decode the body of the type. We can't use a
OBJECT entry for this because among other reasons there
3rd field is already to store the type number. (The
forth needs to be free to contain flags such as DEFAULT
and OPTIONAL) So a new entry type is used, IMP_OBJ, to
hold the tag and class. It must be followed by an
OBJECT entry which is used to handle the type as nor-
mal, the IMP_OBJ entry gives the tag and class to be
used. Like the ETAG entry the IMP_OBJ affects the
entry that follows it.
January 23, 1990
- 11 -
EXTOBJ and EXTMOD
These handle external type references. This is just
like a normal (internal?) type reference except we must
now specify which module as well as the type. Simi-
larly because there are no more free fields in the
OBJECT type we need two entries to hold all the infor-
mation we need. The EXTMOD occurs first and holds the
type number and the offset into the C data structure
and the flags, exactly as for an OBJECT entry. The
next entry, which must be an EXTMOD, contains a pointer
to the modtyp structure for its module. Like a normal
OBJECT entry to handle the case of an implicit tag an
IMP_OBJ entry would occur before these two entries
which gives the class and tag. Likewise it could have
an explicit tag in which the two entries would be pro-
ceeded by an ETAG entry.
DFLT_F and DFLT_B
When a type has a default value, to handle decoding and
encoding properly you need to know its value. As there
is no space to store the value in most entries we allo-
cate a whole entry to specify the value. When encoding
it is convenient to have the default occur before the
entry it refers to. This allows a single check to han-
dle all the default encoding. All it has to do is
check whether it is the same as the default value and
if so not bother encoding the next type. On the other
hand when decoding it is more convenient to have the
entry after the one it refers to. In this case we need
to determine that it is missing before we use the
default value to determine the value to pass to the
user. To handle this we have entries of both types.
_\bD_\bF_\bL_\bT__\bF contains the default value for the following
entry (F = Front) and DFLT_B contains that for the
entry before it (B = Back). Consequently DFLT_F are
only used in the decoding tables and DFLT_B entries are
only used in the decoding (and printing tables).
S-Types
These types are entries for the same ASN.1 type as the
entry type formed by removing the starting `S'. The
above forms would do to handle ASN.1 but we also have
to be compatible with the C data structures generated
by _\bp_\bo_\bs_\by. The implementors decided to optimise the C
data structures generated a little means we have to
have all these S type entries. If a type was a single
field in most cases they produced a #define which elim-
inates the need to have a whole structure just for that
type. In all the places where this type is used the
field of the C structure is changed from a pointer to
field which holds the value directly in the structure.
See the ISODE reference given above for more details.
We handle this by generating the same tables that would
January 23, 1990
- 12 -
be generated with out the optimisation, except the optimised
types the S-type of entries instead of the normal ones. For
example an optimised OCTET STRING would have the type field
of its entry as SOCTETSTRING instead of OCTETSTRING. The
only difference in how S type and its corresponding normal
are handle is how they find the C data structure for that
entry. That difference is that there is no indirection
through pointers.
Flags field
Besides the encoding the class the pe_flags field also
contains a few possible flags. Mainly FL_OPTIONAL
which means the ASN.1 type corresponding to this flag
is OPTIONAL. Consequently when encoding it has to
determine if the type is present in the user data pos-
sibly using the bit map as described under the OPTL
entry. Likewise when decoding it may have to set a bit
in the bit map appropriately. The other flag at the
moment is FL_DEFAULT which means the entry corresponds
to an ASN.1 DEFAULT type. This bit is still needed as
not all types have DFLT_* entries implmented for them
at the moment. In particular compound value things
like SEQUENCE and SET can't have thier default value
specified. This is consistent with ISODE, if fact
implementing that may even break existing ISODE code.
This last flag FL_IMPLICIT is obsolete and not not used
any where.
_\b1._\b2. _\bW_\ba_\bl_\bk _\bt_\bh_\br_\bo_\bu_\bg_\bh _\bo_\bf _\bp_\be_\bp_\bs_\by _\bl_\bi_\bb_\br_\ba_\br_\by _\br_\bo_\bu_\bt_\bi_\bn_\be_\bs.
Here we walk through all the pepsy library routines at
least briefly. If any new routines are added or a routine
changed this documentation is the most likely part that will
need changing. First we give some theory as to how the task
have have been brocken into routines then describe each
function in detail. We assume you are familiar with ISODE's
PE data structure manipulation routines. if not they are
documented in the ISODE manuals, Volume one, chapter 5,
"Encoding of Data-Structures" (It actually covers decoding
as well).
_\b1._\b2._\b1. _\bO_\bv_\be_\br_\bv_\bi_\be_\bw _\bo_\bf _\bp_\be_\bp_\bs_\by _\bl_\bi_\bb_\br_\ba_\br_\by
Each seperate task is put into a different file. So
all the encoding stuff is in _\be_\bn_\bc._\bc, all the decoding stuff
is in _\bd_\be_\bc._\bc, printing stuff in _\bp_\br_\bn_\bt._\bc and freeing stuff in
_\bf_\br_\be._\bc. Actually it breaks down a little in practice, some
of the routines for moving around the tables are used in
both _\be_\bn_\bc._\bc and _\bd_\be_\bc._\bc for example. Probably they should
defined in _\bu_\bt_\bi_\bl._\bc so that linking one of the files from the
library doesn't force linking any other except _\bu_\bt_\bi_\bl._\bo.
There is a common structure to each of the major files
January 23, 1990
- 13 -
as well. There is a main routine which the user calls to
obtain the services provided by that file's routines. As
all the files revolve about processing the table entries
their structure is based on running through the table
entries.
We shall call each array of entries a table or an
object. There is a routine, usually with a name ending in
_obj, which is designed to process an object. For example
en_obj is the routine called to generated an encoded object.
Then there are routines to call on each compound type such
as en_seq for encode a SEQUENCE. Finally all the primitives
are handled by a one function that ends in _type. This lets
each routine concentrate on handling the features particular
to its type and call the appropriate routine to handle each
type it finds with in its compound type.
Most of these table processing routines have just three
arguements: which are called parm, p, mod. The parm is char
* or char ** in the encoding and decoding routines respec-
tively. This points to the user's C structure that data to
be encoded is taken from when encoding. When decoding it is
the address of a pointer which is made to point the C struc-
ture filled with the decode data. The freeing, which is
based on the decoding routines, has a char ** while the
printing routines don't look at the user's data and so don't
have such a pointer. The p points to the current table
entry we are up to processing and the mod arguement points
to the modtyp structure for the current module we are pro-
cessing.
All these processing routines return a PE type, which
is defined in ISODE's file _\bh/_\bp_\bs_\ba_\bp._\bh, and to return zero if
they have an error, but not always. In fact the error han-
dling is needs some work and has not been tested very well.
Generally it tries to print out the table entry where some-
thing went wrong and the name of the function it was in. It
then sometimes does an exit which may not be very pleasent
for the user.
_\b1._\b2._\b2. _\bT_\bh_\be _\be_\bn_\bc_\bo_\bd_\bi_\bn_\bg _\br_\bo_\bu_\bt_\bi_\bn_\be_\bs - _\be_\bn_\bc._\bc
enc_fThis is the the routine made available to the user for
the encoding routines. It is fairly simple as it
leaves all the hard things up to other routines. All
it does is use the type number and modtyp pointer to
get a pointer to the table for encoding that type.
Then it calls the table or object encoding routine,
en_obj, on that object. It first does a consistency
check of making sure the first entry in the table is a
PE_start. Note that it returns an integer (OK or
NOTOK) instead of a PE pointer. This is to be consi-
tent with ISODE functions.
January 23, 1990
- 14 -
en_objWe loop through the entries until we come to the end
of the table and then we return the PE we have built up
from the user's data which is pointed to by parm. In
looping through each entry we call the appropriate rou-
tine to encode its data. The default case is handled
by calling en_type which takes care of all the primi-
tive types.
The macro NEXT_TPE sets its arguement to point to the
next type in the table, counting compound types as one type.
Thus if NEXT_TPE is called on a SET_START it will skip all
the entries up to and including the matching PE_END. As
many objects consist of one compound type and its components
the main loop will only be run through once. Even when the
object is not based on a compound type it will then consist
of one simple type which is processed by en_type, again
probably going through the loop only once. In fact the only
way it can go through the loop more than once is to process
entries that subsidary to the main type, e.g. ETAG entries
and things like that. To double check this is the case
there is some code that looks for the processing of more
than one data generating entry.
Much of that testing could probably be eliminated with
no loss. Similarly prehaps the IMP_OBJ and ETAG could be
handled by the default action of calling en_type. As these
routines have evolved after many changes there are things
like that which really need to be looked at closely before
trying. The comment /*SUPRESS 288*/ means suppress warning
288 to saber C debugging tool that we use.
en_type
This is one of the longest functions as it has so many
cases to handle. It again is structure as a loop over
the types until PE_END but it actually returns as soon
as it has encoded the next type. We can now look at
the encoding of the primative ASN.1 types in detail.
DFLT_FBecause we have arranged that for encoding tables,
that we precede the entry with a DFLT_F entry we can
neatly handle all the default cases. All we do is
check if the parameter passed in the user data, in
parm, is the same as the default value specified in the
DFLT_F entry. The function same performs this check.
If it is the same don't encode anything just return,
otherwise continue on and encode it.
ETAG To handle explicit tags we merely allocate a PE with
the right tag and call en_etype to encode its contents,
which are in the following entries. The switch on the
pe_ucode field use to make a difference but now it is
meaningless and should be cleaned up.
SEQ_START, SEQOF_START, SET_START, SETOF_START
January 23, 1990
- 15 -
We merely call the appropriate function handle them.
Note one _\bi_\bm_\bp_\bo_\br_\bt_\ba_\bn_\bt difference in the way they are
called here from that in enc_obj, the parm arguement is
used as a base to index off and fetch a new pointer to
pass the next function. This seemly bizarre action is
quite straight forward when seen written as it is nor-
mally in C, "parm->offset". Where the field offset is
a pointer which has an offset from the start of the
structure of p->pe_ucode bytes.
This is the magic of how we access all the different
fields of the C data structures with the one piece of code.
It is also prehaps the most critical dependency of the whole
system on the implementation of the C language. As the BGNU
C compiler supports this feature then it is compilerable on
most machines. But any porters should pay attention to this
to ensure that thier compiler is happy generating these
offsets and compiling these casts properly.
The reason why this is different from the calls in
en_obj is that this is not the first compound type in the
table. The first and only the first does not have an offset
and does not need to be indirected through any pointers.
All the compound types inside this type will have as their
field a pointer which points to a structure. From here on
we shall say _\bi_\bn_\bd_\bi_\br_\be_\bc_\bt_\bi_\bo_\bn to mean this adding the pe_ucode
field to the pointer to the structure and using it to refer-
ence a pointer. Whether to use _\bi_\bn_\bd_\bi_\br_\be_\bc_\bt_\bi_\bo_\bn or not is very
important matter that really needs to be understood to
understand how the routines are structured.
IMP_OBJ
Here we have to handle the case where we can encode the
object then have to change its tag and class after
encoding. At the end of this entry this is done very
simply by assigning the right values to the appropriate
fields after the object has been built. This means
that if the intermeadiate form is altered this piece of
code may have to be altered as well. There seems to be
no better way of handling this.
The complication in handling this field is the handling
of all the possible types of object. If it is an external
object we have to perform a call to enc_f with all the right
arguements where a normal OBJECT, the last else branch,
requires a normal call to en_obj. Note the case of SOBJECT
is the same as OBJECT _\be_\bx_\bc_\be_\bp_\bt _\bt_\bh_\be_\br_\be _\bi_\bs _\bn_\bo _\bi_\bn_\bd_\bi_\br_\be_\bc_\bt_\bi_\bo_\bn.
SOBJECT and OBJECT
Here is the code that handles the two cases sperately.
It is exactly as in the IMP_OBJ case except seperated
out. Note the only difference between the two cases is
lack of indirection in the SOBJECT case.
January 23, 1990
- 16 -
CHOICE_START
This is exactly as all other compound types, like
SEQ_START and OBJECT, we call the appropriate routine
with indirection. From reading the ISODE manuals that
the ASN.1 CHOICE type is handled by a structure of its
own like the other compund types.
EXTOBJ and SEXTOBJ
January 23, 1990