usr/src/contrib/isode/pepsy/doc/DESCRIPTION



_\b1.  _\bO_\bv_\be_\br_\bv_\bi_\be_\bw _\bo_\bf _\bp_\be_\bp_\bs_\by _\bs_\by_\bs_\bt_\be_\bm

     This  section  describes  how  the  various  parts  fit
together  to  make  the  system  work.  The principle behind
pepsy is fairly simple.  The ASN.1 is summarised  as  tables
of integers.  These tables are read by driver routines which
encode or decode data to or from the  internal  format  that
ISODE  OSI implementation uses.  In ISODE specific functions
are generated for each ASN.1 type defined  in  contrast  the
pepsy  merely generates a new table of data which is far far
smaller.

     As there is a great deal  of  effort  invested  in  the
ISODE  interface  to  the  encoding/decoding  routines pepsy
automatically provides macros which map the  original  func-
tions  into the appropriate function call of a driver.  This
allows existing posy using code to switch to the pepsy  sys-
tem  with  no  changes  to  the  code  provided  no function
pointers are used to the  original  ISODE  functions.   Even
when  there  are function pointers used the changes are very
simple and take only a few hours to implement.

_\b1._\b1.  _\bB_\br_\bi_\be_\bf _\bd_\be_\bs_\bc_\br_\bi_\bp_\bt_\bi_\bo_\bn _\bo_\bf _\bt_\bh_\be _\bu_\bs_\be _\bo_\bf _\bt_\bh_\be _\bp_\be_\bp_\bs_\by _\bs_\by_\bs_\bt_\be_\bm.

_\b1._\b1._\b1.  _\bO_\bu_\bt_\bl_\bi_\bn_\be _\bo_\bf _\bt_\bh_\be _\bf_\bi_\bl_\be_\bs _\bp_\br_\bo_\bd_\bu_\bc_\be_\bd _\bu_\bn_\bd_\be_\br _\bt_\bh_\be  _\bp_\be_\bp_\bs_\by  _\bs_\by_\bs_\b-
_\bt_\be_\bm.

     The pepsy system consists  of  a  program  called  _\bp_\bo_\bs_\by
which  translates ASN.1 modules into a set of tables, called
_\bp_\bo_\bs_\by at the moment, and library of driver  routines,  called
_\bl_\bi_\bb_\bp_\be_\bp_\bs_\by._\ba.   Running  this  _\bp_\bo_\bs_\by  program on the ASN.1 file
will produce several files.  If the name of the ASN.1 module
is MODULE the following files are generated:

MODULE-types.h
     which contains C structure definitions.   The  user  of
     the  library  provides data as a linked list of these C
     data structures and expects to receive data back  as  a
     similar linked list.  These data structures are exactly
     the same as those produced by the original  ISODE  _\bp_\bo_\bs_\by
     so that existing software written for the old _\bp_\bo_\bs_\by pro-
     gram needs no change.  For details on the C data struc-
     tures types generated see the documentation of the ori-
     ginal _\bp_\bo_\bs_\by program in volume 4 Chapter 5 of  the  ISODE
     manuals.

MODULE_tables.c
     This file contains the tables generated by the new _\bp_\bo_\bs_\by
     program.   These  tables  consist  of  three parts, the
     first which contains the summary of ASN.1 types.   Each
     type  is  summarised  as  an array of a primitive type,
     struct pte, for encoding and decoding, and struct  ptpe
     for  printing.   As implied there is one array for each
     type for each of encoding,  decoding  and  printing  as


                      January 23, 1990


                           - 2 -


     specified  when _\bp_\bo_\bs_\by is run.  The next part contains up
     to three tables of pointers to these arrays.   Each  of
     the three different types of arrays, encoding, decoding
     and printing, has its own table of  pointers.   Finally
     there is the module type definition which contains con-
     tains pointers to these tables and  some  other  useful
     information  about  the  module such as its name.  This
     module type structure, which is typedefed to modtyp, is
     the only piece of data which is global, all the rest of
     the data is static and is only addressable via the mod-
     typ  data  structure.  This  provides  a kind of object
     oriented approach to handling the tables.  Once you are
     passed a pointer to an ASN.1's modtyp structure you can
     encode, decode and print any of its  types  by  calling
     the   appropriate  libpepsy.a  routine  with  its  type
     number.

MODULE_pre_defs.h
     This file contains #defines symbol of each of the ASN.1
     types  to its type number, which is used when calling a
     libpepsy.a routine.  Each symbol  is  _Ztype-nameMODULE
     where _\bt_\by_\bp_\be-_\bn_\ba_\bm_\be is the name of the type with dashes (-)
     turned into underscores (_) and _\bM_\bO_\bD_\bU_\bL_\bE is the  name  of
     the  module.   For  example of the ASN.1 universal type
     _\bG_\br_\ba_\bp_\bh_\bi_\bc_\bS_\bt_\br_\bi_\bn_\bg would  have  the  #define  symbol  _ZGra-
     phicStringUNIV.  The __\bZ is prepended to try to make the
     symbols unique.  This file  also  contains  and  extern
     declaration for the modtyp data for its module.

MODULE_defs.h
     This file contains macros for  all the encoding, decod-
     ing  and printing functions that the _\bp_\be_\bp_\by program would
     have for these ASN.1 types.  This allows  much  of  the
     code  that  uses  the routines generated by running the
     old _\bp_\bo_\bs_\by program and taking its output and running _\bp_\be_\bp_\by
     on  augmented ASN.1 output can be recompiled unchanged.
     If the code used pointers  to  these  functions  it  is
     necessary  to change it to pass around the type numbers
     instead and to call  appropriately  call  a  libpepsy.a
     library  routine  with the type number.  As pointers to
     the printing routines in ISODE are passed as  arguments
     a  #define  is  provided  to turn the argument into the
     pair of arguments, type number and  pointer  to  modtyp
     structure,  which  are  needed  to allow the diagnostic
     printing code to work with no change  for  the  current
     ISODE stack.  This file also contains a #include of the
     _\bM_\bO_\bD_\bU_\bL_\bE__\bp_\br_\be__\bd_\be_\bf_\bs._\bh file.

     As the _\bM_\bO_\bD_\bU_\bL_\bE-_\bt_\by_\bp_\be_\bs._\bh file #include's the _\bM_\bO_\bD_\bU_\bL_\bE__\bd_\be_\bf_\bs._\bh
file  no  further  #includes  need  to be added to the files
using the encoding/decoding/printing functions.  This  means
that code written to use posy/pepy system may need no change
at all and  the  only  effort  required  is  to  change  the
Makefile  to use the pepsy system.  If there is code changes


                      January 23, 1990


                           - 3 -


required it would most likely be because  function  pointers
are  used  to reference the functions generated by posy.  If
only the _\bp_\be_\bp_\by system was used, not posy then pepy, with code
placed inside action statements then quite a large amount of
work may be needed to change over to the new system, depend-
ing on how large and complex the _\bp_\be_\bp_\by module is.

_\b1._\b1._\b2.  _\bO_\bu_\bt_\bl_\bi_\bn_\be _\bo_\bf _\bt_\bh_\be _\bp_\be_\bp_\bs_\by _\bl_\bi_\bb_\br_\ba_\br_\by.

enc.cThis contains the routines that encode data from the  C
     data  structures into ISODE's PElement linked list data
     structure which it uses for all presentation data.  The
     most  important  function to pepsy users is enc_f which
     called to encode a particular type.  It is  passed  the
     type  number and a pointer to modtyp structure for that
     module and then the rest of  the  arguments  which  are
     passed  to  an  encode  function generated by _\bp_\bo_\bs_\by/_\bp_\be_\bp_\by
     system.  See the documentation in Volume 4, "The Appli-
     cations  Cookbook",  Section  6.4 called "Pepy Environ-
     ment".  Most of these  latter  arguments  are  ignored,
     only parm and pe, are used.

     Contrary to what the  ISODE  documentation  says  these
ignored  parameters  are  hardly ever used by existing code.
We have not found a single case where used  for  encoding  a
named type, which is all that the user can reference anyway,
so we don't see  any  problems  with  ignoring  these  other
parameters.   Hopefully  one  day  they  can  be thrown away
entirely, until then they are actually passed the the encod-
ing function.

     The rest of the functions are mostly recursive routines
which  encode a particular type of table entry.  For example
SEQUENCE is encoded by en_seq which may call itself or  oth-
ers  to  encode  the  types  from which it is built up.  The
function en_type builds up a simple type and en_obj  encodes
a  new  type (object) and so on with other functions.  There
are a few utility routines in the file such  as  same  which
determines  whether  the  value  is  the same as the default
value also.

dec.cThis file contains the decoding routines that translate
     presentation data into C data structures defined in the
     MODULE-types.h is like _\be_\bn_\bc._\bc.  It is very much like the
     file _\be_\bn_\bc._\bc except the routines do the reverse tasks The
     routines are structured in a very similar way.  We have
     dec_f  which is called by the user to decode a type and
     like enc_f takes the same  arguments  as  the  decoding
     functions  generated  by  _\bp_\bo_\bs_\by  with two additions, the
     type number and a pointer to the modtyp  structure  for
     that  module.   Likewise  the  other functions are very
     much like those of enc.c

prnt.cThis  file  contains  the  routines  that  print   the


                      January 23, 1990


                           - 4 -


     presentation data in a format similar to that generated
     by  _\bp_\be_\bp_\by's  printing  functions.   It's  main  function
     prnt_f  is  takes  the  same  arguments as the printing
     function generated by _\bp_\be_\bp_\by as well as the now  familiar
     type  number  and  modtyp  pointer.   The functions are
     modeled on the decoding routines as it has similar  job
     to.  The only difference is that instead of storing the
     decoded data into a  C  data  structure  it  is  nicely
     printed out.

fr.c This file contains code to  free  the  data  structures
     defined  in MODULE-types.h.  Likewise if the -f flag is
     given when generating the types file it  also  includes
     macros  in  the  types  file  which replace the freeing
     functions generated by ISODE's _\bp_\bo_\bs_\by.  The function that
     the  user calls us fre_obj which takes a pointer to the
     data structure, its decoding table entry and a  pointer
     to  the modtyp structure for the module. The freeing is
     based on the decoding routines except instead of decod-
     ing  all  it  does is free each part of the data struc-
     ture, which might  involve  recursive  calls,  then  it
     frees the data structure at the end.

util.cThis contains the utility routines used by  more  than
     one of the above files.  This is mostly diagnostic rou-
     tines at the moment, more  general  routines  could  be
     included  in  here.  If there is an error at the moment
     which it can't recover from it just prints out  a  mes-
     sage on standard error and calls exit.  Not perfect and
     this is something that will need work.

main.cThis contains code to perform a series of tests on the
     _\bp_\be_\bp_\bs_\by  library  which  is a useful check to see whether
     any of the routines has  been  broken  by  any  changes
     made.   It  basically  loops  through a whole series of
     test cases.  Each test case is encoded from some  built
     in test data and then decoded and checked to see if the
     data has changed in the transfer.  If  it  is  compiled
     with  -_\bD_\bP_\bR_\bN_\bT=_\b1  the encoded data is also printed out to
     check the printing  routines  which  generates  a  vast
     amount  of  output.  Finally the free routines are used
     to  free  the  allocated  data,  although  it  can  not
     directly  check  the free routines to see if they work,
     it can be used with a malloc tracing package  to  check
     that the routines work.

test_table.h
     This contains the test cases that _\bm_\ba_\bi_\bn._\bc program  runs.
     Each  entry in the table corresponds to a type.  One of
     the fields is count of how many times that type  is  to
     be tested to try out the different possibly data values
     it might have.

pep.h and pepdefs.h


                      January 23, 1990


                           - 5 -


     These files contain the definition of  types  used  for
     the  tables  that  drive the encoding/decoding/printing
     routines.  All the constants used  in  that  table  are
     defined  here  via  #defines.   The modtyp structure is
     defined in _\bp_\be_\bp_\bd_\be_\bf_\bs._\bh.

t1.py and t2.py
     These are test ASN.1 modules that are  used  by  _\bm_\ba_\bi_\bn._\bc
     routines  to  check  the _\bp_\be_\bp_\bs_\by library.  The file _\bt_\b1._\bp_\by
     contains the majority of different types with a few  of
     a  different module provided in _\bt_\b2._\bp_\by.  This allows the
     testing of the code for handling ASN.1 external  refer-
     ences,  i.e.  references  to  types  defined  in other,
     external, modules.

_\b1._\b1._\b3.  _\bN_\be_\bw _\bf_\bi_\bl_\be_\bs _\bi_\bn _\bt_\bh_\be _\bp_\be_\bp_\by _\bd_\bi_\br_\be_\bc_\bt_\bo_\br_\by

etabs.c, dtabs.c and ptabs.c
     These  files  contain  the   code   to   generate   the
     encoding/decoding/printing tables.  The main routine in
     _\be_\bt_\ba_\bb_\bs._\bc is tenc_typ which is called on each ASN.1  type
     to  generate  an array of entries which describe how to
     encode that type.  See the  details  section  for  more
     information  about  how  the  table  entries  function.
     Similarly _\bd_\bt_\ba_\bb_\bs._\bc contains the routine  tdec_typ  which
     is  called  on each type to generate its decoding table
     entries.   Likewise  tprnt_typ  routine  generates  the
     arrays  of table entries for the printing tables.  This
     function is in _\bp_\bt_\ba_\bb_\bs._\bc.

dfns.cThis file contains miscellaneous string handling  rou-
     tines  and hash table routines that don't really belong
     anywhere else.  Some of the routines could  be  cleaned
     up in that they tend not to free memory they use.

mine.hThis  file  contains  the  definitions  for  the  hash
     table(s)  that  are  used  to  keep  track of the ASN.1
     types.  This could probably be done  with  out  a  hash
     table,  should  anyone want to clean this up, feel wel-
     come.  The lookup function is in _\bd_\bf_\bn_\bs._\bc.

pass2.h
     This file has most of the #defines for the  table  gen-
     erating  program.  Most of the prefixes and suffixes of
     function names and files names  are  defined  here  so,
     hopefully,  the names can be changed by merely changing
     the definition.  This contains most  of  the  important
     definitions  needed  by the changes to the _\bp_\bo_\bs_\by program
     needed to generate tables.

posy.hThis contains the definition of a symbol which is  now
     needed  outside  of  the  the main routine and the yacc
     file.  By putting it here we can include  it  any  file
     that  needs  to  know  it  with out putting in any that


                      January 23, 1990


                           - 6 -


     doesn't need it and with out including  all  the  other
     definitions  that  occur  in _\bp_\be_\bp_\by._\bh.  The structure and
     meaning of the tables generated from the ASN.1 grammar

     Each collection of ASN.1 grammar is  called  a  module.
(See  ASN.1  )  Each ASN.1 module is completely specified in
the program by a single C structure of type modtyp  and  the
data  which  it  references.   See the _\bp_\be_\bp_\bd_\be_\bf_\bs._\bh file in the
_\bp_\be_\bp_\bs_\by directory.  For each  ASN.1  module  there  are  three
tables that are generated fromASN.1 grammar.  These initial-
ised arrays which we call tables are  called  the  encoding,
decoding  and  printing  tables.   Each  of  these tables is
referenced through a different pointer of the modtyp  struc-
ture.

     Each of these pointers references an array of pointers,
one  pointer for each ASN.1 type defined in the module.  The
position of one of these pointers is the unique type  number
we  give  to its corresponding type.  The pointer references
an array of type tpe or ptpe, depending  whether  it  is  an
entry  in  the  decoding/encoding  tables or printing tables
respectively.  See _\bp_\be_\bp._\bh in the _\bp_\be_\bp_\bs_\by directory.  This array
actually    contains    the    necessary    information   to
encode/decode/print that ASN.1 type.  So  given  the  modtyp
structure  of  an  ASN.1  module and its type number you can
call a routine to encode, decode or print that type.

     The rest of this document assumes a good  knowledge  of
ASN.1  notation  so  go  read a copy if you haven't already.
From here on I shall mention only tpe and this means tpe  in
the  case  of  encoding  or decoding and ptpe in the case of
printing, unless otherwise stated.  Each type is represented
by  an  array of tpe (or ptpe for printing).  The basic ele-
ment consists of four integer fields, the printing table  is
the  same with an addition char pointer field which contains
the name corresponding to that entry in the  ASN.1  grammar.
The first specifies the type of the entry and determines how
the rest are interpreted.  The possible types are listed  in
_\bp_\be_\bp_\bs_\by/_\bp_\be_\bp._\bh.   Each  type  is  an array which starts with an
entry of type PE_START and ends with  one  of  type  PE_END.
Each  primitive type requires one entry to specify it, apart
from possible PE_START and PE_END used to specify the  start
and end of the type.  Constructed types are represented by a
list of entries terminated by an entry of type  PE_END.   As
ASN.1  types can be nested inside so will the representation
in tpe entries be nested.  For example the ASN.1 type defin-
ition:
           Example1 ::=
                   SEQUENCE {
                       seq1 SEQUENCE {
                                an-i INTEGER,
                                an-ostring OCTET STRING
                            },
                       a-bool IMPLICIT [0] BOOLEAN


                      January 23, 1990


                           - 7 -


                   }
              Will generate an encoding array:
static tpe et_Example1Test[] = {
        { PE_START, 0, 0, 0 },
        { SEQ_START, 0, 16, FL_UNIVERSAL },
        { SEQ_START, OFFSET(struct type_Test_Example1, seq1), 16, FL_UNIVERSAL },
        { INTEGER, OFFSET(struct element_Test_0, an__i), 2, FL_UNIVERSAL },
        { OCTETSTRING, OFFSET(struct element_Test_0, an__ostring), 4, FL_UNIVERSAL },
        { PE_END, 0, 0, 0 },
        { BOOLEAN, OFFSET(struct type_Test_Example1, a__bool), 0, FL_CONTEXT },
        { PE_END, 0, 0, 0 },
        { PE_END, 0, 0, 0 }
        };


     Here the second last PE_END matches and closes off  the
first  SEQ_START.  The entries which correspond to the other
primative types are pretty obvious, with the  INTEGER  entry
corresponding  to  the  primative  INTEGER.  For fields that
generate data the general interpretation of the other  three
fields is offset, tag and flags/class fields respectively.

offsetThe second field gives the offset in a C  data  struc-
     ture  needed  to reference the data that corresponds to
     this table entry.  Each  ASN.1  type  has  C  structure
     types  generated  as  described  in  the ISODE manuals,
     volume 4 "The applications Cookbook" Section 5.2, "POSY
     Environment".  As this offset may have to be determined
     in a compiler dependent manner a C  preprocessor  macro
     is used hide the actual details.

tag  This is the tag associated with the ASN.1 type for that
     entry.   Notice  that  in  the example the [0] IMPLICIT
     which changes the tag associated with the BOOLEAN entry
     actually  has the correct tag of 0 in the table.  Like-
     wise  SEQUENCE  has  the  correct  tag  of  16  in  its
     SEQ_START entry and so on for the others.

flags/class
     This contains  the  ASN.1  class  associated  with  the
     entry's  type.   That  is  UNIVERSAL for all except the
     BOOLEAN type which is CONTEXT class.  This  fourth  can
     also contain flags that specify if the type is OPTIONAL
     or DEFAULT.  There is plenty of room here as  there  is
     only four possibly classes.

     Now that you have some idea of  how  these  arrays  are
arranged  for a type definition I will proceed to go through
the possible type of entries and describe what they  do  and
how  they  work.   These  values are defined in _\bp_\be_\bp_\bs_\by/_\bp_\be_\bp._\bh.
Those entries with a value below TYPE_DATA are entries  that
don't  correspond  to data to be encoded/decoded and are for
other book keeping type purposes.


                      January 23, 1990


                           - 8 -


PE_START and PE_END
     As explained above PE_START starts the beginning  of  a
     ASN.1  type's  array.   It probably isn't necessary but
     the size of the tables is so small it isn't much of  an
     over  head  to  keep  around for cosmetic reasons.  The
     entry type PE_END is necessary to mark the end of  some
     compound type as well as the end of ASN.1 data type.

XOBJECT and UCODE
     These  are  obsolete  types  and  probably  should   be
     removed.  They were to allow C code written directly by
     the user to be incorporated into the  encoding/decoding
     but  it was found unnecessary.  Prehaps some brave soul
     would like to use them in an  attempt  to  implement  a
     similar  system  based  on  _\bp_\be_\bp_\by which is what we first
     attempted to do until we found this to be much easier.

MALLOCThis field only occurs in  the  decoding  tables.   It
     specifies  how much space to malloc out for the current
     C structure it is just inside of.  For instance in  the
     example  above  the  decoding  table  has the following
     entry:

      { MALLOC, 0, sizeof (struct type_Test_Example1), 0 },

     just after the first SEQ_START entry.  It tells  it  to
     malloc  out  a  struct  type_Test_Example1 structure to
     hold the data from the sequence when it is decoded.

SCTRLThis entry is used in handling the ASN.1  CHOICE  type.
     The  C type generated for ASN.1 CHOICE type is a struc-
     ture with an offset field in it and a union of all  the
     C  types present in the CHOICE.  Each ASN.1 type in the
     CHOICE of types has a C type definition  generated  for
     it.   The union is of all these types, which is quite a
     logical way to implement a  CHOICE  type.   The  offset
     field  specifies  which possibility of interpreting the
     union should be used (which  _\bm_\be_\bm_\bb_\be_\br  should  selected).
     As  such  it  needs to be read by the encoding routines
     when encoding the data from the C data  structures  and
     to  be set by the decoding routines when it is decoding
     the data into the C data structures.  There is one such
     entry  for each CHOICE type to specify where the offset
     field is.

CH_ACTAnother redundant entry type.  I think this  was  also
     used  in  code to handle C statements or actions speci-
     fied by the user.  It probably should be removed.

OPTL This is used to handle the optionals field that is gen-
     erated  by posy when optional types that are _\bn_\bo_\bt imple-
     mented by pointers are present in the ASN.1 type.   For
     example  if an ASN.1 type has an optional integer field
     how does the encoding routine determine if the  integer


                      January 23, 1990


                           - 9 -


     is  to  be  present or not?  If it was implemented as a
     pointer it could use a NULL (zero) pointer to mean that
     the  type was not present because NULL is guaranteed to
     never occur as a legal pointer to a real  object.   But
     all  the  possible  values for integer could be legally
     passed  so  instead  for  these  types  which  are  not
     pointers and are optional a bit map is allocated in the
     structure.  Each non pointer optional type a  bit  from
     the bit map is allocated.

     If that bit is set the corresponding  type  is  present
and it is not present if the bit is not set.  Each bit has a
#define generated for it.  The bit map is merely an  integer
field  called  "optionals"  limiting  maximum number of such
optionals to 32 on Sun machines, 16  on  some  others.   (An
array  of char as BSD fd_sets would have avoid all such lim-
its, not that this limit is expected  to  be  exceeded  very
often  !)  Like  the SCTRL entry this entry merely serves to
specify where this field is so it can be test and set by the
encoding and decoding routines respectively.

ANY and CONS_ANY
     The C type corresponding to the entry is a PE  pointer.
     To  conform  with  _\bp_\be_\bp_\by the tag and class of this entry
     are ignored, which may or may not be the most  sensible
     thing.   The CONS_ANY is a redundant symbol which means
     the same thing but is not used.  This should  be  clean
     up and removed.

INTEGER, BOOLEAN, BITSTRING, OCTETSTRING and OBJID
     These are just as described in the first article.   See
     the ISODE manual to find out what they are allocated as
     a C data type to implement  them.   The  offset  fields
     says  where  to find this data type with in the current
     structure.

SET_START, SETOF_START, SEQ_START and SEQOF_START
     These compound entries differ from the  above  in  that
     they group all the following entries together up to the
     matching  PE_END.   The  entries  with   OF   in   them
     correspond  to  the  ASN.1  types which have OF in them
     e.g. SET OF.  Allowing the OF items to  have  an  arbi-
     trary  number of entries is excessive flexibility, they
     can only have one type by the ASN.1 grammar rules.  The
     C data type corresponding to them is either a structure
     if it is the first such type in the array or a  pointer
     to a structure is isn't.  This complicates the process-
     ing of these structures a little but not greatly.   The
     OF types differ one other important way, they may occur
     zero, one or more times, with no upper bound.  To  cope
     with  this  the C data type is a linked list structure.
     The pointer to the data structure determines whether or
     not  there  is another occurrence of the type, if it is
     NULL there isn't.  Thus each data  structure  has  this


                      January 23, 1990


                           - 10 -


     pointer  to  the  next  occurrence,  the offset of this
     pointer is placed in the PE_END field where it can con-
     veniently  be  used to determine whether or not to make
     another pass through the table entry.

OBJECTWhen one  type  references  another  it  generates  an
     OBJECT  entry.   This  specifies the type number of the
     type which is present in  the  3rd  field  of  the  tpe
     structure,  pe_tag.   The  2nd  field  still  gives the
     offset in the C data structure  which  specifies  where
     the  user's data for that type is to be found.  Usually
     this a pointer to the C data structure for that type.

T_NULLThis entry means the ASN.1 primative  type  NULL.   It
     doesn't have any body and consequently has no offset as
     it cannot carry data directly.   Only  its  absence  or
     presence can mean anything so if it is optional it sets
     or clears a bit in the bit map as described earlier for
     OPTL entry.

T_OIDThis use to be used for Object Identifiers and  now  is
     unused, it should be got rid.

OBJIDThis corresponds to the Object  Identifier  ASN.1  type
     primitive.   It is implemented the same as other prima-
     tive types like INTEGER and OCTET STRING.

ETAG This entry gives the  explicit  tag  of  the  following
     entry.  The usual fields which define class and tag are
     the only ones which have meaning  in  this  entry.   By
     concatenating successive ETAG entries it is possibly to
     build up an limited number explicit tags, although this
     hasn't been tested yet.

IMP_OBJ
     If a type has an implicit tag usually all we have to do
     is  set  its  tag and class appropriately in its entry.
     This works for all but one important case,  the  refer-
     ence  of  another type.  This is messy because we can't
     alter the definition of the type with out  wrecking  it
     for  the  other  uses.   So  what we do for encoding is
     build the type normally and then afterward it is  built
     change  its  tag  and  class  to be the values we want.
     Similarly for decoding we match the tag  and  class  up
     and  then  decode the body of the type.  We can't use a
     OBJECT entry for this because among other reasons there
     3rd  field  is  already to store the type number.  (The
     forth needs to be free to contain flags such as DEFAULT
     and  OPTIONAL) So a new entry type is used, IMP_OBJ, to
     hold the tag and class.  It  must  be  followed  by  an
     OBJECT  entry  which is used to handle the type as nor-
     mal, the IMP_OBJ entry gives the tag and  class  to  be
     used.   Like  the  ETAG  entry  the IMP_OBJ affects the
     entry that follows it.


                      January 23, 1990


                           - 11 -


EXTOBJ and EXTMOD
     These handle external type references.   This  is  just
     like a normal (internal?) type reference except we must
     now specify which module as well as  the  type.   Simi-
     larly  because  there  are  no  more free fields in the
     OBJECT type we need two entries to hold all the  infor-
     mation  we need.  The EXTMOD occurs first and holds the
     type number and the offset into the  C  data  structure
     and  the  flags,  exactly  as for an OBJECT entry.  The
     next entry, which must be an EXTMOD, contains a pointer
     to  the modtyp structure for its module.  Like a normal
     OBJECT entry to handle the case of an implicit  tag  an
     IMP_OBJ  entry  would  occur  before  these two entries
     which gives the class and tag.  Likewise it could  have
     an  explicit tag in which the two entries would be pro-
     ceeded by an ETAG entry.

DFLT_F and DFLT_B
     When a type has a default value, to handle decoding and
     encoding properly you need to know its value.  As there
     is no space to store the value in most entries we allo-
     cate a whole entry to specify the value.  When encoding
     it is convenient to have the default occur  before  the
     entry it refers to.  This allows a single check to han-
     dle all the default encoding.  All  it  has  to  do  is
     check  whether  it is the same as the default value and
     if so not bother encoding the next type.  On the  other
     hand  when  decoding  it is more convenient to have the
     entry after the one it refers to.  In this case we need
     to  determine  that  it  is  missing  before we use the
     default value to determine the value  to  pass  to  the
     user.   To  handle  this we have entries of both types.
     _\bD_\bF_\bL_\bT__\bF contains the default  value  for  the  following
     entry  (F  =  Front)  and  DFLT_B contains that for the
     entry before it (B = Back).   Consequently  DFLT_F  are
     only used in the decoding tables and DFLT_B entries are
     only used in the decoding (and printing tables).

S-Types
     These types are entries for the same ASN.1 type as  the
     entry  type  formed  by removing the starting `S'.  The
     above forms would do to handle ASN.1 but we  also  have
     to  be  compatible with the C data structures generated
     by _\bp_\bo_\bs_\by.  The implementors decided to  optimise  the  C
     data  structures  generated  a  little means we have to
     have all these S type entries.  If a type was a  single
     field in most cases they produced a #define which elim-
     inates the need to have a whole structure just for that
     type.   In  all  the places where this type is used the
     field of the C structure is changed from a  pointer  to
     field  which holds the value directly in the structure.
     See the ISODE reference given above for more details.

     We handle this by generating the same tables that would


                      January 23, 1990


                           - 12 -


be generated with out the optimisation, except the optimised
types the S-type of entries instead of the normal ones.  For
example  an optimised OCTET STRING would have the type field
of its entry as SOCTETSTRING instead  of  OCTETSTRING.   The
only  difference  in how S type and its corresponding normal
are handle is how they find the C data  structure  for  that
entry.   That  difference  is  that  there is no indirection
through pointers.

Flags field
     Besides the encoding the class the pe_flags field  also
     contains  a  few  possible  flags.   Mainly FL_OPTIONAL
     which means the ASN.1 type corresponding to  this  flag
     is  OPTIONAL.   Consequently  when  encoding  it has to
     determine if the type is present in the user data  pos-
     sibly  using  the  bit  map as described under the OPTL
     entry.  Likewise when decoding it may have to set a bit
     in  the  bit  map appropriately.  The other flag at the
     moment is FL_DEFAULT which means the entry  corresponds
     to  an ASN.1 DEFAULT type.  This bit is still needed as
     not all types have DFLT_* entries implmented  for  them
     at  the  moment.   In  particular compound value things
     like SEQUENCE and SET can't have  thier  default  value
     specified.   This  is  consistent  with  ISODE, if fact
     implementing that may even break existing  ISODE  code.
     This last flag FL_IMPLICIT is obsolete and not not used
     any where.


_\b1._\b2.  _\bW_\ba_\bl_\bk _\bt_\bh_\br_\bo_\bu_\bg_\bh _\bo_\bf _\bp_\be_\bp_\bs_\by _\bl_\bi_\bb_\br_\ba_\br_\by _\br_\bo_\bu_\bt_\bi_\bn_\be_\bs.

     Here we walk through all the pepsy library routines  at
least  briefly.   If any new routines are added or a routine
changed this documentation is the most likely part that will
need changing.  First we give some theory as to how the task
have have been brocken  into  routines  then  describe  each
function in detail.  We assume you are familiar with ISODE's
PE data structure manipulation routines.  if  not  they  are
documented  in  the  ISODE  manuals,  Volume one, chapter 5,
"Encoding of Data-Structures" (It actually  covers  decoding
as well).

_\b1._\b2._\b1.  _\bO_\bv_\be_\br_\bv_\bi_\be_\bw _\bo_\bf _\bp_\be_\bp_\bs_\by _\bl_\bi_\bb_\br_\ba_\br_\by

     Each seperate task is put into a  different  file.   So
all  the  encoding stuff is in _\be_\bn_\bc._\bc, all the decoding stuff
is in _\bd_\be_\bc._\bc, printing stuff in _\bp_\br_\bn_\bt._\bc and freeing  stuff  in
_\bf_\br_\be._\bc.   Actually  it breaks down a little in practice, some
of the routines for moving around the  tables  are  used  in
both  _\be_\bn_\bc._\bc  and  _\bd_\be_\bc._\bc  for  example.  Probably they should
defined in _\bu_\bt_\bi_\bl._\bc so that linking one of the files from  the
library doesn't force linking any other except _\bu_\bt_\bi_\bl._\bo.

     There is a common structure to each of the major  files


                      January 23, 1990


                           - 13 -


as  well.   There  is a main routine which the user calls to
obtain the services provided by that  file's  routines.   As
all  the  files  revolve  about processing the table entries
their structure  is  based  on  running  through  the  table
entries.

     We shall call each array  of  entries  a  table  or  an
object.   There  is a routine, usually with a name ending in
_obj, which is designed to process an object.   For  example
en_obj is the routine called to generated an encoded object.
Then there are routines to call on each compound  type  such
as en_seq for encode a SEQUENCE.  Finally all the primitives
are handled by a one function that ends in _type.  This lets
each routine concentrate on handling the features particular
to its type and call the appropriate routine to handle  each
type it finds with in its compound type.

     Most of these table processing routines have just three
arguements: which are called parm, p, mod.  The parm is char
* or char ** in the encoding and decoding  routines  respec-
tively.   This points to the user's C structure that data to
be encoded is taken from when encoding.  When decoding it is
the address of a pointer which is made to point the C struc-
ture filled with the decode data.   The  freeing,  which  is
based  on  the  decoding  routines,  has a char ** while the
printing routines don't look at the user's data and so don't
have  such  a  pointer.   The  p points to the current table
entry we are up to processing and the mod  arguement  points
to  the  modtyp structure for the current module we are pro-
cessing.

     All these processing routines return a PE  type,  which
is  defined  in ISODE's file _\bh/_\bp_\bs_\ba_\bp._\bh, and to return zero if
they have an error, but not always.  In fact the error  han-
dling  is needs some work and has not been tested very well.
Generally it tries to print out the table entry where  some-
thing went wrong and the name of the function it was in.  It
then sometimes does an exit which may not be  very  pleasent
for the user.

_\b1._\b2._\b2.  _\bT_\bh_\be _\be_\bn_\bc_\bo_\bd_\bi_\bn_\bg _\br_\bo_\bu_\bt_\bi_\bn_\be_\bs - _\be_\bn_\bc._\bc

enc_fThis is the the routine made available to the user  for
     the  encoding  routines.   It  is  fairly  simple as it
     leaves all the hard things up to other  routines.   All
     it  does  is  use the type number and modtyp pointer to
     get a pointer to the  table  for  encoding  that  type.
     Then  it  calls  the  table or object encoding routine,
     en_obj, on that object.  It first  does  a  consistency
     check  of making sure the first entry in the table is a
     PE_start.  Note that  it  returns  an  integer  (OK  or
     NOTOK)  instead  of a PE pointer.  This is to be consi-
     tent with ISODE functions.


                      January 23, 1990


                           - 14 -


en_objWe loop through the entries until we come to  the  end
     of the table and then we return the PE we have built up
     from the user's data which is pointed to by  parm.   In
     looping through each entry we call the appropriate rou-
     tine to encode its data.  The default case  is  handled
     by  calling  en_type which takes care of all the primi-
     tive types.

     The macro NEXT_TPE sets its arguement to point  to  the
next type in the table, counting compound types as one type.
Thus if NEXT_TPE is called on a SET_START it will  skip  all
the  entries  up  to  and including the matching PE_END.  As
many objects consist of one compound type and its components
the  main loop will only be run through once.  Even when the
object is not based on a compound type it will then  consist
of  one  simple  type  which  is processed by en_type, again
probably going through the loop only once.  In fact the only
way  it can go through the loop more than once is to process
entries that subsidary to the main type, e.g.  ETAG  entries
and  things  like  that.   To  double check this is the case
there is some code that looks for  the  processing  of  more
than one data generating entry.

     Much of that testing could probably be eliminated  with
no  loss.   Similarly  prehaps the IMP_OBJ and ETAG could be
handled by the default action of calling en_type.  As  these
routines  have  evolved  after many changes there are things
like that which really need to be looked at  closely  before
trying.   The comment /*SUPRESS 288*/ means suppress warning
288 to saber C debugging tool that we use.

en_type
     This is one of the longest functions as it has so  many
     cases  to handle.  It again is structure as a loop over
     the types until PE_END but it actually returns as  soon
     as  it  has  encoded the next type.  We can now look at
     the encoding of the primative ASN.1 types in detail.

DFLT_FBecause we have arranged  that  for  encoding  tables,
     that  we  precede  the entry with a DFLT_F entry we can
     neatly handle all the default  cases.   All  we  do  is
     check  if  the  parameter  passed  in the user data, in
     parm, is the same as the default value specified in the
     DFLT_F  entry.   The function same performs this check.
     If it is the same don't encode  anything  just  return,
     otherwise continue on and encode it.

ETAG To handle explicit tags we merely allocate  a  PE  with
     the right tag and call en_etype to encode its contents,
     which are in the following entries.  The switch on  the
     pe_ucode  field  use to make a difference but now it is
     meaningless and should be cleaned up.

SEQ_START, SEQOF_START, SET_START, SETOF_START


                      January 23, 1990


                           - 15 -


     We merely call the appropriate  function  handle  them.
     Note  one  _\bi_\bm_\bp_\bo_\br_\bt_\ba_\bn_\bt  difference  in  the  way they are
     called here from that in enc_obj, the parm arguement is
     used  as a base to index off and fetch a new pointer to
     pass the next function.  This seemly bizarre action  is
     quite  straight forward when seen written as it is nor-
     mally in C, "parm->offset".  Where the field offset  is
     a  pointer  which  has  an offset from the start of the
     structure of p->pe_ucode bytes.

     This is the magic of how we access  all  the  different
fields  of the C data structures with the one piece of code.
It is also prehaps the most critical dependency of the whole
system on the implementation of the C language.  As the BGNU
C compiler supports this feature then it is compilerable  on
most machines.  But any porters should pay attention to this
to ensure that thier  compiler  is  happy  generating  these
offsets and compiling these casts properly.

     The reason why this is  different  from  the  calls  in
en_obj  is  that  this is not the first compound type in the
table.  The first and only the first does not have an offset
and  does  not  need  to be indirected through any pointers.
All the compound types inside this type will have  as  their
field  a  pointer which points to a structure.  From here on
we shall say _\bi_\bn_\bd_\bi_\br_\be_\bc_\bt_\bi_\bo_\bn  to mean this adding  the  pe_ucode
field to the pointer to the structure and using it to refer-
ence a pointer.  Whether to use _\bi_\bn_\bd_\bi_\br_\be_\bc_\bt_\bi_\bo_\bn or not  is  very
important  matter  that  really  needs  to  be understood to
understand how the routines are structured.

IMP_OBJ
     Here we have to handle the case where we can encode the
     object  then  have  to  change  its tag and class after
     encoding.  At the end of this entry this is  done  very
     simply by assigning the right values to the appropriate
     fields after the object has  been  built.   This  means
     that if the intermeadiate form is altered this piece of
     code may have to be altered as well.  There seems to be
     no better way of handling this.

     The complication in handling this field is the handling
of  all  the possible types of object.  If it is an external
object we have to perform a call to enc_f with all the right
arguements  where  a  normal  OBJECT,  the last else branch,
requires a normal call to en_obj.  Note the case of  SOBJECT
is the same as OBJECT _\be_\bx_\bc_\be_\bp_\bt _\bt_\bh_\be_\br_\be _\bi_\bs _\bn_\bo _\bi_\bn_\bd_\bi_\br_\be_\bc_\bt_\bi_\bo_\bn.

SOBJECT and OBJECT
     Here is the code that handles the two cases  sperately.
     It  is  exactly as in the IMP_OBJ case except seperated
     out.  Note the only difference between the two cases is
     lack of indirection in the SOBJECT case.


                      January 23, 1990


                           - 16 -


CHOICE_START
     This is exactly  as  all  other  compound  types,  like
     SEQ_START  and  OBJECT, we call the appropriate routine
     with indirection.  From reading the ISODE manuals  that
     the  ASN.1 CHOICE type is handled by a structure of its
     own like the other compund types.

EXTOBJ and SEXTOBJ


                      January 23, 1990