usr/lib/lisp/manual/ch9.r



                         CHAPTER  9


                     Arrays and Vectors


     Arrays and vectors are two means of  expressing  aggre-
gate  data objects in FRANZ LISP.  Vectors may be thought of
as sequences of data.  They are intended as  a  vehicle  for
user-defined  data  types.   This  use  of  vectors is still
experimental and subject to  revision.   As  a  simple  data
structure,  they  are similar to hunks and strings.  Vectors
are used to implement closures, and are useful  to  communi-
cate with foreign functions.  Both of these topics were dis-
cussed in Chapter 8.  Later in this chapter, we describe the
current  implementation of vectors, and will advise the user
what is most likely to change.

     Arrays in FRANZ LISP provide a programmable data struc-
ture  access  mechanism.   One  possible  use for FRANZ LISP
arrays is to implement Maclisp style arrays which are simple
vectors of fixnums, flonums or general lisp values.  This is
described in more detail in 9.3 but first we  will  describe
how array references are handled by the lisp system.

     The structure of an array object is given in 1.3.10 and
reproduced here for your convenience.


\e8_______________________________________________________________
  Subpart name     Get value   Set value          Type

\e8_______________________________________________________________\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b_______________________________________________________________
 access function   getaccess   putaccess      binary, list
                                                or symbol
\e8_______________________________________________________________
    auxiliary       getaux      putaux           lispval
\e8_______________________________________________________________
      data         arrayref     replace    block of contiguous
                                  set            lispval
\e8_______________________________________________________________
     length        getlength   putlength         fixnum
\e8_______________________________________________________________
      delta        getdelta    putdelta          fixnum
\e8_______________________________________________________________
\e7\b|\b\e7|\b\e7|\b\e7|\b\e7|\b\e7|\b\e7|\b\e7|\b\e7|\b\e7|\b\e7|\b\e7|


                |\b\e7|\b\e7|\b\e7|\b\e7|\b\e7|\b\e7|\b\e7|\b\e7|\b\e7|\b\e7|\b\e7|


                            |\b\e7|\b\e7|\b\e7|\b\e7|\b\e7|\b\e7|\b\e7|\b\e7|\b\e7|\b\e7|\b\e7|


                                        |\b\e7|\b\e7|\b\e7|\b\e7|\b\e7|\b\e7|\b\e7|\b\e7|\b\e7|\b\e7|\b\e7|


                                                              |\b\e7|\b\e7|\b\e7|\b\e7|\b\e7|\b\e7|\b\e7|\b\e7|\b\e7|\b\e7|\b\e7|


   9.1.  general arrays   Suppose the evaluator is  told  to
      evaluate (_\bf_\bo_\bo _\ba _\bb) and the function cell of the symbol
      foo contains an  array  object  (which  we  will  call
      foo_arr_obj).   First  the evaluator will evaluate and
      stack the values of _\ba and _\bb.  Next it will  stack  the


\e9Arrays and Vectors                                       9-1


Arrays and Vectors                                       9-2


      array  object  foo_arr_obj.   Finally it will call the
      access function of foo_arr_obj.  The  access  function
      should  be  a  lexpr[] or a symbol whose function cell
      contains a lexpr.  The access function is  responsible
      for  locating  and  returning  a value from the array.
      The array access function is  free  to  interpret  the
      arguments  as it wishes.  The Maclisp compatible array
      access function which  is  provided  in  the  standard
      FRANZ  LISP  system  interprets  the arguments as sub-
      scripts in the same way as languages like Fortran  and
      Pascal.

           The array access function  will  also  be  called
      upon  to  store  elements  in the array.  For example,
      (_\bs_\bt_\bo_\br_\be (_\bf_\bo_\bo _\ba _\bb) _\bc) will automatically expand to  (foo
      c  a  b)  and when the evaluator is called to evaluate
      this, it will evaluate the arguments _\bc, _\bb and _\ba.  Then
      it will stack the array object (which is stored in the
      function cell of foo) and call the array access  func-
      tion  with  (now)  four  arguments.   The array access
      function must be able to tell this is a  store  opera-
      tion,  which it can do by checking the number of argu-
      ments it has been given (a  lexpr  can  do  this  very
      easily).


   9.2.  subparts of an array object   An array  is  created
      by allocating an array object with _\bm_\ba_\br_\br_\ba_\by and  filling
      in the fields.  Certain lisp functions  interpret  the
      values  of the subparts of the array object in special
      ways as described  in  the  following  text.   Placing
      illegal  values  in  these subparts may cause the lisp
      system to fail.


      9.2.1.  access function   The purpose  of  the  access
         function has been described above.  The contents of
         the access function should be  a  lexpr,  either  a
         binary  (compiled  function) or a list (interpreted
         function).  It may also be a symbol whose  function
         cell  contains a function definition.  This subpart
         is used by _\be_\bv_\ba_\bl, _\bf_\bu_\bn_\bc_\ba_\bl_\bl, and _\ba_\bp_\bp_\bl_\by when evaluating
         array references.


____________________
\e9   []A lexpr is a function which accepts any number of argu-
ments which are evaluated before the function is called.


\e9                                      Printed: July 21, 1983


Arrays and Vectors                                       9-3


      9.2.2.  auxiliary   This can be used for any  purpose.
         If  it is a list and the first element of that list
         is the symbol unmarked_array then the data  subpart
         will  not  be marked by the garbage collector (this
         is used in the Maclisp compatible array package and
         has  the  potential  for  causing strange errors if
         used incorrectly).


      9.2.3.  data   This is either nil or points to a block
         of  data  space  allocated  by  _\bs_\be_\bg_\bm_\be_\bn_\bt  or  _\bs_\bm_\ba_\bl_\bl-
         _\bs_\be_\bg_\bm_\be_\bn_\bt.


      9.2.4.  length   This is a fixnum whose value  is  the
         number of elements in the data block.  This is used
         by the garbage collector and by _\ba_\br_\br_\ba_\by_\br_\be_\bf to  deter-
         mine if your index is in bounds.


      9.2.5.  delta   This is a fixnum whose  value  is  the
         number  of bytes in each element of the data block.
         This will be four for an array of fixnums or  value
         cells,  and eight for an array of flonums.  This is
         used by the garbage collector and _\ba_\br_\br_\ba_\by_\br_\be_\bf as well.


   9.3.  The Maclisp compatible array package

           A Maclisp style array is similar to what is known
      as  arrays  in other languages: a block of homogeneous
      data elements which is indexed by one or more integers
      called  subscripts.  The data elements can be all fix-
      nums, flonums or general lisp objects.   An  array  is
      created  by  a  call  to the function _\ba_\br_\br_\ba_\by or *_\ba_\br_\br_\ba_\by.
      The only difference is that *_\ba_\br_\br_\ba_\by evaluates its argu-
      ments.   This call: (_\ba_\br_\br_\ba_\by _\bf_\bo_\bo _\bt _\b3 _\b5) sets up an array
      called foo of dimensions 3 by 5.  The  subscripts  are
      zero  based.  The first element is (_\bf_\bo_\bo _\b0 _\b0), the next
      is (_\bf_\bo_\bo _\b0 _\b1) and so on up to (_\bf_\bo_\bo _\b2 _\b4).  The  t  indi-
      cates  a  general  lisp  object array which means each
      element of foo can be any type.  Each element  can  be
      any  type  since  all that is stored in the array is a
      pointer to a  lisp  object,  not  the  object  itself.
      _\bA_\br_\br_\ba_\by  does  this  by  allocating an array object with
      _\bm_\ba_\br_\br_\ba_\by and then allocating a segment of 15 consecutive
      value  cells  with _\bs_\bm_\ba_\bl_\bl-_\bs_\be_\bg_\bm_\be_\bn_\bt and storing a pointer
      to that segment in  the  data  subpart  of  the  array
      object.   The  length  and  delta subpart of the array


                                      Printed: July 21, 1983


Arrays and Vectors                                       9-4


      object are filled in (with 15 and 4 respectively)  and
      the  access  function  subpart  is set to point to the
      appropriate  array  access  function.   In  this  case
      there is a special access function for two dimensional
      value cell arrays called arrac-twoD, and  this  access
      function  is  used.   The  auxiliary subpart is set to
      (t 3 5) which describes the  type  of  array  and  the
      bounds of the subscripts. Finally this array object is
      placed in the function cell of the  symbol  foo.   Now
      when (_\bf_\bo_\bo _\b1 _\b3) is evaluated, the array access function
      is invoked with three arguments: 1, 3  and  the  array
      object.   From the auxiliary field of the array object
      it gets a description of  the  particular  array.   It
      then determines which element (_\bf_\bo_\bo _\b1 _\b3) refers to  and
      uses arrayref to extract that element.  Since this  is
      an  array  of  value cells, what arrayref returns is a
      value cell whose value is what we want, so we evaluate
      the   value  cell  and  return  it  as  the  value  of
      (_\bf_\bo_\bo _\b1 _\b3).

           In Maclisp the call (_\ba_\br_\br_\ba_\by _\bf_\bo_\bo _\bf_\bi_\bx_\bn_\bu_\bm _\b2_\b5) returns
      an  array  whose  data  object is a block of 25 memory
      words.  When fixnums are stored  in  this  array,  the
      actual  numbers  are stored instead of pointers to the
      numbers as is done  in  general  lisp  object  arrays.
      This  is  efficient  under  Maclisp but inefficient in
      FRANZ LISP since every time  a  value  was  referenced
      from an array it had to be copied and a pointer to the
      copy returned to prevent aliasing[].  Thus  t,  fixnum
      and  flonum  arrays  are  all  implemented in the same
      manner.  This should not affect the  compatibility  of
      Maclisp  and  FRANZ  LISP.  If there is an application
      where a block of fixnums or flonums is required,  then
      the  exact  same effect of fixnum and flonum arrays in
      Maclisp can be  achieved  by  using  fixnum-block  and
      flonum-block  arrays.  Such arrays are required if you
      want to pass a large number of arguments to a  Fortran
      or C coded function and then get answers back.

           The Maclisp compatible array package is just  one
      example  of  how  a general array scheme can be imple-
      mented.  Another type of  array  you  could  implement
      would  be  hashed  arrays.   The  subscript  could  be
____________________
\e9   []Aliasing is when  two  variables  are  share  the  same
storage  location.   For  example  if  the copying mentioned
weren't done then after (_\bs_\be_\bt_\bq _\bx (_\bf_\bo_\bo _\b2)) was done, the value
of x and (foo 2) would share the same location.  Then should
the value of (foo 2) change, x's value would change as well.
This  is  considered  dangerous and as a result pointers are
never returned into the data space of arrays.


\e9                                      Printed: July 21, 1983


Arrays and Vectors                                       9-5


      anything, not just  a  number.   The  access  function
      would  hash the subscript and use the result to select
      an array element.  With the generality of arrays  also
      comes  extra cost; if you just want a simple aggregate
      of (less than 128) general lisp objects you  would  be
      wise to look into using hunks.


   9.4.   vectors    Vectors  were  invented  to   fix   two
      shortcommings with hunks.  They can be longer than 128
      elements.  They also have a tag associated with  them,
      which is intended to say, for example, "Think of me as
      an _\bB_\bl_\bo_\bb_\bi_\bt."  Thus a vector is an arbitrary sized  hunk
      with a property list.

           Continuing the example, the lisp kernel  may  not
      know how to print out or evaluate _\bb_\bl_\bo_\bb_\bi_\bt_\bs, but this is
      information which will be common to all  _\bb_\bl_\bo_\bb_\bi_\bt_\bs.   On
      the  other hand, for each individual blobits there are
      particulars  which  are  likely  to  change,  (height,
      weight,  eye-color).  This is the part that would pre-
      viously have been stored in the individual entries  in
      the hunk, and are stored in the data slots of the vec-
      tor.  Once again we summarize the structure of a  vec-
      tor in tabular form:


\e8         ________________________________________________
          Subpart name   Get value   Set value    Type

\e8         ________________________________________________\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b________________________________________________
            datum[_\bi]       vref        vset      lispval
\e8         ________________________________________________
            property       vprop     vsetprop    lispval
                                     vputprop
\e8         ________________________________________________
              size         vsize         -       fixnum
\e8         ________________________________________________
\e7        |\b\e7|\b\e7|\b\e7|\b\e7|\b\e7|\b\e7|\b\e7|


                      |\b\e7|\b\e7|\b\e7|\b\e7|\b\e7|\b\e7|\b\e7|


                                  |\b\e7|\b\e7|\b\e7|\b\e7|\b\e7|\b\e7|\b\e7|


                                              |\b\e7|\b\e7|\b\e7|\b\e7|\b\e7|\b\e7|\b\e7|


                                                        |\b\e7|\b\e7|\b\e7|\b\e7|\b\e7|\b\e7|\b\e7|


      Vectors are created specifying size and optional  fill
      value using the function (_\bn_\be_\bw-_\bv_\be_\bc_\bt_\bo_\br  'x_size ['g_fill
      ['g_prop]]), or by  initial  values:  (_\bv_\be_\bc_\bt_\bo_\br  ['g_val
      ...]).


   9.5.  anatomy  of  vectors    There  are  some  technical
      details about vectors, that the user should know:


\e9                                      Printed: July 21, 1983


Arrays and Vectors                                       9-6


      9.5.1.  size   The user is not free to alter this.  It
         is noted when the vector is created, and is used by
         the garbage collector.  The garbage collector  will
         coallesce  two free vectors, which are neighbors in
         the heap.  Internally, this is kept as  the  number
         of  bytes of data.  Thus, a vector created by (_\bv_\be_\bc_\b-
         _\bt_\bo_\br 'foo), has a size of 4.


      9.5.2.  property   Currently, we expect  the  property
         to  be either a symbol, or a list whose first entry
         is a symbol.  The symbols fclosure  and  structure-
         value-argument  are  magic,  and  their  effect  is
         described in Chapter  8.   If  the  property  is  a
         (non-null)  symbol,  the vector will be printed out
         as <symbol>[<size>]. Another case is  if  the  pro-
         perty  is  actually  a (disembodied) property-list,
         which contains a value  for  the  indicator  print.
         The value is taken to be a Lisp function, which the
         printer will invoke with two arguments:  the vector
         and the current output port.  Otherwise, the vector
         will be printed as vector[<size>].  We  have  vague
         (as  yet unimplemented) ideas about similar mechan-
         isms for evaluation  properties.   Users  are  cau-
         tioned  against  putting anything other than nil in
         the property entry of a vector.


      9.5.3.  internal order   In memory, vectors start with
         a  longword containing the size (which is immediate
         data within the vector).  The next cell contains  a
         pointer  to  the property.  Any remaining cells (if
         any) are for data.  Vectors are handled differently
         from  any  other  object  in  FRANZ LISP, in that a
         pointer to a vector is pointer to  the  first  data
         cell,  i.e.  a pointer to the _\bt_\bh_\bi_\br_\bd longword of the
         structure.  This was done for  efficiency  in  com-
         piled   code  and  for  uniformity  in  referencing
         immediate-vectors  (described  below).   The   user
         should  never return a pointer to any other part of
         a vector, as this may cause the  garbage  collector
         to follow an invalid pointer.


   9.6.  immediate-vectors   Immediate-vectors  are  similar
      to  vectors.   They  differ,  in  that binary data are
      stored in space directly within the vector.  Thus  the
      garbage  collector will preserve the vector itself (if
      used), and will only traverse the property cell.   The
      data  may  be  referenced as longwords, shortwords, or


                                      Printed: July 21, 1983


Arrays and Vectors                                       9-7


      even bytes.   Shorts  and  bytes  are  returned  sign-
      extended.   The  compiler  open-codes such references,
      and will avoid  boxing  the  resulting  integer  data,
      where  possible.   Thus, immediate vectors may be used
      for efficiently processing character data.   They  are
      also  useful in storing results from functions written
      in other languages.


\e8        __________________________________________________
         Subpart name   Get value    Set value     Type

\e8        __________________________________________________\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b__________________________________________________
           datum[_\bi]     vrefi-byte   vseti-byte   fixnum
                        vrefi-word   vseti-word   fixnum
                        vrefi-long   vseti-long   fixnum
\e8        __________________________________________________
           property       vprop       vsetprop    lispval
                                      vputprop
\e8        __________________________________________________
             size         vsize          -        fixnum
                        vsize-byte                fixnum
                        vsize-word                fixnum
\e8        __________________________________________________
\e7       |\b\e7|\b\e7|\b\e7|\b\e7|\b\e7|\b\e7|\b\e7|\b\e7|\b\e7|\b\e7|\b\e7|


                     |\b\e7|\b\e7|\b\e7|\b\e7|\b\e7|\b\e7|\b\e7|\b\e7|\b\e7|\b\e7|\b\e7|


                                  |\b\e7|\b\e7|\b\e7|\b\e7|\b\e7|\b\e7|\b\e7|\b\e7|\b\e7|\b\e7|\b\e7|


                                               |\b\e7|\b\e7|\b\e7|\b\e7|\b\e7|\b\e7|\b\e7|\b\e7|\b\e7|\b\e7|\b\e7|


                                                         |\b\e7|\b\e7|\b\e7|\b\e7|\b\e7|\b\e7|\b\e7|\b\e7|\b\e7|\b\e7|\b\e7|


      To create immediate vectors specifying size  and  fill
      data,  you  can  use  the  functions _\bn_\be_\bw-_\bv_\be_\bc_\bt_\bo_\br_\bi-_\bb_\by_\bt_\be,
      _\bn_\be_\bw-_\bv_\be_\bc_\bt_\bo_\br_\bi-_\bw_\bo_\br_\bd, or _\bn_\be_\bw-_\bv_\be_\bc_\bt_\bo_\br_\bi-_\bl_\bo_\bn_\bg.  You  can  also
      use   the  functions  _\bv_\be_\bc_\bt_\bo_\br_\bi-_\bb_\by_\bt_\be,  _\bv_\be_\bc_\bt_\bo_\br_\bi-_\bw_\bo_\br_\bd,  or
      _\bv_\be_\bc_\bt_\bo_\br_\bi-_\bl_\bo_\bn_\bg.  All of these functions are described in
      chapter 2.


\e9                                      Printed: July 21, 1983