BSD 4_3 development
[unix-history] / usr / lib / lisp / manual / ch9.r
CHAPTER 9
Arrays and Vectors
Arrays and vectors are two means of expressing aggre-
gate data objects in FRANZ LISP. Vectors may be thought of
as sequences of data. They are intended as a vehicle for
user-defined data types. This use of vectors is still
experimental and subject to revision. As a simple data
structure, they are similar to hunks and strings. Vectors
are used to implement closures, and are useful to communi-
cate with foreign functions. Both of these topics were dis-
cussed in Chapter 8. Later in this chapter, we describe the
current implementation of vectors, and will advise the user
what is most likely to change.
Arrays in FRANZ LISP provide a programmable data struc-
ture access mechanism. One possible use for FRANZ LISP
arrays is to implement Maclisp style arrays which are simple
vectors of fixnums, flonums or general lisp values. This is
described in more detail in 9.3 but first we will describe
how array references are handled by the lisp system.
The structure of an array object is given in 1.3.10 and
reproduced here for your convenience.
\e8_______________________________________________________________
Subpart name Get value Set value Type
\e8_______________________________________________________________\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b_______________________________________________________________
access function getaccess putaccess binary, list
or symbol
\e8_______________________________________________________________
auxiliary getaux putaux lispval
\e8_______________________________________________________________
data arrayref replace block of contiguous
set lispval
\e8_______________________________________________________________
length getlength putlength fixnum
\e8_______________________________________________________________
delta getdelta putdelta fixnum
\e8_______________________________________________________________
\e7\b|\b\e7|\b\e7|\b\e7|\b\e7|\b\e7|\b\e7|\b\e7|\b\e7|\b\e7|\b\e7|\b\e7|
|\b\e7|\b\e7|\b\e7|\b\e7|\b\e7|\b\e7|\b\e7|\b\e7|\b\e7|\b\e7|\b\e7|
|\b\e7|\b\e7|\b\e7|\b\e7|\b\e7|\b\e7|\b\e7|\b\e7|\b\e7|\b\e7|\b\e7|
|\b\e7|\b\e7|\b\e7|\b\e7|\b\e7|\b\e7|\b\e7|\b\e7|\b\e7|\b\e7|\b\e7|
|\b\e7|\b\e7|\b\e7|\b\e7|\b\e7|\b\e7|\b\e7|\b\e7|\b\e7|\b\e7|\b\e7|
9.1. general arrays Suppose the evaluator is told to
evaluate (_\bf_\bo_\bo _\ba _\bb) and the function cell of the symbol
foo contains an array object (which we will call
foo_arr_obj). First the evaluator will evaluate and
stack the values of _\ba and _\bb. Next it will stack the
\e9Arrays and Vectors 9-1
Arrays and Vectors 9-2
array object foo_arr_obj. Finally it will call the
access function of foo_arr_obj. The access function
should be a lexpr[] or a symbol whose function cell
contains a lexpr. The access function is responsible
for locating and returning a value from the array.
The array access function is free to interpret the
arguments as it wishes. The Maclisp compatible array
access function which is provided in the standard
FRANZ LISP system interprets the arguments as sub-
scripts in the same way as languages like Fortran and
Pascal.
The array access function will also be called
upon to store elements in the array. For example,
(_\bs_\bt_\bo_\br_\be (_\bf_\bo_\bo _\ba _\bb) _\bc) will automatically expand to (foo
c a b) and when the evaluator is called to evaluate
this, it will evaluate the arguments _\bc, _\bb and _\ba. Then
it will stack the array object (which is stored in the
function cell of foo) and call the array access func-
tion with (now) four arguments. The array access
function must be able to tell this is a store opera-
tion, which it can do by checking the number of argu-
ments it has been given (a lexpr can do this very
easily).
9.2. subparts of an array object An array is created
by allocating an array object with _\bm_\ba_\br_\br_\ba_\by and filling
in the fields. Certain lisp functions interpret the
values of the subparts of the array object in special
ways as described in the following text. Placing
illegal values in these subparts may cause the lisp
system to fail.
9.2.1. access function The purpose of the access
function has been described above. The contents of
the access function should be a lexpr, either a
binary (compiled function) or a list (interpreted
function). It may also be a symbol whose function
cell contains a function definition. This subpart
is used by _\be_\bv_\ba_\bl, _\bf_\bu_\bn_\bc_\ba_\bl_\bl, and _\ba_\bp_\bp_\bl_\by when evaluating
array references.
____________________
\e9 []A lexpr is a function which accepts any number of argu-
ments which are evaluated before the function is called.
\e9 Printed: July 21, 1983
Arrays and Vectors 9-3
9.2.2. auxiliary This can be used for any purpose.
If it is a list and the first element of that list
is the symbol unmarked_array then the data subpart
will not be marked by the garbage collector (this
is used in the Maclisp compatible array package and
has the potential for causing strange errors if
used incorrectly).
9.2.3. data This is either nil or points to a block
of data space allocated by _\bs_\be_\bg_\bm_\be_\bn_\bt or _\bs_\bm_\ba_\bl_\bl-
_\bs_\be_\bg_\bm_\be_\bn_\bt.
9.2.4. length This is a fixnum whose value is the
number of elements in the data block. This is used
by the garbage collector and by _\ba_\br_\br_\ba_\by_\br_\be_\bf to deter-
mine if your index is in bounds.
9.2.5. delta This is a fixnum whose value is the
number of bytes in each element of the data block.
This will be four for an array of fixnums or value
cells, and eight for an array of flonums. This is
used by the garbage collector and _\ba_\br_\br_\ba_\by_\br_\be_\bf as well.
9.3. The Maclisp compatible array package
A Maclisp style array is similar to what is known
as arrays in other languages: a block of homogeneous
data elements which is indexed by one or more integers
called subscripts. The data elements can be all fix-
nums, flonums or general lisp objects. An array is
created by a call to the function _\ba_\br_\br_\ba_\by or *_\ba_\br_\br_\ba_\by.
The only difference is that *_\ba_\br_\br_\ba_\by evaluates its argu-
ments. This call: (_\ba_\br_\br_\ba_\by _\bf_\bo_\bo _\bt _\b3 _\b5) sets up an array
called foo of dimensions 3 by 5. The subscripts are
zero based. The first element is (_\bf_\bo_\bo _\b0 _\b0), the next
is (_\bf_\bo_\bo _\b0 _\b1) and so on up to (_\bf_\bo_\bo _\b2 _\b4). The t indi-
cates a general lisp object array which means each
element of foo can be any type. Each element can be
any type since all that is stored in the array is a
pointer to a lisp object, not the object itself.
_\bA_\br_\br_\ba_\by does this by allocating an array object with
_\bm_\ba_\br_\br_\ba_\by and then allocating a segment of 15 consecutive
value cells with _\bs_\bm_\ba_\bl_\bl-_\bs_\be_\bg_\bm_\be_\bn_\bt and storing a pointer
to that segment in the data subpart of the array
object. The length and delta subpart of the array
Printed: July 21, 1983
Arrays and Vectors 9-4
object are filled in (with 15 and 4 respectively) and
the access function subpart is set to point to the
appropriate array access function. In this case
there is a special access function for two dimensional
value cell arrays called arrac-twoD, and this access
function is used. The auxiliary subpart is set to
(t 3 5) which describes the type of array and the
bounds of the subscripts. Finally this array object is
placed in the function cell of the symbol foo. Now
when (_\bf_\bo_\bo _\b1 _\b3) is evaluated, the array access function
is invoked with three arguments: 1, 3 and the array
object. From the auxiliary field of the array object
it gets a description of the particular array. It
then determines which element (_\bf_\bo_\bo _\b1 _\b3) refers to and
uses arrayref to extract that element. Since this is
an array of value cells, what arrayref returns is a
value cell whose value is what we want, so we evaluate
the value cell and return it as the value of
(_\bf_\bo_\bo _\b1 _\b3).
In Maclisp the call (_\ba_\br_\br_\ba_\by _\bf_\bo_\bo _\bf_\bi_\bx_\bn_\bu_\bm _\b2_\b5) returns
an array whose data object is a block of 25 memory
words. When fixnums are stored in this array, the
actual numbers are stored instead of pointers to the
numbers as is done in general lisp object arrays.
This is efficient under Maclisp but inefficient in
FRANZ LISP since every time a value was referenced
from an array it had to be copied and a pointer to the
copy returned to prevent aliasing[]. Thus t, fixnum
and flonum arrays are all implemented in the same
manner. This should not affect the compatibility of
Maclisp and FRANZ LISP. If there is an application
where a block of fixnums or flonums is required, then
the exact same effect of fixnum and flonum arrays in
Maclisp can be achieved by using fixnum-block and
flonum-block arrays. Such arrays are required if you
want to pass a large number of arguments to a Fortran
or C coded function and then get answers back.
The Maclisp compatible array package is just one
example of how a general array scheme can be imple-
mented. Another type of array you could implement
would be hashed arrays. The subscript could be
____________________
\e9 []Aliasing is when two variables are share the same
storage location. For example if the copying mentioned
weren't done then after (_\bs_\be_\bt_\bq _\bx (_\bf_\bo_\bo _\b2)) was done, the value
of x and (foo 2) would share the same location. Then should
the value of (foo 2) change, x's value would change as well.
This is considered dangerous and as a result pointers are
never returned into the data space of arrays.
\e9 Printed: July 21, 1983
Arrays and Vectors 9-5
anything, not just a number. The access function
would hash the subscript and use the result to select
an array element. With the generality of arrays also
comes extra cost; if you just want a simple aggregate
of (less than 128) general lisp objects you would be
wise to look into using hunks.
9.4. vectors Vectors were invented to fix two
shortcommings with hunks. They can be longer than 128
elements. They also have a tag associated with them,
which is intended to say, for example, "Think of me as
an _\bB_\bl_\bo_\bb_\bi_\bt." Thus a vector is an arbitrary sized hunk
with a property list.
Continuing the example, the lisp kernel may not
know how to print out or evaluate _\bb_\bl_\bo_\bb_\bi_\bt_\bs, but this is
information which will be common to all _\bb_\bl_\bo_\bb_\bi_\bt_\bs. On
the other hand, for each individual blobits there are
particulars which are likely to change, (height,
weight, eye-color). This is the part that would pre-
viously have been stored in the individual entries in
the hunk, and are stored in the data slots of the vec-
tor. Once again we summarize the structure of a vec-
tor in tabular form:
\e8 ________________________________________________
Subpart name Get value Set value Type
\e8 ________________________________________________\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b________________________________________________
datum[_\bi] vref vset lispval
\e8 ________________________________________________
property vprop vsetprop lispval
vputprop
\e8 ________________________________________________
size vsize - fixnum
\e8 ________________________________________________
\e7 |\b\e7|\b\e7|\b\e7|\b\e7|\b\e7|\b\e7|\b\e7|
|\b\e7|\b\e7|\b\e7|\b\e7|\b\e7|\b\e7|\b\e7|
|\b\e7|\b\e7|\b\e7|\b\e7|\b\e7|\b\e7|\b\e7|
|\b\e7|\b\e7|\b\e7|\b\e7|\b\e7|\b\e7|\b\e7|
|\b\e7|\b\e7|\b\e7|\b\e7|\b\e7|\b\e7|\b\e7|
Vectors are created specifying size and optional fill
value using the function (_\bn_\be_\bw-_\bv_\be_\bc_\bt_\bo_\br 'x_size ['g_fill
['g_prop]]), or by initial values: (_\bv_\be_\bc_\bt_\bo_\br ['g_val
...]).
9.5. anatomy of vectors There are some technical
details about vectors, that the user should know:
\e9 Printed: July 21, 1983
Arrays and Vectors 9-6
9.5.1. size The user is not free to alter this. It
is noted when the vector is created, and is used by
the garbage collector. The garbage collector will
coallesce two free vectors, which are neighbors in
the heap. Internally, this is kept as the number
of bytes of data. Thus, a vector created by (_\bv_\be_\bc_\b-
_\bt_\bo_\br 'foo), has a size of 4.
9.5.2. property Currently, we expect the property
to be either a symbol, or a list whose first entry
is a symbol. The symbols fclosure and structure-
value-argument are magic, and their effect is
described in Chapter 8. If the property is a
(non-null) symbol, the vector will be printed out
as <symbol>[<size>]. Another case is if the pro-
perty is actually a (disembodied) property-list,
which contains a value for the indicator print.
The value is taken to be a Lisp function, which the
printer will invoke with two arguments: the vector
and the current output port. Otherwise, the vector
will be printed as vector[<size>]. We have vague
(as yet unimplemented) ideas about similar mechan-
isms for evaluation properties. Users are cau-
tioned against putting anything other than nil in
the property entry of a vector.
9.5.3. internal order In memory, vectors start with
a longword containing the size (which is immediate
data within the vector). The next cell contains a
pointer to the property. Any remaining cells (if
any) are for data. Vectors are handled differently
from any other object in FRANZ LISP, in that a
pointer to a vector is pointer to the first data
cell, i.e. a pointer to the _\bt_\bh_\bi_\br_\bd longword of the
structure. This was done for efficiency in com-
piled code and for uniformity in referencing
immediate-vectors (described below). The user
should never return a pointer to any other part of
a vector, as this may cause the garbage collector
to follow an invalid pointer.
9.6. immediate-vectors Immediate-vectors are similar
to vectors. They differ, in that binary data are
stored in space directly within the vector. Thus the
garbage collector will preserve the vector itself (if
used), and will only traverse the property cell. The
data may be referenced as longwords, shortwords, or
Printed: July 21, 1983
Arrays and Vectors 9-7
even bytes. Shorts and bytes are returned sign-
extended. The compiler open-codes such references,
and will avoid boxing the resulting integer data,
where possible. Thus, immediate vectors may be used
for efficiently processing character data. They are
also useful in storing results from functions written
in other languages.
\e8 __________________________________________________
Subpart name Get value Set value Type
\e8 __________________________________________________\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b__________________________________________________
datum[_\bi] vrefi-byte vseti-byte fixnum
vrefi-word vseti-word fixnum
vrefi-long vseti-long fixnum
\e8 __________________________________________________
property vprop vsetprop lispval
vputprop
\e8 __________________________________________________
size vsize - fixnum
vsize-byte fixnum
vsize-word fixnum
\e8 __________________________________________________
\e7 |\b\e7|\b\e7|\b\e7|\b\e7|\b\e7|\b\e7|\b\e7|\b\e7|\b\e7|\b\e7|\b\e7|
|\b\e7|\b\e7|\b\e7|\b\e7|\b\e7|\b\e7|\b\e7|\b\e7|\b\e7|\b\e7|\b\e7|
|\b\e7|\b\e7|\b\e7|\b\e7|\b\e7|\b\e7|\b\e7|\b\e7|\b\e7|\b\e7|\b\e7|
|\b\e7|\b\e7|\b\e7|\b\e7|\b\e7|\b\e7|\b\e7|\b\e7|\b\e7|\b\e7|\b\e7|
|\b\e7|\b\e7|\b\e7|\b\e7|\b\e7|\b\e7|\b\e7|\b\e7|\b\e7|\b\e7|\b\e7|
To create immediate vectors specifying size and fill
data, you can use the functions _\bn_\be_\bw-_\bv_\be_\bc_\bt_\bo_\br_\bi-_\bb_\by_\bt_\be,
_\bn_\be_\bw-_\bv_\be_\bc_\bt_\bo_\br_\bi-_\bw_\bo_\br_\bd, or _\bn_\be_\bw-_\bv_\be_\bc_\bt_\bo_\br_\bi-_\bl_\bo_\bn_\bg. You can also
use the functions _\bv_\be_\bc_\bt_\bo_\br_\bi-_\bb_\by_\bt_\be, _\bv_\be_\bc_\bt_\bo_\br_\bi-_\bw_\bo_\br_\bd, or
_\bv_\be_\bc_\bt_\bo_\br_\bi-_\bl_\bo_\bn_\bg. All of these functions are described in
chapter 2.
\e9 Printed: July 21, 1983