fix font switch
[unix-history] / usr / src / old / lisp / PSD.doc / ch9.n
CommitLineData
1a897ec1
NC
1.\" Copyright (c) 1980 Regents of the University of California.
2.\" All rights reserved. The Berkeley software License Agreement
3.\" specifies the terms and conditions for redistribution.
4.\"
b93d171f 5.\" @(#)ch9.n 6.1 (Berkeley) %G%
1a897ec1 6.\"
b93d171f
NC
7." $Header: ch9.n 1.4 83/07/21 21:08:57 sklower Exp $
8.Lc Arrays\ and\ Vectors 9
9.pp
10Arrays and vectors are two means of expressing aggregate
11data objects in
12.Fr .
13Vectors may be thought of as sequences of data.
14They are intended as a vehicle for user-defined
15data types.
16This use of vectors is still experimental and subject
17to revision.
18As a simple data structure, they are similar to
19hunks and strings.
20Vectors are used to implement closures,
21and are useful to communicate with foreign functions.
22Both of these topics were discussed in Chapter 8.
23Later in this chapter,
24we describe the current implementation of vectors, and will
25advise the user what is most likely to change.
26.pp
27Arrays in
28.Fr
29provide a programmable data structure access mechanism.
30One possible use for
31.Fr
32arrays is to implement Maclisp style arrays which are simple vectors
33of fixnums, flonums or general lisp values.
34This is described in more detail in \(sc9.3 but first
35we will describe how array references are handled by
36the lisp system.
37.pp
38The structure of an array object is given in \(sc1.3.10 and reproduced here
39for your convenience.
40.sp 1v
41.TS
42box center ;
43c | c | c | c .
44Subpart name Get value Set value Type
45
46=
47access function getaccess putaccess binary, list
48 or symbol
49_
50auxiliary getaux putaux lispval
51_
52data arrayref replace block of contiguous
53 set lispval
54_
55length getlength putlength fixnum
56_
57delta getdelta putdelta fixnum
58.TE
59.sh 2 "general arrays" \n(ch 1
60Suppose the evaluator is told to evaluate \fI(foo\ a\ b)\fP
61and the function cell of the symbol foo contains an array object
62(which we will call foo_arr_obj).
63First the evaluator will evaluate and stack the values of
64.i a
65and
66.i b .
67Next it will stack the array object foo_arr_obj.
68Finally it will call the access function of foo_arr_obj.
69The access function should be a lexpr\*[\(dg\*]
70or a symbol whose
71function cell contains a lexpr.
72.(f
73\*[\(dg\*]A lexpr is a function which accepts any number of arguments
74which are evaluated before the function is called.
75.)f
76The access function is responsible for locating and returning
77a value from the array.
78The array access function is free to interpret the arguments as it wishes.
79The Maclisp compatible array access function which is provided
80in the standard
81.Fr
82system interprets the arguments as subscripts in the same way as
83languages like Fortran and Pascal.
84.pp
85The array access function will also be called upon to store elements in
86the array.
87For example, \fI(store\ (foo\ a\ b)\ c)\fP
88will automatically expand to (foo c a b) and when the evaluator is called
89to evaluate this, it will evaluate the arguments
90.i c ,
91.i b
92and
93.i a .
94Then it will
95stack the array object (which is stored
96in the function cell of foo) and call the array access function
97with (now) four arguments.
98The array access function must be able to tell this is a store operation,
99which it can do by checking the number of arguments it has been
100given (a lexpr can do this very easily).
101.sh 2 "subparts of an array object"
102An array is created by allocating an
103array object with
104.i marray
105and filling in the fields.
106Certain lisp functions interpret the values of the subparts
107of the array object in special
108ways as described in the following text.
109Placing illegal values in these subparts may cause
110the lisp system to fail.
111.sh 3 "access function"
112The purpose of the access function has been described above.
113The contents of the access function should be a lexpr,
114either a binary (compiled function) or a list (interpreted function).
115It may also be a symbol whose function cell contains a function
116definition.
117This subpart
118is used by
119.i eval ,
120.i funcall ,
121and
122.i apply
123when evaluating array references.
124.sh 3 auxiliary
125This can be used for any purpose. If it is a list and the first element
126of that list is the symbol unmarked_array then the data subpart will
127not be marked by the garbage collector (this is used in the Maclisp
128compatible array package and has the potential for causing strange errors
129if used incorrectly).
130.sh 3 data
131This is either nil or points to a block of data space allocated by
132.i segment
133or
134.i small-segment.
135.sh 3 length
136This is a fixnum whose value is the number of elements in the
137data block. This is used by the garbage collector and by
138.i arrayref
139to determine if your index is in bounds.
140.sh 3 delta
141This is a fixnum whose value is the number of bytes in each element of
142the data block.
143This will be four for an array of fixnums or value cells, and eight
144for an array of flonums.
145This is used by the garbage collector and
146.i arrayref
147as well.
148.sh 2 "The Maclisp compatible array package"
149.pp
150A Maclisp style array is similar to what is known as arrays in other
151languages: a block of homogeneous data elements which
152is indexed by one or more integers called subscripts.
153The data elements can be all fixnums, flonums or general lisp objects.
154An array is created by a call to the function
155.i array
156or \fI*array\fP.
157The only difference is that
158.i *array
159evaluates its arguments.
160This call:
161.i "(array foo t 3 5)"
162sets up an array called foo of dimensions 3 by 5.
163The subscripts are zero based.
164The first element is \fI(foo\ 0\ 0)\fP, the next is \fI(foo\ 0\ 1)\fP
165and so on up to \fI(foo\ 2\ 4)\fP.
166The t indicates a general lisp object array which means each element of
167foo can be any type.
168Each element can be any type since all that is stored in the array is
169a pointer to a lisp object, not the object itself.
170.i Array
171does this by allocating an array object
172with
173.i marray
174and then allocating a segment of 15 consecutive value cells with
175.i small-segment
176and storing a pointer to that segment in the data subpart of the array
177object.
178The length and delta subpart of the array object are filled in (with 15
179and 4 respectively) and the access function subpart is set to point to
180the appropriate array access function.
181In this case there is a special access function for two dimensional
182value cell arrays called arrac-twoD, and this access function is used.
183The auxiliary subpart is set to (t\ 3\ 5) which describes the type of array
184and the bounds of the subscripts.
185Finally this array object is placed in the function cell of the symbol foo.
186Now when
187.i "(foo 1 3)"
188is evaluated, the array access function is invoked with three arguments:
1891, 3 and the array object. From the auxiliary field of the
190array object it gets a description of the particular array.
191It then determines which element \fI(foo\ 1\ 3)\fP refers to and
192uses arrayref to extract that element.
193Since this is an array of value cells, what arrayref returns is a
194value cell whose value is what we want, so we evaluate the value cell
195and return it as the value of \fI(foo\ 1\ 3)\fP.
196.pp
197In Maclisp the call \fI(array\ foo\ fixnum\ 25)\fP
198returns an array whose data object is a block of 25 memory words.
199When fixnums are stored in this array, the actual numbers are
200stored instead of pointers to the numbers as is done in general lisp
201object arrays.
202This is efficient under Maclisp but inefficient in
203.Fr
204since every time a value was referenced from an array it had to be copied
205and a pointer to the copy returned to prevent aliasing\*[\(dg\*].
206.(f
207\*[\(dg\*]Aliasing is when two variables are share the same storage location.
208For example if the copying mentioned weren't done then after
209\fI(setq\ x\ (foo\ 2))\fP was done, the value of x and
210(foo\ 2) would share the same
211location.
212Then should the value of (foo\ 2) change, x's value would change as well.
213This is considered dangerous and as a result pointers are never returned
214into the data space of arrays.
215.)f
216Thus t, fixnum and flonum arrays are all implemented in the same
217manner.
218This should not affect the compatibility of Maclisp
219and
220.Fr .
221If there is an application where a block of fixnums or flonums is required,
222then the exact same effect of fixnum and flonum arrays in Maclisp
223can be achieved by using fixnum-block and flonum-block arrays.
224Such arrays are required if you want to pass a large number of arguments to a
225Fortran or C coded function and then get answers back.
226.pp
227The Maclisp compatible array package is
228just one example of how a general array scheme can be implemented.
229Another type of array you could implement would be hashed arrays.
230The subscript could be anything, not just a number.
231The access function would hash the subscript and use the result to
232select an array element.
233With the generality of arrays also comes extra cost; if you just
234want a simple aggregate of (less than 128) general lisp objects
235you would be wise to look into using hunks.
236.sh 2 vectors
237Vectors were invented to fix two shortcommings with hunks.
238They can be longer than 128 elements. They also have a
239tag associated with them, which is intended to say, for example,
240"Think of me as an \fIBlobit\fP." Thus a \fBvector\fP
241is an arbitrary sized hunk with a property list.
242.pp
243Continuing the example,
244the lisp kernel may not know how to print out
245or evaluate \fIblobits\fP, but this is information which will
246be common to all \fIblobits\fP. On the other hand, for each
247individual blobits there are particulars which are likely to change,
248(height, weight, eye-color). This is the part that would
249previously have been stored in the individual entries in the hunk,
250and are stored in the data slots of the vector.
251Once again we summarize the structure of a vector in tabular form:
252.sp 1v
253.TS
254box center ;
255c | c | c | c .
256Subpart name Get value Set value Type
257
258=
259datum[\fIi\fP] vref vset lispval
260_
261property vprop vsetprop lispval
262 vputprop
263_
264size vsize \- fixnum
265.TE
266Vectors are created specifying size and optional fill value using the
267function (\fInew-vector\fP 'x_size ['g_fill ['g_prop]]), or by
268initial values: (\fIvector\fP ['g_val ...]).
269.sh 2 "anatomy of vectors"
270There are some technical details about vectors, that the user should
271know:
272.sh 3 size
273The user is not free to alter this. It is noted when the vector
274is created, and is used by the garbage collector. The garbage
275collector will coallesce two free vectors, which are neighbors
276in the heap. Internally, this is kept as the number of bytes
277of data. Thus, a vector created by (\fIvector\fP 'foo), has a
278size of 4.
279.sh 3 property
280Currently, we expect the property to be either a symbol, or a list
281whose first entry is a symbol. The symbols \fBfclosure\fP and
282\fBstructure-value-argument\fP are magic, and their effect is described in
283Chapter 8. If the property is a (non-null) symbol, the vector
284will be printed out as <symbol>[<size>].
285Another case is if the property is actually a (disembodied) property-list, which
286contains a value for the indicator \fBprint\fP.
287The value is taken to be a Lisp function, which the printer
288will invoke with two arguments: the vector and the current output port.
289Otherwise, the vector will be printed as vector[<size>].
290We have vague (as yet unimplemented) ideas
291about similar mechanisms for evaluation properties.
292Users are cautioned against putting anything other than nil
293in the property entry of a vector.
294.sh 3 "internal order"
295In memory, vectors start with a longword containing the size
296(which is immediate data within the vector).
297The next cell contains a pointer to the property.
298Any remaining cells (if any) are for data.
299Vectors are handled differently from any other object in
300.Fr,
301in that a pointer to a vector is pointer to the first data
302cell, i.e. a pointer to the \fIthird\fP longword of the structure.
303This was done for efficiency in compiled code and for uniformity
304in referencing immediate-vectors (described below).
305The user should never return a pointer to any other part
306of a vector, as this may cause the garbage collector to follow an
307invalid pointer.
308.sh 2 "immediate-vectors"
309Immediate-vectors are similar to vectors. They differ, in
310that binary data are stored in space directly within the vector.
311Thus the garbage collector will preserve the vector itself (if used),
312and will only traverse the property cell.
313The data may be referenced as longwords, shortwords, or even bytes.
314Shorts and bytes are returned sign-extended.
315The compiler open-codes such references,
316and will avoid boxing the resulting integer data, where possible.
317Thus, immediate vectors may be used for efficiently processing
318character data.
319They are also useful in storing results from functions written
320in other languages.
321.sp 1v
322.TS
323box center ;
324c | c | c | c .
325Subpart name Get value Set value Type
326
327=
328datum[\fIi\fP] vrefi-byte vseti-byte fixnum
329 vrefi-word vseti-word fixnum
330 vrefi-long vseti-long fixnum
331_
332property vprop vsetprop lispval
333 vputprop
334_
335size vsize \- fixnum
336 vsize-byte fixnum
337 vsize-word fixnum
338.TE
339To create immediate vectors specifying size and fill data,
340you can use the functions
341\fInew-vectori-byte\fP,
342\fInew-vectori-word\fP,
343or \fInew-vectori-long\fP.
344You can also use the functions
345\fIvectori-byte\fP,
346\fIvectori-word\fP,
347or \fIvectori-long\fP.
348All of these functions are described in
349chapter 2.