BSD 4_2 development
[unix-history] / usr / lisp / ch9.n
CommitLineData
2f61a98b
C
1." $Header: ch9.n 1.4 83/07/21 21:08:57 sklower Exp $
2.Lc Arrays\ and\ Vectors 9
3.pp
4Arrays and vectors are two means of expressing aggregate
5data objects in
6.Fr .
7Vectors may be thought of as sequences of data.
8They are intended as a vehicle for user-defined
9data types.
10This use of vectors is still experimental and subject
11to revision.
12As a simple data structure, they are similar to
13hunks and strings.
14Vectors are used to implement closures,
15and are useful to communicate with foreign functions.
16Both of these topics were discussed in Chapter 8.
17Later in this chapter,
18we describe the current implementation of vectors, and will
19advise the user what is most likely to change.
20.pp
21Arrays in
22.Fr
23provide a programmable data structure access mechanism.
24One possible use for
25.Fr
26arrays is to implement Maclisp style arrays which are simple vectors
27of fixnums, flonums or general lisp values.
28This is described in more detail in \(sc9.3 but first
29we will describe how array references are handled by
30the lisp system.
31.pp
32The structure of an array object is given in \(sc1.3.10 and reproduced here
33for your convenience.
34.sp 1v
35.TS
36box center ;
37c | c | c | c .
38Subpart name Get value Set value Type
39
40=
41access function getaccess putaccess binary, list
42 or symbol
43_
44auxiliary getaux putaux lispval
45_
46data arrayref replace block of contiguous
47 set lispval
48_
49length getlength putlength fixnum
50_
51delta getdelta putdelta fixnum
52.TE
53.sh 2 "general arrays" \n(ch 1
54Suppose the evaluator is told to evaluate \fI(foo\ a\ b)\fP
55and the function cell of the symbol foo contains an array object
56(which we will call foo_arr_obj).
57First the evaluator will evaluate and stack the values of
58.i a
59and
60.i b .
61Next it will stack the array object foo_arr_obj.
62Finally it will call the access function of foo_arr_obj.
63The access function should be a lexpr\*[\(dg\*]
64or a symbol whose
65function cell contains a lexpr.
66.(f
67\*[\(dg\*]A lexpr is a function which accepts any number of arguments
68which are evaluated before the function is called.
69.)f
70The access function is responsible for locating and returning
71a value from the array.
72The array access function is free to interpret the arguments as it wishes.
73The Maclisp compatible array access function which is provided
74in the standard
75.Fr
76system interprets the arguments as subscripts in the same way as
77languages like Fortran and Pascal.
78.pp
79The array access function will also be called upon to store elements in
80the array.
81For example, \fI(store\ (foo\ a\ b)\ c)\fP
82will automatically expand to (foo c a b) and when the evaluator is called
83to evaluate this, it will evaluate the arguments
84.i c ,
85.i b
86and
87.i a .
88Then it will
89stack the array object (which is stored
90in the function cell of foo) and call the array access function
91with (now) four arguments.
92The array access function must be able to tell this is a store operation,
93which it can do by checking the number of arguments it has been
94given (a lexpr can do this very easily).
95.sh 2 "subparts of an array object"
96An array is created by allocating an
97array object with
98.i marray
99and filling in the fields.
100Certain lisp functions interpret the values of the subparts
101of the array object in special
102ways as described in the following text.
103Placing illegal values in these subparts may cause
104the lisp system to fail.
105.sh 3 "access function"
106The purpose of the access function has been described above.
107The contents of the access function should be a lexpr,
108either a binary (compiled function) or a list (interpreted function).
109It may also be a symbol whose function cell contains a function
110definition.
111This subpart
112is used by
113.i eval ,
114.i funcall ,
115and
116.i apply
117when evaluating array references.
118.sh 3 auxiliary
119This can be used for any purpose. If it is a list and the first element
120of that list is the symbol unmarked_array then the data subpart will
121not be marked by the garbage collector (this is used in the Maclisp
122compatible array package and has the potential for causing strange errors
123if used incorrectly).
124.sh 3 data
125This is either nil or points to a block of data space allocated by
126.i segment
127or
128.i small-segment.
129.sh 3 length
130This is a fixnum whose value is the number of elements in the
131data block. This is used by the garbage collector and by
132.i arrayref
133to determine if your index is in bounds.
134.sh 3 delta
135This is a fixnum whose value is the number of bytes in each element of
136the data block.
137This will be four for an array of fixnums or value cells, and eight
138for an array of flonums.
139This is used by the garbage collector and
140.i arrayref
141as well.
142.sh 2 "The Maclisp compatible array package"
143.pp
144A Maclisp style array is similar to what is known as arrays in other
145languages: a block of homogeneous data elements which
146is indexed by one or more integers called subscripts.
147The data elements can be all fixnums, flonums or general lisp objects.
148An array is created by a call to the function
149.i array
150or \fI*array\fP.
151The only difference is that
152.i *array
153evaluates its arguments.
154This call:
155.i "(array foo t 3 5)"
156sets up an array called foo of dimensions 3 by 5.
157The subscripts are zero based.
158The first element is \fI(foo\ 0\ 0)\fP, the next is \fI(foo\ 0\ 1)\fP
159and so on up to \fI(foo\ 2\ 4)\fP.
160The t indicates a general lisp object array which means each element of
161foo can be any type.
162Each element can be any type since all that is stored in the array is
163a pointer to a lisp object, not the object itself.
164.i Array
165does this by allocating an array object
166with
167.i marray
168and then allocating a segment of 15 consecutive value cells with
169.i small-segment
170and storing a pointer to that segment in the data subpart of the array
171object.
172The length and delta subpart of the array object are filled in (with 15
173and 4 respectively) and the access function subpart is set to point to
174the appropriate array access function.
175In this case there is a special access function for two dimensional
176value cell arrays called arrac-twoD, and this access function is used.
177The auxiliary subpart is set to (t\ 3\ 5) which describes the type of array
178and the bounds of the subscripts.
179Finally this array object is placed in the function cell of the symbol foo.
180Now when
181.i "(foo 1 3)"
182is evaluated, the array access function is invoked with three arguments:
1831, 3 and the array object. From the auxiliary field of the
184array object it gets a description of the particular array.
185It then determines which element \fI(foo\ 1\ 3)\fP refers to and
186uses arrayref to extract that element.
187Since this is an array of value cells, what arrayref returns is a
188value cell whose value is what we want, so we evaluate the value cell
189and return it as the value of \fI(foo\ 1\ 3)\fP.
190.pp
191In Maclisp the call \fI(array\ foo\ fixnum\ 25)\fP
192returns an array whose data object is a block of 25 memory words.
193When fixnums are stored in this array, the actual numbers are
194stored instead of pointers to the numbers as is done in general lisp
195object arrays.
196This is efficient under Maclisp but inefficient in
197.Fr
198since every time a value was referenced from an array it had to be copied
199and a pointer to the copy returned to prevent aliasing\*[\(dg\*].
200.(f
201\*[\(dg\*]Aliasing is when two variables are share the same storage location.
202For example if the copying mentioned weren't done then after
203\fI(setq\ x\ (foo\ 2))\fP was done, the value of x and
204(foo\ 2) would share the same
205location.
206Then should the value of (foo\ 2) change, x's value would change as well.
207This is considered dangerous and as a result pointers are never returned
208into the data space of arrays.
209.)f
210Thus t, fixnum and flonum arrays are all implemented in the same
211manner.
212This should not affect the compatibility of Maclisp
213and
214.Fr .
215If there is an application where a block of fixnums or flonums is required,
216then the exact same effect of fixnum and flonum arrays in Maclisp
217can be achieved by using fixnum-block and flonum-block arrays.
218Such arrays are required if you want to pass a large number of arguments to a
219Fortran or C coded function and then get answers back.
220.pp
221The Maclisp compatible array package is
222just one example of how a general array scheme can be implemented.
223Another type of array you could implement would be hashed arrays.
224The subscript could be anything, not just a number.
225The access function would hash the subscript and use the result to
226select an array element.
227With the generality of arrays also comes extra cost; if you just
228want a simple aggregate of (less than 128) general lisp objects
229you would be wise to look into using hunks.
230.sh 2 vectors
231Vectors were invented to fix two shortcommings with hunks.
232They can be longer than 128 elements. They also have a
233tag associated with them, which is intended to say, for example,
234"Think of me as an \fIBlobit\fP." Thus a \fBvector\fP
235is an arbitrary sized hunk with a property list.
236.pp
237Continuing the example,
238the lisp kernel may not know how to print out
239or evaluate \fIblobits\fP, but this is information which will
240be common to all \fIblobits\fP. On the other hand, for each
241individual blobits there are particulars which are likely to change,
242(height, weight, eye-color). This is the part that would
243previously have been stored in the individual entries in the hunk,
244and are stored in the data slots of the vector.
245Once again we summarize the structure of a vector in tabular form:
246.sp 1v
247.TS
248box center ;
249c | c | c | c .
250Subpart name Get value Set value Type
251
252=
253datum[\fIi\fP] vref vset lispval
254_
255property vprop vsetprop lispval
256 vputprop
257_
258size vsize \- fixnum
259.TE
260Vectors are created specifying size and optional fill value using the
261function (\fInew-vector\fP 'x_size ['g_fill ['g_prop]]), or by
262initial values: (\fIvector\fP ['g_val ...]).
263.sh 2 "anatomy of vectors"
264There are some technical details about vectors, that the user should
265know:
266.sh 3 size
267The user is not free to alter this. It is noted when the vector
268is created, and is used by the garbage collector. The garbage
269collector will coallesce two free vectors, which are neighbors
270in the heap. Internally, this is kept as the number of bytes
271of data. Thus, a vector created by (\fIvector\fP 'foo), has a
272size of 4.
273.sh 3 property
274Currently, we expect the property to be either a symbol, or a list
275whose first entry is a symbol. The symbols \fBfclosure\fP and
276\fBstructure-value-argument\fP are magic, and their effect is described in
277Chapter 8. If the property is a (non-null) symbol, the vector
278will be printed out as <symbol>[<size>].
279Another case is if the property is actually a (disembodied) property-list, which
280contains a value for the indicator \fBprint\fP.
281The value is taken to be a Lisp function, which the printer
282will invoke with two arguments: the vector and the current output port.
283Otherwise, the vector will be printed as vector[<size>].
284We have vague (as yet unimplemented) ideas
285about similar mechanisms for evaluation properties.
286Users are cautioned against putting anything other than nil
287in the property entry of a vector.
288.sh 3 "internal order"
289In memory, vectors start with a longword containing the size
290(which is immediate data within the vector).
291The next cell contains a pointer to the property.
292Any remaining cells (if any) are for data.
293Vectors are handled differently from any other object in
294.Fr,
295in that a pointer to a vector is pointer to the first data
296cell, i.e. a pointer to the \fIthird\fP longword of the structure.
297This was done for efficiency in compiled code and for uniformity
298in referencing immediate-vectors (described below).
299The user should never return a pointer to any other part
300of a vector, as this may cause the garbage collector to follow an
301invalid pointer.
302.sh 2 "immediate-vectors"
303Immediate-vectors are similar to vectors. They differ, in
304that binary data are stored in space directly within the vector.
305Thus the garbage collector will preserve the vector itself (if used),
306and will only traverse the property cell.
307The data may be referenced as longwords, shortwords, or even bytes.
308Shorts and bytes are returned sign-extended.
309The compiler open-codes such references,
310and will avoid boxing the resulting integer data, where possible.
311Thus, immediate vectors may be used for efficiently processing
312character data.
313They are also useful in storing results from functions written
314in other languages.
315.sp 1v
316.TS
317box center ;
318c | c | c | c .
319Subpart name Get value Set value Type
320
321=
322datum[\fIi\fP] vrefi-byte vseti-byte fixnum
323 vrefi-word vseti-word fixnum
324 vrefi-long vseti-long fixnum
325_
326property vprop vsetprop lispval
327 vputprop
328_
329size vsize \- fixnum
330 vsize-byte fixnum
331 vsize-word fixnum
332.TE
333To create immediate vectors specifying size and fill data,
334you can use the functions
335\fInew-vectori-byte\fP,
336\fInew-vectori-word\fP,
337or \fInew-vectori-long\fP.
338You can also use the functions
339\fIvectori-byte\fP,
340\fIvectori-word\fP,
341or \fIvectori-long\fP.
342All of these functions are described in
343chapter 2.