Commit | Line | Data |
---|---|---|
2f61a98b C |
1 | ." $Header: ch9.n 1.4 83/07/21 21:08:57 sklower Exp $ |
2 | .Lc Arrays\ and\ Vectors 9 | |
3 | .pp | |
4 | Arrays and vectors are two means of expressing aggregate | |
5 | data objects in | |
6 | .Fr . | |
7 | Vectors may be thought of as sequences of data. | |
8 | They are intended as a vehicle for user-defined | |
9 | data types. | |
10 | This use of vectors is still experimental and subject | |
11 | to revision. | |
12 | As a simple data structure, they are similar to | |
13 | hunks and strings. | |
14 | Vectors are used to implement closures, | |
15 | and are useful to communicate with foreign functions. | |
16 | Both of these topics were discussed in Chapter 8. | |
17 | Later in this chapter, | |
18 | we describe the current implementation of vectors, and will | |
19 | advise the user what is most likely to change. | |
20 | .pp | |
21 | Arrays in | |
22 | .Fr | |
23 | provide a programmable data structure access mechanism. | |
24 | One possible use for | |
25 | .Fr | |
26 | arrays is to implement Maclisp style arrays which are simple vectors | |
27 | of fixnums, flonums or general lisp values. | |
28 | This is described in more detail in \(sc9.3 but first | |
29 | we will describe how array references are handled by | |
30 | the lisp system. | |
31 | .pp | |
32 | The structure of an array object is given in \(sc1.3.10 and reproduced here | |
33 | for your convenience. | |
34 | .sp 1v | |
35 | .TS | |
36 | box center ; | |
37 | c | c | c | c . | |
38 | Subpart name Get value Set value Type | |
39 | ||
40 | = | |
41 | access function getaccess putaccess binary, list | |
42 | or symbol | |
43 | _ | |
44 | auxiliary getaux putaux lispval | |
45 | _ | |
46 | data arrayref replace block of contiguous | |
47 | set lispval | |
48 | _ | |
49 | length getlength putlength fixnum | |
50 | _ | |
51 | delta getdelta putdelta fixnum | |
52 | .TE | |
53 | .sh 2 "general arrays" \n(ch 1 | |
54 | Suppose the evaluator is told to evaluate \fI(foo\ a\ b)\fP | |
55 | and the function cell of the symbol foo contains an array object | |
56 | (which we will call foo_arr_obj). | |
57 | First the evaluator will evaluate and stack the values of | |
58 | .i a | |
59 | and | |
60 | .i b . | |
61 | Next it will stack the array object foo_arr_obj. | |
62 | Finally it will call the access function of foo_arr_obj. | |
63 | The access function should be a lexpr\*[\(dg\*] | |
64 | or a symbol whose | |
65 | function cell contains a lexpr. | |
66 | .(f | |
67 | \*[\(dg\*]A lexpr is a function which accepts any number of arguments | |
68 | which are evaluated before the function is called. | |
69 | .)f | |
70 | The access function is responsible for locating and returning | |
71 | a value from the array. | |
72 | The array access function is free to interpret the arguments as it wishes. | |
73 | The Maclisp compatible array access function which is provided | |
74 | in the standard | |
75 | .Fr | |
76 | system interprets the arguments as subscripts in the same way as | |
77 | languages like Fortran and Pascal. | |
78 | .pp | |
79 | The array access function will also be called upon to store elements in | |
80 | the array. | |
81 | For example, \fI(store\ (foo\ a\ b)\ c)\fP | |
82 | will automatically expand to (foo c a b) and when the evaluator is called | |
83 | to evaluate this, it will evaluate the arguments | |
84 | .i c , | |
85 | .i b | |
86 | and | |
87 | .i a . | |
88 | Then it will | |
89 | stack the array object (which is stored | |
90 | in the function cell of foo) and call the array access function | |
91 | with (now) four arguments. | |
92 | The array access function must be able to tell this is a store operation, | |
93 | which it can do by checking the number of arguments it has been | |
94 | given (a lexpr can do this very easily). | |
95 | .sh 2 "subparts of an array object" | |
96 | An array is created by allocating an | |
97 | array object with | |
98 | .i marray | |
99 | and filling in the fields. | |
100 | Certain lisp functions interpret the values of the subparts | |
101 | of the array object in special | |
102 | ways as described in the following text. | |
103 | Placing illegal values in these subparts may cause | |
104 | the lisp system to fail. | |
105 | .sh 3 "access function" | |
106 | The purpose of the access function has been described above. | |
107 | The contents of the access function should be a lexpr, | |
108 | either a binary (compiled function) or a list (interpreted function). | |
109 | It may also be a symbol whose function cell contains a function | |
110 | definition. | |
111 | This subpart | |
112 | is used by | |
113 | .i eval , | |
114 | .i funcall , | |
115 | and | |
116 | .i apply | |
117 | when evaluating array references. | |
118 | .sh 3 auxiliary | |
119 | This can be used for any purpose. If it is a list and the first element | |
120 | of that list is the symbol unmarked_array then the data subpart will | |
121 | not be marked by the garbage collector (this is used in the Maclisp | |
122 | compatible array package and has the potential for causing strange errors | |
123 | if used incorrectly). | |
124 | .sh 3 data | |
125 | This is either nil or points to a block of data space allocated by | |
126 | .i segment | |
127 | or | |
128 | .i small-segment. | |
129 | .sh 3 length | |
130 | This is a fixnum whose value is the number of elements in the | |
131 | data block. This is used by the garbage collector and by | |
132 | .i arrayref | |
133 | to determine if your index is in bounds. | |
134 | .sh 3 delta | |
135 | This is a fixnum whose value is the number of bytes in each element of | |
136 | the data block. | |
137 | This will be four for an array of fixnums or value cells, and eight | |
138 | for an array of flonums. | |
139 | This is used by the garbage collector and | |
140 | .i arrayref | |
141 | as well. | |
142 | .sh 2 "The Maclisp compatible array package" | |
143 | .pp | |
144 | A Maclisp style array is similar to what is known as arrays in other | |
145 | languages: a block of homogeneous data elements which | |
146 | is indexed by one or more integers called subscripts. | |
147 | The data elements can be all fixnums, flonums or general lisp objects. | |
148 | An array is created by a call to the function | |
149 | .i array | |
150 | or \fI*array\fP. | |
151 | The only difference is that | |
152 | .i *array | |
153 | evaluates its arguments. | |
154 | This call: | |
155 | .i "(array foo t 3 5)" | |
156 | sets up an array called foo of dimensions 3 by 5. | |
157 | The subscripts are zero based. | |
158 | The first element is \fI(foo\ 0\ 0)\fP, the next is \fI(foo\ 0\ 1)\fP | |
159 | and so on up to \fI(foo\ 2\ 4)\fP. | |
160 | The t indicates a general lisp object array which means each element of | |
161 | foo can be any type. | |
162 | Each element can be any type since all that is stored in the array is | |
163 | a pointer to a lisp object, not the object itself. | |
164 | .i Array | |
165 | does this by allocating an array object | |
166 | with | |
167 | .i marray | |
168 | and then allocating a segment of 15 consecutive value cells with | |
169 | .i small-segment | |
170 | and storing a pointer to that segment in the data subpart of the array | |
171 | object. | |
172 | The length and delta subpart of the array object are filled in (with 15 | |
173 | and 4 respectively) and the access function subpart is set to point to | |
174 | the appropriate array access function. | |
175 | In this case there is a special access function for two dimensional | |
176 | value cell arrays called arrac-twoD, and this access function is used. | |
177 | The auxiliary subpart is set to (t\ 3\ 5) which describes the type of array | |
178 | and the bounds of the subscripts. | |
179 | Finally this array object is placed in the function cell of the symbol foo. | |
180 | Now when | |
181 | .i "(foo 1 3)" | |
182 | is evaluated, the array access function is invoked with three arguments: | |
183 | 1, 3 and the array object. From the auxiliary field of the | |
184 | array object it gets a description of the particular array. | |
185 | It then determines which element \fI(foo\ 1\ 3)\fP refers to and | |
186 | uses arrayref to extract that element. | |
187 | Since this is an array of value cells, what arrayref returns is a | |
188 | value cell whose value is what we want, so we evaluate the value cell | |
189 | and return it as the value of \fI(foo\ 1\ 3)\fP. | |
190 | .pp | |
191 | In Maclisp the call \fI(array\ foo\ fixnum\ 25)\fP | |
192 | returns an array whose data object is a block of 25 memory words. | |
193 | When fixnums are stored in this array, the actual numbers are | |
194 | stored instead of pointers to the numbers as is done in general lisp | |
195 | object arrays. | |
196 | This is efficient under Maclisp but inefficient in | |
197 | .Fr | |
198 | since every time a value was referenced from an array it had to be copied | |
199 | and a pointer to the copy returned to prevent aliasing\*[\(dg\*]. | |
200 | .(f | |
201 | \*[\(dg\*]Aliasing is when two variables are share the same storage location. | |
202 | For example if the copying mentioned weren't done then after | |
203 | \fI(setq\ x\ (foo\ 2))\fP was done, the value of x and | |
204 | (foo\ 2) would share the same | |
205 | location. | |
206 | Then should the value of (foo\ 2) change, x's value would change as well. | |
207 | This is considered dangerous and as a result pointers are never returned | |
208 | into the data space of arrays. | |
209 | .)f | |
210 | Thus t, fixnum and flonum arrays are all implemented in the same | |
211 | manner. | |
212 | This should not affect the compatibility of Maclisp | |
213 | and | |
214 | .Fr . | |
215 | If there is an application where a block of fixnums or flonums is required, | |
216 | then the exact same effect of fixnum and flonum arrays in Maclisp | |
217 | can be achieved by using fixnum-block and flonum-block arrays. | |
218 | Such arrays are required if you want to pass a large number of arguments to a | |
219 | Fortran or C coded function and then get answers back. | |
220 | .pp | |
221 | The Maclisp compatible array package is | |
222 | just one example of how a general array scheme can be implemented. | |
223 | Another type of array you could implement would be hashed arrays. | |
224 | The subscript could be anything, not just a number. | |
225 | The access function would hash the subscript and use the result to | |
226 | select an array element. | |
227 | With the generality of arrays also comes extra cost; if you just | |
228 | want a simple aggregate of (less than 128) general lisp objects | |
229 | you would be wise to look into using hunks. | |
230 | .sh 2 vectors | |
231 | Vectors were invented to fix two shortcommings with hunks. | |
232 | They can be longer than 128 elements. They also have a | |
233 | tag associated with them, which is intended to say, for example, | |
234 | "Think of me as an \fIBlobit\fP." Thus a \fBvector\fP | |
235 | is an arbitrary sized hunk with a property list. | |
236 | .pp | |
237 | Continuing the example, | |
238 | the lisp kernel may not know how to print out | |
239 | or evaluate \fIblobits\fP, but this is information which will | |
240 | be common to all \fIblobits\fP. On the other hand, for each | |
241 | individual blobits there are particulars which are likely to change, | |
242 | (height, weight, eye-color). This is the part that would | |
243 | previously have been stored in the individual entries in the hunk, | |
244 | and are stored in the data slots of the vector. | |
245 | Once again we summarize the structure of a vector in tabular form: | |
246 | .sp 1v | |
247 | .TS | |
248 | box center ; | |
249 | c | c | c | c . | |
250 | Subpart name Get value Set value Type | |
251 | ||
252 | = | |
253 | datum[\fIi\fP] vref vset lispval | |
254 | _ | |
255 | property vprop vsetprop lispval | |
256 | vputprop | |
257 | _ | |
258 | size vsize \- fixnum | |
259 | .TE | |
260 | Vectors are created specifying size and optional fill value using the | |
261 | function (\fInew-vector\fP 'x_size ['g_fill ['g_prop]]), or by | |
262 | initial values: (\fIvector\fP ['g_val ...]). | |
263 | .sh 2 "anatomy of vectors" | |
264 | There are some technical details about vectors, that the user should | |
265 | know: | |
266 | .sh 3 size | |
267 | The user is not free to alter this. It is noted when the vector | |
268 | is created, and is used by the garbage collector. The garbage | |
269 | collector will coallesce two free vectors, which are neighbors | |
270 | in the heap. Internally, this is kept as the number of bytes | |
271 | of data. Thus, a vector created by (\fIvector\fP 'foo), has a | |
272 | size of 4. | |
273 | .sh 3 property | |
274 | Currently, we expect the property to be either a symbol, or a list | |
275 | whose first entry is a symbol. The symbols \fBfclosure\fP and | |
276 | \fBstructure-value-argument\fP are magic, and their effect is described in | |
277 | Chapter 8. If the property is a (non-null) symbol, the vector | |
278 | will be printed out as <symbol>[<size>]. | |
279 | Another case is if the property is actually a (disembodied) property-list, which | |
280 | contains a value for the indicator \fBprint\fP. | |
281 | The value is taken to be a Lisp function, which the printer | |
282 | will invoke with two arguments: the vector and the current output port. | |
283 | Otherwise, the vector will be printed as vector[<size>]. | |
284 | We have vague (as yet unimplemented) ideas | |
285 | about similar mechanisms for evaluation properties. | |
286 | Users are cautioned against putting anything other than nil | |
287 | in the property entry of a vector. | |
288 | .sh 3 "internal order" | |
289 | In memory, vectors start with a longword containing the size | |
290 | (which is immediate data within the vector). | |
291 | The next cell contains a pointer to the property. | |
292 | Any remaining cells (if any) are for data. | |
293 | Vectors are handled differently from any other object in | |
294 | .Fr, | |
295 | in that a pointer to a vector is pointer to the first data | |
296 | cell, i.e. a pointer to the \fIthird\fP longword of the structure. | |
297 | This was done for efficiency in compiled code and for uniformity | |
298 | in referencing immediate-vectors (described below). | |
299 | The user should never return a pointer to any other part | |
300 | of a vector, as this may cause the garbage collector to follow an | |
301 | invalid pointer. | |
302 | .sh 2 "immediate-vectors" | |
303 | Immediate-vectors are similar to vectors. They differ, in | |
304 | that binary data are stored in space directly within the vector. | |
305 | Thus the garbage collector will preserve the vector itself (if used), | |
306 | and will only traverse the property cell. | |
307 | The data may be referenced as longwords, shortwords, or even bytes. | |
308 | Shorts and bytes are returned sign-extended. | |
309 | The compiler open-codes such references, | |
310 | and will avoid boxing the resulting integer data, where possible. | |
311 | Thus, immediate vectors may be used for efficiently processing | |
312 | character data. | |
313 | They are also useful in storing results from functions written | |
314 | in other languages. | |
315 | .sp 1v | |
316 | .TS | |
317 | box center ; | |
318 | c | c | c | c . | |
319 | Subpart name Get value Set value Type | |
320 | ||
321 | = | |
322 | datum[\fIi\fP] vrefi-byte vseti-byte fixnum | |
323 | vrefi-word vseti-word fixnum | |
324 | vrefi-long vseti-long fixnum | |
325 | _ | |
326 | property vprop vsetprop lispval | |
327 | vputprop | |
328 | _ | |
329 | size vsize \- fixnum | |
330 | vsize-byte fixnum | |
331 | vsize-word fixnum | |
332 | .TE | |
333 | To create immediate vectors specifying size and fill data, | |
334 | you can use the functions | |
335 | \fInew-vectori-byte\fP, | |
336 | \fInew-vectori-word\fP, | |
337 | or \fInew-vectori-long\fP. | |
338 | You can also use the functions | |
339 | \fIvectori-byte\fP, | |
340 | \fIvectori-word\fP, | |
341 | or \fIvectori-long\fP. | |
342 | All of these functions are described in | |
343 | chapter 2. |