| 1 | ." $Header: ch9.n 1.4 83/07/21 21:08:57 sklower Exp $ |
| 2 | .Lc Arrays\ and\ Vectors 9 |
| 3 | .pp |
| 4 | Arrays and vectors are two means of expressing aggregate |
| 5 | data objects in |
| 6 | .Fr . |
| 7 | Vectors may be thought of as sequences of data. |
| 8 | They are intended as a vehicle for user-defined |
| 9 | data types. |
| 10 | This use of vectors is still experimental and subject |
| 11 | to revision. |
| 12 | As a simple data structure, they are similar to |
| 13 | hunks and strings. |
| 14 | Vectors are used to implement closures, |
| 15 | and are useful to communicate with foreign functions. |
| 16 | Both of these topics were discussed in Chapter 8. |
| 17 | Later in this chapter, |
| 18 | we describe the current implementation of vectors, and will |
| 19 | advise the user what is most likely to change. |
| 20 | .pp |
| 21 | Arrays in |
| 22 | .Fr |
| 23 | provide a programmable data structure access mechanism. |
| 24 | One possible use for |
| 25 | .Fr |
| 26 | arrays is to implement Maclisp style arrays which are simple vectors |
| 27 | of fixnums, flonums or general lisp values. |
| 28 | This is described in more detail in \(sc9.3 but first |
| 29 | we will describe how array references are handled by |
| 30 | the lisp system. |
| 31 | .pp |
| 32 | The structure of an array object is given in \(sc1.3.10 and reproduced here |
| 33 | for your convenience. |
| 34 | .sp 1v |
| 35 | .TS |
| 36 | box center ; |
| 37 | c | c | c | c . |
| 38 | Subpart name Get value Set value Type |
| 39 | |
| 40 | = |
| 41 | access function getaccess putaccess binary, list |
| 42 | or symbol |
| 43 | _ |
| 44 | auxiliary getaux putaux lispval |
| 45 | _ |
| 46 | data arrayref replace block of contiguous |
| 47 | set lispval |
| 48 | _ |
| 49 | length getlength putlength fixnum |
| 50 | _ |
| 51 | delta getdelta putdelta fixnum |
| 52 | .TE |
| 53 | .sh 2 "general arrays" \n(ch 1 |
| 54 | Suppose the evaluator is told to evaluate \fI(foo\ a\ b)\fP |
| 55 | and the function cell of the symbol foo contains an array object |
| 56 | (which we will call foo_arr_obj). |
| 57 | First the evaluator will evaluate and stack the values of |
| 58 | .i a |
| 59 | and |
| 60 | .i b . |
| 61 | Next it will stack the array object foo_arr_obj. |
| 62 | Finally it will call the access function of foo_arr_obj. |
| 63 | The access function should be a lexpr\*[\(dg\*] |
| 64 | or a symbol whose |
| 65 | function cell contains a lexpr. |
| 66 | .(f |
| 67 | \*[\(dg\*]A lexpr is a function which accepts any number of arguments |
| 68 | which are evaluated before the function is called. |
| 69 | .)f |
| 70 | The access function is responsible for locating and returning |
| 71 | a value from the array. |
| 72 | The array access function is free to interpret the arguments as it wishes. |
| 73 | The Maclisp compatible array access function which is provided |
| 74 | in the standard |
| 75 | .Fr |
| 76 | system interprets the arguments as subscripts in the same way as |
| 77 | languages like Fortran and Pascal. |
| 78 | .pp |
| 79 | The array access function will also be called upon to store elements in |
| 80 | the array. |
| 81 | For example, \fI(store\ (foo\ a\ b)\ c)\fP |
| 82 | will automatically expand to (foo c a b) and when the evaluator is called |
| 83 | to evaluate this, it will evaluate the arguments |
| 84 | .i c , |
| 85 | .i b |
| 86 | and |
| 87 | .i a . |
| 88 | Then it will |
| 89 | stack the array object (which is stored |
| 90 | in the function cell of foo) and call the array access function |
| 91 | with (now) four arguments. |
| 92 | The array access function must be able to tell this is a store operation, |
| 93 | which it can do by checking the number of arguments it has been |
| 94 | given (a lexpr can do this very easily). |
| 95 | .sh 2 "subparts of an array object" |
| 96 | An array is created by allocating an |
| 97 | array object with |
| 98 | .i marray |
| 99 | and filling in the fields. |
| 100 | Certain lisp functions interpret the values of the subparts |
| 101 | of the array object in special |
| 102 | ways as described in the following text. |
| 103 | Placing illegal values in these subparts may cause |
| 104 | the lisp system to fail. |
| 105 | .sh 3 "access function" |
| 106 | The purpose of the access function has been described above. |
| 107 | The contents of the access function should be a lexpr, |
| 108 | either a binary (compiled function) or a list (interpreted function). |
| 109 | It may also be a symbol whose function cell contains a function |
| 110 | definition. |
| 111 | This subpart |
| 112 | is used by |
| 113 | .i eval , |
| 114 | .i funcall , |
| 115 | and |
| 116 | .i apply |
| 117 | when evaluating array references. |
| 118 | .sh 3 auxiliary |
| 119 | This can be used for any purpose. If it is a list and the first element |
| 120 | of that list is the symbol unmarked_array then the data subpart will |
| 121 | not be marked by the garbage collector (this is used in the Maclisp |
| 122 | compatible array package and has the potential for causing strange errors |
| 123 | if used incorrectly). |
| 124 | .sh 3 data |
| 125 | This is either nil or points to a block of data space allocated by |
| 126 | .i segment |
| 127 | or |
| 128 | .i small-segment. |
| 129 | .sh 3 length |
| 130 | This is a fixnum whose value is the number of elements in the |
| 131 | data block. This is used by the garbage collector and by |
| 132 | .i arrayref |
| 133 | to determine if your index is in bounds. |
| 134 | .sh 3 delta |
| 135 | This is a fixnum whose value is the number of bytes in each element of |
| 136 | the data block. |
| 137 | This will be four for an array of fixnums or value cells, and eight |
| 138 | for an array of flonums. |
| 139 | This is used by the garbage collector and |
| 140 | .i arrayref |
| 141 | as well. |
| 142 | .sh 2 "The Maclisp compatible array package" |
| 143 | .pp |
| 144 | A Maclisp style array is similar to what is known as arrays in other |
| 145 | languages: a block of homogeneous data elements which |
| 146 | is indexed by one or more integers called subscripts. |
| 147 | The data elements can be all fixnums, flonums or general lisp objects. |
| 148 | An array is created by a call to the function |
| 149 | .i array |
| 150 | or \fI*array\fP. |
| 151 | The only difference is that |
| 152 | .i *array |
| 153 | evaluates its arguments. |
| 154 | This call: |
| 155 | .i "(array foo t 3 5)" |
| 156 | sets up an array called foo of dimensions 3 by 5. |
| 157 | The subscripts are zero based. |
| 158 | The first element is \fI(foo\ 0\ 0)\fP, the next is \fI(foo\ 0\ 1)\fP |
| 159 | and so on up to \fI(foo\ 2\ 4)\fP. |
| 160 | The t indicates a general lisp object array which means each element of |
| 161 | foo can be any type. |
| 162 | Each element can be any type since all that is stored in the array is |
| 163 | a pointer to a lisp object, not the object itself. |
| 164 | .i Array |
| 165 | does this by allocating an array object |
| 166 | with |
| 167 | .i marray |
| 168 | and then allocating a segment of 15 consecutive value cells with |
| 169 | .i small-segment |
| 170 | and storing a pointer to that segment in the data subpart of the array |
| 171 | object. |
| 172 | The length and delta subpart of the array object are filled in (with 15 |
| 173 | and 4 respectively) and the access function subpart is set to point to |
| 174 | the appropriate array access function. |
| 175 | In this case there is a special access function for two dimensional |
| 176 | value cell arrays called arrac-twoD, and this access function is used. |
| 177 | The auxiliary subpart is set to (t\ 3\ 5) which describes the type of array |
| 178 | and the bounds of the subscripts. |
| 179 | Finally this array object is placed in the function cell of the symbol foo. |
| 180 | Now when |
| 181 | .i "(foo 1 3)" |
| 182 | is evaluated, the array access function is invoked with three arguments: |
| 183 | 1, 3 and the array object. From the auxiliary field of the |
| 184 | array object it gets a description of the particular array. |
| 185 | It then determines which element \fI(foo\ 1\ 3)\fP refers to and |
| 186 | uses arrayref to extract that element. |
| 187 | Since this is an array of value cells, what arrayref returns is a |
| 188 | value cell whose value is what we want, so we evaluate the value cell |
| 189 | and return it as the value of \fI(foo\ 1\ 3)\fP. |
| 190 | .pp |
| 191 | In Maclisp the call \fI(array\ foo\ fixnum\ 25)\fP |
| 192 | returns an array whose data object is a block of 25 memory words. |
| 193 | When fixnums are stored in this array, the actual numbers are |
| 194 | stored instead of pointers to the numbers as is done in general lisp |
| 195 | object arrays. |
| 196 | This is efficient under Maclisp but inefficient in |
| 197 | .Fr |
| 198 | since every time a value was referenced from an array it had to be copied |
| 199 | and a pointer to the copy returned to prevent aliasing\*[\(dg\*]. |
| 200 | .(f |
| 201 | \*[\(dg\*]Aliasing is when two variables are share the same storage location. |
| 202 | For example if the copying mentioned weren't done then after |
| 203 | \fI(setq\ x\ (foo\ 2))\fP was done, the value of x and |
| 204 | (foo\ 2) would share the same |
| 205 | location. |
| 206 | Then should the value of (foo\ 2) change, x's value would change as well. |
| 207 | This is considered dangerous and as a result pointers are never returned |
| 208 | into the data space of arrays. |
| 209 | .)f |
| 210 | Thus t, fixnum and flonum arrays are all implemented in the same |
| 211 | manner. |
| 212 | This should not affect the compatibility of Maclisp |
| 213 | and |
| 214 | .Fr . |
| 215 | If there is an application where a block of fixnums or flonums is required, |
| 216 | then the exact same effect of fixnum and flonum arrays in Maclisp |
| 217 | can be achieved by using fixnum-block and flonum-block arrays. |
| 218 | Such arrays are required if you want to pass a large number of arguments to a |
| 219 | Fortran or C coded function and then get answers back. |
| 220 | .pp |
| 221 | The Maclisp compatible array package is |
| 222 | just one example of how a general array scheme can be implemented. |
| 223 | Another type of array you could implement would be hashed arrays. |
| 224 | The subscript could be anything, not just a number. |
| 225 | The access function would hash the subscript and use the result to |
| 226 | select an array element. |
| 227 | With the generality of arrays also comes extra cost; if you just |
| 228 | want a simple aggregate of (less than 128) general lisp objects |
| 229 | you would be wise to look into using hunks. |
| 230 | .sh 2 vectors |
| 231 | Vectors were invented to fix two shortcommings with hunks. |
| 232 | They can be longer than 128 elements. They also have a |
| 233 | tag associated with them, which is intended to say, for example, |
| 234 | "Think of me as an \fIBlobit\fP." Thus a \fBvector\fP |
| 235 | is an arbitrary sized hunk with a property list. |
| 236 | .pp |
| 237 | Continuing the example, |
| 238 | the lisp kernel may not know how to print out |
| 239 | or evaluate \fIblobits\fP, but this is information which will |
| 240 | be common to all \fIblobits\fP. On the other hand, for each |
| 241 | individual blobits there are particulars which are likely to change, |
| 242 | (height, weight, eye-color). This is the part that would |
| 243 | previously have been stored in the individual entries in the hunk, |
| 244 | and are stored in the data slots of the vector. |
| 245 | Once again we summarize the structure of a vector in tabular form: |
| 246 | .sp 1v |
| 247 | .TS |
| 248 | box center ; |
| 249 | c | c | c | c . |
| 250 | Subpart name Get value Set value Type |
| 251 | |
| 252 | = |
| 253 | datum[\fIi\fP] vref vset lispval |
| 254 | _ |
| 255 | property vprop vsetprop lispval |
| 256 | vputprop |
| 257 | _ |
| 258 | size vsize \- fixnum |
| 259 | .TE |
| 260 | Vectors are created specifying size and optional fill value using the |
| 261 | function (\fInew-vector\fP 'x_size ['g_fill ['g_prop]]), or by |
| 262 | initial values: (\fIvector\fP ['g_val ...]). |
| 263 | .sh 2 "anatomy of vectors" |
| 264 | There are some technical details about vectors, that the user should |
| 265 | know: |
| 266 | .sh 3 size |
| 267 | The user is not free to alter this. It is noted when the vector |
| 268 | is created, and is used by the garbage collector. The garbage |
| 269 | collector will coallesce two free vectors, which are neighbors |
| 270 | in the heap. Internally, this is kept as the number of bytes |
| 271 | of data. Thus, a vector created by (\fIvector\fP 'foo), has a |
| 272 | size of 4. |
| 273 | .sh 3 property |
| 274 | Currently, we expect the property to be either a symbol, or a list |
| 275 | whose first entry is a symbol. The symbols \fBfclosure\fP and |
| 276 | \fBstructure-value-argument\fP are magic, and their effect is described in |
| 277 | Chapter 8. If the property is a (non-null) symbol, the vector |
| 278 | will be printed out as <symbol>[<size>]. |
| 279 | Another case is if the property is actually a (disembodied) property-list, which |
| 280 | contains a value for the indicator \fBprint\fP. |
| 281 | The value is taken to be a Lisp function, which the printer |
| 282 | will invoke with two arguments: the vector and the current output port. |
| 283 | Otherwise, the vector will be printed as vector[<size>]. |
| 284 | We have vague (as yet unimplemented) ideas |
| 285 | about similar mechanisms for evaluation properties. |
| 286 | Users are cautioned against putting anything other than nil |
| 287 | in the property entry of a vector. |
| 288 | .sh 3 "internal order" |
| 289 | In memory, vectors start with a longword containing the size |
| 290 | (which is immediate data within the vector). |
| 291 | The next cell contains a pointer to the property. |
| 292 | Any remaining cells (if any) are for data. |
| 293 | Vectors are handled differently from any other object in |
| 294 | .Fr, |
| 295 | in that a pointer to a vector is pointer to the first data |
| 296 | cell, i.e. a pointer to the \fIthird\fP longword of the structure. |
| 297 | This was done for efficiency in compiled code and for uniformity |
| 298 | in referencing immediate-vectors (described below). |
| 299 | The user should never return a pointer to any other part |
| 300 | of a vector, as this may cause the garbage collector to follow an |
| 301 | invalid pointer. |
| 302 | .sh 2 "immediate-vectors" |
| 303 | Immediate-vectors are similar to vectors. They differ, in |
| 304 | that binary data are stored in space directly within the vector. |
| 305 | Thus the garbage collector will preserve the vector itself (if used), |
| 306 | and will only traverse the property cell. |
| 307 | The data may be referenced as longwords, shortwords, or even bytes. |
| 308 | Shorts and bytes are returned sign-extended. |
| 309 | The compiler open-codes such references, |
| 310 | and will avoid boxing the resulting integer data, where possible. |
| 311 | Thus, immediate vectors may be used for efficiently processing |
| 312 | character data. |
| 313 | They are also useful in storing results from functions written |
| 314 | in other languages. |
| 315 | .sp 1v |
| 316 | .TS |
| 317 | box center ; |
| 318 | c | c | c | c . |
| 319 | Subpart name Get value Set value Type |
| 320 | |
| 321 | = |
| 322 | datum[\fIi\fP] vrefi-byte vseti-byte fixnum |
| 323 | vrefi-word vseti-word fixnum |
| 324 | vrefi-long vseti-long fixnum |
| 325 | _ |
| 326 | property vprop vsetprop lispval |
| 327 | vputprop |
| 328 | _ |
| 329 | size vsize \- fixnum |
| 330 | vsize-byte fixnum |
| 331 | vsize-word fixnum |
| 332 | .TE |
| 333 | To create immediate vectors specifying size and fill data, |
| 334 | you can use the functions |
| 335 | \fInew-vectori-byte\fP, |
| 336 | \fInew-vectori-word\fP, |
| 337 | or \fInew-vectori-long\fP. |
| 338 | You can also use the functions |
| 339 | \fIvectori-byte\fP, |
| 340 | \fIvectori-word\fP, |
| 341 | or \fIvectori-long\fP. |
| 342 | All of these functions are described in |
| 343 | chapter 2. |