git.subgeniuskitty.com - unix-history/.git/blame_incremental

... / ...

Commit	Line	Data
	1	." $Header: ch9.n 1.4 83/07/21 21:08:57 sklower Exp $
	2	.Lc Arrays\ and\ Vectors 9
	3	.pp
	4	Arrays and vectors are two means of expressing aggregate
	5	data objects in
	6	.Fr .
	7	Vectors may be thought of as sequences of data.
	8	They are intended as a vehicle for user-defined
	9	data types.
	10	This use of vectors is still experimental and subject
	11	to revision.
	12	As a simple data structure, they are similar to
	13	hunks and strings.
	14	Vectors are used to implement closures,
	15	and are useful to communicate with foreign functions.
	16	Both of these topics were discussed in Chapter 8.
	17	Later in this chapter,
	18	we describe the current implementation of vectors, and will
	19	advise the user what is most likely to change.
	20	.pp
	21	Arrays in
	22	.Fr
	23	provide a programmable data structure access mechanism.
	24	One possible use for
	25	.Fr
	26	arrays is to implement Maclisp style arrays which are simple vectors
	27	of fixnums, flonums or general lisp values.
	28	This is described in more detail in \(sc9.3 but first
	29	we will describe how array references are handled by
	30	the lisp system.
	31	.pp
	32	The structure of an array object is given in \(sc1.3.10 and reproduced here
	33	for your convenience.
	34	.sp 1v
	35	.TS
	36	box center ;
	37	c \| c \| c \| c .
	38	Subpart name Get value Set value Type
	39
	40	=
	41	access function getaccess putaccess binary, list
	42	or symbol
	43	_
	44	auxiliary getaux putaux lispval
	45	_
	46	data arrayref replace block of contiguous
	47	set lispval
	48	_
	49	length getlength putlength fixnum
	50	_
	51	delta getdelta putdelta fixnum
	52	.TE
	53	.sh 2 "general arrays" \n(ch 1
	54	Suppose the evaluator is told to evaluate \fI(foo\ a\ b)\fP
	55	and the function cell of the symbol foo contains an array object
	56	(which we will call foo_arr_obj).
	57	First the evaluator will evaluate and stack the values of
	58	.i a
	59	and
	60	.i b .
	61	Next it will stack the array object foo_arr_obj.
	62	Finally it will call the access function of foo_arr_obj.
	63	The access function should be a lexpr\[\(dg\]
	64	or a symbol whose
	65	function cell contains a lexpr.
	66	.(f
	67	\[\(dg\]A lexpr is a function which accepts any number of arguments
	68	which are evaluated before the function is called.
	69	.)f
	70	The access function is responsible for locating and returning
	71	a value from the array.
	72	The array access function is free to interpret the arguments as it wishes.
	73	The Maclisp compatible array access function which is provided
	74	in the standard
	75	.Fr
	76	system interprets the arguments as subscripts in the same way as
	77	languages like Fortran and Pascal.
	78	.pp
	79	The array access function will also be called upon to store elements in
	80	the array.
	81	For example, \fI(store\ (foo\ a\ b)\ c)\fP
	82	will automatically expand to (foo c a b) and when the evaluator is called
	83	to evaluate this, it will evaluate the arguments
	84	.i c ,
	85	.i b
	86	and
	87	.i a .
	88	Then it will
	89	stack the array object (which is stored
	90	in the function cell of foo) and call the array access function
	91	with (now) four arguments.
	92	The array access function must be able to tell this is a store operation,
	93	which it can do by checking the number of arguments it has been
	94	given (a lexpr can do this very easily).
	95	.sh 2 "subparts of an array object"
	96	An array is created by allocating an
	97	array object with
	98	.i marray
	99	and filling in the fields.
	100	Certain lisp functions interpret the values of the subparts
	101	of the array object in special
	102	ways as described in the following text.
	103	Placing illegal values in these subparts may cause
	104	the lisp system to fail.
	105	.sh 3 "access function"
	106	The purpose of the access function has been described above.
	107	The contents of the access function should be a lexpr,
	108	either a binary (compiled function) or a list (interpreted function).
	109	It may also be a symbol whose function cell contains a function
	110	definition.
	111	This subpart
	112	is used by
	113	.i eval ,
	114	.i funcall ,
	115	and
	116	.i apply
	117	when evaluating array references.
	118	.sh 3 auxiliary
	119	This can be used for any purpose. If it is a list and the first element
	120	of that list is the symbol unmarked_array then the data subpart will
	121	not be marked by the garbage collector (this is used in the Maclisp
	122	compatible array package and has the potential for causing strange errors
	123	if used incorrectly).
	124	.sh 3 data
	125	This is either nil or points to a block of data space allocated by
	126	.i segment
	127	or
	128	.i small-segment.
	129	.sh 3 length
	130	This is a fixnum whose value is the number of elements in the
	131	data block. This is used by the garbage collector and by
	132	.i arrayref
	133	to determine if your index is in bounds.
	134	.sh 3 delta
	135	This is a fixnum whose value is the number of bytes in each element of
	136	the data block.
	137	This will be four for an array of fixnums or value cells, and eight
	138	for an array of flonums.
	139	This is used by the garbage collector and
	140	.i arrayref
	141	as well.
	142	.sh 2 "The Maclisp compatible array package"
	143	.pp
	144	A Maclisp style array is similar to what is known as arrays in other
	145	languages: a block of homogeneous data elements which
	146	is indexed by one or more integers called subscripts.
	147	The data elements can be all fixnums, flonums or general lisp objects.
	148	An array is created by a call to the function
	149	.i array
	150	or \fI*array\fP.
	151	The only difference is that
	152	.i *array
	153	evaluates its arguments.
	154	This call:
	155	.i "(array foo t 3 5)"
	156	sets up an array called foo of dimensions 3 by 5.
	157	The subscripts are zero based.
	158	The first element is \fI(foo\ 0\ 0)\fP, the next is \fI(foo\ 0\ 1)\fP
	159	and so on up to \fI(foo\ 2\ 4)\fP.
	160	The t indicates a general lisp object array which means each element of
	161	foo can be any type.
	162	Each element can be any type since all that is stored in the array is
	163	a pointer to a lisp object, not the object itself.
	164	.i Array
	165	does this by allocating an array object
	166	with
	167	.i marray
	168	and then allocating a segment of 15 consecutive value cells with
	169	.i small-segment
	170	and storing a pointer to that segment in the data subpart of the array
	171	object.
	172	The length and delta subpart of the array object are filled in (with 15
	173	and 4 respectively) and the access function subpart is set to point to
	174	the appropriate array access function.
	175	In this case there is a special access function for two dimensional
	176	value cell arrays called arrac-twoD, and this access function is used.
	177	The auxiliary subpart is set to (t\ 3\ 5) which describes the type of array
	178	and the bounds of the subscripts.
	179	Finally this array object is placed in the function cell of the symbol foo.
	180	Now when
	181	.i "(foo 1 3)"
	182	is evaluated, the array access function is invoked with three arguments:
	183	1, 3 and the array object. From the auxiliary field of the
	184	array object it gets a description of the particular array.
	185	It then determines which element \fI(foo\ 1\ 3)\fP refers to and
	186	uses arrayref to extract that element.
	187	Since this is an array of value cells, what arrayref returns is a
	188	value cell whose value is what we want, so we evaluate the value cell
	189	and return it as the value of \fI(foo\ 1\ 3)\fP.
	190	.pp
	191	In Maclisp the call \fI(array\ foo\ fixnum\ 25)\fP
	192	returns an array whose data object is a block of 25 memory words.
	193	When fixnums are stored in this array, the actual numbers are
	194	stored instead of pointers to the numbers as is done in general lisp
	195	object arrays.
	196	This is efficient under Maclisp but inefficient in
	197	.Fr
	198	since every time a value was referenced from an array it had to be copied
	199	and a pointer to the copy returned to prevent aliasing\[\(dg\].
	200	.(f
	201	\[\(dg\]Aliasing is when two variables are share the same storage location.
	202	For example if the copying mentioned weren't done then after
	203	\fI(setq\ x\ (foo\ 2))\fP was done, the value of x and
	204	(foo\ 2) would share the same
	205	location.
	206	Then should the value of (foo\ 2) change, x's value would change as well.
	207	This is considered dangerous and as a result pointers are never returned
	208	into the data space of arrays.
	209	.)f
	210	Thus t, fixnum and flonum arrays are all implemented in the same
	211	manner.
	212	This should not affect the compatibility of Maclisp
	213	and
	214	.Fr .
	215	If there is an application where a block of fixnums or flonums is required,
	216	then the exact same effect of fixnum and flonum arrays in Maclisp
	217	can be achieved by using fixnum-block and flonum-block arrays.
	218	Such arrays are required if you want to pass a large number of arguments to a
	219	Fortran or C coded function and then get answers back.
	220	.pp
	221	The Maclisp compatible array package is
	222	just one example of how a general array scheme can be implemented.
	223	Another type of array you could implement would be hashed arrays.
	224	The subscript could be anything, not just a number.
	225	The access function would hash the subscript and use the result to
	226	select an array element.
	227	With the generality of arrays also comes extra cost; if you just
	228	want a simple aggregate of (less than 128) general lisp objects
	229	you would be wise to look into using hunks.
	230	.sh 2 vectors
	231	Vectors were invented to fix two shortcommings with hunks.
	232	They can be longer than 128 elements. They also have a
	233	tag associated with them, which is intended to say, for example,
	234	"Think of me as an \fIBlobit\fP." Thus a \fBvector\fP
	235	is an arbitrary sized hunk with a property list.
	236	.pp
	237	Continuing the example,
	238	the lisp kernel may not know how to print out
	239	or evaluate \fIblobits\fP, but this is information which will
	240	be common to all \fIblobits\fP. On the other hand, for each
	241	individual blobits there are particulars which are likely to change,
	242	(height, weight, eye-color). This is the part that would
	243	previously have been stored in the individual entries in the hunk,
	244	and are stored in the data slots of the vector.
	245	Once again we summarize the structure of a vector in tabular form:
	246	.sp 1v
	247	.TS
	248	box center ;
	249	c \| c \| c \| c .
	250	Subpart name Get value Set value Type
	251
	252	=
	253	datum[\fIi\fP] vref vset lispval
	254	_
	255	property vprop vsetprop lispval
	256	vputprop
	257	_
	258	size vsize \- fixnum
	259	.TE
	260	Vectors are created specifying size and optional fill value using the
	261	function (\fInew-vector\fP 'x_size ['g_fill ['g_prop]]), or by
	262	initial values: (\fIvector\fP ['g_val ...]).
	263	.sh 2 "anatomy of vectors"
	264	There are some technical details about vectors, that the user should
	265	know:
	266	.sh 3 size
	267	The user is not free to alter this. It is noted when the vector
	268	is created, and is used by the garbage collector. The garbage
	269	collector will coallesce two free vectors, which are neighbors
	270	in the heap. Internally, this is kept as the number of bytes
	271	of data. Thus, a vector created by (\fIvector\fP 'foo), has a
	272	size of 4.
	273	.sh 3 property
	274	Currently, we expect the property to be either a symbol, or a list
	275	whose first entry is a symbol. The symbols \fBfclosure\fP and
	276	\fBstructure-value-argument\fP are magic, and their effect is described in
	277	Chapter 8. If the property is a (non-null) symbol, the vector
	278	will be printed out as <symbol>[<size>].
	279	Another case is if the property is actually a (disembodied) property-list, which
	280	contains a value for the indicator \fBprint\fP.
	281	The value is taken to be a Lisp function, which the printer
	282	will invoke with two arguments: the vector and the current output port.
	283	Otherwise, the vector will be printed as vector[<size>].
	284	We have vague (as yet unimplemented) ideas
	285	about similar mechanisms for evaluation properties.
	286	Users are cautioned against putting anything other than nil
	287	in the property entry of a vector.
	288	.sh 3 "internal order"
	289	In memory, vectors start with a longword containing the size
	290	(which is immediate data within the vector).
	291	The next cell contains a pointer to the property.
	292	Any remaining cells (if any) are for data.
	293	Vectors are handled differently from any other object in
	294	.Fr,
	295	in that a pointer to a vector is pointer to the first data
	296	cell, i.e. a pointer to the \fIthird\fP longword of the structure.
	297	This was done for efficiency in compiled code and for uniformity
	298	in referencing immediate-vectors (described below).
	299	The user should never return a pointer to any other part
	300	of a vector, as this may cause the garbage collector to follow an
	301	invalid pointer.
	302	.sh 2 "immediate-vectors"
	303	Immediate-vectors are similar to vectors. They differ, in
	304	that binary data are stored in space directly within the vector.
	305	Thus the garbage collector will preserve the vector itself (if used),
	306	and will only traverse the property cell.
	307	The data may be referenced as longwords, shortwords, or even bytes.
	308	Shorts and bytes are returned sign-extended.
	309	The compiler open-codes such references,
	310	and will avoid boxing the resulting integer data, where possible.
	311	Thus, immediate vectors may be used for efficiently processing
	312	character data.
	313	They are also useful in storing results from functions written
	314	in other languages.
	315	.sp 1v
	316	.TS
	317	box center ;
	318	c \| c \| c \| c .
	319	Subpart name Get value Set value Type
	320
	321	=
	322	datum[\fIi\fP] vrefi-byte vseti-byte fixnum
	323	vrefi-word vseti-word fixnum
	324	vrefi-long vseti-long fixnum
	325	_
	326	property vprop vsetprop lispval
	327	vputprop
	328	_
	329	size vsize \- fixnum
	330	vsize-byte fixnum
	331	vsize-word fixnum
	332	.TE
	333	To create immediate vectors specifying size and fill data,
	334	you can use the functions
	335	\fInew-vectori-byte\fP,
	336	\fInew-vectori-word\fP,
	337	or \fInew-vectori-long\fP.
	338	You can also use the functions
	339	\fIvectori-byte\fP,
	340	\fIvectori-word\fP,
	341	or \fIvectori-long\fP.
	342	All of these functions are described in
	343	chapter 2.