BSD 3 development
[unix-history] / .ref-BSD-2 / doc / px / pxin2.n
CommitLineData
1662094b
BJ
1.de mD
2.ta 8n 16n 24n
3..
4.if !\n(xx .so tmac.p
5.nr H1 1
6.if n .ND
7.NH
8Operations
9.NH 2
10Naming conventions and operation summary
11.PP
12As discussed in section 1.10,
13the main interpreter loop decodes the first word of the interpreter
14instruction,
15using the first byte as an operation code,
16and places the second byte,
17the ``subop'',
18in register 3.
19The subop may be used to index the display,
20as a small constant,
21or to indicate one of several relational operators.
22In the cases where a constant is needed, but it
23is not small enough to fit in the byte sub-operator,
24a zero is placed there and the constant follows in the next word.
25Zero is easily tested for,
26as the instruction which places the
27subop in r3 sets the condition code flags,
28and this condition code is still available after the transfer
29to an operation code sequence.
30A construction like
31.DS
32.mD
33_OPER:
34 \fBbne\fR 1f
35 \fBmov\fR (lc)+,r3
361:
37 ...
38.DE
39.IP
40is all that is needed to effect this packing of data.
41This technique saves a great deal of space in the Pascal
42.I obj
43object code.
44.PP
45Table 2.1 gives the codes used in the instruction descriptions
46to indicate the kind of inline data expected by each instruction.
47.KF
48.TS
49box center;
50c s
51l | l
52ci | aw(3.25i).
53Table 2.1 \- Inline data type codes
54_
55Code Description
56=
57a T{
58.fi
59An address offset is given in the word
60following the instruction.
61T}
62_
63l T{
64An index into the display, ready as an offset or a guaranteeably small integer,
65is given in the sub-operation code.
66T}
67_
68r T{
69A relational operator encoded as described
70in section 2.3 is given in the subop.
71T}
72_
73s T{
74A small integer is
75placed in the subop, or in the next word
76if it is zero or too large.
77T}
78_
79v T{
80Variable length inline data.
81T}
82_
83w T{
84A word value in the following word.
85T}
86_
87" T{
88An inline constant string.
89T}
90.TE
91.KE
92.PP
93Before giving a list of the machine opcodes,
94it is useful to note the naming conventions in the interpreter for typed
95operations.
96Machine instructions which have numeric operands use a simple and uniform
97naming convention in which a suffix on the root operation name indicates
98the type of operands expected.
99These are given in Table 2.2.
100Here the expression ``a above b'' means that `a' is on top of the
101stack with `b' below it.
102Short integers are 2 byte integers,
103and long integers are 4 byte integers.
104.TS
105box center;
106c s s
107c s s
108l l l
109c ap-2 a.
110Table 2.2 \- Operator Suffices
111.sp
112Unary operator suffices
113.sp .1i
114Suffix Example Argument type
1152 NEG2 Short integer
1164 SQR4 Long integer
1178 ABS8 Real
118.sp
119.T&
120c s s
121l l l
122c ap-2 a.
123Binary operator suffices
124.sp .1i
125Suffix Example Argument type
1262 ADD2 Two short integers
12724 MUL24 Short above long integer
12842 REL42 Long above short integer
1294 DIV4 Two long integers
13028 DVD28 Short integer above real
13148 REL48 Long integer above real
13282 SUB82 Real above short integer
13384 MUL84 Real above long integer
134.sp
135.T&
136c s s
137l l l
138c ap-2 a.
139Other Suffices
140.sp .1i
141Suffix Example Argument types
142T ADDT Sets
143G RELG Strings
144.TE
145.PP
146We now give the list of machine operations with a reference
147to the appropriate sections and a short description of each.
148The character `*' at the end of a name indicates that all
149operations with the root prefix before the `*' are summarized by
150the one entry.
151.br
152.ne 15
153.TS H
154box center;
155c s s
156lw(14) | lw(12) | lw(40)
157lp-2 | a | l.
158Table 2.3 \- Machine operations
159_
160Mnemonic Reference Description
161=
162.TH
163.so fig2.1.n
164.TE
165.bp
166.NH 2
167Basic control operations
168.LP
169.SH
170ABORT
171.IP
172This operator is used to halt execution immediately with an IOT process
173fault.
174It is used only for debugging
175.I px
176and is never generated by the translator
177.I pi.
178.SH
179HALT
180.IP
181Corresponds to the Pascal procedure
182.I halt ;
183causes execution to terminate with a post-mortem backtrace as if a run-time
184error had occurred.
185.SH
186BEG w1,w2,"
187.IP
188Causes the second part of the block mark to be created, and
189.I w1
190bytes of local variable space to be allocated and cleared to zero.
191Stack overflow is detected here.
192.I W2
193is the first line of the body of this section for error traceback,
194and he inline string (length 8) the character representation of its name.
195.SH
196NODUMP w
197.IP
198Equivalent to
199.SM BEG ,
200and used to begin the main program when the ``p''
201option is disabled so that the post-mortem backtrace will be inhibited.
202.SH
203END
204.IP
205Complementary to the operators
206.SM CALL
207and
208.SM BEG ,
209exits the current block, calling the procedure
210.I blkexit
211to flush buffers for and release any local files.
212Restores the environment of the caller from the block mark.
213If this is the end for the main program, all files are
214.I flushed,
215the profile data file is written if necessary, and
216the routine
217.I psexit
218which prints the statistics if desired (and does not return) is called.
219.SH
220CALL l,a
221.IP
222Saves the current line number, return address, and active display entry pointer
223.I dp
224in the first part of the block mark, then transfers to the entry point
225given by the relative address
226.I a ,
227which is the beginning of a
228.B procedure
229or
230.B function
231at level
232.I l.
233.SH
234PUSH s
235.IP
236Clears
237.I s
238bytes on the stack for, e.g., the return value of a
239.B function
240just before calling the function.
241.SH
242POP s
243.IP
244Pop
245.I s
246bytes off the stack.
247Used, e.g., after a
248.B function
249or
250.B procedure
251returns to remove the arguments from the stack.
252.SH
253TRA a
254.IP
255Transfer control to relative address
256.I a
257as a local
258.B goto
259or part of a structured statement.
260.SH
261LINO s
262.IP
263Set current line number to
264.I s.
265For consistency, check that the expression stack is empty
266as it should be (as this is the start of a statement.)
267This consistency check will fail only if there is a bug in the
268interpreter or the interpreter code has somehow been damaged.
269Increment the statement count and if it exceeds the statement limit,
270generate a fault.
271.SH
272GOTO l,a
273.IP
274Transfer conrol to address
275.I a
276which is in the block at level
277.I l
278of the display.
279This is a non-local
280.B goto.
281Causes each block to be exited as if with
282.SM END ,
283flushing and freeing files with
284.I blkexit,
285until the current display entry is at level
286.I l.
287.SH
288SDUP
289.IP
290Duplicate the one word integer on the top of
291the stack.
292This is used mostly for constructing sets.
293See section 2.11.
294.NH 2
295If and relational operators
296.SH
297IF a
298.IP
299The interpreter conditional transfers all take place using this operator
300which examines the Boolean value on the top of the stack.
301If the value is
302.I true ,
303the subsequent code is executed,
304otherwise control transfers to the specified address.
305.SH
306REL* r
307.IP
308These take two arguments on the stack,
309and the sub-operation code indicates which relational operation is to
310be performed, coded as follows with `a' above `b' on the stack:
311.DS
312.mD
313.TS
314lb lb
315c a.
316Code Operation
317_
3180 a = b
3192 a <> b
3204 a < b
3216 a > b
3228 a <= b
32310 a >= b
324.TE
325.DE
326.IP
327Each operation does a number of tests to set the condition code
328appropriately and then does an indexed branch based on the
329sub-operation code to a test of the condition here specified,
330pushing a Boolean value on the stack.
331.IP
332Consider the statement fragment:
333.DS
334.mD
335\*bif\fR a = b \*bthen\fR
336.DE
337.IP
338If
339.I a
340and
341.I b
342are integers this generates the following code:
343.DS
344.TS
345lp-2w(8) l.
346RV4 \fIa\fR
347RV4 \fIb\fR
348REL4 \&=
349IF \fIElse part offset\fR
350.sp
351.T&
352c s.
353\fI\&... Then part code ...\fR
354.TE
355.DE
356.NH 2
357Boolean operators
358.IP
359The Boolean operators
360.SM AND ,
361.SM OR ,
362and
363.SM NOT
364manipulate values on the top of the stack.
365All Boolean values are kept in single bytes in memory,
366or in single words on the stack.
367Zero represents a Boolean \fIfalse\fP, and one a Boolean \fItrue\fP.
368.NH 2
369Rvalue, constant, and assignment operators
370.SH
371RV* l,a
372.IP
373The rvalue operators load values on the stack.
374They take a block number as a subop and load the appropriate
375number of bytes from that block at the offset specified
376in the following word onto the stack. As an example, consider
377.SM RV4 :
378.DS
379.mD
380_RV4:
381 \fBmov\fR _display(r3),r0
382 \fBadd\fR (lc)+,r0
383 \fBsub\fR $4,sp
384 \fBmov\fR sp,r2
385 \fBmov\fR (r0)+,(r2)+
386 \fBmov\fR (r0)+,(r2)+
387 \fBreturn\fR
388.DE
389.IP
390Here the interpreter first generates the source address in r0 by adding the
391display entry to the offset in the next instruction word.
392It then reserves a long integer space on the stack (4 bytes)
393and moves the data from the source onto the stack.
394The pseudo-operation ``return''
395takes the interpreter back to the main interpreter loop.
396Note that the sub-operation code is already in
397r3 and multiplied by 2 to be immediately usable as a word index
398into the display.
399.SH
400CON* r
401.IP
402The constant operators load a value onto the stack from inline code.
403Small integer values are condensed and loaded by the
404.SM CON1
405operator, which is given by
406.DS
407.mD
408_CON1:
409 \fBmov\fR r3,-(sp)
410 \fBreturn\fR
411.DE
412.IP
413Here note that little work was required as the required constant
414had already been placed in register 3.
415For longer constants, more work is required;
416the operator
417.SM CON
418takes a length specification in the subop and can be used to load
419strings and other variable length data onto the stack.
420.SH
421AS*
422.IP
423The assignment operators are similar to arithmetic and relational operators
424in that they take two operands, both in the stack,
425but the lengths given for them indicate
426first the length of the value on the stack and then the length
427of the target in memory.
428The target address in memory is under the value to be stored.
429Thus the statement
430.DS
431i := 1
432.DE
433.IP
434where
435.I i
436is a full-length, 4 byte, integer,
437will generate the code sequence
438.DS
439.TS
440lp-2w(8) l.
441LV \fIi\fP
442CON1 1
443AS24
444.TE
445.DE
446.IP
447Here
448.SM LV
449will load the address of
450.I i,
451which is actually given as a block number in the subop and an
452offest in the following word,
453onto the stack, occupying a single word.
454.SM CON1 ,
455which is a single word instruction,
456then loads the constant 1,
457which is in its subop,
458onto the stack.
459Since there are not one byte constants on the stack,
460this becomes a 2 byte, single word integer.
461The interpreter then assigns a length 2 integer to a length 4 integer using
462.SM AS24 \&.
463The code sequence for
464.SM AS24
465is given by:
466.DS
467.mD
468_AS24:
469 \fBmov\fR (sp)+,r1
470 \fBsxt\fR r0
471 \fBmov\fR (sp)+,r2
472 \fBmov\fR r0,(r2)+
473 \fBmov\fR r1,(r2)
474 \fBreturn\fR
475.DE
476.IP
477Thus the interpreter gets the single word off the stack,
478extends it to be a 4 byte integer in two registers,
479gets the target address off the stack,
480and finally stores the parts of the value in the target.
481This is a typical use of the constant and assignment operators.
482.NH 2
483Addressing operations
484.SH
485LV l,w
486.IP
487The most common operation performed by the interpreter
488is the ``lvalue'' or ``address of'' operation.
489It is given by:
490.DS
491.mD
492_LV:
493 \fBmov\fR _display(r3),r0
494 \fBadd\fR (lc)+,r0
495 \fBmov\fR r0,-(sp)
496 \fBreturn
497.DE
498.IP
499It calculates an address in the block specified in the subop
500by adding the associated display entry to the
501offset which appears in the following word.
502.SH
503OFF s
504.IP
505The offset operator is used in field names.
506Thus to get the address of
507.LS
508p^.f1
509.LE
510.IP
511.I pi
512would generate the sequence
513.DS
514.mD
515.TS
516lp-2w(8) l.
517RV \fIp\fP
518OFF \fIf1\fP
519.TE
520.DE
521.IP
522where the
523.SM RV
524loads the value of
525.I p,
526given its block in the subop and offset in the following word,
527and the interpreter then adds the offset of the field
528.I f1
529in its record to get the correct address.
530.SM OFF
531takes its argument in the subop if it is small enough.
532.SH
533NIL
534.IP
535The example above is incomplete, lacking a check for a
536.B nil
537pointer.
538The code generated would, in fact, be
539.DS
540.TS
541lp-2w(8) l.
542RV \fIp\fP
543NIL
544OFF \fIf1\fP
545.TE
546.DE
547.IP
548where the
549.SM NIL
550operation checks for a
551.I nil
552pointer and generates the appropriate runtime error if it is.
553.SH
554INX* s,w,w
555.IP
556The operators
557.SM INX2
558and
559.SM INX4
560perform subscripting.
561For example, the statement
562.DS
563a[i] := 2.0
564.DE
565.IP
566with
567.I i
568a short integer, such as a subrange ``1..1000'',
569and
570.I a
571an
572``array [1..1000] of real''
573would generate
574.DS
575.TS
576lp-2w(8) l.
577LV \fIa\fP
578RV2 \fIi\fP
579INX2 8,1,999
580CON8 2.0
581AS8
582.TE
583.DE
584.IP
585Here the
586.SM LV
587operation takes the address of
588.I a
589and places it on the stack.
590The value of
591.I i
592is then placed on top of this on the stack.
593We then perform an indexing of the array address by the
594length 2 index (a length 4 index would use
595.SM INX4 )
596where the individual elements have a size of 8 bytes.
597The code for
598.SM INX2
599is:
600.DS
601.mD
602_INX2:
603 \fBtst\fR r3
604 \fBbne\fR 1f
605 \fBmov\fR (lc)+,r3
6061:
607 \fBmov\fR (sp)+,r1
608 \fBsub\fR (lc)+,r1
609 \fBbmi\fR 1f
610 \fBcmp\fR r1,(lc)+
611 \fBbgt\fR 1f
612 \fBmul\fR r3,r1
613 \fBadd\fR r1,(sp)
614 \fBreturn
6151:
616 \fBerror\fR ESUBSCR
617.DE
618.IP
619Here the index operation subtracts the constant value 1 from the
620supplied subscript,
621this being the low bound of the range of permissible subscripts.
622If the result is negative,
623or if the normalized subscript then exceeds 999, which
624is the maximum permissible subscript if the first is numbered 0,
625the interpreter generates a subscript error.
626Otherwise, the interpreter multiplies the offset by 8 and adds it to the address
627which is already on the stack for
628.I a ,
629to address ``a[i]''.
630Multi-dimension subscripts are translated as a sequence of single subscriptings.
631.SH
632IND*
633.IP
634For indirect references through
635.B var
636parameters and pointers,
637the interpreter has a set of indirection operators which convert a pointer
638on the stack into a value on the stack from that address.
639different
640.SM IND
641operators are necessary because of the possibility of different
642length operands.
643.NH 2
644Arithmetic operators
645.IP
646The interpreter has a large number of arithmetic operators.
647All operators produce results long enough to prevent overflow
648unless the bounds of the base type are exceeded.
649No overflow checking is done on arithmetic, but divide by zero
650and mod by zero are detected.
651.NH 2
652Range checking
653.IP
654The interpreter has a number of range checking operators.
655The important distinction among these operators is between values whose
656legal range begins at 0 and those which do not begin at 0, i.e. with
657a subrange variable whose values range from 45 to 70.
658For those which begin at 0, a simpler ``logical'' comparison against
659the upper bound suffices.
660For others, both the low and upper bounds must be checked independently,
661requiring two comparisons.
662.NH 2
663Case operators
664.IP
665The interpreter includes three operators for
666.B case
667statements which are used depending on the width of the
668.B case
669label type.
670For each width, the structure of the case data is the same, and
671is represented in the following figure.
672.sp 1
673.KF
674.so fig2.2.n
675.KE
676.sp 1
677.IP
678The
679.SM CASEOP
680case statement operators do a sequential search through the
681case label values.
682If they find the label value, they take the corresponding entry
683from the transfer table and cause the interpreter to branch to the
684indicated statement.
685If the specified label is not found, an error results.
686.IP
687The
688.SM CASE
689operators take the number of cases as a subop
690if possible.
691Three different operators are needed to handle single byte,
692word, and double word case transfer table values.
693For example, the
694.SM CASEOP1
695operator has the following code sequence:
696.DS
697.mD
698_CASEOP1:
699 \fBbne\fR 1f
700 \fBmov\fR (lc)+,r3
7011:
702 \fBmov\fR lc,r0
703 \fBadd\fR r3,r0
704 \fBadd\fR r3,r0
705 \fBmov\fR r3,r2
706 \fBtst\fR (sp)+
7071:
708 \fBcmpb\fR (r0)+,-2(sp)
709 \fBbeq\fR 5f
710 \fBsob\fR r3,1b
711 \fBerror\fR ECASE
7125:
713 \fBsub\fR r3,r2
714 \fBadd\fR r2,r2
715 \fBadd\fR lc,r2
716 \fBadd\fR (r2),lc
717 \fBreturn
718.DE
719.IP
720Here the interpreter first computes the address of the beginning
721of the case label value area by adding twice the number of case label
722values to the address of the transfer table, since the transfer
723table entries are full word, 2 byte, address offsets.
724It then searches through the label values, and generates an ECASE
725error if the label is not found.
726If the label is found, we calculate the index of the entry in
727the transfer table which is desired and then add that offset
728to the interpreter location counter.
729.NH 2
730Operations supporting pxp
731.IP
732For the purpose of execution profiling the following operations
733are defined.
734.SH
735PXPBUF w
736.IP
737Causes the interpreter to allocate a count buffer
738with
739.I w
740counters, each of which is a 4 byte integer,
741and to clear the counters to 0.
742The count buffer is placed within an image of the
743.I pmon.out
744file as described in the
745.I "PXP Implementation Notes."
746The contents of this buffer will be written to the file
747.I pmon.out
748when the program terminates.
749.SH
750COUNT s
751.IP
752Increments the counter specified by
753.I s.
754.SH
755TRACNT w,a
756.IP
757Used at the entry point to procedures and functions,
758combining a transfer to the entry point of the block with
759an incrementing of its entry count.
760.NH 2
761Set operations
762.IP
763The set operations
764set union
765.SM ADDT,
766intersection
767.SM MULT,
768and the set relationals
769.SM RELT
770are straightforward.
771The following operations are more interesting.
772.SH
773CARD s
774.IP
775Takes the cardinality of a set of size
776.I s
777bytes on top of the stack, leaving a 2 byte integer count.
778.SM CARD
779uses a table of 4-bit population counts to count set bits
780in each 4-bit nibble of each byte in the set.
781.SH
782CTTOT s,w,w
783.IP
784Constructs a set.
785This operation requires a non-trivial amount of work,
786checking bounds and setting individual bits or ranges of bits.
787This operation sequence is very slow,
788and motivates the presence of the operator
789.SM INCT
790below.
791The arguments to
792.SM CTTOT
793include the number of elements
794.I s
795in the constructed set,
796the lower and upper bounds of the set,
797the two
798.I w
799values,
800and a pair of values on the stack for each range in the set, single
801elements in constructed sets being duplicated with
802.SM SDUP
803to form degenerate ranges.
804.SH
805IN s,w,w
806.IP
807The operator
808.B in
809for sets.
810The value
811.I s
812specifies the size of the set,
813the two
814.I w
815values the lower and upper bounds of the set.
816The value on the stack is checked to be in the set on the stack,
817and a Boolean value of
818.I true
819or
820.I false
821replaces the operands.
822.SH
823INCT
824.IP
825The operator
826.B in
827on a constructed set without constructing it.
828The left operand of
829.B in
830is on top of the stack followed by the number of pairs in the
831constructed set,
832and then the pairs themselves, all as single word integers.
833Pairs designate runs of values and single values are represented by
834a degenerate pair with both value equal.
835A typical situation for this operator to be generated is
836.LS
837\fBif\fR ch \fBin\fR ['+', '-', '*', '/']
838.LE
839.IP
840or
841.LS
842\fBif\fR ch \fBin\fR ['a'..'z', '$', '_']
843.LE
844.IP
845These situations are very common in Pascal, and
846.SM INCT
847makes them run much faster in the interpreter,
848as if they were written as an efficient series of
849.B if
850statements.
851.NH 2
852Miscellaneous
853.IP
854Other miscellaneous operators which are present in the interpreter
855are
856.SM ASRT
857which causes termination if the Boolean value on the stack is not
858.I true,
859and
860.SM STOI ,
861.SM STOD ,
862.SM ITOD ,
863and
864.SM ITOS
865which convert between different length arithmetic operands for
866use in aligning the arguments in
867.B procedure
868and
869.B function
870calls, and with some untyped built-ins, such as
871.SM SIN
872and
873.SM COS \&.
874.IP
875Finally, if the program is run with the run-time testing disabled, there
876are special operators for
877.B for
878statements
879and special indexing operators for arrays
880which have individual element size which is a power of 2.
881The code can run significantly faster using these operators.
882.NH 2
883Functions and procedures
884.IP
885.I Px
886has a large number of built-in procedures and functions.
887The mathematical functions are taken from the standard
888system library.
889The linear congruential random number generator is described in
890the
891.I "Berkeley Pascal User Manual"
892.IP
893The procedures
894.I linelimit
895and
896.I dispose
897are included here but currently ignored.
898One surprise is that the built-ins
899.I pack
900and
901.I unpack
902are here and quite complex, functioning as a memory to memory
903move with a number of semantic checks.
904They do no ``unpacking'' or ``packing'' in the true sense, however,
905as the interpreter supports no packed data types.