+.SH
+Code Generation
+.PP
+The grand plan for code-generation is
+independent of any particular machine;
+it depends largely on a set of tables.
+But this fact does not necessarily make it very easy
+to modify the compiler to produce code for other machines,
+both because there is a good deal of machine-dependent structure
+in the tables, and because in any event such tables are non-trivial to
+prepare.
+.PP
+The arguments to the basic code generation routine
+.II rcexpr
+are a pointer to a tree representing an expression,
+the name of a code-generation table,
+and the number of a register in which the value of the
+expression should be placed.
+.II Rcexpr
+returns the number of the register in which the value actually
+ended up;
+its caller
+may need to produce a
+.II mov
+instruction if the value really needs to be in the given register.
+There are four code generation tables.
+.PP
+.II Regtab
+is the basic one, which actually does the job described
+above: namely,
+compile code which places the value represented by the expression
+tree in a register.
+.PP
+.II Cctab
+is used when the value of the expression is not actually needed,
+but instead the value of the condition codes resulting from
+evaluation of the expression.
+This table is used, for example, to evaluate the expression after
+.II if.
+It is clearly silly to
+calculate the value (0 or 1) of the expression
+`a==b' in the context `if (a==b) ... '
+.PP
+The
+.II sptab
+table is used when the value of an expression is to be pushed on the stack,
+for example when it is an actual argument.
+For example in the function call `f(a)' it is a bad idea to
+load
+.II a
+into a register which is then pushed on the stack,
+when there is a single instruction which does the job.
+.PP
+The
+.II efftab
+table is used when an expression is to be evaluated for its side effects,
+not its value.
+This occurs mostly for expressions which are statements, which have no
+value.
+Thus the code for the statement
+`a = b'
+need produce only the approoriate
+.II mov
+instruction, and need not leave the value of
+.II b
+in a register,
+while in the expression `a + (b = c)'
+the value of `b = c' will appear in a register.
+.PP
+All of the tables besides
+.II regtab
+are rather small, and handle only a relatively few special cases.
+If one of these subsidiary tables does not contain
+an entry applicable to the given expression tree,
+.II rcexpr
+uses
+.II regtab
+to put the value of the expression into a register
+and then fixes things up;
+nothing need be done when the table
+was
+.II efftab,
+but a
+.II tst
+instruction is produced when the table called for was
+.II cctab,
+and a
+.II mov
+instruction,
+pushing the register on the stack,
+when the table was
+.II sptab.
+.PP
+The
+.II rcexpr
+routine itself picks off some special
+cases, then calls
+.II cexpr
+to do the real work.
+.II Cexpr
+tries to find an entry applicable
+to the given tree in the given table, and returns \-1 if
+no such entry is found, letting
+.II rcexpr
+try again with a different table.
+A successful match yields a string
+containing both literal characters
+which are written out and pseudo-operations, or macros, which are expanded.
+Before studying the contents
+of these strings we will consider how table entries are matched
+against trees.
+.PP
+Recall that most non-leaf nodes in an expression tree
+contain the name of the operator,
+the type of the value represented, and pointers to the subtrees (operands).
+They also contain an estimate of the number of registers required to evaluate
+the expression, placed there by the expression-optimizer routines.
+The register counts are used to guide the code generation process,
+which is based on the Sethi-Ullman algorithm.
+.PP
+The main code generation
+tables consist of entries
+each containing an operator number and a pointer
+to a subtable for the corresponding operator.
+A subtable consists of a sequence
+of entries, each with a key describing certain properties of the
+operands of the operator involved; associated with the key is a code string.
+Once the subtable corresponding to the operator is found, the subtable
+is searched linearly until a key is found such that the properties demanded
+by the key are compatible with the operands of the tree node.
+A successful match returns the code string;
+an unsuccessful search, either for the operator in the main table
+or a compatble key in the subtable,
+returns a failure indication.
+.PP
+The tables are all contained in a file
+which must be processed to obtain an assembly language program.
+Thus they are written in a special-purpose language.
+To provided definiteness to the following discussion, here is an
+example of a subtable entry.
+.DS
+%n,aw
+ F
+ add A2,R
+.DE
+The `%' indicates the key;
+the information following (up to a blank line) specifies the code string.
+Very briefly, this entry is in the subtable
+for `+' of
+.II regtab;
+the key specifies that the left operand is any integer, character, or pointer
+expression,
+and the right operand is any word quantity which is directly addressible
+(e.g. a variable or constant).
+The code string calls for the generation of the code
+to compile the left (first) operand into the
+current register (`F')
+and then to produce an `add' instruction which adds the
+second operand (`A2') to the register (`R').
+All of the notation will be explained below.
+.PP
+Only three features of the operands are used in deciding
+whether a match has occurred.
+They are:
+.IP 1.
+Is the type of the operand compatible with that demanded?
+.RT
+.IP 2.
+Is the `degree of difficulty' (in a sense described below) compatible?
+.RT
+.IP 3.
+The table may demand that the operand have a `*'
+(indirection operator) as its highest operator.
+.PP
+As suggested above, the key for a subtable entry
+is indicated by a `%,' and a comma-separated pair
+of specifications for the operands.
+(The second specification is ignored for unary operators).
+A specification indicates
+a type requirement by including one of the following letters.
+If no type letter is present, any integer, character,
+or pointer operand will satisfy the requirement (not float, double, or long).
+.IP b
+A byte (character) operand is required.
+.RT
+.IP w
+A word (integer or pointer) operand is required.
+.RT
+.IP f
+A float or double operand is required.
+.RT
+.IP d
+A double operand is required.
+.RT
+.IP l
+A long (32-bit integer) operand is required.
+.PP
+Before discussing the `degree of difficulty' specification,
+the algorithm has to be explained more completely.
+.II Rcexpr
+(and
+.II cexpr)
+are called with a register number in which to place their result.
+Registers 0, 1, ... are used during evaluation of expressions;
+the maximum register which can be used in this way depends on the
+number of register variables, but in any event only registers
+0 through 4 are available since r5 is used as a stack frame
+header and r6 (sp) and r7 (pc) have special
+hardware properties.
+The code generation routines assume that when called with register
+.II n
+as argument, they may use
+.II n+1,
+\&...
+(up to the first register variable)
+as temporaries.
+Consider the expression `X+Y', where both
+X and Y are expressions.
+As a first approximation, there are three ways of compiling
+code to put this expression in register
+.II n.
+.IP 1.
+If Y is an addressible cell,
+(recursively) put X into register
+.II n
+and add Y to it.
+.RT
+.IP 2.
+If Y is an expression that can be calculated in
+.II k
+registers, where
+.II k
+smaller than the number of registers available,
+compile X into register
+.II n,
+Y into register
+.II n+1,
+and add register
+.II n+1
+to
+.II n.
+.RT
+.IP 3.
+Otherwise, compile Y into register
+.II n,
+save the result in a temporary (actually, on the stack)
+compile X into register
+.II n,
+then add in the temporary.
+.PP
+The distinction between cases 2 and 3 therefore depends
+on whether the right operand can be compiled in fewer than
+.II k
+registers, where
+.II k
+is the number of free registers left after registers 0 through
+.II n
+are taken:
+0 through
+.II n\-1
+are presumed to contain already computed temporary results;
+.II n
+will, in case 2,
+contain the value of the left operand while the right
+is being evaluated.
+.PP
+These considerations should make clear
+the specification codes for the degree of difficulty,
+bearing in mind that a number of special cases are also present:
+.IP z
+is satisfied when the operand is zero, so that special code
+can be produced for expressions like `x = 0'.
+.RT
+.IP 1
+is satisfied when the operand is the constant 1, to optimize
+cases like left and right shift by 1, which can be done
+efficiently on the PDP-11.
+.RT
+.IP c
+is satisfied when the operand is a positive (16-bit)
+constant; this takes care of some special cases in long arithmetic.
+.RT
+.IP a
+is satisfied when the operand is addressible;
+this occurs not only for variables and constants, but also for
+some more complicated constructions, such as indirection through
+a simple variable, `*p++' where
+.II p
+is a register variable (because of the PDP-11's auto-increment address
+mode), and `*(p+c)' where
+.II p
+is a register and
+.II c
+is a constant.
+Precisely, the requirement is that the operand refers to a cell
+whose address can be written as a source or destination of a PDP-11
+instruction.
+.RT
+.IP e
+is satisfied by an operand whose value can be generated in a register
+using no more than
+.II k
+registers, where
+.II k
+is the number of registers left (not counting the current register).
+The `e' stands for `easy.'
+.RT
+.IP n
+is satisfied by any operand.
+The `n' stands for `anything.'
+.PP
+These degrees of difficulty are considered to lie in a linear ordering
+and any operand which satisfies an earlier-mentioned requirement
+will satisfy a later one.
+Since the subtables are searched linearly,
+if a `1' specification is included, almost certainly
+a `z' must be written first to prevent
+expressions containing the constant 0 to be compiled
+as if the 0 were 1.
+.PP
+Finally,
+a key specification may contain a `*' which
+requires the operand to have an indirection as its leading operator.
+Examples below should clarify the utility of this specification.
+.PP
+Now let us consider the contents of the code string
+associated with each subtable entry.
+Conventionally, lower-case letters in this string
+represent literal information which is copied directly
+to the output.
+Upper-case letters generally introduce specific
+macro-operations, some of which may be followed
+by modifying information.
+The code strings in the tables are written with tabs and
+new-lines used freely to suggest instructions which will be generated;
+the table-compiling program compresses tabs (using the 0200 bit of the
+next character) and throws away some of the new-lines.
+For example the macro `F' is ordinarily written on a line by itself;
+but since its expansion will end with a new-line, the new-line
+after `F' itself is dispensable.
+This is all to reduce the size of the stored tables.
+.PP
+The first set of macro-operations is concerned with
+compiling subtrees.
+Recall that this is done by the
+.II cexpr
+routine.
+In the following discussion the `current register'
+is generally the argument register to
+.II cexpr;
+that is, the place where the result is desired.
+The `next register' is numbered one
+higher
+than the current register.
+(This explanation isn't fully true
+because of complications, described below, involving
+operations which require even-odd register pairs.)
+.IP F
+causes a recursive call to
+the
+.II rcexpr
+routine to compile code which places the value of the first (left)
+operand of the operator in the current register.
+.RT
+.IP F1
+generates code which places the value of the first operand in the
+next register.
+It is incorrectly used if there might be no next register;
+that is, if the degree of difficulty of the first operand is not `easy;'
+if not, another register might not be available.
+.RT
+.IP FS
+generates code which pushes the value of the first operand on the stack,
+by calling
+.II rcexpr
+specifying
+.II sptab
+as the table.
+.LP
+Analogously,
+.IP "S, S1, SS"
+compile the second (right) operand
+into the current register, the next register, or onto the stack.
+.LP
+To deal with registers, there are
+.IP R
+which expands into the name of the current register.
+.RT
+.IP R1
+which expands into the name of the next register.
+.RT
+.IP R+
+which expands into the the name of the current register plus 1.
+It was suggested above that this is the same as the next register,
+except for complications; here is one of them.
+Long integer variables have
+32 bits and require 2 registers; in such cases the next register
+is the current register plus 2.
+The code would like to talk about both halves of the
+long quantity, so R refers to the register with the high-order part
+and R+ to the low-order part.
+.RT
+.IP R\-
+This is another complication, involving division and mod.
+These operations involve a pair of registers of which the odd-numbered
+contains the left operand.
+.II Cexpr
+arranges that the current register is odd;
+the R\- notation allows the code to refer to the next lower,
+even-numbered register.
+.LP
+To refer to addressible quantities, there are the notations:
+.IP A1
+causes generation of the address specified by the first operand.
+For this to be legal, the operand must be addressible; its
+key must contain an `a'
+or a more restrictive specification.
+.RT
+.IP A2
+correspondingly generates the address of the second operand
+providing it has one.
+.PP
+We now have enough mechanism to show a complete, if suboptimal,
+table for the + operator on word or byte operands.
+.DS
+%n,z
+ F
+.sp 1
+%n,1
+ F
+ inc R
+.sp 1
+%n,aw
+ F
+ add A2,R
+.sp 1
+%n,e
+ F
+ S1
+ add R1,R
+.sp 1
+%n,n
+ SS
+ F
+ add (sp)+,R
+.DE
+The first two sequences handle some special cases.
+Actually it turns out that handling a right operand of 0
+is unnecessary since the expression-optimizer
+throws out adds of 0.
+Adding 1 by using the `increment' instruction is done next,
+and then the case where the right operand is addressible.
+It must be a word quantity, since the PDP-11 lacks an `add byte' instruction.
+Finally the cases where the right operand either can, or cannot,
+be done in the available registers are treated.
+.PP
+The next macro-instructions are conveniently
+introduced by noticing that the above table is suitable
+for subtraction as well as addition, since no use is made of the
+commutativity of addition.
+All that is needed is substitution of `sub' for `add'
+and `dec' for 'inc.'
+Considerable saving of space is achieved by factoring out
+several similar operations.
+.IP I
+is replaced by a string from another table indexed by the operator
+in the node being expanded.
+This secondary table actually contains two strings per operator.
+.RT
+.IP I\(fm
+is replaced by the second string in the side table
+entry for the current operator.
+.PP
+Thus, given that the entries for `+' and `\-' in the side table
+(which is called
+.II instab)
+are `add' and `inc,' `sub' and `dec'
+respectively,
+the middle of of the above addition table can be written
+.DS
+%n,1
+ F
+ I' R
+
+%n,aw
+ F
+ I A2,R
+.DE
+and it will be suitable for subtraction,
+and several other operators, as well.
+.PP
+Next, there is the question of character and floating-point operations.
+.IP B1
+generates the letter `b' if the first operand is a character,
+`f' if it is float or double, and nothing otherwise.
+It is used in a context like `movB1'
+which generates a `mov', `movb', or `movf'
+instruction according to the type of the operand.
+.RT
+.IP B2
+is just like B1 but applies to the second operand.
+.RT
+.IP BE
+generates `b' if either operand is a character
+and null otherwise.
+.RT
+.IP BF
+generates `f' if the type of the operator node itself is float or double,
+otherwise null.
+.PP
+For example, there is an entry in
+.II efftab
+for the `=' operator
+.DS
+%a,aw
+%ab,a
+ IBE A2,A1
+.DE
+Note first that two key specifications
+can be applied to the same code string.
+Next, observe that when a word is assigned to a byte or to a word,
+or a word is assigned to a byte,
+a single instruction,
+a
+.II mov
+or
+.II movb
+as appropriate, does the job.
+However, when a byte is assigned to a word,
+it must pass through a register to implement the sign-extension rules:
+.DS
+%a,n
+ S
+ IB1 R,A1
+.DE
+.PP
+Next, there is the question of handling indirection properly.
+Consider the expression `X + *Y', where X and Y are expressions,
+Assuming that Y is more complicated than just a variable,
+but on the other hand qualifies as `easy' in the context,
+the expression would be compiled by placing the value of X in a register,
+that of *Y in the next register, and adding the registers.
+It is easy to see that a better job can be done
+by compiling X, then Y (into the next register),
+and producing the
+instruction symbolized by `add (R1),R'.
+This scheme avoids generating
+the instruction `mov (R1),R1'
+required actually to place the value of *Y in a register.
+A related situation occurs
+with the expression `X + *(p+6)', which
+exemplifies a construction
+frequent in structure and array references.
+The addition table shown above would produce
+.DS
+[put X in register R]
+mov p,R1
+add $6,R1
+mov (R1),R1
+add R1,R
+.DE
+when the best code is
+.DS
+[put X in R]
+mov p,R1
+add 6(R1),R
+.DE
+As we said above, a key specification for a code table entry
+may require an operand to have an indirection as its highest operator.
+To make use of the requirement,
+the following macros are provided.
+.IP F*
+the first operand must have the form *X.
+If in particular it has the form *(Y + c), for some constant
+.II c,
+then code is produced which places the value of Y in
+the current register.
+Otherwise, code is produced which loads X into the current register.
+.RT
+.IP F1*
+resembles F* except that the next register is loaded.
+.RT
+.IP S*
+resembles F* except that the second operand is loaded.
+.RT
+.IP S1*
+resembles S* except that the next register is loaded.
+.RT
+.IP FS*
+The first operand must have the form `*X'.
+Push the value of X on the stack.
+.RT
+.IP SS*
+resembles FS* except that it applies to the second operand.
+.LP
+To capture the constant that may have been skipped over
+in the above macros, there are
+.IP #1
+The first operand must have the form *X;
+if in particular it has the form *(Y + c) for
+.II c
+a constant, then the constant is written out,
+otherwise a null string.
+.RT
+.IP #2
+is the same as #1 except that the second operand is used.
+.LP
+Now we can improve the addition table above.
+Just before the `%n,e' entry, put
+.DS
+%n,ew*
+ F
+ S1*
+ add #2(R1),R
+.DE
+and just before the `%n,n' put
+.DS
+%n,nw*
+ SS*
+ F
+ add *(sp)+,R
+.DE
+When using the stacking macros there is no place to use
+the constant
+as an index word, so that particular special case doesn't occur.
+.PP
+The constant mentioned above can actually be more
+general than a number.
+Any quantity acceptable to the assembler as an expression will do,
+in particular the address of a static cell, perhaps with a numeric offset.
+If
+.II x
+is an external character array,
+the expression `x[i+5] = 0' will generate
+the code
+.DS
+mov i,r0
+clrb x+5(r0)
+.DE
+via the table entry (in the `=' part of
+.II efftab)
+.DS
+%e*,z
+ F
+ I'B1 #1(R)
+.DE
+Some machine operations place restrictions on the registers
+used.
+The divide instruction, used to implement the divide and mod
+operations, requires the dividend to be placed in the odd member
+of an even-odd pair;
+other peculiarities
+of multiplication make it simplest to put the multiplicand
+in an odd-numbered register.
+There is no theory which optimally accounts for
+this kind of requirement.
+.II Cexpr
+handles it by checking for a multiply, divide, or mod operation;
+in these cases, its argument register number is incremented by
+one or two so that it is odd, and if the operation was divide or mod,
+so that it is a member of a free even-odd pair.
+The routine which determines the number of registers required
+estimates, conservatively, that
+at least two registers are required for a multiplication
+and three for the other peculiar operators.
+After the expression is compiled,
+the register where the result actually ended up is returned.
+(Divide and mod are actually the same operation except for the
+location of the result).
+.PP
+These operations are the ones which cause results to end up in
+unexpected places,
+and this possibility adds a further level of complexity.
+The simplest way of handling the problem is always to move the
+result to the place where the caller expected it,
+but this will produce unnecessary register moves in many
+simple cases; `a = b*c' would generate
+.DS
+mov b,r1
+mul c,r1
+mov r1,r0
+mov r0,a
+.DE
+The next thought is used the passed-back
+information as to where the result landed to change the notion of the current
+register.
+While compiling the `=' operation above, which comes from a
+table
+entry
+like
+.DS
+%a,e
+ S
+ mov R,A1
+.DE
+it is sufficient to redefine the meaning of `R'
+after processing the `S' which does the multiply.
+This technique is in fact used; the tables are written in such a way
+that correct code is produced.
+The trouble is that the technique cannot be used in general,
+because it invalidates the count of the number of registers
+required for an expression.
+Consider just `a*b + X' where X is some expression.
+The algorithm assumes that the value of a*b,
+once computed, requires just one register.
+If there are three registers available, and X requires two registers to
+compute, then this expression will match a key specifying
+`%n,e'.
+If a*b is computed and left in register 1, then there are, contrary
+to expectations, no longer two registers available to compute X,
+but only one, and bad code will be produced.
+To guard against this possibility,
+.II cexpr
+checks the result returned by recursive calls which implement
+F, S and their relatives.
+If the result is not in the expected register, then the number of
+registers required by the other operand is checked;
+if it can be done using those registers which remain even
+after making unavailable the unexpectedly-occupied
+register, then
+the notions of the `next register' and possibly the `current
+register' are redefined.
+Otherwise a register-copy instruction is produced.
+A register-copy is also always produced
+when the current operator is one of those which have odd-even requirements.
+.PP
+Finally, there are a few loose-end macro operations
+and facts about the tables.
+The operators:
+.IP V
+is used for long operations.
+It is written with an address like a machine instruction;
+it expands into `adc' (add carry) if the operation
+is an additive operator,
+`sbc' (subtract carry) if the operation is a subtractive
+operator, and disappears, along with the rest of the line, otherwise.
+Its purpose is to allow common treatment of logical
+operations, which have no carries, and additive and subtractive
+operations, which generate carries.
+.RT
+.IP T
+generates a `tst' instruction if the first operand
+of the tree does not set the condition codes correctly.
+It is used with divide and mod operations,
+which require a sign-extended 32-bit operand.
+The code table for the operations contains an `sxt'
+(sign-extend) instruction to generate the high-order part of the
+dividend.
+.RT
+.IP H
+is analogous to the `F' and `S' macros,
+except that it calls for the generation of code for
+the current tree
+(not one of its operands)
+using
+.II regtab.
+It is used in
+.II cctab
+for all the operators which, when executed normally,
+set the condition codes properly according to the result.
+It prevents a `tst' instruction from being generated for
+constructions like `if (a+b) ...'
+since after calculation of the value of
+`a+b' a conditional branch can be written immediately.
+.PP
+All of the discussion above is in terms of operators with operands.
+Leaves of the expression tree (variables and constants), however,
+are peculiar in that they have no operands.
+In order to regularize the matching process,
+.II cexpr
+examines its operand to determine if it is a leaf;
+if so, it creates a special `load' operator whose operand
+is the leaf, and substitutes it for the argument tree;
+this allows the table entry for the created operator
+to use the `A1' notation to load the leaf into a register.
+.PP
+Purely to save space in the tables,
+pieces of subtables can be labelled and referred to later.
+It turns out, for example,
+that rather large portions of the
+the
+.II efftab
+table for the `=' and `=+' operators are identical.
+Thus `=' has an entry
+.DS
+%[move3:]
+%a,aw
+%ab,a
+ IBE A2,A1
+.DE
+while part of the `=+' table is
+.DS
+%aw,aw
+% [move3]
+.DE
+Labels are written as `%[ ... : ]',
+before the key specifications;
+references
+are written
+with `% [ ... ]'
+after the key.
+Peculiarities in the implementation
+make it necessary that labels appear before references to them.
+.PP
+The example illustrates the utility
+of allowing separate keys
+to point to the same code string.
+The assignment code
+works properly if either the right operand is a word, or the left operand
+is a byte;
+but since there is no `add byte' instruction the addition code
+has to be restricted to word operands.