Each expression tree, as it is read in, is subjected to
routine and a number of subroutines;
the major things done are
Modifications and simplifications
of the tree so its value may be computed more efficiently
and conveniently by the code generator.
Marking each interior node with an estimate of the number of
registers required to evaluate it.
This register count is needed to guide the code generation algorithm.
One thing that is definitely not done is
discovery or exploitation of common subexpressions, nor is this done anywhere in the
The basic organization is simple: a depth-first scan of the tree.
does nothing for leaf nodes (except for automatics; see below),
to handle unary operators.
it calls itself to process the operands,
then treats each operator separately.
commutative and associative operators, which are handled
Here is a brief catalog of the transformations carried out by
It is not intended to be complete.
Some of the transformations are machine-dependent,
although they may well be useful on machines other than the
As indicated in the discussion of
below, the optimizer can create a node type corresponding
to the location addressed by a register plus a constant offset.
Since this is precisely the implementation of automatic variables
and arguments, where the register is fixed by convention,
such variables are changed to the new form to simplify
Associative and commutative operators are processed by the
the bitwise & operator is turned into a new
operator; `a & b' becomes
This is done because the PDP-11 provides
A similar transformation takes place for
Relationals are turned around so the
more complicated expression is on the left.
(So that `2 > f(x)' becomes `f(x) < 2').
This improves code generation since
the algorithm prefers to have the right operand
require fewer registers than the left.
An expression minus a constant is turned into
the expression plus the negative constant,
to take advantage of the properties of addition.
Operators with constant operands are evaluated.
Right shifts (unless by 1)
are turned into left shifts with a negated right operand,
since the PDP-11 lacks a general right-shift operator.
A number of special cases are simplified, such as division or
routine performs the same sort of processing for unary operators.
`*&x' and `&*x' are simplified to `x'.
is a constant or the address of a static or external
and `*r' are turned into a special kind of name node
the name itself and the offset.
This simplifies subsequent processing
because such constructions can appear as
the the address of a PDP-11 instruction.
When the unary `&' operator is applied to
a name node of the special kind just discussed,
it is reworked to make the addition
this is done because the PDP-11 has no `load address' instruction.
is a register are discovered and marked
as being implementable using the PDP-11
auto-increment and -decrement modes.
If `!' is applied to a relational,
and the sense of the relational is reversed.
Special cases involving reflexive
use of negation and complementation are discovered.
Operations applying to constants are evaluated.
routine, called for associative and commutative operators,
discovers clusters of the same operator at the top levels
of the current tree, and arranges them in a list:
the list would be`a,b,c,d,e,f'.
After each subtree is optimized, the list is sorted in
decreasing difficulty of computation;
the code generation algorithm works best when left operands
The `degree of difficulty'
computed is actually finer than
the mere number of registers required;
a constant is considered simpler
than the address of a static or external, which is simpler
than reference to a variable.
This makes it easy to fold all the constants
and also to merge together the sum of a constant and the address of
or external (since in such nodes there is space for
There are also special cases, like multiplication by 1 and addition of 0.
A special routine is invoked to handle sums of products.
is based on the fact that it is better
to compute `c1*c2*x + c1*y' as `c1*(c2*x + y)'
and makes the divisibility tests required to assure the
correctness of the transformation.
This transformation is rarely
possible with code directly written by the user,
but it invariably occurs as a result of the
implementation of multi-dimensional arrays.
reconstructs a tree from the list
of expressions which result.