fixed it to work on terminals with over 48 lines, and fixed bug
[unix-history] / .ref-BSD-3 / usr / doc / porttour / porttour1
CommitLineData
8340f87c
BJ
1.TL
2A Tour Through the Portable C Compiler
3.AU
4S. C. Johnson
5.AI
6.MH
7.SH
8Introduction
9.PP
10A C compiler has been implemented that has proved to be quite portable,
11serving as the basis for C compilers on roughly a dozen machines, including
12the Honeywell 6000, IBM 370, and Interdata 8/32.
13The compiler is highly compatible with the C language standard.
14.[
15Kernighan Ritchie Prentice 1978
16.]
17.PP
18Among the goals of this compiler are portability, high reliability,
19and the use of state-of-the-art techniques and tools wherever practical.
20Although the efficiency of the compiling process is not
21a primary goal, the compiler is efficient enough, and produces
22good enough code, to serve as a production compiler.
23.PP
24The language implemented is highly compatible with the current PDP-11 version of C.
25Moreover, roughly 75% of the compiler, including
26nearly all the syntactic and semantic routines, is machine independent.
27The compiler also serves as the major portion of the program
28.I lint ,
29described elsewhere.
30.[
31Johnson Lint Program Checker
32.]
33.PP
34A number of earlier attempts to make portable compilers are worth noting.
35While on CO-OP assignment to Bell Labs in 1973, Alan Snyder wrote a portable C
36compiler which was the basis of his Master's Thesis at M.I.T.
37.[
38Snyder Portable C Compiler
39.]
40This compiler was very slow and complicated, and contained a number of rather
41serious implementation difficulties;
42nevertheless, a number of Snyder's ideas appear in this work.
43.PP
44Most earlier portable compilers, including Snyder's, have proceeded by
45defining an intermediate language, perhaps based
46on three-address code or code for a stack machine,
47and writing a machine independent program to
48translate from the source code to this
49intermediate code.
50The intermediate code is then read by a second pass, and interpreted or compiled.
51This approach is elegant, and has a number of advantages, especially if the
52target machine is far removed from the host.
53It suffers from some disadvantages as well.
54Some constructions, like initialization and subroutine prologs,
55are difficult or expensive to express in a
56machine independent way that still allows them
57to be easily adapted to the target assemblers.
58Most of these approaches require a symbol table to be constructed
59in the second (machine dependent) pass, and/or
60require powerful target assemblers.
61Also, many conversion operators may be generated
62that have no effect on a given machine,
63but may be needed on others (for example, pointer to pointer
64conversions usually do nothing in C, but must be generated because
65there are some machines where they are significant).
66.PP
67For these reasons, the first pass of the portable compiler
68is not entirely machine independent.
69It contains some machine dependent features, such as initialization, subroutine
70prolog and epilog, certain storage allocation functions,
71code for the
72.I switch
73statement, and code to throw out unneeded conversion operators.
74.PP
75As a crude measure of the degree of
76portability actually achieved,
77the Interdata 8/32 C compiler has
78roughly 600 machine dependent lines of source out of 4600 in Pass 1, and 1000
79out of 3400 in Pass 2.
80In total, 1600 out of 8000, or 20%,
81of the total source is machine dependent
82(12% in Pass 1, 30% in Pass 2).
83These percentages can be expected to rise slightly as the
84compiler is tuned.
85The percentage of machine-dependent code for the IBM is 22%, for
86the Honeywell 25%.
87If the assembler format and structure were the same for all
88these machines, perhaps another 5-10% of the code
89would become machine independent.
90.PP
91These figures are sufficiently misleading as to be almost
92meaningless.
93A large fraction of the machine dependent code can be converted
94in a straightforward, almost mechanical way.
95On the other hand, a certain amount of the code requres hard
96intellectual effort to convert, since the algorithms
97embodied in this part of the code are typically complicated
98and machine dependent.
99.PP
100To summarize, however, if you need a C compiler written for a machine with
101a reasonable architecture, the compiler is already three quarters finished!
102.SH
103Overview
104.PP
105This paper discusses the structure and organization of the portable compiler.
106The intent is to give the big picture, rather than discussing the details of a particular machine
107implementation.
108After a brief overview and a discussion of the source file structure,
109the paper describes the major data structures, and then delves more
110closely into the two passes.
111Some of the theoretical work on which the compiler is based, and
112its application to the compiler, is discussed elsewhere.
113.[
114johnson portable theory practice
115.]
116One of the major design issues in any C compiler, the design of
117the calling sequence and stack frame, is the subject of a separate
118memorandum.
119.[
120johnson lesk ritchie calling sequence
121.]
122.PP
123The compiler consists of two passes,
124.I pass1
125and
126.I pass2 ,
127that together turn C source code into assembler code for the target
128machine.
129The two passes are preceded by a preprocessor, that
130handles the
131.B "#define"
132and
133.B "#include"
134statements, and related features (e.g.,
135.B #ifdef ,
136etc.).
137It is a nearly machine independent program, and will
138not be further discussed here.
139.PP
140The output of the preprocessor is a text file that is read as the standard
141input of the first pass.
142This
143produces as standard output another text file that becomes the standard
144input of the second pass.
145The second pass produces, as standard output, the desired assembler language source code.
146The preprocessor and the two passes
147all write error messages on the standard error file.
148Thus the compiler itself makes few demands on the I/O
149library support, aiding in the bootstrapping process.
150.PP
151Although the compiler is divided into two passes,
152this represents historical accident more than deep necessity.
153In fact, the compiler can optionally be loaded
154so that both passes operate in the same program.
155This ``one pass'' operation eliminates the overhead of
156reading and writing the intermediate file, so the compiler
157operates about 30% faster in this mode.
158It also occupies about 30% more space than the larger
159of the two component passes.
160.PP
161Because the compiler is fundamentally structured as two
162passes, even when loaded as one, this document primarily
163describes the two pass version.
164.PP
165The first pass does the lexical analysis, parsing, and
166symbol table maintenance.
167It also constructs parse trees for expressions,
168and keeps track of the types of the nodes in these trees.
169Additional code is devoted to initialization.
170Machine dependent portions of the first pass serve to
171generate subroutine prologs and epilogs,
172code for switches, and code for branches, label definitions,
173alignment operations,
174changes of location counter, etc.
175.PP
176The intermediate file is a text file organized into lines.
177Lines beginning with a right parenthesis are copied by the second
178pass directly to its output file, with the
179parenthesis stripped off.
180Thus, when the first pass produces assembly code, such as subroutine prologs,
181etc., each line is prefaced with a right parenthesis;
182the second pass passes these lines to through to the assembler.
183.PP
184The major job done by the second pass is generation of code
185for expressions.
186The expression parse trees produced in the first pass are
187written onto the
188intermediate file in Polish Prefix form:
189first, there is a line beginning with a period, followed by the source
190file line number and name on which the expression appeared (for debugging purposes).
191The successive lines represent the nodes of the parse tree,
192one node per line.
193Each line contains the node number, type, and
194any values (e.g., values of constants)
195that may appear in the node.
196Lines representing nodes with descendants are immediately
197followed by the left subtree of descendants, then the
198right.
199Since the number of descendants of any node is completely
200determined by the node number, there is no need to mark the end
201of the tree.
202.PP
203There are only two other line types in the intermediate file.
204Lines beginning with a left square bracket (`[') represent the beginning of
205blocks (delimited by { ... } in the C source);
206lines beginning with right square brackets (`]') represent the end of blocks.
207The remainder of these lines tell how much
208stack space, and how many register variables,
209are currently in use.
210.PP
211Thus, the second pass reads the intermediate files, copies the `)' lines,
212makes note of the information in the `[' and `]' lines,
213and devotes most of its effort to the
214`.' lines and their associated expression trees, turning them
215turns into assembly code to evaluate the expressions.
216.PP
217In the one pass version of the compiler, the expression
218trees that are built by the first pass have
219been declared to have room for the second pass
220information as well.
221Instead of writing the trees onto an intermediate file,
222each tree is transformed in place into an acceptable
223form for the code generator.
224The code generator then writes the result of compiling
225this tree onto the standard output.
226Instead of `[' and `]' lines in the intermediate file, the information is passed
227directly to the second pass routines.
228Assembly code produced by the first pass is simply written out,
229without the need for ')' at the head of each line.
230.SH
231The Source Files
232.PP
233The compiler source consists of 22 source files.
234Two files,
235.I manifest
236and
237.I macdefs ,
238are header files included with all other files.
239.I Manifest
240has declarations for the node numbers, types, storage classes,
241and other global data definitions.
242.I Macdefs
243has machine-dependent definitions, such as the size and alignment of the
244various data representations.
245Two machine independent header files,
246.I mfile1
247and
248.I mfile2 ,
249contain the data structure and manifest definitions
250for the first and second passes, respectively.
251In the second pass, a machine dependent header file,
252.I mac2defs ,
253contains declarations of register names, etc.
254.PP
255There is a file,
256.I common ,
257containing (machine independent) routines used in both passes.
258These include routines for allocating and freeing trees, walking over trees,
259printing debugging information, and printing error messages.
260There are two dummy files,
261.I comm1.c
262and
263.I comm2.c ,
264that simply include
265.I common
266within the scope of the appropriate pass1 or pass2 header files.
267When the compiler is loaded as a single pass,
268.I common
269only needs to be included once:
270.I comm2.c
271is not needed.
272.PP
273Entire sections of this document are devoted to the detailed structure of the
274passes.
275For the moment, we just give a brief description of the files.
276The first pass
277is obtained by compiling and loading
278.I scan.c ,
279.I cgram.c ,
280.I xdefs.c ,
281.I pftn.c ,
282.I trees.c ,
283.I optim.c ,
284.I local.c ,
285.I code.c ,
286and
287.I comm1.c .
288.I Scan.c
289is the lexical analyzer, which is used by
290.I cgram.c ,
291the result of applying
292.I Yacc
293.[
294Johnson Yacc Compiler cstr
295.]
296to the input grammar
297.I cgram.y .
298.I Xdefs.c
299is a short file of external definitions.
300.I Pftn.c
301maintains the symbol table, and does initialization.
302.I Trees.c
303builds the expression trees, and computes the node types.
304.I Optim.c
305does some machine independent optimizations on the expression trees.
306.I Comm1.c
307includes
308.I common ,
309that contains service routines common to the two passes of the compiler.
310All the above files are machine independent.
311The files
312.I local.c
313and
314.I code.c
315contain machine dependent code for generating subroutine
316prologs, switch code, and the like.
317.PP
318The second pass
319is produced by compiling and loading
320.I reader.c ,
321.I allo.c ,
322.I match.c ,
323.I comm1.c ,
324.I order.c ,
325.I local.c ,
326and
327.I table.c .
328.I Reader.c
329reads the intermediate file, and controls the major logic of the
330code generation.
331.I Allo.c
332keeps track of busy and free registers.
333.I Match.c
334controls the matching
335of code templates to subtrees of the expression tree to be compiled.
336.I Comm2.c
337includes the file
338.I common ,
339as in the first pass.
340The above files are machine independent.
341.I Order.c
342controls the machine dependent details of the code generation strategy.
343.I Local2.c
344has many small machine dependent routines,
345and tables of opcodes, register types, etc.
346.I Table.c
347has the code template tables,
348which are also clearly machine dependent.
349.SH
350Data Structure Considerations.
351.PP
352This section discusses the node numbers, type words, and expression trees, used
353throughout both passes of the compiler.
354.PP
355The file
356.I manifest
357defines those symbols used throughout both passes.
358The intent is to use the same symbol name (e.g., MINUS)
359for the given operator throughout the lexical analysis, parsing, tree building,
360and code generation phases;
361this requires some synchronization with the
362.I Yacc
363input file,
364.I cgram.y ,
365as well.
366.PP
367A token
368like MINUS may be seen in the lexical analyzer before it is known
369whether it is a unary or binary operator;
370clearly, it is necessary to know this by the time the parse tree
371is constructed.
372Thus, an operator (really a macro) called
373UNARY is provided, so that MINUS
374and UNARY MINUS are both distinct node numbers.
375Similarly, many binary operators exist in an assignment form (for example, \-=),
376and the operator ASG may be applied to such node names to generate new ones, e.g. ASG MINUS.
377.PP
378It is frequently desirable to know if a node represents a leaf (no descendants), a unary operator (one
379descendant) or a binary operator (two descendants).
380The macro
381.I optype(o)
382returns one of the manifest constants LTYPE, UTYPE, or BITYPE, respectively, depending
383on the node number
384.I o .
385Similarly,
386.I asgop(o)
387returns true if
388.I o
389is an assignment operator number (=, +=, etc. ), and
390.I logop(o)
391returns true if
392.I o
393is a relational or logical (&&, \(or\(or, or !) operator.
394.PP
395C has a rich typing structure, with a potentially infinite number of types.
396To begin with, there are the basic types: CHAR, SHORT, INT, LONG, the unsigned versions
397known as
398UCHAR, USHORT, UNSIGNED, ULONG, and FLOAT, DOUBLE,
399and finally
400STRTY (a structure), UNIONTY, and ENUMTY.
401Then, there are three operators that can be applied to types to make others:
402if
403.I t
404is a type, we may potentially have types
405.I "pointer to t" ,
406.I "function returning t" ,
407and
408.I "array of t's"
409generated from
410.I t .
411Thus, an arbitrary type in C consists of a basic type, and zero or more of these operators.
412.PP
413In the compiler, a type is represented by an unsigned integer; the rightmost four bits hold the basic
414type, and the remaining bits are divided into two-bit fields, containing
4150 (no operator), or one of the three operators
416described above.
417The modifiers are read right to left in the word, starting with the two-bit
418field adjacent to the basic type, until a field with 0 in it is reached.
419The macros PTR, FTN, and ARY represent the
420.I "pointer to" ,
421.I "function returning" ,
422and
423.I "array of"
424operators.
425The macro values are shifted so that they align with the first two-bit
426field; thus PTR+INT
427represents the type for an integer pointer, and
428.DS
429ARY + (PTR<<2) + (FTN<<4) + DOUBLE
430.DE
431represents the type of an array of pointers to functions returning doubles.
432.PP
433The type words are ordinarily manipulated by macros.
434If
435.I t
436is a type word,
437.I BTYPE(t)
438gives the basic type.
439.I ISPTR(t) ,
440.I ISARY(t) ,
441and
442.I ISFTN(t)
443ask if an object of this type is a pointer, array, or a function,
444respectively.
445.I MODTYPE(t,b)
446sets the basic type of
447.I t
448to
449.I b .
450.I DECREF(t)
451gives the type resulting from removing the first operator from
452.I t .
453Thus, if
454.I t
455is a pointer to
456.I t' ,
457a function returning
458.I t' ,
459or an array of
460.I t' ,
461then
462.I DECREF(t)
463would equal
464.I t' .
465.I INCREF(t)
466gives the type representing a pointer to
467.I t .
468Finally, there are operators for dealing with the unsigned types.
469.I ISUNSIGNED(t)
470returns true if
471.I t
472is one of the four basic unsigned types;
473in this case,
474.I DEUNSIGN(t)
475gives the associated `signed' type.
476Similarly,
477.I UNSIGNABLE(t)
478returns true if
479.I t
480is one of the four basic types that could become unsigned, and
481.I ENUNSIGN(t)
482returns the unsigned analogue of
483.I t
484in this case.
485.PP
486The other important global data structure is that of expression trees.
487The actual shapes of the nodes are given in
488.I mfile1
489and
490.I mfile2 .
491They are not the same in the two passes; the first pass nodes contain
492dimension and size information, while the second pass
493nodes contain register allocation information.
494Nevertheless, all nodes contain fields called
495.I op ,
496containing the node number,
497and
498.I type ,
499containing the type word.
500A function called
501.I talloc()
502returns a pointer to a new tree node.
503To free a node, its
504.I op
505field need merely be set to FREE.
506The other fields in the node will remain intact at least until the next allocation.
507.PP
508Nodes representing binary operators contain fields,
509.I left
510and
511.I right ,
512that contain pointers to the left and right descendants.
513Unary operator nodes have the
514.I left
515field, and a value field called
516.I rval .
517Leaf nodes, with no descendants, have two value fields:
518.I lval
519and
520.I rval .
521.PP
522At appropriate times, the function
523.I tcheck()
524can be called, to check that there are no busy nodes remaining.
525This is used as a compiler consistency check.
526The function
527.I tcopy(p)
528takes a pointer
529.I p
530that points to an expression tree, and returns a pointer
531to a disjoint copy of the tree.
532The function
533.I walkf(p,f)
534performs a postorder walk of the tree pointed to by
535.I p ,
536and applies the function
537.I f
538to each node.
539The function
540.I fwalk(p,f,d)
541does a preorder walk of the tree pointed to by
542.I p .
543At each node, it calls a function
544.I f ,
545passing to it the node pointer, a value passed down
546from its ancestor, and two pointers to values to be passed
547down to the left and right descendants (if any).
548The value
549.I d
550is the value passed down to the root.
551.a
552.I Fwalk
553is used for a number of tree labeling and debugging activities.
554.PP
555The other major data structure, the symbol table, exists only in pass one,
556and will be discussed later.
557.SH
558Pass One
559.PP
560The first pass does lexical analysis, parsing, symbol table maintenance,
561tree building, optimization, and a number of machine dependent things.
562This pass is largely machine independent, and the machine independent
563sections can be pretty successfully ignored.
564Thus, they will be only sketched here.
565.SH
566Lexical Analysis
567.PP
568The lexical analyzer is a conceptually simple routine that reads the input
569and returns the tokens of the C language as it encounters them:
570names, constants, operators, and keywords.
571The conceptual simplicity of this job is confounded a bit by several
572other simple jobs that unfortunately must go on simultaneously.
573These include
574.IP \(bu
575Keeping track of the current filename and line number, and occasionally
576setting this information as the result of preprocessor control lines.
577.IP \(bu
578Skipping comments.
579.IP \(bu
580Properly dealing with octal, decimal, hex, floating
581point, and character constants, as well as character strings.
582.PP
583To achieve speed, the program maintains several tables
584that are indexed into by character value, to tell
585the lexical analyzer what to do next.
586To achieve portability, these tables must be initialized
587each time the compiler is run, in order that the
588table entries reflect the local character set values.
589.SH
590Parsing
591.PP
592As mentioned above, the parser is generated by Yacc from the grammar on file
593.I cgram.y.
594The grammar is relatively readable, but contains some unusual features
595that are worth comment.
596.PP
597Perhaps the strangest feature of the grammar is the treatment of
598declarations.
599The problem is to keep track of the basic type
600and the storage class while interpreting the various
601stars, brackets, and parentheses that may surround a given name.
602The entire declaration mechanism must be recursive,
603since declarations may appear within declarations of
604structures and unions, or even within a
605.B sizeof
606construction
607inside a dimension in another declaration!
608.PP
609There are some difficulties in using a bottom-up parser,
610such as produced by Yacc, to handle constructions where a lot
611of left context information must be kept around.
612The problem is that the original PDP-11 compiler is top-down in
613implementation, and some of the semantics of C reflect this.
614In a top-down parser, the input rules are restricted
615somewhat, but one can naturally associate temporary
616storage with a rule at a very early stage in the recognition
617of that rule.
618In a bottom-up parser, there is more freedom in the
619specification of rules, but it is more difficult to know what
620rule is being matched until the entire rule is seen.
621The parser described by
622.I cgram.c
623makes effective use of
624the bottom-up parsing mechanism in some places (notably the treatment
625of expressions), but struggles against the
626restrictions in others.
627The usual result is that it is necessary to run a stack of values
628``on the side'', independent of the Yacc value stack,
629in order to be able to store and access information
630deep within inner constructions, where the relationship of the
631rules being recognized to the total picture is not yet clear.
632.PP
633In the case of declarations, the attribute information
634(type, etc.) for a declaration is carefully
635kept immediately to the left of the declarator
636(that part of the declaration involving the name).
637In this way, when it is time to declare the name, the
638name and the type information can be quickly brought together.
639The ``$0'' mechanism of Yacc is used to accomplish this.
640The result is not pretty, but it works.
641The storage class information changes more slowly,
642so it is kept in an external variable, and stacked if
643necessary.
644Some of the grammar could be considerably cleaned up by using
645some more recent features of Yacc, notably actions within
646rules and the ability to return multiple values for actions.
647.PP
648A stack is also used to keep track of the current location
649to be branched to when a
650.B break
651or
652.B continue
653statement is processed.
654.PP
655This use of external stacks dates from the time when Yacc did not permit
656values to be structures.
657Some, or most, of this use of external stacks
658could be eliminated by redoing the grammar to use the mechanisms now provided.
659There are some areas, however, particularly the processing of structure, union,
660and enum declarations, function
661prologs, and switch statement processing, when having
662all the affected data together in an array speeds later
663processing; in this case, use of external storage seems essential.
664.PP
665The
666.I cgram.y
667file also contains some small functions used as
668utility functions in the parser.
669These include routines for saving case values and labels in processing switches,
670and stacking and popping
671values on the external stack described above.
672.SH
673Storage Classes
674.PP
675C has a finite, but fairly extensive, number of storage classes
676available.
677One of the compiler design decisions was to
678process the storage class information totally in the first pass;
679by the second pass, this information must have
680been totally dealt with.
681This means that all of the storage allocation must take place in the first
682pass, so that references to automatics and
683parameters can be turned into references to cells lying a certain
684number of bytes offset from certain machine registers.
685Much of this transformation is machine dependent, and strongly
686depends on the storage class.
687.PP
688The classes include EXTERN (for externally declared, but not defined variables),
689EXTDEF (for external definitions), and similar distinctions for
690USTATIC and STATIC, UFORTRAN and FORTRAN (for fortran functions) and
691ULABEL and LABEL.
692The storage classes REGISTER and AUTO are obvious, as
693are STNAME, UNAME, and ENAME (for structure, union, and enumeration
694tags), and the associated MOS, MOU, and MOE (for the members).
695TYPEDEF is treated as a storage class as well.
696There are two special storage classes: PARAM and SNULL.
697SNULL is used to distinguish the case where no explicit
698storage class has been given; before an entry is made
699in the symbol table the true storage class is discovered.
700Similarly, PARAM is used for the temporary entry in the symbol
701table made before the declaration of function parameters is completed.
702.PP
703The most complexity in the storage class process comes from
704bit fields.
705A separate storage class is kept for each width bit field;
706a
707.I k
708bit bit field has storage class
709.I k
710plus FIELD.
711This enables the size to be quickly recovered from the storage class.
712.SH
713Symbol Table Maintenance.
714.PP
715The symbol table routines do far more than simply enter
716names into the symbol table;
717considerable semantic processing and checking is done as well.
718For example, if a new declaration comes in, it must be checked
719to see if there is a previous declaration of the same symbol.
720If there is, there are many cases.
721The declarations may agree and be compatible (for example,
722an extern declaration can appear twice) in which case the
723new declaration is ignored.
724The new declaration may add information (such as an explicit array
725dimension) to an already present declaration.
726The new declaration may be different, but still correct (for example,
727an extern declaration of something may be entered,
728and then later the definition may be seen).
729The new declaration may be incompatible, but appear in an inner block;
730in this case, the old declaration is carefully hidden away, and
731the new one comes into force until the block is left.
732Finally, the declarations may be incompatible, and an error
733message must be produced.
734.PP
735A number of other factors make for additional complexity.
736The type declared by the user is not always the type
737entered into the symbol table (for example, if an formal parameter to a function
738is declared to be an array, C requires that this be changed into
739a pointer before entry in the symbol table).
740Moreover, there are various kinds of illegal types that
741may be declared which are difficult to
742check for syntactically (for example,
743a function returning an array).
744Finally, there is a strange feature in C that requires
745structure tag names and member names for structures and unions
746to be taken from a different logical symbol table than ordinary identifiers.
747Keeping track of which kind of name is involved is a bit of struggle
748(consider typedef names used within structure declarations, for example).
749.PP
750The symbol table handling routines have been rewritten a number of times to
751extend features, improve performance, and fix bugs.
752They address the above problems with reasonable effectiveness
753but a singular lack of grace.
754.PP
755When a name is read in the input, it is hashed, and the
756routine
757.I lookup
758is called, together with a flag which tells which symbol table
759should be searched (actually, both symbol tables are stored in one, and a flag
760is used to distinguish individual entries).
761If the name is found,
762.I lookup
763returns the index to the entry found; otherwise, it makes
764a new entry, marks it UNDEF (undefined), and returns the
765index of the new entry.
766This index is stored in the
767.I rval
768field of a NAME node.
769.PP
770When a declaration is being parsed, this NAME node is
771made part of a tree with UNARY MUL nodes for each *,
772LB nodes for each array descriptor (the right descendant
773has the dimension), and UNARY CALL nodes for each function
774descriptor.
775This tree is passed to the routine
776.I tymerge ,
777along with the attribute type of the whole declaration;
778this routine collapses the tree to a single node, by calling
779.I tyreduce ,
780and then modifies the type to reflect the overall
781type of the declaration.
782.PP
783Dimension and size information is stored in a table called
784.I dimtab .
785To properly describe a type in C, one needs not just the
786type information but also size information (for structures and
787enums) and dimension information (for arrays).
788Sizes and offsets are dealt with in the compiler by
789giving the associated indices into
790.I dimtab .
791.I Tymerge
792and
793.I tyreduce
794call
795.I dstash
796to put the discovered dimensions away into the
797.I dimtab
798array.
799.I Tymerge
800returns a pointer to a single node that contains
801the symbol table index in its
802.I rval
803field, and the size and dimension indices in
804fields
805.I csiz
806and
807.I cdim ,
808respectively.
809This information is properly considered part of the type in the first pass,
810and is carried around at all times.
811.PP
812To enter an element into the symbol table, the routine
813.I defid
814is called; it is handed a storage class, and a pointer
815to the node produced by
816.I tymerge .
817.I Defid
818calls
819.I fixtype ,
820which adjusts and checks the given type depending on the storage
821class, and converts null types appropriately.
822It then calls
823.I fixclass ,
824which does a similar job for the storage class;
825it is here, for example, that
826register declarations are either allowed or changed
827to auto.
828.PP
829The new declaration is now compared against an older one,
830if present, and several pages of validity checks performed.
831If the definitions are compatible, with possibly some added information,
832the processing is straightforward.
833If the definitions differ, the block levels of the
834current and the old declaration are compared.
835The current block level is kept in
836.I blevel ,
837an external variable; the old declaration level is kept in the symbol table.
838Block level 0 is for external declarations, 1 is for arguments to functions,
839and 2 and above are blocks within a function.
840If the current block level is the same as the old declaration, an error
841results.
842If the current block level is higher, the new declaration overrides the old.
843This is done by marking the old symbol table entry ``hidden'', and making
844a new entry, marked ``hiding''.
845.I Lookup
846will skip over hidden entries.
847When a block is left, the symbol table is searched,
848and any entries defined in that block are destroyed; if
849they hid other entries, the old entries are ``unhidden''.
850.PP
851This nice block structure is warped a bit because labels
852do not follow the block structure rules (one can do a
853.B goto
854into
855a block, for example);
856default definitions of functions in inner blocks also persist clear out to the outermost scope.
857This implies that cleaning up the symbol table after block exit is more
858subtle than it might first seem.
859.PP
860For successful new definitions,
861.I defid
862also initializes a ``general purpose'' field,
863.I offset ,
864in the symbol table.
865It contains the stack offset for automatics and parameters, the register number
866for register variables, the bit offset into the structure for
867structure members, and
868the internal label number for static variables and labels.
869The offset field is set by
870.I falloc
871for bit fields, and
872.I dclstruct
873for structures and unions.
874.PP
875The symbol table entry itself thus contains
876the name, type word, size and dimension offsets,
877offset value, and declaration block level.
878It also has a field of flags, describing what symbol table the
879name is in, and whether the entry is hidden, or hides another.
880Finally, a field gives the line number of the last use,
881or of the definition, of the name.
882This is used mainly for diagnostics, but is useful to
883.I lint
884as well.
885.PP
886In some special cases, there is more than the above amount of information kept
887for the use of the compiler.
888This is especially true with structures; for use in initialization,
889structure declarations must have access to a list of the members of the
890structure.
891This list is also kept in
892.I dimtab .
893Because a structure can be mentioned long before the
894members are known, it is necessary to have another
895level of indirection in the table.
896The two words following the
897.I csiz
898entry in
899.I dimtab
900are used to hold the alignment of the structure, and
901the index in dimtab of the list of members.
902This list contains the symbol table indices for the structure members, terminated by a
903\-1.
904.SH
905Tree Building
906.PP
907The portable compiler transforms expressions
908into expression trees.
909As the parser recognizes each rule making up an
910expression,
911it calls
912.I buildtree
913which is given an operator number, and pointers to the
914left and right descendants.
915.I Buildtree
916first examines the left and right descendants,
917and, if they are both constants, and the operator
918is appropriate, simply does the constant
919computation at compile time, and returns
920the result as a constant.
921Otherwise,
922.I buildtree
923allocates a node for the head of the tree,
924attaches the descendants to it, and ensures that
925conversion operators are generated if needed, and that
926the type of the new node is consistent with the types
927of the operands.
928There is also a considerable amount of semantic complexity here;
929many combinations of types are illegal, and the portable
930compiler makes a strong effort to check
931the legality of expression types completely.
932This is done both for
933.I lint
934purposes, and to prevent such semantic errors
935from being passed through to the code generator.
936.PP
937The heart of
938.I buildtree
939is a large table, accessed by the routine
940.I opact .
941This routine maps the types of the left and right
942operands into a rather smaller set of descriptors, and then
943accesses a table (actually encoded in a switch statement) which
944for each operator and pair of types causes
945an action to be returned.
946The actions are logical or's of a number of
947separate actions, which may be
948carried out by
949.I buildtree .
950These component actions may include
951checking the left side to ensure that it
952is an lvalue (can be stored into),
953applying a type conversion to the left or right operand,
954setting the type of the new node to
955the type of the left or right operand, calling various
956routines to balance the types of the left and right operands,
957and suppressing the ordinary conversion
958of arrays and function operands to pointers.
959An important operation is OTHER, which causes
960some special code to be invoked
961in
962.I buildtree ,
963to handle issues which are unique to a particular operator.
964Examples of this are
965structure and union reference
966(actually handled by
967the routine
968.I stref ),
969the building of NAME, ICON, STRING and FCON (floating point constant) nodes,
970unary * and &, structure assignment, and calls.
971In the case of unary * and &,
972.I buildtree
973will cancel a * applied to a tree, the top node of which
974is &, and conversely.
975.PP
976Another special operation is
977PUN; this
978causes the compiler to check for type mismatches,
979such as intermixing pointers and integers.
980.PP
981The treatment of conversion operators is still a rather
982strange area of the compiler (and of C!).
983The recent introduction of type casts has only confounded this
984situation.
985Most of the conversion operators are generated by
986calls to
987.I tymatch
988and
989.I ptmatch ,
990both of which are given a tree, and asked to make the
991operands agree in type.
992.I Ptmatch
993treats the case where one of the operands is a pointer;
994.I tymatch
995treats all other cases.
996Where these routines have decided on the
997proper type for an operand, they call
998.I makety ,
999which is handed a tree, and a type word, dimension offset, and size offset.
1000If necessary, it inserts a conversion operation to make the
1001types correct.
1002Conversion operations are never inserted on the left side of
1003assignment operators, however.
1004There are two conversion operators used;
1005PCONV, if the conversion is to a non-basic type (usually a pointer),
1006and
1007SCONV, if the conversion is to a basic type (scalar).
1008.PP
1009To allow for maximum flexibility, every node produced by
1010.I buildtree
1011is given to a machine dependent routine,
1012.I clocal ,
1013immediately after it is produced.
1014This is to allow more or less immediate rewriting of those
1015nodes which must be adapted for the local machine.
1016The conversion operations are given to
1017.I clocal
1018as well; on most machines, many of these
1019conversions do nothing, and should be thrown away
1020(being careful to retain the type).
1021If this operation is done too early,
1022however,
1023later calls to
1024.I buildtree
1025may get confused about correct type of the
1026subtrees;
1027thus
1028.I clocal
1029is given the conversion ops only after the entire tree is built.
1030This topic will be dealt with in more detail later.
1031.SH
1032Initialization
1033.PP
1034Initialization is one of the messier areas in the portable compiler.
1035The only consolation is that most of the mess takes place
1036in the machine independent part, where it
1037is may be safely ignored by the implementor of the
1038compiler for a particular machine.
1039.PP
1040The basic problem is that the semantics of initialization
1041really calls for a co-routine structure; one collection
1042of programs reading constants from the input stream, while another,
1043independent set of programs places these constants into the
1044appropriate spots in memory.
1045The dramatic differences in the local assemblers also
1046come to the fore here.
1047The parsing problems are dealt with by keeping a rather
1048extensive stack containing the current
1049state of the initialization; the assembler
1050problems are dealt with by having a fair number of machine dependent routines.
1051.PP
1052The stack contains the symbol table number,
1053type, dimension index, and size index for the current identifier
1054being initialized.
1055Another entry has the offset, in bits, of the beginning
1056of the current identifier.
1057Another entry keeps track of how many elements have been seen,
1058if the current identifier is an array.
1059Still another entry keeps track of the current
1060member of a structure being initialized.
1061Finally, there is an entry containing flags
1062which keep track of the current state of the initialization process
1063(e.g., tell if a } has been seen for the current identifier.)
1064.PP
1065When an initialization begins, the routine
1066.I beginit
1067is called; it handles the alignment restrictions, if
1068any, and calls
1069.I instk
1070to create the stack entry.
1071This is done by first making an entry on the top of the stack for the item
1072being initialized.
1073If the top entry is an array, another entry is made on the stack
1074for the first element.
1075If the top entry is a structure, another entry is made on the
1076stack for the first member of the structure.
1077This continues until the top element of the stack is a scalar.
1078.I Instk
1079then returns, and the parser begins collecting initializers.
1080.PP
1081When a constant is obtained, the routine
1082.I doinit
1083is called; it examines the stack, and does whatever
1084is necessary to assign the current constant to the
1085scalar on the top of the stack.
1086.I gotscal
1087is then called, which rearranges the stack so that the
1088next scalar to be initialized gets placed on top of the stack.
1089This process continues until the end of the initializers;
1090.I endinit
1091cleans up.
1092If a { or } is encountered in the
1093string of initializers, it is handled by calling
1094.I ilbrace
1095or
1096.I irbrace ,
1097respectively.
1098.PP
1099A central issue is the treatment of the ``holes'' that
1100arise as a result of alignment restrictions or explicit
1101requests for holes in bit fields.
1102There is a global variable,
1103.I inoff ,
1104which contains the current offset in the initialization
1105(all offsets in the first pass of the compiler are in bits).
1106.I Doinit
1107figures out from the top entry on the stack the expected
1108bit offset of the next identifier; it calls the
1109machine dependent
1110routine
1111.I inforce
1112which, in a machine dependent way, forces
1113the assembler to
1114set aside space if need be so that the
1115next scalar seen will go into the appropriate
1116bit offset position.
1117The scalar itself is passed to one of the machine dependent
1118routines
1119.I fincode
1120(for floating point initialization),
1121.I incode
1122(for fields, and other initializations less than an int in
1123size),
1124and
1125.I cinit
1126(for all other initializations).
1127The size is passed to all these routines, and it is up to
1128the machine dependent routines to ensure that the
1129initializer occupies exactly the right size.
1130.PP
1131Character strings represent a bit of an exception.
1132If a character string is seen as the initializer for
1133a pointer, the characters making up the string must
1134be put out under a different location counter.
1135When the lexical analyzer sees the quote at the head
1136of a character string, it returns the token STRING,
1137but does not do anything with the contents.
1138The parser calls
1139.I getstr ,
1140which sets up the appropriate location counters
1141and flags, and calls
1142.I lxstr
1143to read and process the contents of the string.
1144.PP
1145If the string is being used to initialize a character array,
1146.I lxstr
1147calls
1148.I putbyte ,
1149which in effect simulates
1150.I doinit
1151for each character read.
1152If the string is used to initialize a character pointer,
1153.I lxstr
1154calls a machine dependent routine,
1155.I bycode ,
1156which stashes away each character.
1157The pointer to this string is then returned,
1158and processed normally by
1159.I doinit .
1160.PP
1161The null at the end of the string is treated as if it
1162were read explicitly by
1163.I lxstr .
1164.SH
1165Statements
1166.PP
1167The first pass addresses four main areas; declarations, expressions, initialization, and statements.
1168The statement processing is relatively simple; most of it is carried out in the
1169parser directly.
1170Most of the logic is concerned with allocating
1171label numbers, defining the labels, and branching appropriately.
1172An external symbol,
1173.I reached ,
1174is 1 if a statement can be reached, 0 otherwise; this is
1175used to do a bit of simple flow analysis as the program is being parsed,
1176and also to avoid generating the subroutine return sequence if the subroutine
1177cannot ``fall through'' the last statement.
1178.PP
1179Conditional branches are handled by generating an expression
1180node, CBRANCH,
1181whose left descendant is the conditional expression and the
1182right descendant is an ICON node containing the internal label
1183number to be branched to.
1184For efficiency, the semantics are that
1185the label is gone to if the condition is
1186.I false .
1187.PP
1188The switch statement is
1189compiled by collecting the case entries, and an indication as to whether
1190there is a default case;
1191an internal label number is generated for each of these,
1192and remembered in a big array.
1193The expression comprising the value to be switched on is
1194compiled when the switch keyword is encountered,
1195but the expression tree is headed by
1196a special node, FORCE, which tells the code generator to
1197put the expression value into a special distinguished
1198register (this same mechanism is used for processing the
1199return statement).
1200When the end of the switch block is reached, the array
1201containing the case values is sorted, and checked for
1202duplicate entries (an error); if all is
1203correct, the machine dependent routine
1204.I genswitch
1205is called, with this array of labels and values in increasing order.
1206.I Genswitch
1207can assume that the value to be tested is already in the
1208register which is the usual integer return value register.
1209.SH
1210Optimization
1211.PP
1212There is a machine independent file,
1213.I optim.c ,
1214which contains a relatively short optimization routine,
1215.I optim .
1216Actually the word optimization is something of a misnomer;
1217the results are not optimum, only improved, and the
1218routine is in fact not optional; it must
1219be called for proper operation of the compiler.
1220.PP
1221.I Optim
1222is called after an expression tree is built, but
1223before the code generator is called.
1224The essential part of its job is to call
1225.I clocal
1226on the conversion operators.
1227On most machines, the treatment of
1228& is also essential:
1229by this time in the processing, the only node which
1230is a legal descendant of & is NAME.
1231(Possible descendants of * have been eliminated by
1232.I buildtree.)
1233The address of a static name is, almost by definition, a
1234constant, and can be represented by an ICON node on most machines
1235(provided that the loader has enough power).
1236Unfortunately, this is not universally true; on some machine, such as the IBM 370,
1237the issue of addressability rears its ugly head;
1238thus, before turning a NAME node into an ICON node,
1239the machine dependent function
1240.I andable
1241is called.
1242.PP
1243The optimization attempts of
1244.I optim
1245are currently quite limited.
1246It is primarily concerned with improving the behavior of
1247the compiler with operations one of whose arguments is a constant.
1248In the simplest case, the constant is placed on the right if the
1249operation is commutative.
1250The compiler also makes a limited search for expressions
1251such as
1252.DS
1253.I "( x + a ) + b"
1254.DE
1255where
1256.I a
1257and
1258.I b
1259are constants, and attempts to combine
1260.I a
1261and
1262.I b
1263at compile time.
1264A number of special cases are also examined;
1265additions of 0 and multiplications by 1 are removed,
1266although the correct processing of these cases to get
1267the type of the resulting tree correct is
1268decidedly nontrivial.
1269In some cases, the addition or multiplication must be replaced by
1270a conversion op to keep the types from becoming
1271fouled up.
1272Finally, in cases where a relational operation is being done,
1273and one operand is a constant, the operands are permuted, and the operator altered, if necessary,
1274to put the constant on the right.
1275Finally, multiplications by a power of 2 are changed to shifts.
1276.PP
1277There are dozens of similar optimizations that can be, and should be,
1278done.
1279It seems likely that this routine will be expanded in the relatively near future.
1280.SH
1281Machine Dependent Stuff
1282.PP
1283A number of the first pass machine dependent routines have been discussed above.
1284In general, the routines are short, and easy to adapt from
1285machine to machine.
1286The two exceptions to this general rule are
1287.I clocal
1288and
1289the function prolog and epilog generation routines,
1290.I bfcode
1291and
1292.I efcode .
1293.PP
1294.I Clocal
1295has the job of rewriting, if appropriate and desirable,
1296the nodes constructed by
1297.I buildtree .
1298There are two major areas where this
1299is important;
1300NAME nodes and conversion operations.
1301In the case of NAME nodes,
1302.I clocal
1303must rewrite the NAME node to reflect the
1304actual physical location of the name in the machine.
1305In effect, the NAME node must be examined, the symbol table
1306entry found (through the
1307.I rval
1308field of the node),
1309and, based on the storage class of the node,
1310the tree must be rewritten.
1311Automatic variables and parameters are typically
1312rewritten by treating the reference to the variable as
1313a structure reference, off the register which
1314holds the stack or argument pointer;
1315the
1316.I stref
1317routine is set up to be called in this way, and to
1318build the appropriate tree.
1319In the most general case, the tree consists
1320of a unary * node, whose descendant is
1321a + node, with the stack or argument register as left operand,
1322and a constant offset as right operand.
1323In the case of LABEL and internal static nodes, the
1324.I rval
1325field is rewritten to be the negative of the internal
1326label number; a negative
1327.I rval
1328field is taken to be an internal label number.
1329Finally, a name of class REGISTER must be converted into a REG node,
1330and the
1331.I rval
1332field replaced by the register number.
1333In fact, this part of the
1334.I clocal
1335routine is nearly machine independent; only for machines
1336with addressability problems (IBM 370 again!) does it
1337have to be noticeably different,
1338.a
1339.PP
1340The conversion operator treatment is rather tricky.
1341It is necessary to handle the application of conversion operators
1342to constants in
1343.I clocal ,
1344in order that all constant expressions can have their values known
1345at compile time.
1346In extreme cases, this may mean that some simulation of the
1347arithmetic of the target machine might have to be done in a
1348cross-compiler.
1349In the most common case,
1350conversions from pointer to pointer do nothing.
1351For some machines, however, conversion from byte pointer to short or long
1352pointer might require a shift or rotate operation, which would
1353have to be generated here.
1354.PP
1355The extension of the portable compiler to machines where the size of a pointer
1356depends on its type would be straightforward, but has not yet been done.
1357.PP
1358The other major machine dependent issue involves the subroutine prolog and epilog
1359generation.
1360The hard part here is the design of the stack frame
1361and calling sequence; this design issue is discussed elsewhere.
1362.[
1363Johnson Lesk Ritchie calling sequence
1364.]
1365The routine
1366.I bfcode
1367is called with the number of arguments
1368the function is defined with, and
1369an array containing the symbol table indices of the
1370declared parameters.
1371.I Bfcode
1372must generate the code to establish the new stack frame,
1373save the return address and previous stack pointer
1374value on the stack, and save whatever
1375registers are to be used for register variables.
1376The stack size and the number of register variables is not
1377known when
1378.I bfcode
1379is called, so these numbers must be
1380referred to by assembler constants, which are
1381defined when they are known (usually in the second pass,
1382after all register variables, automatics, and temporaries have been seen).
1383The final job is to find those parameters which may have been declared
1384register, and generate the code to initialize
1385the register with the value passed on the stack.
1386Once again, for most machines, the general logic of
1387.I bfcode
1388remains the same, but the contents of the
1389.I printf
1390calls in it will change from machine to machine.
1391.I efcode
1392is rather simpler, having just to generate the default
1393return at the end of a function.
1394This may be nontrivial in the case of a function returning a structure or union, however.
1395.PP
1396There seems to be no really good place to discuss structures and unions, but
1397this is as good a place as any.
1398The C language now supports structure assignment,
1399and the passing of structures as arguments to functions,
1400and the receiving of structures back from functions.
1401This was added rather late to C, and thus to the portable compiler.
1402Consequently, it fits in less well than the older features.
1403Moreover, most of the burden of making these features work is
1404placed on the machine dependent code.
1405.PP
1406There are both conceptual and practical problems.
1407Conceptually, the compiler is structured around
1408the idea that to compute something, you put it into
1409a register and work on it.
1410This notion causes a bit of trouble on some machines (e.g., machines with 3-address opcodes), but
1411matches many machines quite well.
1412Unfortunately, this notion breaks down with structures.
1413The closest that one can come is to keep the addresses of the
1414structures in registers.
1415The actual code sequences used to move structures vary from the
1416trivial (a multiple byte move) to the horrible (a
1417function call), and are very machine dependent.
1418.PP
1419The practical problem is more painful.
1420When a function returning a structure is called, this function
1421has to have some place to put the structure value.
1422If it places it on the stack, it has difficulty popping its stack frame.
1423If it places the value in a static temporary, the routine fails to be
1424reentrant.
1425The most logically consistent way of implementing this is for the
1426caller to pass in a pointer to a spot where the called function
1427should put the value before returning.
1428This is relatively straightforward, although a bit tedious, to implement,
1429but means that the caller must have properly declared
1430the function type, even if the value is never used.
1431On some machines, such as the Interdata 8/32, the return value
1432simply overlays the argument region (which on the 8/32 is part
1433of the caller's stack frame).
1434The caller takes care of leaving enough room if the returned value is larger
1435than the arguments.
1436This also assumes that the caller know and declares the
1437function properly.
1438.PP
1439The PDP-11 and the VAX have stack hardware which is used in function calls and returns;
1440this makes it very inconvenient to
1441use either of the above mechanisms.
1442In these machines, a static area within the called functionis allocated, and
1443the function return value is copied into it on return; the function
1444returns the address of that region.
1445This is simple to implement, but is non-reentrant.
1446However, the function can now be called as a subroutine
1447without being properly declared, without the disaster which would otherwise ensue.
1448No matter what choice is taken, the convention is that the function
1449actually returns the address of the return structure value.
1450.PP
1451In building expression trees, the portable compiler takes a bit for granted about
1452structures.
1453It assumes that functions returning structures
1454actually return a pointer to the structure, and it assumes that
1455a reference to a structure is actually a reference to its address.
1456The structure assignment operator is rebuilt so that the left
1457operand is the structure being assigned to, but the
1458right operand is the address of the structure being assigned;
1459this makes it easier to deal with
1460.DS
1461.I "a = b = c"
1462.DE
1463and similar constructions.
1464.PP
1465There are four special tree nodes associated with these
1466operations:
1467STASG (structure assignment), STARG (structure argument
1468to a function call), and STCALL and UNARY STCALL
1469(calls of a function with nonzero and zero arguments, respectively).
1470These four nodes are unique in that the size and alignment information, which can be determined by
1471the type for all other objects in C,
1472must be known to carry out these operations; special
1473fields are set aside in these nodes to contain
1474this information, and special
1475intermediate code is used to transmit this information.
1476.SH
1477First Pass Summary
1478.PP
1479There are may other issues which have been ignored here,
1480partly to justify the title ``tour'', and partially
1481because they have seemed to cause little trouble.
1482There are some debugging flags
1483which may be turned on, by giving the compiler's first pass
1484the argument
1485.DS
1486\-X[flags]
1487.DE
1488Some of the more interesting flags are
1489\-Xd for the defining and freeing of symbols,
1490\-Xi for initialization comments, and
1491\-Xb for various comments about the building of trees.
1492In many cases, repeating the flag more than once gives more information;
1493thus,
1494\-Xddd gives more information than \-Xd.
1495In the two pass version of the compiler, the
1496flags should not be set when the output is sent to the second
1497pass, since the debugging output and the intermediate code both go onto the standard
1498output.
1499.PP
1500We turn now to consideration of the second pass.