Commit | Line | Data |
---|---|---|
8340f87c BJ |
1 | .TL |
2 | A Tour Through the Portable C Compiler | |
3 | .AU | |
4 | S. C. Johnson | |
5 | .AI | |
6 | .MH | |
7 | .SH | |
8 | Introduction | |
9 | .PP | |
10 | A C compiler has been implemented that has proved to be quite portable, | |
11 | serving as the basis for C compilers on roughly a dozen machines, including | |
12 | the Honeywell 6000, IBM 370, and Interdata 8/32. | |
13 | The compiler is highly compatible with the C language standard. | |
14 | .[ | |
15 | Kernighan Ritchie Prentice 1978 | |
16 | .] | |
17 | .PP | |
18 | Among the goals of this compiler are portability, high reliability, | |
19 | and the use of state-of-the-art techniques and tools wherever practical. | |
20 | Although the efficiency of the compiling process is not | |
21 | a primary goal, the compiler is efficient enough, and produces | |
22 | good enough code, to serve as a production compiler. | |
23 | .PP | |
24 | The language implemented is highly compatible with the current PDP-11 version of C. | |
25 | Moreover, roughly 75% of the compiler, including | |
26 | nearly all the syntactic and semantic routines, is machine independent. | |
27 | The compiler also serves as the major portion of the program | |
28 | .I lint , | |
29 | described elsewhere. | |
30 | .[ | |
31 | Johnson Lint Program Checker | |
32 | .] | |
33 | .PP | |
34 | A number of earlier attempts to make portable compilers are worth noting. | |
35 | While on CO-OP assignment to Bell Labs in 1973, Alan Snyder wrote a portable C | |
36 | compiler which was the basis of his Master's Thesis at M.I.T. | |
37 | .[ | |
38 | Snyder Portable C Compiler | |
39 | .] | |
40 | This compiler was very slow and complicated, and contained a number of rather | |
41 | serious implementation difficulties; | |
42 | nevertheless, a number of Snyder's ideas appear in this work. | |
43 | .PP | |
44 | Most earlier portable compilers, including Snyder's, have proceeded by | |
45 | defining an intermediate language, perhaps based | |
46 | on three-address code or code for a stack machine, | |
47 | and writing a machine independent program to | |
48 | translate from the source code to this | |
49 | intermediate code. | |
50 | The intermediate code is then read by a second pass, and interpreted or compiled. | |
51 | This approach is elegant, and has a number of advantages, especially if the | |
52 | target machine is far removed from the host. | |
53 | It suffers from some disadvantages as well. | |
54 | Some constructions, like initialization and subroutine prologs, | |
55 | are difficult or expensive to express in a | |
56 | machine independent way that still allows them | |
57 | to be easily adapted to the target assemblers. | |
58 | Most of these approaches require a symbol table to be constructed | |
59 | in the second (machine dependent) pass, and/or | |
60 | require powerful target assemblers. | |
61 | Also, many conversion operators may be generated | |
62 | that have no effect on a given machine, | |
63 | but may be needed on others (for example, pointer to pointer | |
64 | conversions usually do nothing in C, but must be generated because | |
65 | there are some machines where they are significant). | |
66 | .PP | |
67 | For these reasons, the first pass of the portable compiler | |
68 | is not entirely machine independent. | |
69 | It contains some machine dependent features, such as initialization, subroutine | |
70 | prolog and epilog, certain storage allocation functions, | |
71 | code for the | |
72 | .I switch | |
73 | statement, and code to throw out unneeded conversion operators. | |
74 | .PP | |
75 | As a crude measure of the degree of | |
76 | portability actually achieved, | |
77 | the Interdata 8/32 C compiler has | |
78 | roughly 600 machine dependent lines of source out of 4600 in Pass 1, and 1000 | |
79 | out of 3400 in Pass 2. | |
80 | In total, 1600 out of 8000, or 20%, | |
81 | of the total source is machine dependent | |
82 | (12% in Pass 1, 30% in Pass 2). | |
83 | These percentages can be expected to rise slightly as the | |
84 | compiler is tuned. | |
85 | The percentage of machine-dependent code for the IBM is 22%, for | |
86 | the Honeywell 25%. | |
87 | If the assembler format and structure were the same for all | |
88 | these machines, perhaps another 5-10% of the code | |
89 | would become machine independent. | |
90 | .PP | |
91 | These figures are sufficiently misleading as to be almost | |
92 | meaningless. | |
93 | A large fraction of the machine dependent code can be converted | |
94 | in a straightforward, almost mechanical way. | |
95 | On the other hand, a certain amount of the code requres hard | |
96 | intellectual effort to convert, since the algorithms | |
97 | embodied in this part of the code are typically complicated | |
98 | and machine dependent. | |
99 | .PP | |
100 | To summarize, however, if you need a C compiler written for a machine with | |
101 | a reasonable architecture, the compiler is already three quarters finished! | |
102 | .SH | |
103 | Overview | |
104 | .PP | |
105 | This paper discusses the structure and organization of the portable compiler. | |
106 | The intent is to give the big picture, rather than discussing the details of a particular machine | |
107 | implementation. | |
108 | After a brief overview and a discussion of the source file structure, | |
109 | the paper describes the major data structures, and then delves more | |
110 | closely into the two passes. | |
111 | Some of the theoretical work on which the compiler is based, and | |
112 | its application to the compiler, is discussed elsewhere. | |
113 | .[ | |
114 | johnson portable theory practice | |
115 | .] | |
116 | One of the major design issues in any C compiler, the design of | |
117 | the calling sequence and stack frame, is the subject of a separate | |
118 | memorandum. | |
119 | .[ | |
120 | johnson lesk ritchie calling sequence | |
121 | .] | |
122 | .PP | |
123 | The compiler consists of two passes, | |
124 | .I pass1 | |
125 | and | |
126 | .I pass2 , | |
127 | that together turn C source code into assembler code for the target | |
128 | machine. | |
129 | The two passes are preceded by a preprocessor, that | |
130 | handles the | |
131 | .B "#define" | |
132 | and | |
133 | .B "#include" | |
134 | statements, and related features (e.g., | |
135 | .B #ifdef , | |
136 | etc.). | |
137 | It is a nearly machine independent program, and will | |
138 | not be further discussed here. | |
139 | .PP | |
140 | The output of the preprocessor is a text file that is read as the standard | |
141 | input of the first pass. | |
142 | This | |
143 | produces as standard output another text file that becomes the standard | |
144 | input of the second pass. | |
145 | The second pass produces, as standard output, the desired assembler language source code. | |
146 | The preprocessor and the two passes | |
147 | all write error messages on the standard error file. | |
148 | Thus the compiler itself makes few demands on the I/O | |
149 | library support, aiding in the bootstrapping process. | |
150 | .PP | |
151 | Although the compiler is divided into two passes, | |
152 | this represents historical accident more than deep necessity. | |
153 | In fact, the compiler can optionally be loaded | |
154 | so that both passes operate in the same program. | |
155 | This ``one pass'' operation eliminates the overhead of | |
156 | reading and writing the intermediate file, so the compiler | |
157 | operates about 30% faster in this mode. | |
158 | It also occupies about 30% more space than the larger | |
159 | of the two component passes. | |
160 | .PP | |
161 | Because the compiler is fundamentally structured as two | |
162 | passes, even when loaded as one, this document primarily | |
163 | describes the two pass version. | |
164 | .PP | |
165 | The first pass does the lexical analysis, parsing, and | |
166 | symbol table maintenance. | |
167 | It also constructs parse trees for expressions, | |
168 | and keeps track of the types of the nodes in these trees. | |
169 | Additional code is devoted to initialization. | |
170 | Machine dependent portions of the first pass serve to | |
171 | generate subroutine prologs and epilogs, | |
172 | code for switches, and code for branches, label definitions, | |
173 | alignment operations, | |
174 | changes of location counter, etc. | |
175 | .PP | |
176 | The intermediate file is a text file organized into lines. | |
177 | Lines beginning with a right parenthesis are copied by the second | |
178 | pass directly to its output file, with the | |
179 | parenthesis stripped off. | |
180 | Thus, when the first pass produces assembly code, such as subroutine prologs, | |
181 | etc., each line is prefaced with a right parenthesis; | |
182 | the second pass passes these lines to through to the assembler. | |
183 | .PP | |
184 | The major job done by the second pass is generation of code | |
185 | for expressions. | |
186 | The expression parse trees produced in the first pass are | |
187 | written onto the | |
188 | intermediate file in Polish Prefix form: | |
189 | first, there is a line beginning with a period, followed by the source | |
190 | file line number and name on which the expression appeared (for debugging purposes). | |
191 | The successive lines represent the nodes of the parse tree, | |
192 | one node per line. | |
193 | Each line contains the node number, type, and | |
194 | any values (e.g., values of constants) | |
195 | that may appear in the node. | |
196 | Lines representing nodes with descendants are immediately | |
197 | followed by the left subtree of descendants, then the | |
198 | right. | |
199 | Since the number of descendants of any node is completely | |
200 | determined by the node number, there is no need to mark the end | |
201 | of the tree. | |
202 | .PP | |
203 | There are only two other line types in the intermediate file. | |
204 | Lines beginning with a left square bracket (`[') represent the beginning of | |
205 | blocks (delimited by { ... } in the C source); | |
206 | lines beginning with right square brackets (`]') represent the end of blocks. | |
207 | The remainder of these lines tell how much | |
208 | stack space, and how many register variables, | |
209 | are currently in use. | |
210 | .PP | |
211 | Thus, the second pass reads the intermediate files, copies the `)' lines, | |
212 | makes note of the information in the `[' and `]' lines, | |
213 | and devotes most of its effort to the | |
214 | `.' lines and their associated expression trees, turning them | |
215 | turns into assembly code to evaluate the expressions. | |
216 | .PP | |
217 | In the one pass version of the compiler, the expression | |
218 | trees that are built by the first pass have | |
219 | been declared to have room for the second pass | |
220 | information as well. | |
221 | Instead of writing the trees onto an intermediate file, | |
222 | each tree is transformed in place into an acceptable | |
223 | form for the code generator. | |
224 | The code generator then writes the result of compiling | |
225 | this tree onto the standard output. | |
226 | Instead of `[' and `]' lines in the intermediate file, the information is passed | |
227 | directly to the second pass routines. | |
228 | Assembly code produced by the first pass is simply written out, | |
229 | without the need for ')' at the head of each line. | |
230 | .SH | |
231 | The Source Files | |
232 | .PP | |
233 | The compiler source consists of 22 source files. | |
234 | Two files, | |
235 | .I manifest | |
236 | and | |
237 | .I macdefs , | |
238 | are header files included with all other files. | |
239 | .I Manifest | |
240 | has declarations for the node numbers, types, storage classes, | |
241 | and other global data definitions. | |
242 | .I Macdefs | |
243 | has machine-dependent definitions, such as the size and alignment of the | |
244 | various data representations. | |
245 | Two machine independent header files, | |
246 | .I mfile1 | |
247 | and | |
248 | .I mfile2 , | |
249 | contain the data structure and manifest definitions | |
250 | for the first and second passes, respectively. | |
251 | In the second pass, a machine dependent header file, | |
252 | .I mac2defs , | |
253 | contains declarations of register names, etc. | |
254 | .PP | |
255 | There is a file, | |
256 | .I common , | |
257 | containing (machine independent) routines used in both passes. | |
258 | These include routines for allocating and freeing trees, walking over trees, | |
259 | printing debugging information, and printing error messages. | |
260 | There are two dummy files, | |
261 | .I comm1.c | |
262 | and | |
263 | .I comm2.c , | |
264 | that simply include | |
265 | .I common | |
266 | within the scope of the appropriate pass1 or pass2 header files. | |
267 | When the compiler is loaded as a single pass, | |
268 | .I common | |
269 | only needs to be included once: | |
270 | .I comm2.c | |
271 | is not needed. | |
272 | .PP | |
273 | Entire sections of this document are devoted to the detailed structure of the | |
274 | passes. | |
275 | For the moment, we just give a brief description of the files. | |
276 | The first pass | |
277 | is obtained by compiling and loading | |
278 | .I scan.c , | |
279 | .I cgram.c , | |
280 | .I xdefs.c , | |
281 | .I pftn.c , | |
282 | .I trees.c , | |
283 | .I optim.c , | |
284 | .I local.c , | |
285 | .I code.c , | |
286 | and | |
287 | .I comm1.c . | |
288 | .I Scan.c | |
289 | is the lexical analyzer, which is used by | |
290 | .I cgram.c , | |
291 | the result of applying | |
292 | .I Yacc | |
293 | .[ | |
294 | Johnson Yacc Compiler cstr | |
295 | .] | |
296 | to the input grammar | |
297 | .I cgram.y . | |
298 | .I Xdefs.c | |
299 | is a short file of external definitions. | |
300 | .I Pftn.c | |
301 | maintains the symbol table, and does initialization. | |
302 | .I Trees.c | |
303 | builds the expression trees, and computes the node types. | |
304 | .I Optim.c | |
305 | does some machine independent optimizations on the expression trees. | |
306 | .I Comm1.c | |
307 | includes | |
308 | .I common , | |
309 | that contains service routines common to the two passes of the compiler. | |
310 | All the above files are machine independent. | |
311 | The files | |
312 | .I local.c | |
313 | and | |
314 | .I code.c | |
315 | contain machine dependent code for generating subroutine | |
316 | prologs, switch code, and the like. | |
317 | .PP | |
318 | The second pass | |
319 | is produced by compiling and loading | |
320 | .I reader.c , | |
321 | .I allo.c , | |
322 | .I match.c , | |
323 | .I comm1.c , | |
324 | .I order.c , | |
325 | .I local.c , | |
326 | and | |
327 | .I table.c . | |
328 | .I Reader.c | |
329 | reads the intermediate file, and controls the major logic of the | |
330 | code generation. | |
331 | .I Allo.c | |
332 | keeps track of busy and free registers. | |
333 | .I Match.c | |
334 | controls the matching | |
335 | of code templates to subtrees of the expression tree to be compiled. | |
336 | .I Comm2.c | |
337 | includes the file | |
338 | .I common , | |
339 | as in the first pass. | |
340 | The above files are machine independent. | |
341 | .I Order.c | |
342 | controls the machine dependent details of the code generation strategy. | |
343 | .I Local2.c | |
344 | has many small machine dependent routines, | |
345 | and tables of opcodes, register types, etc. | |
346 | .I Table.c | |
347 | has the code template tables, | |
348 | which are also clearly machine dependent. | |
349 | .SH | |
350 | Data Structure Considerations. | |
351 | .PP | |
352 | This section discusses the node numbers, type words, and expression trees, used | |
353 | throughout both passes of the compiler. | |
354 | .PP | |
355 | The file | |
356 | .I manifest | |
357 | defines those symbols used throughout both passes. | |
358 | The intent is to use the same symbol name (e.g., MINUS) | |
359 | for the given operator throughout the lexical analysis, parsing, tree building, | |
360 | and code generation phases; | |
361 | this requires some synchronization with the | |
362 | .I Yacc | |
363 | input file, | |
364 | .I cgram.y , | |
365 | as well. | |
366 | .PP | |
367 | A token | |
368 | like MINUS may be seen in the lexical analyzer before it is known | |
369 | whether it is a unary or binary operator; | |
370 | clearly, it is necessary to know this by the time the parse tree | |
371 | is constructed. | |
372 | Thus, an operator (really a macro) called | |
373 | UNARY is provided, so that MINUS | |
374 | and UNARY MINUS are both distinct node numbers. | |
375 | Similarly, many binary operators exist in an assignment form (for example, \-=), | |
376 | and the operator ASG may be applied to such node names to generate new ones, e.g. ASG MINUS. | |
377 | .PP | |
378 | It is frequently desirable to know if a node represents a leaf (no descendants), a unary operator (one | |
379 | descendant) or a binary operator (two descendants). | |
380 | The macro | |
381 | .I optype(o) | |
382 | returns one of the manifest constants LTYPE, UTYPE, or BITYPE, respectively, depending | |
383 | on the node number | |
384 | .I o . | |
385 | Similarly, | |
386 | .I asgop(o) | |
387 | returns true if | |
388 | .I o | |
389 | is an assignment operator number (=, +=, etc. ), and | |
390 | .I logop(o) | |
391 | returns true if | |
392 | .I o | |
393 | is a relational or logical (&&, \(or\(or, or !) operator. | |
394 | .PP | |
395 | C has a rich typing structure, with a potentially infinite number of types. | |
396 | To begin with, there are the basic types: CHAR, SHORT, INT, LONG, the unsigned versions | |
397 | known as | |
398 | UCHAR, USHORT, UNSIGNED, ULONG, and FLOAT, DOUBLE, | |
399 | and finally | |
400 | STRTY (a structure), UNIONTY, and ENUMTY. | |
401 | Then, there are three operators that can be applied to types to make others: | |
402 | if | |
403 | .I t | |
404 | is a type, we may potentially have types | |
405 | .I "pointer to t" , | |
406 | .I "function returning t" , | |
407 | and | |
408 | .I "array of t's" | |
409 | generated from | |
410 | .I t . | |
411 | Thus, an arbitrary type in C consists of a basic type, and zero or more of these operators. | |
412 | .PP | |
413 | In the compiler, a type is represented by an unsigned integer; the rightmost four bits hold the basic | |
414 | type, and the remaining bits are divided into two-bit fields, containing | |
415 | 0 (no operator), or one of the three operators | |
416 | described above. | |
417 | The modifiers are read right to left in the word, starting with the two-bit | |
418 | field adjacent to the basic type, until a field with 0 in it is reached. | |
419 | The macros PTR, FTN, and ARY represent the | |
420 | .I "pointer to" , | |
421 | .I "function returning" , | |
422 | and | |
423 | .I "array of" | |
424 | operators. | |
425 | The macro values are shifted so that they align with the first two-bit | |
426 | field; thus PTR+INT | |
427 | represents the type for an integer pointer, and | |
428 | .DS | |
429 | ARY + (PTR<<2) + (FTN<<4) + DOUBLE | |
430 | .DE | |
431 | represents the type of an array of pointers to functions returning doubles. | |
432 | .PP | |
433 | The type words are ordinarily manipulated by macros. | |
434 | If | |
435 | .I t | |
436 | is a type word, | |
437 | .I BTYPE(t) | |
438 | gives the basic type. | |
439 | .I ISPTR(t) , | |
440 | .I ISARY(t) , | |
441 | and | |
442 | .I ISFTN(t) | |
443 | ask if an object of this type is a pointer, array, or a function, | |
444 | respectively. | |
445 | .I MODTYPE(t,b) | |
446 | sets the basic type of | |
447 | .I t | |
448 | to | |
449 | .I b . | |
450 | .I DECREF(t) | |
451 | gives the type resulting from removing the first operator from | |
452 | .I t . | |
453 | Thus, if | |
454 | .I t | |
455 | is a pointer to | |
456 | .I t' , | |
457 | a function returning | |
458 | .I t' , | |
459 | or an array of | |
460 | .I t' , | |
461 | then | |
462 | .I DECREF(t) | |
463 | would equal | |
464 | .I t' . | |
465 | .I INCREF(t) | |
466 | gives the type representing a pointer to | |
467 | .I t . | |
468 | Finally, there are operators for dealing with the unsigned types. | |
469 | .I ISUNSIGNED(t) | |
470 | returns true if | |
471 | .I t | |
472 | is one of the four basic unsigned types; | |
473 | in this case, | |
474 | .I DEUNSIGN(t) | |
475 | gives the associated `signed' type. | |
476 | Similarly, | |
477 | .I UNSIGNABLE(t) | |
478 | returns true if | |
479 | .I t | |
480 | is one of the four basic types that could become unsigned, and | |
481 | .I ENUNSIGN(t) | |
482 | returns the unsigned analogue of | |
483 | .I t | |
484 | in this case. | |
485 | .PP | |
486 | The other important global data structure is that of expression trees. | |
487 | The actual shapes of the nodes are given in | |
488 | .I mfile1 | |
489 | and | |
490 | .I mfile2 . | |
491 | They are not the same in the two passes; the first pass nodes contain | |
492 | dimension and size information, while the second pass | |
493 | nodes contain register allocation information. | |
494 | Nevertheless, all nodes contain fields called | |
495 | .I op , | |
496 | containing the node number, | |
497 | and | |
498 | .I type , | |
499 | containing the type word. | |
500 | A function called | |
501 | .I talloc() | |
502 | returns a pointer to a new tree node. | |
503 | To free a node, its | |
504 | .I op | |
505 | field need merely be set to FREE. | |
506 | The other fields in the node will remain intact at least until the next allocation. | |
507 | .PP | |
508 | Nodes representing binary operators contain fields, | |
509 | .I left | |
510 | and | |
511 | .I right , | |
512 | that contain pointers to the left and right descendants. | |
513 | Unary operator nodes have the | |
514 | .I left | |
515 | field, and a value field called | |
516 | .I rval . | |
517 | Leaf nodes, with no descendants, have two value fields: | |
518 | .I lval | |
519 | and | |
520 | .I rval . | |
521 | .PP | |
522 | At appropriate times, the function | |
523 | .I tcheck() | |
524 | can be called, to check that there are no busy nodes remaining. | |
525 | This is used as a compiler consistency check. | |
526 | The function | |
527 | .I tcopy(p) | |
528 | takes a pointer | |
529 | .I p | |
530 | that points to an expression tree, and returns a pointer | |
531 | to a disjoint copy of the tree. | |
532 | The function | |
533 | .I walkf(p,f) | |
534 | performs a postorder walk of the tree pointed to by | |
535 | .I p , | |
536 | and applies the function | |
537 | .I f | |
538 | to each node. | |
539 | The function | |
540 | .I fwalk(p,f,d) | |
541 | does a preorder walk of the tree pointed to by | |
542 | .I p . | |
543 | At each node, it calls a function | |
544 | .I f , | |
545 | passing to it the node pointer, a value passed down | |
546 | from its ancestor, and two pointers to values to be passed | |
547 | down to the left and right descendants (if any). | |
548 | The value | |
549 | .I d | |
550 | is the value passed down to the root. | |
551 | .a | |
552 | .I Fwalk | |
553 | is used for a number of tree labeling and debugging activities. | |
554 | .PP | |
555 | The other major data structure, the symbol table, exists only in pass one, | |
556 | and will be discussed later. | |
557 | .SH | |
558 | Pass One | |
559 | .PP | |
560 | The first pass does lexical analysis, parsing, symbol table maintenance, | |
561 | tree building, optimization, and a number of machine dependent things. | |
562 | This pass is largely machine independent, and the machine independent | |
563 | sections can be pretty successfully ignored. | |
564 | Thus, they will be only sketched here. | |
565 | .SH | |
566 | Lexical Analysis | |
567 | .PP | |
568 | The lexical analyzer is a conceptually simple routine that reads the input | |
569 | and returns the tokens of the C language as it encounters them: | |
570 | names, constants, operators, and keywords. | |
571 | The conceptual simplicity of this job is confounded a bit by several | |
572 | other simple jobs that unfortunately must go on simultaneously. | |
573 | These include | |
574 | .IP \(bu | |
575 | Keeping track of the current filename and line number, and occasionally | |
576 | setting this information as the result of preprocessor control lines. | |
577 | .IP \(bu | |
578 | Skipping comments. | |
579 | .IP \(bu | |
580 | Properly dealing with octal, decimal, hex, floating | |
581 | point, and character constants, as well as character strings. | |
582 | .PP | |
583 | To achieve speed, the program maintains several tables | |
584 | that are indexed into by character value, to tell | |
585 | the lexical analyzer what to do next. | |
586 | To achieve portability, these tables must be initialized | |
587 | each time the compiler is run, in order that the | |
588 | table entries reflect the local character set values. | |
589 | .SH | |
590 | Parsing | |
591 | .PP | |
592 | As mentioned above, the parser is generated by Yacc from the grammar on file | |
593 | .I cgram.y. | |
594 | The grammar is relatively readable, but contains some unusual features | |
595 | that are worth comment. | |
596 | .PP | |
597 | Perhaps the strangest feature of the grammar is the treatment of | |
598 | declarations. | |
599 | The problem is to keep track of the basic type | |
600 | and the storage class while interpreting the various | |
601 | stars, brackets, and parentheses that may surround a given name. | |
602 | The entire declaration mechanism must be recursive, | |
603 | since declarations may appear within declarations of | |
604 | structures and unions, or even within a | |
605 | .B sizeof | |
606 | construction | |
607 | inside a dimension in another declaration! | |
608 | .PP | |
609 | There are some difficulties in using a bottom-up parser, | |
610 | such as produced by Yacc, to handle constructions where a lot | |
611 | of left context information must be kept around. | |
612 | The problem is that the original PDP-11 compiler is top-down in | |
613 | implementation, and some of the semantics of C reflect this. | |
614 | In a top-down parser, the input rules are restricted | |
615 | somewhat, but one can naturally associate temporary | |
616 | storage with a rule at a very early stage in the recognition | |
617 | of that rule. | |
618 | In a bottom-up parser, there is more freedom in the | |
619 | specification of rules, but it is more difficult to know what | |
620 | rule is being matched until the entire rule is seen. | |
621 | The parser described by | |
622 | .I cgram.c | |
623 | makes effective use of | |
624 | the bottom-up parsing mechanism in some places (notably the treatment | |
625 | of expressions), but struggles against the | |
626 | restrictions in others. | |
627 | The usual result is that it is necessary to run a stack of values | |
628 | ``on the side'', independent of the Yacc value stack, | |
629 | in order to be able to store and access information | |
630 | deep within inner constructions, where the relationship of the | |
631 | rules being recognized to the total picture is not yet clear. | |
632 | .PP | |
633 | In the case of declarations, the attribute information | |
634 | (type, etc.) for a declaration is carefully | |
635 | kept immediately to the left of the declarator | |
636 | (that part of the declaration involving the name). | |
637 | In this way, when it is time to declare the name, the | |
638 | name and the type information can be quickly brought together. | |
639 | The ``$0'' mechanism of Yacc is used to accomplish this. | |
640 | The result is not pretty, but it works. | |
641 | The storage class information changes more slowly, | |
642 | so it is kept in an external variable, and stacked if | |
643 | necessary. | |
644 | Some of the grammar could be considerably cleaned up by using | |
645 | some more recent features of Yacc, notably actions within | |
646 | rules and the ability to return multiple values for actions. | |
647 | .PP | |
648 | A stack is also used to keep track of the current location | |
649 | to be branched to when a | |
650 | .B break | |
651 | or | |
652 | .B continue | |
653 | statement is processed. | |
654 | .PP | |
655 | This use of external stacks dates from the time when Yacc did not permit | |
656 | values to be structures. | |
657 | Some, or most, of this use of external stacks | |
658 | could be eliminated by redoing the grammar to use the mechanisms now provided. | |
659 | There are some areas, however, particularly the processing of structure, union, | |
660 | and enum declarations, function | |
661 | prologs, and switch statement processing, when having | |
662 | all the affected data together in an array speeds later | |
663 | processing; in this case, use of external storage seems essential. | |
664 | .PP | |
665 | The | |
666 | .I cgram.y | |
667 | file also contains some small functions used as | |
668 | utility functions in the parser. | |
669 | These include routines for saving case values and labels in processing switches, | |
670 | and stacking and popping | |
671 | values on the external stack described above. | |
672 | .SH | |
673 | Storage Classes | |
674 | .PP | |
675 | C has a finite, but fairly extensive, number of storage classes | |
676 | available. | |
677 | One of the compiler design decisions was to | |
678 | process the storage class information totally in the first pass; | |
679 | by the second pass, this information must have | |
680 | been totally dealt with. | |
681 | This means that all of the storage allocation must take place in the first | |
682 | pass, so that references to automatics and | |
683 | parameters can be turned into references to cells lying a certain | |
684 | number of bytes offset from certain machine registers. | |
685 | Much of this transformation is machine dependent, and strongly | |
686 | depends on the storage class. | |
687 | .PP | |
688 | The classes include EXTERN (for externally declared, but not defined variables), | |
689 | EXTDEF (for external definitions), and similar distinctions for | |
690 | USTATIC and STATIC, UFORTRAN and FORTRAN (for fortran functions) and | |
691 | ULABEL and LABEL. | |
692 | The storage classes REGISTER and AUTO are obvious, as | |
693 | are STNAME, UNAME, and ENAME (for structure, union, and enumeration | |
694 | tags), and the associated MOS, MOU, and MOE (for the members). | |
695 | TYPEDEF is treated as a storage class as well. | |
696 | There are two special storage classes: PARAM and SNULL. | |
697 | SNULL is used to distinguish the case where no explicit | |
698 | storage class has been given; before an entry is made | |
699 | in the symbol table the true storage class is discovered. | |
700 | Similarly, PARAM is used for the temporary entry in the symbol | |
701 | table made before the declaration of function parameters is completed. | |
702 | .PP | |
703 | The most complexity in the storage class process comes from | |
704 | bit fields. | |
705 | A separate storage class is kept for each width bit field; | |
706 | a | |
707 | .I k | |
708 | bit bit field has storage class | |
709 | .I k | |
710 | plus FIELD. | |
711 | This enables the size to be quickly recovered from the storage class. | |
712 | .SH | |
713 | Symbol Table Maintenance. | |
714 | .PP | |
715 | The symbol table routines do far more than simply enter | |
716 | names into the symbol table; | |
717 | considerable semantic processing and checking is done as well. | |
718 | For example, if a new declaration comes in, it must be checked | |
719 | to see if there is a previous declaration of the same symbol. | |
720 | If there is, there are many cases. | |
721 | The declarations may agree and be compatible (for example, | |
722 | an extern declaration can appear twice) in which case the | |
723 | new declaration is ignored. | |
724 | The new declaration may add information (such as an explicit array | |
725 | dimension) to an already present declaration. | |
726 | The new declaration may be different, but still correct (for example, | |
727 | an extern declaration of something may be entered, | |
728 | and then later the definition may be seen). | |
729 | The new declaration may be incompatible, but appear in an inner block; | |
730 | in this case, the old declaration is carefully hidden away, and | |
731 | the new one comes into force until the block is left. | |
732 | Finally, the declarations may be incompatible, and an error | |
733 | message must be produced. | |
734 | .PP | |
735 | A number of other factors make for additional complexity. | |
736 | The type declared by the user is not always the type | |
737 | entered into the symbol table (for example, if an formal parameter to a function | |
738 | is declared to be an array, C requires that this be changed into | |
739 | a pointer before entry in the symbol table). | |
740 | Moreover, there are various kinds of illegal types that | |
741 | may be declared which are difficult to | |
742 | check for syntactically (for example, | |
743 | a function returning an array). | |
744 | Finally, there is a strange feature in C that requires | |
745 | structure tag names and member names for structures and unions | |
746 | to be taken from a different logical symbol table than ordinary identifiers. | |
747 | Keeping track of which kind of name is involved is a bit of struggle | |
748 | (consider typedef names used within structure declarations, for example). | |
749 | .PP | |
750 | The symbol table handling routines have been rewritten a number of times to | |
751 | extend features, improve performance, and fix bugs. | |
752 | They address the above problems with reasonable effectiveness | |
753 | but a singular lack of grace. | |
754 | .PP | |
755 | When a name is read in the input, it is hashed, and the | |
756 | routine | |
757 | .I lookup | |
758 | is called, together with a flag which tells which symbol table | |
759 | should be searched (actually, both symbol tables are stored in one, and a flag | |
760 | is used to distinguish individual entries). | |
761 | If the name is found, | |
762 | .I lookup | |
763 | returns the index to the entry found; otherwise, it makes | |
764 | a new entry, marks it UNDEF (undefined), and returns the | |
765 | index of the new entry. | |
766 | This index is stored in the | |
767 | .I rval | |
768 | field of a NAME node. | |
769 | .PP | |
770 | When a declaration is being parsed, this NAME node is | |
771 | made part of a tree with UNARY MUL nodes for each *, | |
772 | LB nodes for each array descriptor (the right descendant | |
773 | has the dimension), and UNARY CALL nodes for each function | |
774 | descriptor. | |
775 | This tree is passed to the routine | |
776 | .I tymerge , | |
777 | along with the attribute type of the whole declaration; | |
778 | this routine collapses the tree to a single node, by calling | |
779 | .I tyreduce , | |
780 | and then modifies the type to reflect the overall | |
781 | type of the declaration. | |
782 | .PP | |
783 | Dimension and size information is stored in a table called | |
784 | .I dimtab . | |
785 | To properly describe a type in C, one needs not just the | |
786 | type information but also size information (for structures and | |
787 | enums) and dimension information (for arrays). | |
788 | Sizes and offsets are dealt with in the compiler by | |
789 | giving the associated indices into | |
790 | .I dimtab . | |
791 | .I Tymerge | |
792 | and | |
793 | .I tyreduce | |
794 | call | |
795 | .I dstash | |
796 | to put the discovered dimensions away into the | |
797 | .I dimtab | |
798 | array. | |
799 | .I Tymerge | |
800 | returns a pointer to a single node that contains | |
801 | the symbol table index in its | |
802 | .I rval | |
803 | field, and the size and dimension indices in | |
804 | fields | |
805 | .I csiz | |
806 | and | |
807 | .I cdim , | |
808 | respectively. | |
809 | This information is properly considered part of the type in the first pass, | |
810 | and is carried around at all times. | |
811 | .PP | |
812 | To enter an element into the symbol table, the routine | |
813 | .I defid | |
814 | is called; it is handed a storage class, and a pointer | |
815 | to the node produced by | |
816 | .I tymerge . | |
817 | .I Defid | |
818 | calls | |
819 | .I fixtype , | |
820 | which adjusts and checks the given type depending on the storage | |
821 | class, and converts null types appropriately. | |
822 | It then calls | |
823 | .I fixclass , | |
824 | which does a similar job for the storage class; | |
825 | it is here, for example, that | |
826 | register declarations are either allowed or changed | |
827 | to auto. | |
828 | .PP | |
829 | The new declaration is now compared against an older one, | |
830 | if present, and several pages of validity checks performed. | |
831 | If the definitions are compatible, with possibly some added information, | |
832 | the processing is straightforward. | |
833 | If the definitions differ, the block levels of the | |
834 | current and the old declaration are compared. | |
835 | The current block level is kept in | |
836 | .I blevel , | |
837 | an external variable; the old declaration level is kept in the symbol table. | |
838 | Block level 0 is for external declarations, 1 is for arguments to functions, | |
839 | and 2 and above are blocks within a function. | |
840 | If the current block level is the same as the old declaration, an error | |
841 | results. | |
842 | If the current block level is higher, the new declaration overrides the old. | |
843 | This is done by marking the old symbol table entry ``hidden'', and making | |
844 | a new entry, marked ``hiding''. | |
845 | .I Lookup | |
846 | will skip over hidden entries. | |
847 | When a block is left, the symbol table is searched, | |
848 | and any entries defined in that block are destroyed; if | |
849 | they hid other entries, the old entries are ``unhidden''. | |
850 | .PP | |
851 | This nice block structure is warped a bit because labels | |
852 | do not follow the block structure rules (one can do a | |
853 | .B goto | |
854 | into | |
855 | a block, for example); | |
856 | default definitions of functions in inner blocks also persist clear out to the outermost scope. | |
857 | This implies that cleaning up the symbol table after block exit is more | |
858 | subtle than it might first seem. | |
859 | .PP | |
860 | For successful new definitions, | |
861 | .I defid | |
862 | also initializes a ``general purpose'' field, | |
863 | .I offset , | |
864 | in the symbol table. | |
865 | It contains the stack offset for automatics and parameters, the register number | |
866 | for register variables, the bit offset into the structure for | |
867 | structure members, and | |
868 | the internal label number for static variables and labels. | |
869 | The offset field is set by | |
870 | .I falloc | |
871 | for bit fields, and | |
872 | .I dclstruct | |
873 | for structures and unions. | |
874 | .PP | |
875 | The symbol table entry itself thus contains | |
876 | the name, type word, size and dimension offsets, | |
877 | offset value, and declaration block level. | |
878 | It also has a field of flags, describing what symbol table the | |
879 | name is in, and whether the entry is hidden, or hides another. | |
880 | Finally, a field gives the line number of the last use, | |
881 | or of the definition, of the name. | |
882 | This is used mainly for diagnostics, but is useful to | |
883 | .I lint | |
884 | as well. | |
885 | .PP | |
886 | In some special cases, there is more than the above amount of information kept | |
887 | for the use of the compiler. | |
888 | This is especially true with structures; for use in initialization, | |
889 | structure declarations must have access to a list of the members of the | |
890 | structure. | |
891 | This list is also kept in | |
892 | .I dimtab . | |
893 | Because a structure can be mentioned long before the | |
894 | members are known, it is necessary to have another | |
895 | level of indirection in the table. | |
896 | The two words following the | |
897 | .I csiz | |
898 | entry in | |
899 | .I dimtab | |
900 | are used to hold the alignment of the structure, and | |
901 | the index in dimtab of the list of members. | |
902 | This list contains the symbol table indices for the structure members, terminated by a | |
903 | \-1. | |
904 | .SH | |
905 | Tree Building | |
906 | .PP | |
907 | The portable compiler transforms expressions | |
908 | into expression trees. | |
909 | As the parser recognizes each rule making up an | |
910 | expression, | |
911 | it calls | |
912 | .I buildtree | |
913 | which is given an operator number, and pointers to the | |
914 | left and right descendants. | |
915 | .I Buildtree | |
916 | first examines the left and right descendants, | |
917 | and, if they are both constants, and the operator | |
918 | is appropriate, simply does the constant | |
919 | computation at compile time, and returns | |
920 | the result as a constant. | |
921 | Otherwise, | |
922 | .I buildtree | |
923 | allocates a node for the head of the tree, | |
924 | attaches the descendants to it, and ensures that | |
925 | conversion operators are generated if needed, and that | |
926 | the type of the new node is consistent with the types | |
927 | of the operands. | |
928 | There is also a considerable amount of semantic complexity here; | |
929 | many combinations of types are illegal, and the portable | |
930 | compiler makes a strong effort to check | |
931 | the legality of expression types completely. | |
932 | This is done both for | |
933 | .I lint | |
934 | purposes, and to prevent such semantic errors | |
935 | from being passed through to the code generator. | |
936 | .PP | |
937 | The heart of | |
938 | .I buildtree | |
939 | is a large table, accessed by the routine | |
940 | .I opact . | |
941 | This routine maps the types of the left and right | |
942 | operands into a rather smaller set of descriptors, and then | |
943 | accesses a table (actually encoded in a switch statement) which | |
944 | for each operator and pair of types causes | |
945 | an action to be returned. | |
946 | The actions are logical or's of a number of | |
947 | separate actions, which may be | |
948 | carried out by | |
949 | .I buildtree . | |
950 | These component actions may include | |
951 | checking the left side to ensure that it | |
952 | is an lvalue (can be stored into), | |
953 | applying a type conversion to the left or right operand, | |
954 | setting the type of the new node to | |
955 | the type of the left or right operand, calling various | |
956 | routines to balance the types of the left and right operands, | |
957 | and suppressing the ordinary conversion | |
958 | of arrays and function operands to pointers. | |
959 | An important operation is OTHER, which causes | |
960 | some special code to be invoked | |
961 | in | |
962 | .I buildtree , | |
963 | to handle issues which are unique to a particular operator. | |
964 | Examples of this are | |
965 | structure and union reference | |
966 | (actually handled by | |
967 | the routine | |
968 | .I stref ), | |
969 | the building of NAME, ICON, STRING and FCON (floating point constant) nodes, | |
970 | unary * and &, structure assignment, and calls. | |
971 | In the case of unary * and &, | |
972 | .I buildtree | |
973 | will cancel a * applied to a tree, the top node of which | |
974 | is &, and conversely. | |
975 | .PP | |
976 | Another special operation is | |
977 | PUN; this | |
978 | causes the compiler to check for type mismatches, | |
979 | such as intermixing pointers and integers. | |
980 | .PP | |
981 | The treatment of conversion operators is still a rather | |
982 | strange area of the compiler (and of C!). | |
983 | The recent introduction of type casts has only confounded this | |
984 | situation. | |
985 | Most of the conversion operators are generated by | |
986 | calls to | |
987 | .I tymatch | |
988 | and | |
989 | .I ptmatch , | |
990 | both of which are given a tree, and asked to make the | |
991 | operands agree in type. | |
992 | .I Ptmatch | |
993 | treats the case where one of the operands is a pointer; | |
994 | .I tymatch | |
995 | treats all other cases. | |
996 | Where these routines have decided on the | |
997 | proper type for an operand, they call | |
998 | .I makety , | |
999 | which is handed a tree, and a type word, dimension offset, and size offset. | |
1000 | If necessary, it inserts a conversion operation to make the | |
1001 | types correct. | |
1002 | Conversion operations are never inserted on the left side of | |
1003 | assignment operators, however. | |
1004 | There are two conversion operators used; | |
1005 | PCONV, if the conversion is to a non-basic type (usually a pointer), | |
1006 | and | |
1007 | SCONV, if the conversion is to a basic type (scalar). | |
1008 | .PP | |
1009 | To allow for maximum flexibility, every node produced by | |
1010 | .I buildtree | |
1011 | is given to a machine dependent routine, | |
1012 | .I clocal , | |
1013 | immediately after it is produced. | |
1014 | This is to allow more or less immediate rewriting of those | |
1015 | nodes which must be adapted for the local machine. | |
1016 | The conversion operations are given to | |
1017 | .I clocal | |
1018 | as well; on most machines, many of these | |
1019 | conversions do nothing, and should be thrown away | |
1020 | (being careful to retain the type). | |
1021 | If this operation is done too early, | |
1022 | however, | |
1023 | later calls to | |
1024 | .I buildtree | |
1025 | may get confused about correct type of the | |
1026 | subtrees; | |
1027 | thus | |
1028 | .I clocal | |
1029 | is given the conversion ops only after the entire tree is built. | |
1030 | This topic will be dealt with in more detail later. | |
1031 | .SH | |
1032 | Initialization | |
1033 | .PP | |
1034 | Initialization is one of the messier areas in the portable compiler. | |
1035 | The only consolation is that most of the mess takes place | |
1036 | in the machine independent part, where it | |
1037 | is may be safely ignored by the implementor of the | |
1038 | compiler for a particular machine. | |
1039 | .PP | |
1040 | The basic problem is that the semantics of initialization | |
1041 | really calls for a co-routine structure; one collection | |
1042 | of programs reading constants from the input stream, while another, | |
1043 | independent set of programs places these constants into the | |
1044 | appropriate spots in memory. | |
1045 | The dramatic differences in the local assemblers also | |
1046 | come to the fore here. | |
1047 | The parsing problems are dealt with by keeping a rather | |
1048 | extensive stack containing the current | |
1049 | state of the initialization; the assembler | |
1050 | problems are dealt with by having a fair number of machine dependent routines. | |
1051 | .PP | |
1052 | The stack contains the symbol table number, | |
1053 | type, dimension index, and size index for the current identifier | |
1054 | being initialized. | |
1055 | Another entry has the offset, in bits, of the beginning | |
1056 | of the current identifier. | |
1057 | Another entry keeps track of how many elements have been seen, | |
1058 | if the current identifier is an array. | |
1059 | Still another entry keeps track of the current | |
1060 | member of a structure being initialized. | |
1061 | Finally, there is an entry containing flags | |
1062 | which keep track of the current state of the initialization process | |
1063 | (e.g., tell if a } has been seen for the current identifier.) | |
1064 | .PP | |
1065 | When an initialization begins, the routine | |
1066 | .I beginit | |
1067 | is called; it handles the alignment restrictions, if | |
1068 | any, and calls | |
1069 | .I instk | |
1070 | to create the stack entry. | |
1071 | This is done by first making an entry on the top of the stack for the item | |
1072 | being initialized. | |
1073 | If the top entry is an array, another entry is made on the stack | |
1074 | for the first element. | |
1075 | If the top entry is a structure, another entry is made on the | |
1076 | stack for the first member of the structure. | |
1077 | This continues until the top element of the stack is a scalar. | |
1078 | .I Instk | |
1079 | then returns, and the parser begins collecting initializers. | |
1080 | .PP | |
1081 | When a constant is obtained, the routine | |
1082 | .I doinit | |
1083 | is called; it examines the stack, and does whatever | |
1084 | is necessary to assign the current constant to the | |
1085 | scalar on the top of the stack. | |
1086 | .I gotscal | |
1087 | is then called, which rearranges the stack so that the | |
1088 | next scalar to be initialized gets placed on top of the stack. | |
1089 | This process continues until the end of the initializers; | |
1090 | .I endinit | |
1091 | cleans up. | |
1092 | If a { or } is encountered in the | |
1093 | string of initializers, it is handled by calling | |
1094 | .I ilbrace | |
1095 | or | |
1096 | .I irbrace , | |
1097 | respectively. | |
1098 | .PP | |
1099 | A central issue is the treatment of the ``holes'' that | |
1100 | arise as a result of alignment restrictions or explicit | |
1101 | requests for holes in bit fields. | |
1102 | There is a global variable, | |
1103 | .I inoff , | |
1104 | which contains the current offset in the initialization | |
1105 | (all offsets in the first pass of the compiler are in bits). | |
1106 | .I Doinit | |
1107 | figures out from the top entry on the stack the expected | |
1108 | bit offset of the next identifier; it calls the | |
1109 | machine dependent | |
1110 | routine | |
1111 | .I inforce | |
1112 | which, in a machine dependent way, forces | |
1113 | the assembler to | |
1114 | set aside space if need be so that the | |
1115 | next scalar seen will go into the appropriate | |
1116 | bit offset position. | |
1117 | The scalar itself is passed to one of the machine dependent | |
1118 | routines | |
1119 | .I fincode | |
1120 | (for floating point initialization), | |
1121 | .I incode | |
1122 | (for fields, and other initializations less than an int in | |
1123 | size), | |
1124 | and | |
1125 | .I cinit | |
1126 | (for all other initializations). | |
1127 | The size is passed to all these routines, and it is up to | |
1128 | the machine dependent routines to ensure that the | |
1129 | initializer occupies exactly the right size. | |
1130 | .PP | |
1131 | Character strings represent a bit of an exception. | |
1132 | If a character string is seen as the initializer for | |
1133 | a pointer, the characters making up the string must | |
1134 | be put out under a different location counter. | |
1135 | When the lexical analyzer sees the quote at the head | |
1136 | of a character string, it returns the token STRING, | |
1137 | but does not do anything with the contents. | |
1138 | The parser calls | |
1139 | .I getstr , | |
1140 | which sets up the appropriate location counters | |
1141 | and flags, and calls | |
1142 | .I lxstr | |
1143 | to read and process the contents of the string. | |
1144 | .PP | |
1145 | If the string is being used to initialize a character array, | |
1146 | .I lxstr | |
1147 | calls | |
1148 | .I putbyte , | |
1149 | which in effect simulates | |
1150 | .I doinit | |
1151 | for each character read. | |
1152 | If the string is used to initialize a character pointer, | |
1153 | .I lxstr | |
1154 | calls a machine dependent routine, | |
1155 | .I bycode , | |
1156 | which stashes away each character. | |
1157 | The pointer to this string is then returned, | |
1158 | and processed normally by | |
1159 | .I doinit . | |
1160 | .PP | |
1161 | The null at the end of the string is treated as if it | |
1162 | were read explicitly by | |
1163 | .I lxstr . | |
1164 | .SH | |
1165 | Statements | |
1166 | .PP | |
1167 | The first pass addresses four main areas; declarations, expressions, initialization, and statements. | |
1168 | The statement processing is relatively simple; most of it is carried out in the | |
1169 | parser directly. | |
1170 | Most of the logic is concerned with allocating | |
1171 | label numbers, defining the labels, and branching appropriately. | |
1172 | An external symbol, | |
1173 | .I reached , | |
1174 | is 1 if a statement can be reached, 0 otherwise; this is | |
1175 | used to do a bit of simple flow analysis as the program is being parsed, | |
1176 | and also to avoid generating the subroutine return sequence if the subroutine | |
1177 | cannot ``fall through'' the last statement. | |
1178 | .PP | |
1179 | Conditional branches are handled by generating an expression | |
1180 | node, CBRANCH, | |
1181 | whose left descendant is the conditional expression and the | |
1182 | right descendant is an ICON node containing the internal label | |
1183 | number to be branched to. | |
1184 | For efficiency, the semantics are that | |
1185 | the label is gone to if the condition is | |
1186 | .I false . | |
1187 | .PP | |
1188 | The switch statement is | |
1189 | compiled by collecting the case entries, and an indication as to whether | |
1190 | there is a default case; | |
1191 | an internal label number is generated for each of these, | |
1192 | and remembered in a big array. | |
1193 | The expression comprising the value to be switched on is | |
1194 | compiled when the switch keyword is encountered, | |
1195 | but the expression tree is headed by | |
1196 | a special node, FORCE, which tells the code generator to | |
1197 | put the expression value into a special distinguished | |
1198 | register (this same mechanism is used for processing the | |
1199 | return statement). | |
1200 | When the end of the switch block is reached, the array | |
1201 | containing the case values is sorted, and checked for | |
1202 | duplicate entries (an error); if all is | |
1203 | correct, the machine dependent routine | |
1204 | .I genswitch | |
1205 | is called, with this array of labels and values in increasing order. | |
1206 | .I Genswitch | |
1207 | can assume that the value to be tested is already in the | |
1208 | register which is the usual integer return value register. | |
1209 | .SH | |
1210 | Optimization | |
1211 | .PP | |
1212 | There is a machine independent file, | |
1213 | .I optim.c , | |
1214 | which contains a relatively short optimization routine, | |
1215 | .I optim . | |
1216 | Actually the word optimization is something of a misnomer; | |
1217 | the results are not optimum, only improved, and the | |
1218 | routine is in fact not optional; it must | |
1219 | be called for proper operation of the compiler. | |
1220 | .PP | |
1221 | .I Optim | |
1222 | is called after an expression tree is built, but | |
1223 | before the code generator is called. | |
1224 | The essential part of its job is to call | |
1225 | .I clocal | |
1226 | on the conversion operators. | |
1227 | On most machines, the treatment of | |
1228 | & is also essential: | |
1229 | by this time in the processing, the only node which | |
1230 | is a legal descendant of & is NAME. | |
1231 | (Possible descendants of * have been eliminated by | |
1232 | .I buildtree.) | |
1233 | The address of a static name is, almost by definition, a | |
1234 | constant, and can be represented by an ICON node on most machines | |
1235 | (provided that the loader has enough power). | |
1236 | Unfortunately, this is not universally true; on some machine, such as the IBM 370, | |
1237 | the issue of addressability rears its ugly head; | |
1238 | thus, before turning a NAME node into an ICON node, | |
1239 | the machine dependent function | |
1240 | .I andable | |
1241 | is called. | |
1242 | .PP | |
1243 | The optimization attempts of | |
1244 | .I optim | |
1245 | are currently quite limited. | |
1246 | It is primarily concerned with improving the behavior of | |
1247 | the compiler with operations one of whose arguments is a constant. | |
1248 | In the simplest case, the constant is placed on the right if the | |
1249 | operation is commutative. | |
1250 | The compiler also makes a limited search for expressions | |
1251 | such as | |
1252 | .DS | |
1253 | .I "( x + a ) + b" | |
1254 | .DE | |
1255 | where | |
1256 | .I a | |
1257 | and | |
1258 | .I b | |
1259 | are constants, and attempts to combine | |
1260 | .I a | |
1261 | and | |
1262 | .I b | |
1263 | at compile time. | |
1264 | A number of special cases are also examined; | |
1265 | additions of 0 and multiplications by 1 are removed, | |
1266 | although the correct processing of these cases to get | |
1267 | the type of the resulting tree correct is | |
1268 | decidedly nontrivial. | |
1269 | In some cases, the addition or multiplication must be replaced by | |
1270 | a conversion op to keep the types from becoming | |
1271 | fouled up. | |
1272 | Finally, in cases where a relational operation is being done, | |
1273 | and one operand is a constant, the operands are permuted, and the operator altered, if necessary, | |
1274 | to put the constant on the right. | |
1275 | Finally, multiplications by a power of 2 are changed to shifts. | |
1276 | .PP | |
1277 | There are dozens of similar optimizations that can be, and should be, | |
1278 | done. | |
1279 | It seems likely that this routine will be expanded in the relatively near future. | |
1280 | .SH | |
1281 | Machine Dependent Stuff | |
1282 | .PP | |
1283 | A number of the first pass machine dependent routines have been discussed above. | |
1284 | In general, the routines are short, and easy to adapt from | |
1285 | machine to machine. | |
1286 | The two exceptions to this general rule are | |
1287 | .I clocal | |
1288 | and | |
1289 | the function prolog and epilog generation routines, | |
1290 | .I bfcode | |
1291 | and | |
1292 | .I efcode . | |
1293 | .PP | |
1294 | .I Clocal | |
1295 | has the job of rewriting, if appropriate and desirable, | |
1296 | the nodes constructed by | |
1297 | .I buildtree . | |
1298 | There are two major areas where this | |
1299 | is important; | |
1300 | NAME nodes and conversion operations. | |
1301 | In the case of NAME nodes, | |
1302 | .I clocal | |
1303 | must rewrite the NAME node to reflect the | |
1304 | actual physical location of the name in the machine. | |
1305 | In effect, the NAME node must be examined, the symbol table | |
1306 | entry found (through the | |
1307 | .I rval | |
1308 | field of the node), | |
1309 | and, based on the storage class of the node, | |
1310 | the tree must be rewritten. | |
1311 | Automatic variables and parameters are typically | |
1312 | rewritten by treating the reference to the variable as | |
1313 | a structure reference, off the register which | |
1314 | holds the stack or argument pointer; | |
1315 | the | |
1316 | .I stref | |
1317 | routine is set up to be called in this way, and to | |
1318 | build the appropriate tree. | |
1319 | In the most general case, the tree consists | |
1320 | of a unary * node, whose descendant is | |
1321 | a + node, with the stack or argument register as left operand, | |
1322 | and a constant offset as right operand. | |
1323 | In the case of LABEL and internal static nodes, the | |
1324 | .I rval | |
1325 | field is rewritten to be the negative of the internal | |
1326 | label number; a negative | |
1327 | .I rval | |
1328 | field is taken to be an internal label number. | |
1329 | Finally, a name of class REGISTER must be converted into a REG node, | |
1330 | and the | |
1331 | .I rval | |
1332 | field replaced by the register number. | |
1333 | In fact, this part of the | |
1334 | .I clocal | |
1335 | routine is nearly machine independent; only for machines | |
1336 | with addressability problems (IBM 370 again!) does it | |
1337 | have to be noticeably different, | |
1338 | .a | |
1339 | .PP | |
1340 | The conversion operator treatment is rather tricky. | |
1341 | It is necessary to handle the application of conversion operators | |
1342 | to constants in | |
1343 | .I clocal , | |
1344 | in order that all constant expressions can have their values known | |
1345 | at compile time. | |
1346 | In extreme cases, this may mean that some simulation of the | |
1347 | arithmetic of the target machine might have to be done in a | |
1348 | cross-compiler. | |
1349 | In the most common case, | |
1350 | conversions from pointer to pointer do nothing. | |
1351 | For some machines, however, conversion from byte pointer to short or long | |
1352 | pointer might require a shift or rotate operation, which would | |
1353 | have to be generated here. | |
1354 | .PP | |
1355 | The extension of the portable compiler to machines where the size of a pointer | |
1356 | depends on its type would be straightforward, but has not yet been done. | |
1357 | .PP | |
1358 | The other major machine dependent issue involves the subroutine prolog and epilog | |
1359 | generation. | |
1360 | The hard part here is the design of the stack frame | |
1361 | and calling sequence; this design issue is discussed elsewhere. | |
1362 | .[ | |
1363 | Johnson Lesk Ritchie calling sequence | |
1364 | .] | |
1365 | The routine | |
1366 | .I bfcode | |
1367 | is called with the number of arguments | |
1368 | the function is defined with, and | |
1369 | an array containing the symbol table indices of the | |
1370 | declared parameters. | |
1371 | .I Bfcode | |
1372 | must generate the code to establish the new stack frame, | |
1373 | save the return address and previous stack pointer | |
1374 | value on the stack, and save whatever | |
1375 | registers are to be used for register variables. | |
1376 | The stack size and the number of register variables is not | |
1377 | known when | |
1378 | .I bfcode | |
1379 | is called, so these numbers must be | |
1380 | referred to by assembler constants, which are | |
1381 | defined when they are known (usually in the second pass, | |
1382 | after all register variables, automatics, and temporaries have been seen). | |
1383 | The final job is to find those parameters which may have been declared | |
1384 | register, and generate the code to initialize | |
1385 | the register with the value passed on the stack. | |
1386 | Once again, for most machines, the general logic of | |
1387 | .I bfcode | |
1388 | remains the same, but the contents of the | |
1389 | .I printf | |
1390 | calls in it will change from machine to machine. | |
1391 | .I efcode | |
1392 | is rather simpler, having just to generate the default | |
1393 | return at the end of a function. | |
1394 | This may be nontrivial in the case of a function returning a structure or union, however. | |
1395 | .PP | |
1396 | There seems to be no really good place to discuss structures and unions, but | |
1397 | this is as good a place as any. | |
1398 | The C language now supports structure assignment, | |
1399 | and the passing of structures as arguments to functions, | |
1400 | and the receiving of structures back from functions. | |
1401 | This was added rather late to C, and thus to the portable compiler. | |
1402 | Consequently, it fits in less well than the older features. | |
1403 | Moreover, most of the burden of making these features work is | |
1404 | placed on the machine dependent code. | |
1405 | .PP | |
1406 | There are both conceptual and practical problems. | |
1407 | Conceptually, the compiler is structured around | |
1408 | the idea that to compute something, you put it into | |
1409 | a register and work on it. | |
1410 | This notion causes a bit of trouble on some machines (e.g., machines with 3-address opcodes), but | |
1411 | matches many machines quite well. | |
1412 | Unfortunately, this notion breaks down with structures. | |
1413 | The closest that one can come is to keep the addresses of the | |
1414 | structures in registers. | |
1415 | The actual code sequences used to move structures vary from the | |
1416 | trivial (a multiple byte move) to the horrible (a | |
1417 | function call), and are very machine dependent. | |
1418 | .PP | |
1419 | The practical problem is more painful. | |
1420 | When a function returning a structure is called, this function | |
1421 | has to have some place to put the structure value. | |
1422 | If it places it on the stack, it has difficulty popping its stack frame. | |
1423 | If it places the value in a static temporary, the routine fails to be | |
1424 | reentrant. | |
1425 | The most logically consistent way of implementing this is for the | |
1426 | caller to pass in a pointer to a spot where the called function | |
1427 | should put the value before returning. | |
1428 | This is relatively straightforward, although a bit tedious, to implement, | |
1429 | but means that the caller must have properly declared | |
1430 | the function type, even if the value is never used. | |
1431 | On some machines, such as the Interdata 8/32, the return value | |
1432 | simply overlays the argument region (which on the 8/32 is part | |
1433 | of the caller's stack frame). | |
1434 | The caller takes care of leaving enough room if the returned value is larger | |
1435 | than the arguments. | |
1436 | This also assumes that the caller know and declares the | |
1437 | function properly. | |
1438 | .PP | |
1439 | The PDP-11 and the VAX have stack hardware which is used in function calls and returns; | |
1440 | this makes it very inconvenient to | |
1441 | use either of the above mechanisms. | |
1442 | In these machines, a static area within the called functionis allocated, and | |
1443 | the function return value is copied into it on return; the function | |
1444 | returns the address of that region. | |
1445 | This is simple to implement, but is non-reentrant. | |
1446 | However, the function can now be called as a subroutine | |
1447 | without being properly declared, without the disaster which would otherwise ensue. | |
1448 | No matter what choice is taken, the convention is that the function | |
1449 | actually returns the address of the return structure value. | |
1450 | .PP | |
1451 | In building expression trees, the portable compiler takes a bit for granted about | |
1452 | structures. | |
1453 | It assumes that functions returning structures | |
1454 | actually return a pointer to the structure, and it assumes that | |
1455 | a reference to a structure is actually a reference to its address. | |
1456 | The structure assignment operator is rebuilt so that the left | |
1457 | operand is the structure being assigned to, but the | |
1458 | right operand is the address of the structure being assigned; | |
1459 | this makes it easier to deal with | |
1460 | .DS | |
1461 | .I "a = b = c" | |
1462 | .DE | |
1463 | and similar constructions. | |
1464 | .PP | |
1465 | There are four special tree nodes associated with these | |
1466 | operations: | |
1467 | STASG (structure assignment), STARG (structure argument | |
1468 | to a function call), and STCALL and UNARY STCALL | |
1469 | (calls of a function with nonzero and zero arguments, respectively). | |
1470 | These four nodes are unique in that the size and alignment information, which can be determined by | |
1471 | the type for all other objects in C, | |
1472 | must be known to carry out these operations; special | |
1473 | fields are set aside in these nodes to contain | |
1474 | this information, and special | |
1475 | intermediate code is used to transmit this information. | |
1476 | .SH | |
1477 | First Pass Summary | |
1478 | .PP | |
1479 | There are may other issues which have been ignored here, | |
1480 | partly to justify the title ``tour'', and partially | |
1481 | because they have seemed to cause little trouble. | |
1482 | There are some debugging flags | |
1483 | which may be turned on, by giving the compiler's first pass | |
1484 | the argument | |
1485 | .DS | |
1486 | \-X[flags] | |
1487 | .DE | |
1488 | Some of the more interesting flags are | |
1489 | \-Xd for the defining and freeing of symbols, | |
1490 | \-Xi for initialization comments, and | |
1491 | \-Xb for various comments about the building of trees. | |
1492 | In many cases, repeating the flag more than once gives more information; | |
1493 | thus, | |
1494 | \-Xddd gives more information than \-Xd. | |
1495 | In the two pass version of the compiler, the | |
1496 | flags should not be set when the output is sent to the second | |
1497 | pass, since the debugging output and the intermediate code both go onto the standard | |
1498 | output. | |
1499 | .PP | |
1500 | We turn now to consideration of the second pass. |