Commit | Line | Data |
---|---|---|
c9528a00 C |
1 | .SH |
2 | 9: Hints for Preparing Specifications | |
3 | .PP | |
4 | This section contains miscellaneous hints on preparing efficient, easy to change, | |
5 | and clear specifications. | |
6 | The individual subsections are more or less | |
7 | independent. | |
8 | .SH | |
9 | Input Style | |
10 | .PP | |
11 | It is difficult to | |
12 | provide rules with substantial actions | |
13 | and still have a readable specification file. | |
14 | The following style hints owe much to Brian Kernighan. | |
15 | .IP a. | |
16 | Use all capital letters for token names, all lower case letters for | |
17 | nonterminal names. | |
18 | This rule comes under the heading of ``knowing who to blame when | |
19 | things go wrong.'' | |
20 | .IP b. | |
21 | Put grammar rules and actions on separate lines. | |
22 | This allows either to be changed without | |
23 | an automatic need to change the other. | |
24 | .IP c. | |
25 | Put all rules with the same left hand side together. | |
26 | Put the left hand side in only once, and let all | |
27 | following rules begin with a vertical bar. | |
28 | .IP d. | |
29 | Put a semicolon only after the last rule with a given left hand side, | |
30 | and put the semicolon on a separate line. | |
31 | This allows new rules to be easily added. | |
32 | .IP e. | |
33 | Indent rule bodies by two tab stops, and action bodies by three | |
34 | tab stops. | |
35 | .PP | |
36 | The example in Appendix A is written following this style, as are | |
37 | the examples in the text of this paper (where space permits). | |
38 | The user must make up his own mind about these stylistic questions; | |
39 | the central problem, however, is to make the rules visible through | |
40 | the morass of action code. | |
41 | .SH | |
42 | Left Recursion | |
43 | .PP | |
44 | The algorithm used by the Yacc parser encourages so called ``left recursive'' | |
45 | grammar rules: rules of the form | |
46 | .DS | |
47 | name : name rest_of_rule ; | |
48 | .DE | |
49 | These rules frequently arise when | |
50 | writing specifications of sequences and lists: | |
51 | .DS | |
52 | list : item | |
53 | | list \',\' item | |
54 | ; | |
55 | .DE | |
56 | and | |
57 | .DS | |
58 | seq : item | |
59 | | seq item | |
60 | ; | |
61 | .DE | |
62 | In each of these cases, the first rule | |
63 | will be reduced for the first item only, and the second rule | |
64 | will be reduced for the second and all succeeding items. | |
65 | .PP | |
66 | With right recursive rules, such as | |
67 | .DS | |
68 | seq : item | |
69 | | item seq | |
70 | ; | |
71 | .DE | |
72 | the parser would be a bit bigger, and the items would be seen, and reduced, | |
73 | from right to left. | |
74 | More seriously, an internal stack in the parser | |
75 | would be in danger of overflowing if a very long sequence were read. | |
76 | Thus, the user should use left recursion wherever reasonable. | |
77 | .PP | |
78 | It is worth considering whether a sequence with zero | |
79 | elements has any meaning, and if so, consider writing | |
80 | the sequence specification with an empty rule: | |
81 | .DS | |
82 | seq : /* empty */ | |
83 | | seq item | |
84 | ; | |
85 | .DE | |
86 | Once again, the first rule would always be reduced exactly once, before the | |
87 | first item was read, | |
88 | and then the second rule would be reduced once for each item read. | |
89 | Permitting empty sequences | |
90 | often leads to increased generality. | |
91 | However, conflicts might arise if Yacc is asked to decide | |
92 | which empty sequence it has seen, when it hasn't seen enough to | |
93 | know! | |
94 | .SH | |
95 | Lexical Tie-ins | |
96 | .PP | |
97 | Some lexical decisions depend on context. | |
98 | For example, the lexical analyzer might want to | |
99 | delete blanks normally, but not within quoted strings. | |
100 | Or names might be entered into a symbol table in declarations, | |
101 | but not in expressions. | |
102 | .PP | |
103 | One way of handling this situation is | |
104 | to create a global flag that is | |
105 | examined by the lexical analyzer, and set by actions. | |
106 | For example, suppose a program | |
107 | consists of 0 or more declarations, followed by 0 or more statements. | |
108 | Consider: | |
109 | .DS | |
110 | %{ | |
111 | int dflag; | |
112 | %} | |
113 | ... other declarations ... | |
114 | ||
115 | %% | |
116 | ||
117 | prog : decls stats | |
118 | ; | |
119 | ||
120 | decls : /* empty */ | |
121 | { dflag = 1; } | |
122 | | decls declaration | |
123 | ; | |
124 | ||
125 | stats : /* empty */ | |
126 | { dflag = 0; } | |
127 | | stats statement | |
128 | ; | |
129 | ||
130 | ... other rules ... | |
131 | .DE | |
132 | The flag | |
133 | .I dflag | |
134 | is now 0 when reading statements, and 1 when reading declarations, | |
135 | .ul | |
136 | except for the first token in the first statement. | |
137 | This token must be seen by the parser before it can tell that | |
138 | the declaration section has ended and the statements have | |
139 | begun. | |
140 | In many cases, this single token exception does not | |
141 | affect the lexical scan. | |
142 | .PP | |
143 | This kind of ``backdoor'' approach can be elaborated | |
144 | to a noxious degree. | |
145 | Nevertheless, it represents a way of doing some things | |
146 | that are difficult, if not impossible, to | |
147 | do otherwise. | |
148 | .SH | |
149 | Reserved Words | |
150 | .PP | |
151 | Some programming languages | |
152 | permit the user to | |
153 | use words like ``if'', which are normally reserved, | |
154 | as label or variable names, provided that such use does not | |
155 | conflict with the legal use of these names in the programming language. | |
156 | This is extremely hard to do in the framework of Yacc; | |
157 | it is difficult to pass information to the lexical analyzer | |
158 | telling it ``this instance of `if' is a keyword, and that instance is a variable''. | |
159 | The user can make a stab at it, using the | |
160 | mechanism described in the last subsection, | |
161 | but it is difficult. | |
162 | .PP | |
163 | A number of ways of making this easier are under advisement. | |
164 | Until then, it is better that the keywords be | |
165 | .I reserved \|; | |
166 | that is, be forbidden for use as variable names. | |
167 | There are powerful stylistic reasons for preferring this, anyway. |