Commit | Line | Data |
---|---|---|
4ee66e30 RH |
1 | 5/18/78 |
2 | A new version of Yacc has been installed which contains some new | |
3 | features relating to error recovery, detection of funny conditions in the | |
4 | grammar, and strong typing. Existing grammars should continue to work, | |
5 | with the possible exception of somewhat better error recovery behavior. | |
6 | More details follow: | |
7 | ||
8 | *** Ratfor and EFL Yacc are dead. Long live C! | |
9 | ||
10 | *** The y.tab.c file now uses the # line feature to reflect | |
11 | most error conditions in actions, etc., back to the yacc source | |
12 | file, rather than the y.tab.c file. As always with such features, | |
13 | lookahead may cause the line number to be one too large | |
14 | occasionally. | |
15 | ||
16 | *** The error recovery algorithm has been changed to cause the | |
17 | parser never to reduce on a state where there is a shift | |
18 | on the special token `error'. This has the effect of causing | |
19 | the error recovery action to take place somewhat closer to the | |
20 | location of the error than previously. It does not affect the | |
21 | behavior of the parser in the absence of errors. The parse | |
22 | tables may be 1-2% larger as a result of this change. | |
23 | ||
24 | *** Yacc now detects the existence of nonterminals in the grammar | |
25 | which can never derive any strings of tokens (even the empty string). | |
26 | The simplest example is the grammar: | |
27 | %% | |
28 | s : s 'a' ; | |
29 | Here, one must reduce `s' in order to reduce `s': the | |
30 | parser would always report error. If such nonterminals are | |
31 | present, Yacc reports all such, then terminates. | |
32 | ||
33 | *** There is a new reserved word, %start. When used in the declarations | |
34 | section, it may be used to declare the start symbol of the grammar. | |
35 | If %start does not appear, the start symbol is, as at present, the | |
36 | first nonterminal symbol encountered. | |
37 | ||
38 | *** Yacc produced parsers are notorious for producing many many | |
39 | comments from lint. The problem is the value stack of the | |
40 | parser, which typically may contain integers, pointers, and | |
41 | possibly even floating point, etc., values. The lack | |
42 | of tight specification of this stack leads to potential | |
43 | nonportability, and considerable loss of the diagnostic power | |
44 | of lint. Thus, some new features have been added which make use | |
45 | of the new structure and union facilities of C. In effect, | |
46 | the user of Yacc may `honestly' declare the value stack, as | |
47 | well as the lexical interface variable, yylval, to be unions | |
48 | of all the types desired. Yacc will keep track of the types | |
49 | declared for all terminals and nonterminals, and automatically | |
50 | insert the appropriate union tag for all constructions such | |
51 | as $1, $$, etc. It is up to the user to supply the appropriate | |
52 | union declaration, and to declare the type of all the terminal | |
53 | and nonterminal symbols which will have values. If the type | |
54 | declaration feature is used at all, it must be used correctly; | |
55 | if it is not used, the default values are integers, as at present. | |
56 | The new type declaration features are described below: | |
57 | ||
58 | *** There is a new keyword, %union. A construction such as | |
59 | %union { | |
60 | int inttag; | |
61 | float floattag; | |
62 | struct mumble *ptrtag; | |
63 | } | |
64 | can be used, in the declarations section, to declare | |
65 | the type of the yacc stack. The declaration is | |
66 | effectively copied to the y.tab.c file, and, if the -d | |
67 | option is present, to the y.tab.h file as well. The | |
68 | declaration is used to declare the typedef YYSTYPE, which is the | |
69 | type of the value stack. If the -d option is present, | |
70 | the declaration | |
71 | extern YYSTYPE yylval; | |
72 | is also placed onto the y.tab.h file. Note that the lexical | |
73 | analyzer must be changed to use the appropriate union tag when | |
74 | assigning values. It is not necessary that the %union | |
75 | mechanism be used, as long as there is a union type YYSTYPE | |
76 | defined in the declarations section. | |
77 | ||
78 | *** The %token, %left, %right, and %nonassoc declarations now | |
79 | accept a union tag, enclosed in angle brackets (<...>), immediately | |
80 | after the keyword. All tokens mentioned in that declaration are | |
81 | taken to have the appropriate type. | |
82 | ||
83 | *** There is a new keyword, %type, also followed by a union tag | |
84 | in angle brackets, which may be used in the declarations section to | |
85 | declare nonterminal symbols to have a particular type. | |
86 | ||
87 | In both cases, whenever a $$ or $n is encountered in an action, | |
88 | the appropriate union tag is supplied by Yacc. Once any type is | |
89 | declared, it is an error to use a $$ or $n whose type is unknown. | |
90 | It is also illegal to have a grammar rule whose LHS has a type, | |
91 | but the rule has no action and the default action { $$ = $1; } | |
92 | would be inapplicable because $1 had a different type. | |
93 | ||
94 | *** There are occasional times when the type of something is | |
95 | not known (for example, when an action within a rule returns a | |
96 | value). In this case, the $$ and $n syntax is extended | |
97 | to permit the declaration of the type: the syntax is | |
98 | $<tag>$ | |
99 | and | |
100 | $<tag>n | |
101 | respectively. This rather strange syntax is necessitated by the | |
102 | need to distinguish the <> surrounding the tag from the < and > | |
103 | operators of C in the action. It is anticipated that the usage | |
104 | will be rare. | |
105 | ||
106 | *** As always, report gripes, bugs, suggestions to SCJ *** | |
107 | ||
108 | 12/01/76 | |
109 | A newer version of Yacc has been installed which copies the actions directly | |
110 | into the parser, rather than gathering them into a separate routine. | |
111 | The advantages include | |
112 | 1. It's faster | |
113 | 2. You can return a value from yyparse (and stop parsing...) by | |
114 | saying `return(x);' in an action | |
115 | 3. There are macros which simulate various interesting parsing | |
116 | actions: | |
117 | YYERROR causes the parser to behave as if a syntax | |
118 | error had been encountered (i.e., do error recovery) | |
119 | YYACCEPT causes a return from yyparse with a value of 0 | |
120 | YYABORT causes a return from yyparse with a value of 1 | |
121 | ||
122 | The repositioning of the actions may cause scope problems | |
123 | for some people who include lexical analyzers in funny places. | |
124 | This can probably be avoided by using another | |
125 | new feature: the `-d' option. | |
126 | Invoking Yacc with the -d option causes the #defines | |
127 | generated by Yacc to be written out onto a file | |
128 | called "y.tab.h". This can then be included as desired | |
129 | in lexical analyzers, etc. | |
130 | ||
131 | 11/28/76 | |
132 | A new version of Yacc has been installed which permits actions within | |
133 | rules. For such actions, $$ and $1, $2, etc. continue to have their | |
134 | usual meanings. An error message is returned if any $n refers to | |
135 | a value lying to the right of the action in the rule. | |
136 | ||
137 | These internal actions are assumed to return a value, which is accessed | |
138 | through the $n mechanism. | |
139 | ||
140 | In the y.output file, the actions are referred to by created nonterminal | |
141 | names of the form $$nnn. | |
142 | ||
143 | All actions within rules are assumed to be distinct. If some actions | |
144 | are the same, Yacc might report reduce/reduce conflicts which could | |
145 | be resolved by explicitly identifying identical actions; does anyone | |
146 | have a good idea for a syntax to do this? | |
147 | ||
148 | In the new Yacc, the = sign may now be omitted in action constructions | |
149 | of the form ={ ... } |