Research V7 development
[unix-history] / usr / doc / yacc / ssA
CommitLineData
d6111f3f
SJ
1.SH
210: Advanced Topics
3.PP
4This section discusses a number of advanced features
5of Yacc.
6.SH
7Simulating Error and Accept in Actions
8.PP
9The parsing actions of error and accept can be simulated
10in an action by use of macros YYACCEPT and YYERROR.
11YYACCEPT causes
12.I yyparse
13to return the value 0;
14YYERROR causes
15the parser to behave as if the current input symbol
16had been a syntax error;
17.I yyerror
18is called, and error recovery takes place.
19These mechanisms can be used to simulate parsers
20with multiple endmarkers or context-sensitive syntax checking.
21.SH
22Accessing Values in Enclosing Rules.
23.PP
24An action may refer to values
25returned by actions to the left of the current rule.
26The mechanism is simply the same as with ordinary actions,
27a dollar sign followed by a digit, but in this case the
28digit may be 0 or negative.
29Consider
30.DS
31sent : adj noun verb adj noun
32 { \fIlook at the sentence\fR . . . }
33 ;
34
35adj : THE { $$ = THE; }
36 | YOUNG { $$ = YOUNG; }
37 . . .
38 ;
39
40noun : DOG
41 { $$ = DOG; }
42 | CRONE
43 { if( $0 == YOUNG ){
44 printf( "what?\en" );
45 }
46 $$ = CRONE;
47 }
48 ;
49 . . .
50.DE
51In the action following the word CRONE, a check is made that the
52preceding token shifted was not YOUNG.
53Obviously, this is only possible when a great deal is known about
54what might precede the symbol
55.I noun
56in the input.
57There is also a distinctly unstructured flavor about this.
58Nevertheless, at times this mechanism will save a great
59deal of trouble, especially when a few combinations are to
60be excluded from an otherwise regular structure.
61.SH
62Support for Arbitrary Value Types
63.PP
64By default, the values returned by actions and the lexical analyzer are integers.
65Yacc can also support
66values of other types, including structures.
67In addition, Yacc keeps track of the types, and inserts
68appropriate union member names so that the resulting parser will
69be strictly type checked.
70The Yacc value stack (see Section 4)
71is declared to be a
72.I union
73of the various types of values desired.
74The user declares the union, and associates union member names
75to each token and nonterminal symbol having a value.
76When the value is referenced through a $$ or $n construction,
77Yacc will automatically insert the appropriate union name, so that
78no unwanted conversions will take place.
79In addition, type checking commands such as
80.I Lint\|
81.[
82Johnson Lint Checker 1273
83.]
84will be far more silent.
85.PP
86There are three mechanisms used to provide for this typing.
87First, there is a way of defining the union; this must be
88done by the user since other programs, notably the lexical analyzer,
89must know about the union member names.
90Second, there is a way of associating a union member name with tokens
91and nonterminals.
92Finally, there is a mechanism for describing the type of those
93few values where Yacc can not easily determine the type.
94.PP
95To declare the union, the user includes in the declaration section:
96.DS
97%union {
98 body of union ...
99 }
100.DE
101This declares the Yacc value stack,
102and the external variables
103.I yylval
104and
105.I yyval ,
106to have type equal to this union.
107If Yacc was invoked with the
108.B \-d
109option, the union declaration
110is copied onto the
111.I y.tab.h
112file.
113Alternatively,
114the union may be declared in a header file, and a typedef
115used to define the variable YYSTYPE to represent
116this union.
117Thus, the header file might also have said:
118.DS
119typedef union {
120 body of union ...
121 } YYSTYPE;
122.DE
123The header file must be included in the declarations
124section, by use of %{ and %}.
125.PP
126Once YYSTYPE is defined,
127the union member names must be associated
128with the various terminal and nonterminal names.
129The construction
130.DS
131< name >
132.DE
133is used to indicate a union member name.
134If this follows
135one of the
136keywords %token,
137%left, %right, and %nonassoc,
138the union member name is associated with the tokens listed.
139Thus, saying
140.DS
141%left <optype> \'+\' \'\-\'
142.DE
143will cause any reference to values returned by these two tokens to be
144tagged with
145the union member name
146.I optype .
147Another keyword, %type, is
148used similarly to associate
149union member names with nonterminals.
150Thus, one might say
151.DS
152%type <nodetype> expr stat
153.DE
154.PP
155There remain a couple of cases where these mechanisms are insufficient.
156If there is an action within a rule, the value returned
157by this action has no
158.I "a priori"
159type.
160Similarly, reference to left context values (such as $0 \- see the
161previous subsection ) leaves Yacc with no easy way of knowing the type.
162In this case, a type can be imposed on the reference by inserting
163a union member name, between < and >, immediately after
164the first $.
165An example of this usage is
166.DS
167rule : aaa { $<intval>$ = 3; } bbb
168 { fun( $<intval>2, $<other>0 ); }
169 ;
170.DE
171This syntax has little to recommend it, but the situation arises rarely.
172.PP
173A sample specification is given in Appendix C.
174The facilities in this subsection are not triggered until they are used:
175in particular, the use of %type will turn on these mechanisms.
176When they are used, there is a fairly strict level of checking.
177For example, use of $n or $$ to refer to something with no defined type
178is diagnosed.
179If these facilities are not triggered, the Yacc value stack is used to
180hold
181.I int' s,
182as was true historically.