BSD 4_3_Reno release
[unix-history] / usr / src / share / doc / ps1 / 15.yacc / ssA
CommitLineData
95f51977 1.\" @(#)ssA 6.1 (Berkeley) 5/8/86
756b86d7
KM
2.\"
3.SH
410: Advanced Topics
5.PP
6This section discusses a number of advanced features
7of Yacc.
8.SH
9Simulating Error and Accept in Actions
10.PP
11The parsing actions of error and accept can be simulated
12in an action by use of macros YYACCEPT and YYERROR.
13YYACCEPT causes
14.I yyparse
15to return the value 0;
16YYERROR causes
17the parser to behave as if the current input symbol
18had been a syntax error;
19.I yyerror
20is called, and error recovery takes place.
21These mechanisms can be used to simulate parsers
22with multiple endmarkers or context-sensitive syntax checking.
23.SH
24Accessing Values in Enclosing Rules.
25.PP
26An action may refer to values
27returned by actions to the left of the current rule.
28The mechanism is simply the same as with ordinary actions,
29a dollar sign followed by a digit, but in this case the
30digit may be 0 or negative.
31Consider
32.DS
33sent : adj noun verb adj noun
34 { \fIlook at the sentence\fR . . . }
35 ;
36
37adj : THE { $$ = THE; }
38 | YOUNG { $$ = YOUNG; }
39 . . .
40 ;
41
42noun : DOG
43 { $$ = DOG; }
44 | CRONE
45 { if( $0 == YOUNG ){
46 printf( "what?\en" );
47 }
48 $$ = CRONE;
49 }
50 ;
51 . . .
52.DE
53In the action following the word CRONE, a check is made that the
54preceding token shifted was not YOUNG.
55Obviously, this is only possible when a great deal is known about
56what might precede the symbol
57.I noun
58in the input.
59There is also a distinctly unstructured flavor about this.
60Nevertheless, at times this mechanism will save a great
61deal of trouble, especially when a few combinations are to
62be excluded from an otherwise regular structure.
63.SH
64Support for Arbitrary Value Types
65.PP
66By default, the values returned by actions and the lexical analyzer are integers.
67Yacc can also support
68values of other types, including structures.
69In addition, Yacc keeps track of the types, and inserts
70appropriate union member names so that the resulting parser will
71be strictly type checked.
72The Yacc value stack (see Section 4)
73is declared to be a
74.I union
75of the various types of values desired.
76The user declares the union, and associates union member names
77to each token and nonterminal symbol having a value.
78When the value is referenced through a $$ or $n construction,
79Yacc will automatically insert the appropriate union name, so that
80no unwanted conversions will take place.
81In addition, type checking commands such as
82.I Lint\|
83.[
84Johnson Lint Checker 1273
85.]
86will be far more silent.
87.PP
88There are three mechanisms used to provide for this typing.
89First, there is a way of defining the union; this must be
90done by the user since other programs, notably the lexical analyzer,
91must know about the union member names.
92Second, there is a way of associating a union member name with tokens
93and nonterminals.
94Finally, there is a mechanism for describing the type of those
95few values where Yacc can not easily determine the type.
96.PP
97To declare the union, the user includes in the declaration section:
98.DS
99%union {
100 body of union ...
101 }
102.DE
103This declares the Yacc value stack,
104and the external variables
105.I yylval
106and
107.I yyval ,
108to have type equal to this union.
109If Yacc was invoked with the
110.B \-d
111option, the union declaration
112is copied onto the
113.I y.tab.h
114file.
115Alternatively,
116the union may be declared in a header file, and a typedef
117used to define the variable YYSTYPE to represent
118this union.
119Thus, the header file might also have said:
120.DS
121typedef union {
122 body of union ...
123 } YYSTYPE;
124.DE
125The header file must be included in the declarations
126section, by use of %{ and %}.
127.PP
128Once YYSTYPE is defined,
129the union member names must be associated
130with the various terminal and nonterminal names.
131The construction
132.DS
133< name >
134.DE
135is used to indicate a union member name.
136If this follows
137one of the
138keywords %token,
139%left, %right, and %nonassoc,
140the union member name is associated with the tokens listed.
141Thus, saying
142.DS
143%left <optype> \'+\' \'\-\'
144.DE
145will cause any reference to values returned by these two tokens to be
146tagged with
147the union member name
148.I optype .
149Another keyword, %type, is
150used similarly to associate
151union member names with nonterminals.
152Thus, one might say
153.DS
154%type <nodetype> expr stat
155.DE
156.PP
157There remain a couple of cases where these mechanisms are insufficient.
158If there is an action within a rule, the value returned
159by this action has no
160.I "a priori"
161type.
162Similarly, reference to left context values (such as $0 \- see the
163previous subsection ) leaves Yacc with no easy way of knowing the type.
164In this case, a type can be imposed on the reference by inserting
165a union member name, between < and >, immediately after
166the first $.
167An example of this usage is
168.DS
169rule : aaa { $<intval>$ = 3; } bbb
170 { fun( $<intval>2, $<other>0 ); }
171 ;
172.DE
173This syntax has little to recommend it, but the situation arises rarely.
174.PP
175A sample specification is given in Appendix C.
176The facilities in this subsection are not triggered until they are used:
177in particular, the use of %type will turn on these mechanisms.
178When they are used, there is a fairly strict level of checking.
179For example, use of $n or $$ to refer to something with no defined type
180is diagnosed.
181If these facilities are not triggered, the Yacc value stack is used to
182hold
183.I int' s,
184as was true historically.