BSD 4_2 development
[unix-history] / usr / doc / yacc / ssc
CommitLineData
c9528a00
C
1.SH
2Appendix C: An Advanced Example
3.PP
4This Appendix gives an example of a grammar using some
5of the advanced features discussed in Section 10.
6The desk calculator example in Appendix A is
7modified to provide a desk calculator that
8does floating point interval arithmetic.
9The calculator understands floating point
10constants, the arithmetic operations +, \-, *, /,
11unary \-, and = (assignment), and has 26 floating
12point variables, ``a'' through ``z''.
13Moreover, it also understands
14.I intervals ,
15written
16.DS
17 ( x , y )
18.DE
19where
20.I x
21is less than or equal to
22.I y .
23There are 26 interval valued variables ``A'' through ``Z''
24that may also be used.
25The usage is similar to that in Appendix A; assignments
26return no value, and print nothing, while expressions print
27the (floating or interval) value.
28.PP
29This example explores a number of interesting features
30of Yacc and C.
31Intervals are represented by a structure, consisting of the
32left and right endpoint values, stored as
33.I double 's.
34This structure is given a type name, INTERVAL, by using
35.I typedef .
36The Yacc value stack can also contain floating point scalars, and
37integers (used to index into the arrays holding the variable values).
38Notice that this entire strategy depends strongly on being able to
39assign structures and unions in C.
40In fact, many of the actions call functions that return structures
41as well.
42.PP
43It is also worth noting the use of YYERROR to handle error conditions:
44division by an interval containing 0, and an interval presented in
45the wrong order.
46In effect, the error recovery mechanism of Yacc is used to throw away the
47rest of the offending line.
48.PP
49In addition to the mixing of types on the value stack,
50this grammar also demonstrates an interesting use of syntax to
51keep track of the type (e.g. scalar or interval) of intermediate
52expressions.
53Note that a scalar can be automatically promoted to an interval if
54the context demands an interval value.
55This causes a large number of conflicts when the grammar is run through
56Yacc: 18 Shift/Reduce and 26 Reduce/Reduce.
57The problem can be seen by looking at the two input lines:
58.DS
59 2.5 + ( 3.5 \- 4. )
60.DE
61and
62.DS
63 2.5 + ( 3.5 , 4. )
64.DE
65Notice that the 2.5 is to be used in an interval valued expression
66in the second example, but this fact is not known until
67the ``,'' is read; by this time, 2.5 is finished, and the parser cannot go back
68and change its mind.
69More generally, it might be necessary to look ahead an arbitrary number of
70tokens to decide whether to convert a scalar to an interval.
71This problem is evaded by having two rules for each binary interval
72valued operator: one when the left operand is a scalar, and one when
73the left operand is an interval.
74In the second case, the right operand must be an interval,
75so the conversion will be applied automatically.
76Despite this evasion, there are still many cases where the
77conversion may be applied or not, leading to the above
78conflicts.
79They are resolved by listing the rules that yield scalars first
80in the specification file; in this way, the conflicts will
81be resolved in the direction of keeping scalar
82valued expressions scalar valued until they are forced to become
83intervals.
84.PP
85This way of handling multiple types is very instructive, but not very general.
86If there were many kinds of expression types, instead of just two,
87the number of rules needed would increase dramatically, and the conflicts
88even more dramatically.
89Thus, while this example is instructive, it is better practice in a
90more normal programming language environment to
91keep the type information as part of the value, and not as part
92of the grammar.
93.PP
94Finally, a word about the lexical analysis.
95The only unusual feature is the treatment of floating point constants.
96The C library routine
97.I atof
98is used to do the actual conversion from a character string
99to a double precision value.
100If the lexical analyzer detects an error,
101it responds by returning a token that
102is illegal in the grammar, provoking a syntax error
103in the parser, and thence error recovery.
104.DS L
105
106%{
107
108# include <stdio.h>
109# include <ctype.h>
110
111typedef struct interval {
112 double lo, hi;
113 } INTERVAL;
114
115INTERVAL vmul(), vdiv();
116
117double atof();
118
119double dreg[ 26 ];
120INTERVAL vreg[ 26 ];
121
122%}
123
124%start lines
125
126%union {
127 int ival;
128 double dval;
129 INTERVAL vval;
130 }
131
132%token <ival> DREG VREG /* indices into dreg, vreg arrays */
133
134%token <dval> CONST /* floating point constant */
135
136%type <dval> dexp /* expression */
137
138%type <vval> vexp /* interval expression */
139
140 /* precedence information about the operators */
141
142%left \'+\' \'\-\'
143%left \'*\' \'/\'
144%left UMINUS /* precedence for unary minus */
145
146%%
147
148lines : /* empty */
149 | lines line
150 ;
151
152line : dexp \'\en\'
153 { printf( "%15.8f\en", $1 ); }
154 | vexp \'\en\'
155 { printf( "(%15.8f , %15.8f )\en", $1.lo, $1.hi ); }
156 | DREG \'=\' dexp \'\en\'
157 { dreg[$1] = $3; }
158 | VREG \'=\' vexp \'\en\'
159 { vreg[$1] = $3; }
160 | error \'\en\'
161 { yyerrok; }
162 ;
163
164dexp : CONST
165 | DREG
166 { $$ = dreg[$1]; }
167 | dexp \'+\' dexp
168 { $$ = $1 + $3; }
169 | dexp \'\-\' dexp
170 { $$ = $1 \- $3; }
171 | dexp \'*\' dexp
172 { $$ = $1 * $3; }
173 | dexp \'/\' dexp
174 { $$ = $1 / $3; }
175 | \'\-\' dexp %prec UMINUS
176 { $$ = \- $2; }
177 | \'(\' dexp \')\'
178 { $$ = $2; }
179 ;
180
181vexp : dexp
182 { $$.hi = $$.lo = $1; }
183 | \'(\' dexp \',\' dexp \')\'
184 {
185 $$.lo = $2;
186 $$.hi = $4;
187 if( $$.lo > $$.hi ){
188 printf( "interval out of order\en" );
189 YYERROR;
190 }
191 }
192 | VREG
193 { $$ = vreg[$1]; }
194 | vexp \'+\' vexp
195 { $$.hi = $1.hi + $3.hi;
196 $$.lo = $1.lo + $3.lo; }
197 | dexp \'+\' vexp
198 { $$.hi = $1 + $3.hi;
199 $$.lo = $1 + $3.lo; }
200 | vexp \'\-\' vexp
201 { $$.hi = $1.hi \- $3.lo;
202 $$.lo = $1.lo \- $3.hi; }
203 | dexp \'\-\' vexp
204 { $$.hi = $1 \- $3.lo;
205 $$.lo = $1 \- $3.hi; }
206 | vexp \'*\' vexp
207 { $$ = vmul( $1.lo, $1.hi, $3 ); }
208 | dexp \'*\' vexp
209 { $$ = vmul( $1, $1, $3 ); }
210 | vexp \'/\' vexp
211 { if( dcheck( $3 ) ) YYERROR;
212 $$ = vdiv( $1.lo, $1.hi, $3 ); }
213 | dexp \'/\' vexp
214 { if( dcheck( $3 ) ) YYERROR;
215 $$ = vdiv( $1, $1, $3 ); }
216 | \'\-\' vexp %prec UMINUS
217 { $$.hi = \-$2.lo; $$.lo = \-$2.hi; }
218 | \'(\' vexp \')\'
219 { $$ = $2; }
220 ;
221
222%%
223
224# define BSZ 50 /* buffer size for floating point numbers */
225
226 /* lexical analysis */
227
228yylex(){
229 register c;
230
231 while( (c=getchar()) == \' \' ){ /* skip over blanks */ }
232
233 if( isupper( c ) ){
234 yylval.ival = c \- \'A\';
235 return( VREG );
236 }
237 if( islower( c ) ){
238 yylval.ival = c \- \'a\';
239 return( DREG );
240 }
241
242 if( isdigit( c ) || c==\'.\' ){
243 /* gobble up digits, points, exponents */
244
245 char buf[BSZ+1], *cp = buf;
246 int dot = 0, exp = 0;
247
248 for( ; (cp\-buf)<BSZ ; ++cp,c=getchar() ){
249
250 *cp = c;
251 if( isdigit( c ) ) continue;
252 if( c == \'.\' ){
253 if( dot++ || exp ) return( \'.\' ); /* will cause syntax error */
254 continue;
255 }
256
257 if( c == \'e\' ){
258 if( exp++ ) return( \'e\' ); /* will cause syntax error */
259 continue;
260 }
261
262 /* end of number */
263 break;
264 }
265 *cp = \'\e0\';
266 if( (cp\-buf) >= BSZ ) printf( "constant too long: truncated\en" );
267 else ungetc( c, stdin ); /* push back last char read */
268 yylval.dval = atof( buf );
269 return( CONST );
270 }
271 return( c );
272 }
273
274INTERVAL hilo( a, b, c, d ) double a, b, c, d; {
275 /* returns the smallest interval containing a, b, c, and d */
276 /* used by *, / routines */
277 INTERVAL v;
278
279 if( a>b ) { v.hi = a; v.lo = b; }
280 else { v.hi = b; v.lo = a; }
281
282 if( c>d ) {
283 if( c>v.hi ) v.hi = c;
284 if( d<v.lo ) v.lo = d;
285 }
286 else {
287 if( d>v.hi ) v.hi = d;
288 if( c<v.lo ) v.lo = c;
289 }
290 return( v );
291 }
292
293INTERVAL vmul( a, b, v ) double a, b; INTERVAL v; {
294 return( hilo( a*v.hi, a*v.lo, b*v.hi, b*v.lo ) );
295 }
296
297dcheck( v ) INTERVAL v; {
298 if( v.hi >= 0. && v.lo <= 0. ){
299 printf( "divisor interval contains 0.\en" );
300 return( 1 );
301 }
302 return( 0 );
303 }
304
305INTERVAL vdiv( a, b, v ) double a, b; INTERVAL v; {
306 return( hilo( a/v.hi, a/v.lo, b/v.hi, b/v.lo ) );
307 }
308.DE
309.bp