[unix-history] / usr / doc / yacc / ss6

.SH
6: Precedence
.PP
There is one common situation
where the rules given above for resolving conflicts are not sufficient;
this is in the parsing of arithmetic expressions.
Most of the commonly used constructions for arithmetic expressions can be naturally
described by the notion of
.I precedence
levels for operators, together with information about left
or right associativity.
It turns out that ambiguous grammars with appropriate disambiguating rules
can be used to create parsers that are faster and easier to
write than parsers constructed from unambiguous grammars.
The basic notion is to write grammar rules
of the form
.DS
expr  :  expr  OP  expr
.DE
and
.DS
expr  :  UNARY  expr
.DE
for all binary and unary operators desired.
This creates a very ambiguous grammar, with many parsing conflicts.
As disambiguating rules, the user specifies the precedence, or binding
strength, of all the operators, and the associativity
of the binary operators.
This information is sufficient to allow Yacc to resolve the parsing conflicts
in accordance with these rules, and construct a parser that realizes the desired
precedences and associativities.
.PP
The precedences and associativities are attached to tokens in the declarations section.
This is done by a series of lines beginning with a Yacc keyword: %left, %right,
or %nonassoc, followed by a list of tokens.
All of the tokens on the same line are assumed to have the same precedence level
and associativity; the lines are listed in
order of increasing precedence or binding strength.
Thus,
.DS
%left  \'+\'  \'\-\'
%left  \'*\'  \'/\'
.DE
describes the precedence and associativity of the four arithmetic operators.
Plus and minus are left associative, and have lower precedence than
star and slash, which are also left associative.
The keyword %right is used to describe right associative operators,
and the keyword %nonassoc is used to describe operators, like
the operator .LT. in Fortran, that may not associate with themselves; thus,
.DS
A  .LT.  B  .LT.  C
.DE
is illegal in Fortran, and such an operator would be described with the keyword
%nonassoc in Yacc.
As an example of the behavior of these declarations, the description
.DS
%right  \'=\'
%left  \'+\'  \'\-\'
%left  \'*\'  \'/\'

%%

expr	:	expr  \'=\'  expr
	|	expr  \'+\'  expr
	|	expr  \'\-\'  expr
	|	expr  \'*\'  expr
	|	expr  \'/\'  expr
	|	NAME
	;
.DE
might be used to structure the input
.DS
a  =  b  =  c*d  \-  e  \-  f*g
.DE
as follows:
.DS
a = ( b = ( ((c*d)\-e) \- (f*g) ) )
.DE
When this mechanism is used,
unary operators must, in general, be given a precedence.
Sometimes a unary operator and a binary operator
have the same symbolic representation, but different precedences.
An example is unary and binary \'\-\'; unary minus may be given the same
strength as multiplication, or even higher, while binary minus has a lower strength than
multiplication.
The keyword, %prec, changes the precedence level associated with a particular grammar rule.
%prec appears immediately after the body of the grammar rule, before the action or closing semicolon,
and is followed by a token name or literal.
It
causes the precedence of the grammar rule to become that of the following token name or literal.
For example, to make unary minus have the same precedence as multiplication the rules might resemble:
.DS
%left  \'+\'  \'\-\'
%left  \'*\'  \'/\'

%%

expr	:	expr  \'+\'  expr
	|	expr  \'\-\'  expr
	|	expr  \'*\'  expr
	|	expr  \'/\'  expr
	|	\'\-\'  expr      %prec  \'*\'
	|	NAME
	;
.DE
.PP
A token declared
by %left, %right, and %nonassoc need not be, but may be, declared by %token as well.
.PP
The precedences and associativities are used by Yacc to
resolve parsing conflicts; they give rise to disambiguating rules.
Formally, the rules work as follows:
.IP 1.
The precedences and associativities are recorded for those tokens and literals
that have them.
.IP 2.
A precedence and associativity is associated with each grammar rule; it is the precedence
and associativity of the last token or literal in the body of the rule.
If the %prec construction is used, it overrides this default.
Some grammar rules may have no precedence and associativity associated with them.
.IP 3.
When there is a reduce/reduce conflict, or there is a shift/reduce conflict
and either the input symbol or the grammar rule has no precedence and associativity,
then the two disambiguating rules given at the beginning of the section are used,
and the conflicts are reported.
.IP 4.
If there is a shift/reduce conflict, and both the grammar rule and the input character
have precedence and associativity associated with them, then the conflict is resolved
in favor of the action (shift or reduce) associated with the higher precedence.
If the precedences are the same, then the associativity is used; left
associative implies reduce, right associative implies shift, and nonassociating
implies error.
.PP
Conflicts resolved by precedence are not counted in the number of shift/reduce and reduce/reduce
conflicts reported by Yacc.
This means that mistakes in the specification of precedences may
disguise errors in the input grammar; it is a good idea to be sparing
with precedences, and use them in an essentially ``cookbook'' fashion,
until some experience has been gained.
The
.I y.output
file
is very useful in deciding whether the parser is actually doing
what was intended.
Commit	Line	Data
2240a03d TL	1	.SH
	2	6: Precedence
	3	.PP
	4	There is one common situation
	5	where the rules given above for resolving conflicts are not sufficient;
	6	this is in the parsing of arithmetic expressions.
	7	Most of the commonly used constructions for arithmetic expressions can be naturally
	8	described by the notion of
	9	.I precedence
	10	levels for operators, together with information about left
	11	or right associativity.
	12	It turns out that ambiguous grammars with appropriate disambiguating rules
	13	can be used to create parsers that are faster and easier to
	14	write than parsers constructed from unambiguous grammars.
	15	The basic notion is to write grammar rules
	16	of the form
	17	.DS
	18	expr : expr OP expr
	19	.DE
	20	and
	21	.DS
	22	expr : UNARY expr
	23	.DE
	24	for all binary and unary operators desired.
	25	This creates a very ambiguous grammar, with many parsing conflicts.
	26	As disambiguating rules, the user specifies the precedence, or binding
	27	strength, of all the operators, and the associativity
	28	of the binary operators.
	29	This information is sufficient to allow Yacc to resolve the parsing conflicts
	30	in accordance with these rules, and construct a parser that realizes the desired
	31	precedences and associativities.
	32	.PP
	33	The precedences and associativities are attached to tokens in the declarations section.
	34	This is done by a series of lines beginning with a Yacc keyword: %left, %right,
	35	or %nonassoc, followed by a list of tokens.
	36	All of the tokens on the same line are assumed to have the same precedence level
	37	and associativity; the lines are listed in
	38	order of increasing precedence or binding strength.
	39	Thus,
	40	.DS
	41	%left \'+\' \'\-\'
	42	%left \'*\' \'/\'
	43	.DE
	44	describes the precedence and associativity of the four arithmetic operators.
	45	Plus and minus are left associative, and have lower precedence than
	46	star and slash, which are also left associative.
	47	The keyword %right is used to describe right associative operators,
	48	and the keyword %nonassoc is used to describe operators, like
	49	the operator .LT. in Fortran, that may not associate with themselves; thus,
	50	.DS
	51	A .LT. B .LT. C
	52	.DE
	53	is illegal in Fortran, and such an operator would be described with the keyword
	54	%nonassoc in Yacc.
	55	As an example of the behavior of these declarations, the description
	56	.DS
	57	%right \'=\'
	58	%left \'+\' \'\-\'
	59	%left \'*\' \'/\'
	60
	61	%%
	62
	63	expr : expr \'=\' expr
	64	\| expr \'+\' expr
65	\| expr \'\-\' expr
66	\| expr \'*\' expr
67	\| expr \'/\' expr
68	\| NAME
69	;
70	.DE
71	might be used to structure the input
72	.DS
73	a = b = cd \- e \- fg
74	.DE
75	as follows:
76	.DS
77	a = ( b = ( ((cd)\-e) \- (fg) ) )
78	.DE
79	When this mechanism is used,
80	unary operators must, in general, be given a precedence.
81	Sometimes a unary operator and a binary operator
82	have the same symbolic representation, but different precedences.
83	An example is unary and binary \'\-\'; unary minus may be given the same
84	strength as multiplication, or even higher, while binary minus has a lower strength than
85	multiplication.
86	The keyword, %prec, changes the precedence level associated with a particular grammar rule.
87	%prec appears immediately after the body of the grammar rule, before the action or closing semicolon,
88	and is followed by a token name or literal.
89	It
90	causes the precedence of the grammar rule to become that of the following token name or literal.
91	For example, to make unary minus have the same precedence as multiplication the rules might resemble:
92	.DS
93	%left \'+\' \'\-\'
94	%left \'*\' \'/\'
95
96	%%
97
98	expr : expr \'+\' expr
99	\| expr \'\-\' expr
100	\| expr \'*\' expr
101	\| expr \'/\' expr
102	\| \'\-\' expr %prec \'*\'
103	\| NAME
104	;
105	.DE
106	.PP
107	A token declared
108	by %left, %right, and %nonassoc need not be, but may be, declared by %token as well.
109	.PP
110	The precedences and associativities are used by Yacc to
111	resolve parsing conflicts; they give rise to disambiguating rules.
112	Formally, the rules work as follows:
113	.IP 1.
114	The precedences and associativities are recorded for those tokens and literals
115	that have them.
116	.IP 2.
117	A precedence and associativity is associated with each grammar rule; it is the precedence
118	and associativity of the last token or literal in the body of the rule.
119	If the %prec construction is used, it overrides this default.
120	Some grammar rules may have no precedence and associativity associated with them.
121	.IP 3.
122	When there is a reduce/reduce conflict, or there is a shift/reduce conflict
123	and either the input symbol or the grammar rule has no precedence and associativity,
124	then the two disambiguating rules given at the beginning of the section are used,
125	and the conflicts are reported.
126	.IP 4.
127	If there is a shift/reduce conflict, and both the grammar rule and the input character
128	have precedence and associativity associated with them, then the conflict is resolved
129	in favor of the action (shift or reduce) associated with the higher precedence.
130	If the precedences are the same, then the associativity is used; left
131	associative implies reduce, right associative implies shift, and nonassociating
132	implies error.
133	.PP
134	Conflicts resolved by precedence are not counted in the number of shift/reduce and reduce/reduce
135	conflicts reported by Yacc.
136	This means that mistakes in the specification of precedences may
137	disguise errors in the input grammar; it is a good idea to be sparing
138	with precedences, and use them in an essentially ``cookbook'' fashion,
139	until some experience has been gained.
140	The
141	.I y.output
142	file
143	is very useful in deciding whether the parser is actually doing
144	what was intended.