| 1 | .SH |
| 2 | 10: Advanced Topics |
| 3 | .PP |
| 4 | This section discusses a number of advanced features |
| 5 | of Yacc. |
| 6 | .SH |
| 7 | Simulating Error and Accept in Actions |
| 8 | .PP |
| 9 | The parsing actions of error and accept can be simulated |
| 10 | in an action by use of macros YYACCEPT and YYERROR. |
| 11 | YYACCEPT causes |
| 12 | .I yyparse |
| 13 | to return the value 0; |
| 14 | YYERROR causes |
| 15 | the parser to behave as if the current input symbol |
| 16 | had been a syntax error; |
| 17 | .I yyerror |
| 18 | is called, and error recovery takes place. |
| 19 | These mechanisms can be used to simulate parsers |
| 20 | with multiple endmarkers or context-sensitive syntax checking. |
| 21 | .SH |
| 22 | Accessing Values in Enclosing Rules. |
| 23 | .PP |
| 24 | An action may refer to values |
| 25 | returned by actions to the left of the current rule. |
| 26 | The mechanism is simply the same as with ordinary actions, |
| 27 | a dollar sign followed by a digit, but in this case the |
| 28 | digit may be 0 or negative. |
| 29 | Consider |
| 30 | .DS |
| 31 | sent : adj noun verb adj noun |
| 32 | { \fIlook at the sentence\fR . . . } |
| 33 | ; |
| 34 | |
| 35 | adj : THE { $$ = THE; } |
| 36 | | YOUNG { $$ = YOUNG; } |
| 37 | . . . |
| 38 | ; |
| 39 | |
| 40 | noun : DOG |
| 41 | { $$ = DOG; } |
| 42 | | CRONE |
| 43 | { if( $0 == YOUNG ){ |
| 44 | printf( "what?\en" ); |
| 45 | } |
| 46 | $$ = CRONE; |
| 47 | } |
| 48 | ; |
| 49 | . . . |
| 50 | .DE |
| 51 | In the action following the word CRONE, a check is made that the |
| 52 | preceding token shifted was not YOUNG. |
| 53 | Obviously, this is only possible when a great deal is known about |
| 54 | what might precede the symbol |
| 55 | .I noun |
| 56 | in the input. |
| 57 | There is also a distinctly unstructured flavor about this. |
| 58 | Nevertheless, at times this mechanism will save a great |
| 59 | deal of trouble, especially when a few combinations are to |
| 60 | be excluded from an otherwise regular structure. |
| 61 | .SH |
| 62 | Support for Arbitrary Value Types |
| 63 | .PP |
| 64 | By default, the values returned by actions and the lexical analyzer are integers. |
| 65 | Yacc can also support |
| 66 | values of other types, including structures. |
| 67 | In addition, Yacc keeps track of the types, and inserts |
| 68 | appropriate union member names so that the resulting parser will |
| 69 | be strictly type checked. |
| 70 | The Yacc value stack (see Section 4) |
| 71 | is declared to be a |
| 72 | .I union |
| 73 | of the various types of values desired. |
| 74 | The user declares the union, and associates union member names |
| 75 | to each token and nonterminal symbol having a value. |
| 76 | When the value is referenced through a $$ or $n construction, |
| 77 | Yacc will automatically insert the appropriate union name, so that |
| 78 | no unwanted conversions will take place. |
| 79 | In addition, type checking commands such as |
| 80 | .I Lint\| |
| 81 | .[ |
| 82 | Johnson Lint Checker 1273 |
| 83 | .] |
| 84 | will be far more silent. |
| 85 | .PP |
| 86 | There are three mechanisms used to provide for this typing. |
| 87 | First, there is a way of defining the union; this must be |
| 88 | done by the user since other programs, notably the lexical analyzer, |
| 89 | must know about the union member names. |
| 90 | Second, there is a way of associating a union member name with tokens |
| 91 | and nonterminals. |
| 92 | Finally, there is a mechanism for describing the type of those |
| 93 | few values where Yacc can not easily determine the type. |
| 94 | .PP |
| 95 | To declare the union, the user includes in the declaration section: |
| 96 | .DS |
| 97 | %union { |
| 98 | body of union ... |
| 99 | } |
| 100 | .DE |
| 101 | This declares the Yacc value stack, |
| 102 | and the external variables |
| 103 | .I yylval |
| 104 | and |
| 105 | .I yyval , |
| 106 | to have type equal to this union. |
| 107 | If Yacc was invoked with the |
| 108 | .B \-d |
| 109 | option, the union declaration |
| 110 | is copied onto the |
| 111 | .I y.tab.h |
| 112 | file. |
| 113 | Alternatively, |
| 114 | the union may be declared in a header file, and a typedef |
| 115 | used to define the variable YYSTYPE to represent |
| 116 | this union. |
| 117 | Thus, the header file might also have said: |
| 118 | .DS |
| 119 | typedef union { |
| 120 | body of union ... |
| 121 | } YYSTYPE; |
| 122 | .DE |
| 123 | The header file must be included in the declarations |
| 124 | section, by use of %{ and %}. |
| 125 | .PP |
| 126 | Once YYSTYPE is defined, |
| 127 | the union member names must be associated |
| 128 | with the various terminal and nonterminal names. |
| 129 | The construction |
| 130 | .DS |
| 131 | < name > |
| 132 | .DE |
| 133 | is used to indicate a union member name. |
| 134 | If this follows |
| 135 | one of the |
| 136 | keywords %token, |
| 137 | %left, %right, and %nonassoc, |
| 138 | the union member name is associated with the tokens listed. |
| 139 | Thus, saying |
| 140 | .DS |
| 141 | %left <optype> \'+\' \'\-\' |
| 142 | .DE |
| 143 | will cause any reference to values returned by these two tokens to be |
| 144 | tagged with |
| 145 | the union member name |
| 146 | .I optype . |
| 147 | Another keyword, %type, is |
| 148 | used similarly to associate |
| 149 | union member names with nonterminals. |
| 150 | Thus, one might say |
| 151 | .DS |
| 152 | %type <nodetype> expr stat |
| 153 | .DE |
| 154 | .PP |
| 155 | There remain a couple of cases where these mechanisms are insufficient. |
| 156 | If there is an action within a rule, the value returned |
| 157 | by this action has no |
| 158 | .I "a priori" |
| 159 | type. |
| 160 | Similarly, reference to left context values (such as $0 \- see the |
| 161 | previous subsection ) leaves Yacc with no easy way of knowing the type. |
| 162 | In this case, a type can be imposed on the reference by inserting |
| 163 | a union member name, between < and >, immediately after |
| 164 | the first $. |
| 165 | An example of this usage is |
| 166 | .DS |
| 167 | rule : aaa { $<intval>$ = 3; } bbb |
| 168 | { fun( $<intval>2, $<other>0 ); } |
| 169 | ; |
| 170 | .DE |
| 171 | This syntax has little to recommend it, but the situation arises rarely. |
| 172 | .PP |
| 173 | A sample specification is given in Appendix C. |
| 174 | The facilities in this subsection are not triggered until they are used: |
| 175 | in particular, the use of %type will turn on these mechanisms. |
| 176 | When they are used, there is a fairly strict level of checking. |
| 177 | For example, use of $n or $$ to refer to something with no defined type |
| 178 | is diagnosed. |
| 179 | If these facilities are not triggered, the Yacc value stack is used to |
| 180 | hold |
| 181 | .I int' s, |
| 182 | as was true historically. |