Bell 32V development
[unix-history] / usr / doc / eqn / e5
CommitLineData
755c0b0d
TL
1.NH
2Language Theory
3.PP
4The basic structure of the language is
5not a particularly original one.
6Equations are pictured as a set of ``boxes,''
7pieced together in various ways.
8For example, something with a subscript is
9just a box followed by another box moved downward
10and shrunk
11by an appropriate amount.
12A fraction is just a box centered above another box,
13at the right altitude,
14with a line of correct length drawn between them.
15.PP
16The grammar for the language is shown below.
17For purposes of exposition, we have collapsed
18some productions. In the original grammar, there
19are about 70 productions, but many of these
20are simple ones used only to guarantee
21that some keyword is recognized early enough in the parsing process.
22Symbols in
23capital letters
24are terminal symbols;
25lower case
26symbols are non-terminals, i.e., syntactic categories.
27The vertical bar \(bv indicates an alternative;
28the brackets [ ] indicate optional material.
29A
30.UC TEXT
31is a string of non-blank characters or
32any string inside double quotes;
33the other terminal symbols represent literal occurrences
34of the corresponding keyword.
35.P1
36.ce 0
37.ta .3i
38.ps 9
39.ne 17
40.in 1
41eqn : box | eqn box
42.sp 5p
43box : text
44 | { eqn }
45 | box OVER box
46 | SQRT box
47 | box SUB box | box SUP box
48 | [ L | C | R ]PILE { list }
49 | LEFT text eqn [ RIGHT text ]
50 | box [ FROM box ] [ TO box ]
51 | SIZE text box
52 | [ROMAN | BOLD | ITALIC] box
53 | box [HAT | BAR | DOT | DOTDOT | TILDE]
54 | DEFINE text text
55.sp 5p
56list : eqn | list ABOVE eqn
57.sp 5p
58text : TEXT
59.ps 10
60.in 0
61.P2
62.PP
63The grammar makes it obvious why there are few exceptions.
64For example, the observation that something can be replaced by a more complicated something
65in braces is implicit in the productions:
66.P1
67.ce 0
68 eqn : box | eqn box
69 box : text | { eqn }
70.P2
71Anywhere a single character could be used,
72.ul
73any
74legal construction can be used.
75.PP
76Clearly, our grammar is highly ambiguous.
77What, for instance, do we do with the input
78.P1
79a over b over c ?
80.P2
81Is it
82.P1
83{a over b} over c
84.P2
85or is it
86.P1
87a over {b over c} ?
88.P2
89.PP
90To answer questions like this, the grammar
91is supplemented with a small set of rules that describe the precedence
92and associativity
93of operators.
94In particular, we specify (more or less arbitrarily)
95that
96.ul
97over
98associates to the left,
99so the first alternative above is the one chosen.
100On the other hand,
101.ul
102sub
103and
104.ul
105sup
106bind to the right,
107because this is closer to standard mathematical practice.
108That is, we assume $x sup a sup b$ is $x sup {(a sup b )}$,
109not $(x sup a ) sup b$.
110.PP
111The precedence rules resolve the ambiguity in a construction like
112.P1
113a sup 2 over b
114.P2
115We define
116.ul
117sup
118to have a higher precedence than
119.ul
120over,
121so this construction is parsed as
122$a sup 2 over b$ instead of $a sup {2 over b}$.
123.PP
124Naturally, a user can always
125force a particular parsing
126by placing braces around expressions.
127.PP
128The ambiguous grammar approach seems to be quite useful.
129The grammar we use is small enough to be easily understood,
130for it contains none of the productions that would be
131normally used for resolving ambiguity.
132Instead the supplemental information about
133precedence and associativity (also small enough to be understood)
134provides the compiler-compiler
135with the information it needs
136to make a fast, deterministic parser for
137the specific language we want.
138When the language is supplemented by the disambiguating rules,
139it is in fact
140.UC LR(1)
141and thus easy to parse[5].
142.PP
143The output code is generated as the input is scanned.
144Any time a production
145of the grammar is recognized,
146(potentially) some
147.UC TROFF
148commands are output.
149For example, when the lexical analyzer
150reports that it has found a
151.UC TEXT
152(i.e., a string of contiguous characters),
153we have recognized the production:
154.P1
155text : TEXT
156.P2
157The translation of this is simple.
158We generate a local name for the string,
159then hand the name and the string to
160.UC TROFF,
161and let
162.UC TROFF
163perform the storage management.
164All we save is the name of the string,
165its height, and its baseline.
166.PP
167As another example,
168the translation associated with the production
169.P1
170box : box OVER box
171.P2
172is:
173.P1
174.ce 0
175.in 1
176.ne 14
177Width of output box =
178 slightly more than largest input width
179Height of output box =
180 slightly more than sum of input heights
181Base of output box =
182 slightly more than height of bottom input box
183String describing output box =
184 move down;
185 move right enough to center bottom box;
186 draw bottom box (i.e., copy string for bottom box);
187 move up; move left enough to center top box;
188 draw top box (i.e., copy string for top box);
189 move down and left; draw line full width;
190 return to proper base line.
191.in 0
192.P2
193Most of the other productions have
194equally simple semantic actions.
195Picturing the output as a set of properly placed boxes
196makes the right sequence of positioning commands
197quite obvious.
198The main difficulty is in finding the right numbers to use
199for esthetically pleasing positioning.
200.PP
201With a grammar, it is usually clear how to extend the language.
202For instance, one of our users
203suggested a
204.UC TENSOR
205operator, to make constructions like
206.EQ
207~ sub size 7 m sup size 7 l
208{bold T from n to k} sub size 7 i sup size 7 j
209.EN
210Grammatically, this is easy:
211it is sufficient to add a production like
212.P1
213box : TENSOR { list }
214.P2
215Semantically, we need only juggle the boxes to the right places.