date and time created 88/10/21 13:47:49 by bostic
[unix-history] / usr / src / old / as.vax / PSD.doc / asdocs1.me
CommitLineData
b857495f 1.\"
d2964941 2.\" Copyright (c) 1982 Regents of the University of California
316c6824 3.\" @(#)asdocs1.me 1.7 %G%
b857495f
RH
4.\"
5.EQ
6delim $$
7.EN
8.(l C
316c6824 9.i "\*(VS \*(AM"
b857495f
RH
10.sp 2.0v
11John F. Reiser
12Bell Laboratories,
13Holmdel, NJ
14.sp 1.0v
15.i and
16.sp 1.0v
17Robert R. Henry\**
18.(f
19\**Preparation of this paper supported in part
20by the National Science Foundation under grant MCS #78-07291.
21.)f
22Electronics Research Laboratory
23University of California
24Berkeley, CA 94720
25.sp 1.0v
26November 5, 1979
27.sp 1.0v
28.i Revised
316c6824 29\*(TD
b857495f
RH
30.)l
31.SH 1 Introduction
32.pp
33This document describes the usage and input syntax
34of the \*(UX \*(VX-11 assembler
35.i as .
36.i As
37is designed for assembling the code produced by the
38\*(CL compiler;
39certain concessions have been made to handle code written
40directly by people,
41but in general little sympathy has been extended.
42This document is intended only for the writer of a compiler or a maintainer
43of the assembler.
44.SH 2 "Assembler Revisions since November 5, 1979"
45.pp
a22bdbdb 46There has been one major change to
b857495f 47.i as
a22bdbdb 48since the last release.
b857495f
RH
49.i As
50has been updated to assemble the new instructions and
51data formats for
52.q G
53and
54.q H
55floating point numbers,
56as well as the new queue instructions.
a22bdbdb
RH
57.SH 2 "Features Supported, but No Longer Encouraged as of \*(TD"
58.pp
8bf3b531 59These feature(s) in
a22bdbdb
RH
60.i as
61are supported, but no longer encouraged.
b857495f 62.ip -
a22bdbdb 63The colon operator for field initialization is likely to disappear.
b857495f
RH
64.SH 1 "Usage"
65.pp
66.i As
67is invoked with these command arguments:
68.br
69.sp 0.25v
70as
71[
72.b \-LVWJR
73]
74[
75.b \-d $n$
76]
77[
78.b \-DTS
79]
80[
81.b \-t
82.i directory
83]
84[
85.b \-o
86.i output
87]
88[ $name sub 1$ ] $...$
89[ $name sub n$ ]
90.br
91.sp 0.25v
92.pp
93The
94.b \-L
95flag instructs the assembler to save labels beginning with a
96.q L
97in the symbol table portion of the
98.i output
99file.
100Labels are not saved by default,
101as the default action of the link editor
102.i ld
103is to discard them anyway.
104.pp
105The
106.b \-V
107flag tells the assembler to place its interpass temporary
108file into virtual memory.
109In normal circumstances,
110the system manager will decide where the temporary file should lie.
111Our experiments
112with very large temporary files show that placing the temporary
113file into virtual memory will save about 13% of the assembly time,
114where the size of the temporary file is about 350K bytes.
115Most assembler sources will not be this long.
116.pp
117The
118.b \-W
119turns of all warning error reporting.
120.pp
121The
122.b \-J
123flag forces \*(UX style pseudo\-branch
124instructions with destinations further away than a
125byte displacement to be
126turned into jump instructions with 4 byte offsets.
127The
128.b \-J
129flag buys you nothing if
130.b \-d2
131is set.
132(See \(sc8.4, and future work described in \(sc11)
133.pp
134The
135.b \-R
136flag effectively turns
137.q "\fB.data\fP $n$"
138directives into
139.q "\fB.text\fP $n$"
140directives.
141This obviates the need to run editor scripts on assembler source to
142.q "read\-only"
143fix initialized data segments.
144Uninitialized data (via
145.b .lcomm
146and
147.b .comm
148directives)
149is still assembled into the data or bss segments.
150.pp
151The
152.b \-d
153flag specifies the number of bytes
154which the assembler should allow for a displacement when the value of the
155displacement expression is undefined in the first pass.
156The possible values of
157.i n
158are 1, 2, or 4;
159the assembler uses 4 bytes
160if
161.b -d
162is not specified.
163See \(sc8.2.
164.pp
165Provided the
166.b \-V
167flag is not set,
168the
169.b \-t
170flag causes the assembler to place its single temporary file
171in the
172.i directory
173instead of in
174.i /tmp .
175.pp
176The
177.b \-o
178flag causes the output to be placed on the file
179.i output .
180By default,
181the output of the assembler is placed in the file
182.i a.out
183in the current directory.
184.pp
185The input to the assembler is normally taken from the standard input.
186If file arguments occur,
187then the input is taken sequentially from the files
188$name sub 1$,
189$name sub 2~...~name sub n$
b0ab87d8 190This is not to say that the files are assembled separately;
b857495f
RH
191$name sub 1$ is effectively concatenated to $name sub 2$,
192so multiple definitions cannot occur amongst the input sources.
193.pp
194.pp
195The
196.b \-D
197(debug),
198.b \-T
199(token trace),
200and the
201.b \-S
202(symbol table)
203flags enable assembler trace information,
204provided that the assembler has been compiled with
205the debugging code enabled.
206The information printed is long and boring,
207but useful when debugging the assembler.
208.SH 1 "Lexical conventions"
209.pp
210Assembler tokens include identifiers (alternatively,
211.q symbols
212or
213.q names ),
214constants,
215and operators.
216.SH 2 "Identifiers"
217.pp
218An identifier consists of a sequence of alphanumeric characters
219(including
220period
221.q "\fB\|.\|\fP" ,
222underscore
223.q "\*(US" ,
224and
225dollar
226.q "\*(DL" ).
227The first character may not be numeric.
b0ab87d8 228Identifiers may be (practically) arbitrary long;
b857495f
RH
229all characters are significant.
230.SH 2 "Constants"
231.SH 3 "Scalar constants"
232.pp
233All scalar (non floating point)
234constants are (potentially) 128 bits wide.
235Such constants are interpreted as two's complement numbers.
236Note that 64 bit (quad words) and 128 bit (octal word) integers
237are only partially supported by the \*(VX hardware.
238In addition,
239128 bit integers are only supported by the extended \*(VX architecture.
240.i As
241supports 64 and 128 bit integers
242only so they can be used as immediate constants
b0ab87d8 243or to fill initialized data space.
b857495f
RH
244.i As
245can not perform arithmetic on constants larger than 32 bits.
246.pp
247Scalar constants are initially evaluated to a full 128 bits,
248but are pared down by discarding high order copies of the sign bit
249and categorizing the number as a long, quad or octal integer.
250Numbers with less precision than 32 bits are treated as 32 bit quantities.
251.pp
252The digits are
253.q 0123456789abcdefABCDEF
254with the obvious values.
255.pp
256An octal constant consists of a sequence of digits with a leading zero.
257.pp
258A decimal constant consists of a sequence of digits without a leading zero.
259.pp
260A hexadecimal constant consists of the characters
261.q 0x
262(or
263.q 0X )
264followed by a sequence of digits.
265.pp
266A single-character constant consists of a single quote
267.q "\|\(fm\|"
268followed by an \*(AC character,
269including \*(AC newline.
270The constant's value is the code for the
271given character.
272.SH 3 "Floating Point Constants"
273.pp
274Floating point constants are internally represented
275in the \*(VX floating point format
276that is specified by the lexical form of the constant.
277Using the meta notation that
278[dec] is a decimal digit (\c
279.q "0123456789" ),
280[expt] is a type specification character (\c,
281.q "fFdDhHgG" ),
282[expe] is a exponent delimiter and type specification character (\c,
283.q "eEfFdDhHgG" ),
b0ab87d8
RH
284$x sup roman "*"$ means 0 or more occurences of $x$,
285$x sup +$ means 1 or more occurences of $x$,
b857495f
RH
286then the general lexical form of a floating point number is:
287.ce 1
2880[expe]([+-])$roman "[dec]" sup +$(.)($roman "[dec]" sup roman "*"$)([expt]([+-])($roman "dec]" sup +$))
289.ce 0
290The standard semantic interpretation is used for the
291signed integer, fraction and signed power of 10 exponent.
292If the exponent delimiter is specified,
293it must be either an
294.q e
295or
296.q E ,
297or must agree with the initial type specification character that is used.
298The type specification character specifies
299the type and representation of the constructed number, as follows:
300.(b
301.TS
302center;
303c l c
304c l n.
305type character floating representation size (bits)
306_
307f, F F format floating 32
308d, D D format floating 64
309g, G G format floating 64
310h, H H format floating 128
311.TE
312.)b
313Note that
314.q G
315and
316.q H
317format floating point numbers are not supported
318by all implementations of the \*(VX architecture.
319.i As
320does not require the augmented architecture in order to run.
321.pp
322The assembler uses the library routine
323.i atof()
324to convert
325.q F
326and
327.q D
328numbers,
329and uses its own conversion routine
330(derived from
331.i atof ,
332and believed to be numerically accurate)
333to convert
334.q G
335and
336.q H
337floating point numbers.
338.pp
339Collectively,
340all floating point numbers,
341together with quad and octal scalars are called
342.i Bignums .
343When
344.i as
345requires a Bignum,
346a 32 bit scalar quantity may also be used.
347.SH 3 "String Constants"
348.pp
349A string constant is defined using
b0ab87d8 350the same syntax and semantics as the \*(CL language uses.
b857495f
RH
351Strings begin and end with a
352.q "''"
353(double quote).
354The \*(DM assembler conventions for flexible string quoting is
355not implemented.
356All \*(CL backslash conventions are observed;
357the backslash conventions
358peculiar to the \*(PD assembler are not observed.
359Strings are known by their value and their length;
360the assembler does not implicitly end strings with a null byte.
361.SH 2 "Operators"
362.pp
363There are several single-character
364operators;
365see \(sc6.1.
366.SH 2 "Blanks"
367.pp
368Blank and tab characters
369may be interspersed freely between tokens,
370but may not be used within tokens (except character constants).
371A blank or tab is required to separate adjacent
372identifiers or constants not otherwise separated.
373.SH 2 "Scratch Mark Comments"
374.pp
375The character
376.q "#"
377introduces a comment,
378which extends through the end of the line on which it appears.
379Comments starting in column 1,
380having the format
b0ab87d8 381.q "# $expression~~string$" ,
b857495f
RH
382are interpreted as an indication that the assembler is now assembling
383file
384.i string
385at line
386.i expression .
387Thus, one can use the \*(CL preprocessor on an assembly language source file,
388and use the
389.i #include
390and
391.i #define
392preprocessor directives.
393(Note that there may not be an assembler comment starting in column
3941 if the assembler source is given to the \*(CL preprocessor,
b0ab87d8 395as it will be interpreted by the preprocessor in a way not intended.)
b857495f
RH
396Comments are otherwise ignored by the assembler.
397.SH 2 "\*(CL Style Comments"
398.pp
399The assembler will recognize \*(CL style comments,
400introduced with the prologue
401.b "/*"
402and ending with the epilogue
403.b "*/" .
404\*(CL style comments may extend across multiple lines,
405and are the preferred comment style
406to use if one chooses to use the \*(CL preprocessor.
407.SH 1 "Segments and Location Counters"
408.pp
409Assembled code and data fall into three segments: the text segment,
410the data segment,
411and the bss segment.
412The \*(UX operating system makes
413some assumptions about the content of these segments;
414the assembler does not.
415Within the text and data segments there are a number of sub-segments,
416distinguished by number (\c
417.q "\fBtext\fP 0" ,
418.q "\fBtext\fP 1" ,
419$...$
420.q "\fBdata\fP 0" ,
421.q "\fBdata\fP 1" ,
422$...$).
423Currently there are four subsegments each in text and data.
424The subsegments are for programming convenience only.
425.pp
426Before writing the output file,
427the assembler zero-pads each text subsegment to a multiple of four
428bytes and then concatenates the subsegments in order to form the text segment;
429an analogous operation is done for the data segment.
430Requesting that the loader define symbols and storage regions is the only
431action allowed by the assembler with respect to the bss segment.
432Assembly begins in
433.q "\fBtext\fP 0" .
434.pp
435Associated with each (sub)segment is an implicit location counter which
436begins at zero and is incremented by 1 for each byte assembled into the
437(sub)segment.
438There is no way to explicitly reference a location counter.
439Note that the location counters of subsegments other than
440.q "\fBtext\fP 0"
441and
442.q "\fBdata\fP 0"
443behave peculiarly due to the concatenation used to form
444the text and data segments.