| 1 | .\" |
| 2 | .\" Copyright (c) 1982 Regents of the University of California |
| 3 | .\" @(#)asdocs4.me 1.9 %G% |
| 4 | .\" |
| 5 | .EQ |
| 6 | delim $$ |
| 7 | .EN |
| 8 | .SH 1 "Machine instructions" |
| 9 | .pp |
| 10 | The syntax of machine instruction statements accepted by |
| 11 | .i as |
| 12 | is generally similar to the syntax of \*(DM. |
| 13 | There are differences, |
| 14 | however. |
| 15 | .SH 2 "Character set" |
| 16 | .pp |
| 17 | .i As |
| 18 | uses the character |
| 19 | .q \*(DL |
| 20 | instead of |
| 21 | .q # |
| 22 | for immediate constants, |
| 23 | and the character |
| 24 | .q * |
| 25 | instead of |
| 26 | .q @ |
| 27 | for indirection. |
| 28 | Opcodes and register names |
| 29 | are spelled with lower-case rather than upper-case letters. |
| 30 | .SH 2 "Specifying Displacement Lengths" |
| 31 | .pp |
| 32 | Under certain circumstances, |
| 33 | the following constructs are (optionally) recognized by |
| 34 | .i as |
| 35 | to indicate the number of bytes to allocate for |
| 36 | the displacement used when constructing |
| 37 | displacement and displacement deferred addressing modes: |
| 38 | .(b |
| 39 | .TS |
| 40 | center; |
| 41 | c c l |
| 42 | cb cb l. |
| 43 | primary alternate length |
| 44 | _ |
| 45 | B\` B^ byte (1 byte) |
| 46 | W\` W^ word (2 bytes) |
| 47 | L\` L^ long word (4 bytes) |
| 48 | .TE |
| 49 | .)b |
| 50 | .pp |
| 51 | One can also use lower case |
| 52 | .b b , |
| 53 | .b w |
| 54 | or |
| 55 | .b l |
| 56 | instead of the upper |
| 57 | case letters. |
| 58 | There must be no space between the size specifier letter and the |
| 59 | .q "^" |
| 60 | or |
| 61 | .q "\`" . |
| 62 | The constructs |
| 63 | .b "S^" |
| 64 | and |
| 65 | .b "G^" |
| 66 | are not recognized |
| 67 | by |
| 68 | .i as , |
| 69 | as they are by the \*(DM assembler. |
| 70 | It is preferred to use the |
| 71 | .q "\`" displacement specifier, |
| 72 | so that the |
| 73 | .q "^" |
| 74 | is not |
| 75 | misinterpreted as the |
| 76 | .b xor |
| 77 | operator. |
| 78 | .pp |
| 79 | Literal values |
| 80 | (including floating-point literals used where the |
| 81 | hardware expects a floating-point operand) |
| 82 | are assembled as short |
| 83 | literals if possible, |
| 84 | hence not needing the |
| 85 | .b "S^" |
| 86 | \*(DM directive. |
| 87 | .pp |
| 88 | If the displacement length modifier is present, |
| 89 | then the displacement is |
| 90 | .b always |
| 91 | assembled with that displacement, |
| 92 | even if it will fit into a smaller field, |
| 93 | or if significance is lost. |
| 94 | If the length modifier is not present, |
| 95 | and if the value of the displacement is known exactly in |
| 96 | .i as 's |
| 97 | first pass, |
| 98 | then |
| 99 | .i as |
| 100 | determines the length automatically, |
| 101 | assembling it in the shortest possible way, |
| 102 | Otherwise, |
| 103 | .i as |
| 104 | will use the value specified by the |
| 105 | .b \-d |
| 106 | argument, |
| 107 | which defaults to 4 bytes. |
| 108 | .SH 2 "case\fIx\fP Instructions" |
| 109 | .pp |
| 110 | .i As |
| 111 | considers the instructions |
| 112 | .b caseb , |
| 113 | .b casel , |
| 114 | .b casew |
| 115 | to have three operands. |
| 116 | The displacements must be explicitly computed by |
| 117 | .i as , |
| 118 | using one or more |
| 119 | .b .word |
| 120 | statements. |
| 121 | .SH 2 "Extended branch instructions" |
| 122 | .pp |
| 123 | These opcodes (formed in general |
| 124 | by substituting a |
| 125 | .q j |
| 126 | for the initial |
| 127 | .q b |
| 128 | of the standard opcodes) |
| 129 | take as branch destinations |
| 130 | the name of a label in the current subsegment. |
| 131 | It is an error if the destination is known to be in a different subsegment, |
| 132 | and it is a warning if the destination is not defined within |
| 133 | the object module being assembled. |
| 134 | .pp |
| 135 | If the branch destination is close enough, |
| 136 | then the corresponding |
| 137 | short branch |
| 138 | .q b |
| 139 | instruction is assembled. |
| 140 | Otherwise the assembler choses a sequence |
| 141 | of one or more instructions which together have the same effect as if the |
| 142 | .q b |
| 143 | instruction had a larger span. |
| 144 | In general, |
| 145 | .i as |
| 146 | chooses the inverse branch followed by a |
| 147 | .b brw , |
| 148 | but a |
| 149 | .b brw |
| 150 | is sometimes pooled among several |
| 151 | .q j |
| 152 | instructions with the same destination. |
| 153 | .pp |
| 154 | .i As |
| 155 | is unable to perform the same long/short branch generation |
| 156 | for other instructions with a fixed byte displacement, |
| 157 | such as the |
| 158 | .b sob , |
| 159 | .b aob |
| 160 | families, |
| 161 | or for the |
| 162 | .b acbx |
| 163 | family of instructions which has a fixed word displacement. |
| 164 | This would be desirable, |
| 165 | but is prohibitive because of the complexity of these instructions. |
| 166 | .pp |
| 167 | If the |
| 168 | .b \-J |
| 169 | assembler option is given, |
| 170 | a |
| 171 | .b jmp |
| 172 | instruction is used instead of a |
| 173 | .b brw |
| 174 | instruction |
| 175 | for |
| 176 | .b ALL |
| 177 | .q j |
| 178 | instructions with distant destinations. |
| 179 | This makes assembly of large (>32K bytes) |
| 180 | programs (inefficiently) |
| 181 | possible. |
| 182 | .i As |
| 183 | does not try to use clever combinations of |
| 184 | .b brb , |
| 185 | .b brw |
| 186 | and |
| 187 | .b jmp |
| 188 | instructions. |
| 189 | The |
| 190 | .b jmp |
| 191 | instructions use PC relative addressing, |
| 192 | with the length of the offset given by the |
| 193 | .b \-d |
| 194 | assembler |
| 195 | option. |
| 196 | .pp |
| 197 | These are the extended branch instructions |
| 198 | .i as |
| 199 | recognizes: |
| 200 | .(b |
| 201 | .TS |
| 202 | center; |
| 203 | lb lb lb. |
| 204 | jeql jeqlu jneq jnequ |
| 205 | jgeq jgequ jgtr jgtru |
| 206 | jleq jlequ jlss jlssu |
| 207 | jbcc jbsc jbcs jbss |
| 208 | |
| 209 | jlbc jlbs |
| 210 | jcc jcs |
| 211 | jvc jvs |
| 212 | jbc jbs |
| 213 | jbr |
| 214 | .TE |
| 215 | .)b |
| 216 | .pp |
| 217 | Note that |
| 218 | .b jbr |
| 219 | turns into |
| 220 | .b brb |
| 221 | if its target is close enough; |
| 222 | otherwise a |
| 223 | .b brw |
| 224 | is used. |
| 225 | .SH 1 "Diagnostics" |
| 226 | .pp |
| 227 | Diagnostics are intended to be self explanatory and appear on |
| 228 | the standard output. |
| 229 | Diagnostics either report an |
| 230 | .i error |
| 231 | or a |
| 232 | .i warning. |
| 233 | Error diagnostics complain about lexical, syntactic and some |
| 234 | semantic errors, and abort the assembly. |
| 235 | .pp |
| 236 | The majority of the warnings complain about the use of \*(VX |
| 237 | features not supported by all implementations of the architecture. |
| 238 | .i As |
| 239 | will warn if new opcodes are used, |
| 240 | if |
| 241 | .q G |
| 242 | or |
| 243 | .q H |
| 244 | floating point numbers are used |
| 245 | and will complain about mixed floating conversions. |
| 246 | .SH 1 "Limits" |
| 247 | .(b |
| 248 | .TS |
| 249 | center; |
| 250 | l l. |
| 251 | limit what |
| 252 | _ |
| 253 | Arbitrary\** Files to assemble |
| 254 | BUFSIZ Significant characters per name |
| 255 | Arbitrary Characters per input line |
| 256 | Arbitrary Characters per string |
| 257 | Arbitrary Symbols |
| 258 | 4 Text segments |
| 259 | 4 Data segments |
| 260 | .TE |
| 261 | .)b |
| 262 | .(f |
| 263 | \**Although the number of characters available to the \fIargv\fP line |
| 264 | is restricted by \*(UX to 10240. |
| 265 | .)f |
| 266 | .SH 1 "Annoyances and Future Work" |
| 267 | .pp |
| 268 | Most of the annoyances deal with restrictions on the extended |
| 269 | branch instructions. |
| 270 | .pp |
| 271 | .i As |
| 272 | only uses a two level algorithm for resolving extended branch |
| 273 | instructions into short or long displacements. |
| 274 | What is really needed is a general mechanism |
| 275 | to turn a short conditional jump into a |
| 276 | reverse conditional jump over one of |
| 277 | .b two |
| 278 | possible unconditional branches, |
| 279 | either a |
| 280 | .b brw |
| 281 | or a |
| 282 | .b jmp |
| 283 | instruction. |
| 284 | Currently, the |
| 285 | .b \-J |
| 286 | forces the |
| 287 | .b jmp |
| 288 | instruction to |
| 289 | .i always |
| 290 | be used, |
| 291 | instead of the |
| 292 | shorter |
| 293 | .b brw |
| 294 | instruction when needed. |
| 295 | .pp |
| 296 | The assembler should also recognize extended branch instructions for |
| 297 | .b sob , |
| 298 | .b aob , |
| 299 | and |
| 300 | .b acbx |
| 301 | instructions. |
| 302 | .b Sob |
| 303 | instructions will be easy, |
| 304 | .b aob |
| 305 | will be harder because the synthesized instruction |
| 306 | uses the index operand twice, |
| 307 | so one must be careful of side effects, |
| 308 | and the |
| 309 | .b acbx |
| 310 | family will be much harder (in the general case) |
| 311 | because the comparison depends on the sign of the addend operand, |
| 312 | and two operands are used more than once. |
| 313 | Augmenting |
| 314 | .i as |
| 315 | with these extended loop instructions |
| 316 | will allow the peephole optimizer to produce much better |
| 317 | loop optimizations, |
| 318 | since it currently assumes the worst |
| 319 | case about the size of the loop body. |
| 320 | .pp |
| 321 | The string temporary file is not put in memory when the -V flag is set. |
| 322 | The string table in the generated a.out contains some strings |
| 323 | and names that are never referenced from the symbol table; |
| 324 | the loader removes these unreferenced strings, however. |