Commit | Line | Data |
---|---|---|
4d61aa4e AT |
1 | # Overview # |
2 | ||
3 | TODO: Write introduction. Goal is to build a cross compiler targeting pdp11-aout. | |
4 | ||
5 | TODO: What kind of joint header do I want across all the articles in a set, linking them together? | |
6 | ||
7 | This document guides you through building a cross compiler using GCC on | |
8 | FreeBSD. This cross compiler will run on a modern AMD64 machine but emit code | |
9 | which runs on a DEC PDP-11. In addition to the compiler, these instructions | |
10 | also build associated tooling like an assembler, linker, etc. | |
11 | ||
12 | In this manner, modern programming tools like `make`, `git`, `vi`, and more can | |
13 | be used to write modern C in your usual style while targeting the PDP-11. | |
14 | ||
15 | ||
16 | # Installation # | |
17 | ||
18 | These instructions were tested on FreeBSD 12 with GCC 7.3.0 from ports as the | |
19 | host compiler. The cross compiler was built from the GCC 10.2.0 and Binutils | |
20 | 2.35.1 source code. | |
21 | ||
22 | Building GCC requires GNU Make. On FreeBSD either install via `pkg install | |
23 | gmake` or build from ports under `devel/gmake`. On Linux your `make` command is | |
24 | probably `gmake` in disguise. Run `make --version` and see if the first line is | |
25 | something like `GNU Make 4.2.1`. | |
26 | ||
27 | In addition to GCC, we will also need to compile GNU Binutils since it contains | |
28 | the assembler, linker, and other necessary tools. | |
29 | ||
30 | Obtain suitable source code tarballs from these links. | |
31 | ||
32 | - <https://www.gnu.org/software/binutils/> | |
33 | ||
34 | - <https://www.gnu.org/software/gcc/> | |
35 | ||
36 | I like to build all my cross compilers under one folder in my home directory, | |
37 | each with a version specific sub-folder. | |
38 | ||
39 | setenv PREFIX "$HOME/cross-compiler/pdp11-gcc10.2.0" | |
40 | ||
41 | Remember to make any `$PATH` changes permanent. For `tcsh` on FreeBSD, this | |
42 | means editing `~/.cshrc`. To set the `$PATH` for this session, execute the | |
43 | following. | |
44 | ||
45 | setenv PATH "$PREFIX/bin:$PATH" | |
46 | ||
47 | The `$TARGET` environment variable is critical as it tells GCC what kind of | |
48 | cross compiler we desire. In our case, this [target | |
49 | triplet](https://wiki.osdev.org/Target_Triplet) is requesting code for the | |
50 | PDP-11 architecture, wrapped in an `a.out` container, with no hosted | |
51 | environment. That means this is a bare-metal target. There will be no C | |
52 | standard library, only the C language itself. | |
53 | ||
54 | setenv TARGET pdp11-aout | |
55 | ||
56 | Both GCC and binutils are best built from outside the source tree. Make two | |
57 | directories to hold the build detritus. Use a clean build directory each time | |
58 | you reconfigure or rebuild. | |
59 | ||
60 | cd $HOME/cross-compiler/pdp11-gcc10.2.0 | |
61 | mkdir workdir-binutils | |
62 | mkdir workdir-gcc | |
63 | ||
64 | Build binutils first. Assuming you saved the source code in | |
65 | `~/cross-compiler/pdp11-gcc10.2.0/`, simply do the following. | |
66 | ||
67 | cd $HOME/cross-compiler/pdp11-gcc10.2.0 | |
68 | tar xzf binutils-2.35.1.tar.gz | |
69 | cd workdir-binutils | |
70 | ||
71 | Now configure, build and install binutils. | |
72 | ||
73 | ../binutils-2.35.1/configure --target=$TARGET --prefix="$PREFIX" \ | |
74 | --with-sysroot --disable-nls --disable-werror | |
75 | gmake | |
76 | gmake install | |
77 | ||
78 | Verify that you can access a series of files in your `$PATH` named | |
79 | `pdp11-aout-*` (e.g. `pdp11-aout-as`), and that checking their version with | |
80 | `pdp11-aout-as --version` results in something like `GNU Binutils 2.35.1`. | |
81 | ||
82 | With binutils built and installed, now it's time to build GCC. | |
83 | ||
84 | Follow a similar process to unpack the source code, but note the new | |
85 | requirement to download dependencies. In older versions of GCC this command was | |
86 | `./contrib/download-dependencies` instead of | |
87 | `./contrib/download-prerequisites`. | |
88 | ||
89 | cd $HOME/cross-compiler/pdp11-gcc10.2.0 | |
90 | tar xzf gcc-10.2.0.tar.gz | |
91 | cd gcc-10.2.0 | |
92 | ./contrib/download-prerequisites | |
93 | cd ../workdir-gcc | |
94 | ||
95 | Configuring GCC proceeds similarly to binutils. Both GNU `as` and GNU `ld` are | |
96 | part of binutils, hence the directive informing GCC to use them. | |
97 | ||
98 | ../gcc-10.2.0/configure --target=$TARGET --prefix="$PREFIX" \ | |
99 | --disable-nls --enable-languages=c --without-headers \ | |
100 | --with-gnu-as --with-gnu-ld --disable-libssp | |
101 | gmake all-gcc | |
102 | gmake install-gcc | |
103 | ||
104 | Verify that `pdp11-aout-gcc --version` from your `$PATH` reports something like | |
105 | `pdp11-aout-gcc 10.2.0`. | |
106 | ||
107 | That's it, you're done. You now have a cross compiler that will run on your | |
108 | workstation and output PDP-11 compatible binaries in `a.out` format. | |
109 | ||
110 | At this point you can [skip ahead to the next section](TODO) or continue | |
111 | reading about some potential pitfalls of the cross compiler we've just built. | |
112 | ||
113 | ||
114 | # Potential Pitfalls # | |
115 | ||
116 | Below are a few problems I ran into while using my cross compiler, some of | |
117 | which may apply when compiling your own code for the PDP-11. I hope that by | |
118 | mentioning the problems here, along with symptoms and workarounds, you might be | |
119 | saved some time when encountering them. | |
120 | ||
121 | ## Compiling libgcc ## | |
122 | ||
123 | Our newly built cross compiler expects `libgcc` to exist at link time, but we | |
124 | didn't build it. So what is `libgcc` anyway? Quoting from the [GCC | |
125 | manual](https://gcc.gnu.org/onlinedocs/gccint/Libgcc.html): | |
126 | ||
127 | GCC provides a low-level runtime library, libgcc.a or libgcc_s.so.1 on some | |
128 | platforms. GCC generates calls to routines in this library automatically, | |
129 | whenever it needs to perform some operation that is too complicated to emit | |
130 | inline code for. | |
131 | ||
132 | Most of the routines in libgcc handle arithmetic operations that the target | |
133 | processor cannot perform directly. This includes integer multiply and divide on | |
134 | some machines, and all floating-point and fixed-point operations on other | |
135 | machines. libgcc also includes routines for exception handling, and a handful | |
136 | of miscellaneous operations. | |
137 | ||
138 | Some of these routines can be defined in mostly machine-independent C. Others | |
139 | must be hand-written in assembly language for each processor that needs them. | |
140 | ||
141 | Why didn't we build `libgcc`? Because we encountered this [error | |
142 | message](./pdp11-cross-compiler-libgcc-errormsg.txt). | |
143 | ||
144 | ||
145 | ### Problem ### | |
146 | ||
147 | Consider the following C code which performs division and modulus operations on | |
148 | 16-bit unsigned integers. | |
149 | ||
150 | #include "pdp11.h" | |
151 | #include <stdint.h> | |
152 | ||
153 | uint16_t a=8, b=64; | |
154 | printf("b \% a = %o\n", b % a); | |
155 | printf("b / a = %o\n", b / a); | |
156 | ||
157 | If we try to compile this code, we receive two errors from the linker. | |
158 | ||
159 | pdp11-aout-ld: example.o:example.o:(.text+0x8e): undefined reference to `__umodhi3' | |
160 | pdp11-aout-ld: example.o:example.o:(.text+0xac): undefined reference to `__udivhi3' | |
161 | ||
162 | The two functions referenced, `__umodhi3` and `__udivhi3` are part of `libgcc`. | |
163 | The names reference the **u**nsigned **mod**ulo or **div**ision on | |
7ce9a1b4 | 164 | **h**alf-**i**nteger types. Per the [GCC |
4d61aa4e AT |
165 | manual](https://gcc.gnu.org/onlinedocs/gccint/Machine-Modes.html#Machine-Modes), |
166 | the half-integer mode uses a two-byte integer. | |
167 | ||
168 | ||
169 | ### Solution ### | |
170 | ||
171 | There are two ways around this problem. | |
172 | ||
173 | The first (and superior) option is figuring out how to build `libgcc`. The | |
174 | command to initiate the build is `gmake all-target-libgcc`, executed under the | |
175 | same environment in which `gmake all-gcc` was executed earlier in this guide. | |
176 | If you figure out what I'm doing wrong, let me know. | |
177 | ||
178 | The second option is to implement your own functions for `__umodhi3()`, | |
179 | `__udivhi3()`, and whatever else might come up. It's not hard to make something | |
180 | functional, though catching all the edge cases could be challenging. | |
181 | ||
182 | ||
7ce9a1b4 | 183 | ## Using uint32 ## |
4d61aa4e AT |
184 | |
185 | Although the PDP-11 utilizes a 16-bit word, GCC is clever enough to allow | |
186 | operations on 32-bit words by breaking them up into smaller operations. For | |
187 | example, in the following assembly code generated by GCC, note how the 32-bit | |
188 | word is pushed onto the stack as two separate words. | |
189 | ||
190 | uint32_t a=0710004010 uint16_t a=010; | |
191 | ||
192 | add $-4, sp add $-2, sp | |
193 | mov $3440, (sp) mov $10, (sp) | |
194 | mov $4010, 2(sp) | |
195 | ||
196 | ||
197 | ### Problem ### | |
198 | ||
199 | Whenever I try to make real use of code with `uint32_t`, I encounter internal | |
200 | compiler errors like the following. | |
201 | ||
202 | memtest.c:119:1: error: insn does not satisfy its constraints: | |
203 | } | |
204 | ^ | |
205 | (insn 95 44 45 (set (reg:HI 1 r1) | |
206 | (reg/f:HI 16 virtual-incoming-args)) "memtest.c":114 14 {movhi} | |
207 | (nil)) | |
208 | memtest.c:119:1: internal compiler error: in extract_constrain_insn_cached, at recog.c:2225 | |
209 | no stack trace because unwind library not available | |
210 | Please submit a full bug report, | |
211 | with preprocessed source if appropriate. | |
212 | See <https://gcc.gnu.org/bugs/> for instructions. | |
213 | *** Error code 1 | |
214 | ||
215 | In each case, adding a single `uint32_t` operation in one spot in the code | |
216 | resulted in a compiler error in a completely different part of the code. | |
217 | Removing the offending `uint32_t` line caused the program to again compile and | |
218 | execute normally. In each case, I already had `uint32_t` related code working | |
219 | elsewhere in the program. | |
220 | ||
221 | ||
222 | ### Solution ### | |
223 | ||
224 | Until I track down the bug causing these errors, I've been using structs | |
225 | containing pairs of `uint16_t` words and writing helper functions to perform | |
226 | operations on them. | |
227 | ||
228 | ||
229 | ## GNU Assembler Bug ## | |
230 | ||
231 | If you're stuck using an older version of GNU binutils, as I was while cross | |
232 | compiling from a SPARCstation 20, there is a bug in the GNU assembler that | |
233 | crops up whenever double-indirection is used in GCC. It was present until at | |
234 | least GNU Binutil 2.28 but appears to be fixed no later than 2.32 per the | |
235 | following code snippet in `binutils-2.32/gas/config/tc-pdp11.c`. | |
236 | ||
237 | if (*str == '@' || *str == '*') | |
238 | { | |
239 | /* @(Rn) == @0(Rn): Mode 7, Indexed deferred. | |
240 | Check for auto-increment deferred. */ | |
241 | if ( ... | |
242 | ||
243 | ||
244 | ### Problem ### | |
245 | ||
246 | One of the addressing modes supported by the PDP-11 is 'index deferred', | |
247 | represented by `@X(Rn)`. This operand indicates that `Rn` contains a pointer | |
248 | which should be dereferenced and the result added to `X` to generate a new | |
249 | pointer to the final location. For example, consider the following four values, | |
250 | one stored in a register and the other three in memory. Then `@2(R1)` is the | |
251 | value `222`. | |
252 | ||
253 | R1: 1000 | |
254 | 1000: 2000 | |
255 | 2000: 111 | |
256 | 2002: 222 | |
257 | ||
258 | Similarly, `@0(R1)` is the value `111`. In most PDP-11 assemblers, including | |
259 | DEC's MACRO-11 assembler, the string `@(Rn)` is an alias to `@0(Rn)`. But when | |
260 | the GNU assembler encounters `@(Rn)` it assembles it as though it were `(Rn)`, | |
261 | a single level of indirection instead of two levels! | |
262 | ||
263 | If we're only writing assembly then we can work around this bug by always using | |
264 | the form `@0(Rn)`. But what if we're writing C and using GCC to compile it? | |
265 | Consider the following C code example, taken directly from some stack-based | |
266 | debugger code written for the PDP-11. | |
267 | ||
268 | uint16_t ** csp = (uint16_t **) 070000; | |
269 | *csp = (uint16_t *) 060000; | |
270 | **csp = 0; | |
271 | ||
272 | When GCC compiles this to assembly it generates code of the form `@(Rn)` when | |
273 | assigning a value to `**csp` thus causing the value `0` to overwrite the value | |
274 | `060000` at `*csp` if GNU `as` is used to assemble the code. | |
275 | ||
276 | ||
277 | ### Solution ### | |
278 | ||
279 | The following patch, tested on GNU binutils 2.28, fixes the bug. It's a little | |
280 | hacky since it overloads the `operand->code` variable to pass unrelated state | |
281 | information to `parse_reg()`. | |
282 | ||
283 | --- tc-pdp11.c 2017-06-24 22:33:00.260210000 -0700 | |
284 | +++ tc-pdp11.c.fixed 2017-06-24 22:32:12.455205000 -0700 | |
285 | @@ -431,6 +431,9 @@ | |
286 | { | |
287 | LITTLENUM_TYPE literal_float[2]; | |
288 | ||
289 | + /* Store the value (if any) passed by parse_op_noreg() before parse_reg() overwrites it. */ | |
290 | + int deferred = operand->code; | |
291 | + | |
292 | str = skip_whitespace (str); | |
293 | ||
294 | switch (*str) | |
295 | @@ -451,6 +454,15 @@ | |
296 | operand->code |= 020; | |
297 | str++; | |
298 | } | |
299 | + /* | |
300 | + * This catches the case where @(Rn) is interpreted as (Rn) rather than @0(Rn) | |
301 | + */ | |
302 | + else if (deferred) | |
303 | + { | |
304 | + operand->additional = 1; | |
305 | + operand->word = 0; | |
306 | + operand->code |= 060; | |
307 | + } | |
308 | else | |
309 | { | |
310 | operand->code |= 010; | |
311 | @@ -581,6 +593,12 @@ | |
312 | ||
313 | if (*str == '@' || *str == '*') | |
314 | { | |
315 | + /* | |
316 | + * operand->code is overwritten by parse_reg() inside parse_op_no_deferred() | |
317 | + * We use it to temporarily catch the alias @(Rn) -> @0(Rn) since | |
318 | + * parse_op_no_deferred() starts at str+1 and thus misses the '@'. | |
319 | + */ | |
320 | + operand->code |= 010; | |
321 | str = parse_op_no_deferred (str + 1, operand); | |
322 | if (operand->error) | |
323 | return str; | |
324 |