Commit | Line | Data |
---|---|---|
be80b0b2 OB |
1 | .bd S B 3 |
2 | .TL | |
3 | Setting up the Third Berkeley Software Tape* | |
4 | .AU | |
5 | William N. Joy | |
6 | Ozalp Babaoglu | |
7 | .AI | |
8 | Computer Science Division | |
9 | Department of Electrical Engineering and Computer Science | |
10 | University of California, Berkeley | |
11 | Berkeley, California 94720 | |
12 | .PP | |
13 | .de IR | |
14 | \fI\\$1\fP\\$2 | |
15 | .. | |
16 | .de UX | |
17 | \s-2UNIX\s0\\$1 | |
18 | .. | |
19 | .FS | |
20 | *An early version of this paper appeared under the title | |
21 | ``Setting up the Berkeley Virtual Memory Extensions to the | |
22 | \s-2UNIX\s0 | |
23 | Operating System'' | |
24 | and, no doubt, references to this paper by this name exist elsewhere | |
25 | in the documentation. | |
26 | Portions of this document are adapted from | |
27 | ``Setting Up Unix/32V Version 1.0'' | |
28 | by Thomas B. London and John F. Reiser. | |
29 | .FE | |
30 | The distribution tape can be used only a DEC VAX-11/780** | |
31 | with RM03 or RP06 disks and with | |
32 | TE16 tape drives. | |
33 | We have the ability to make tapes for systems with UNIBUS** disks, but | |
34 | .FS | |
35 | ** DEC, VAX, UNIBUS and MASSBUS are trademarks of | |
36 | Digital Equipment Corporation. | |
37 | .FE | |
38 | .FS | |
39 | \(dg \s-2UNIX\s0 is a Trademark of Bell Laboratories. | |
40 | .FE | |
41 | such tapes are inherently rather system-specific, and will not be | |
42 | discussed here. | |
43 | The tape consists of some preliminary bootstrapping programs followed by | |
44 | one dump of a filesystem (see | |
45 | .IR dump (1)) | |
46 | and one tape archive image (see | |
47 | .IR tar (1)); | |
48 | if needed, | |
49 | individual files can be extracted | |
50 | after the initial construction of the filesystems. | |
51 | .PP | |
52 | If you are set up to do it, | |
53 | it is a good idea immediately to make a copy of the | |
54 | tape to guard against disaster. | |
55 | The tape is 9-track 1600 BPI and contains some 512-byte records | |
56 | followed by many 10240-byte records. | |
57 | There are interspersed tapemarks; end-of-tape is signalled | |
58 | by a double end-of-file. | |
59 | .PP | |
60 | The tape contains binary images | |
61 | of the system and all the user level programs, along with source | |
62 | and manual sections for them. | |
63 | There are about 4200 | |
64 | .UX \(dg | |
65 | files altogether. | |
66 | The first tape file contains bootstrapping programs. | |
67 | The second tape file is to be put on one filesystem | |
68 | called the `root filesystem', and contains essential binaries and enough | |
69 | other files to allow the system to run. | |
70 | The third tape file has all of the source and documentation. | |
71 | Altogether the files provided on the tape occupy approximately 40000 512 byte | |
72 | blocks.\(dd | |
73 | .FS | |
74 | \(dd\s-2UNIX\s0 traditionally talks in terms of 512 character blocks, and | |
75 | for consistency | |
76 | across different versions of | |
77 | .UX | |
78 | and to avoid mass confusion, user programs in the Virtual Vax version of the | |
79 | system also talk in terms of 512 blocks, despite the fact that the file | |
80 | system allocates 1024 byte blocks of disk space. | |
81 | All user progras such as | |
82 | .IR ls (1) | |
83 | and | |
84 | .IR du (1) | |
85 | speak in terms of 512 byte blocks; only system maintenance programs such | |
86 | as | |
87 | .IR mkfs (1), | |
88 | .IR icheck (1), | |
89 | .IR dump (1), | |
90 | and | |
91 | .IR df (1), | |
92 | speak to 1024 byte blocks. It is true that i/o is most efficient in 1024 | |
93 | byte quantities, but it is most natural for the user to think of this | |
94 | as ``2 blocks at a time.'' | |
95 | In any case, packs remain sectored 512 bytes per sector, and at the lowest | |
96 | driver levels the system deals with 512 byte disk records. | |
97 | .FE | |
98 | .SH | |
99 | Making a disk from tape | |
100 | .PP | |
101 | This description is an annotated version of the `sysgen' manual | |
102 | page in section 8 of the UNIX Programmer's Manual. | |
103 | Before you begin to work on the remainder of this manual, be sure you | |
104 | have an up to date manual, and that you have applied all updates | |
105 | to the manual which were provided with it, in the correct order. | |
106 | .PP | |
107 | Perform the following bootstrap procedure to obtain | |
108 | a disk with a root filesystem on it. | |
109 | .IP 1. | |
110 | Mount the magtape on drive 0 at load point, making | |
111 | sure that the ring is not inserted. | |
112 | .IP 2. | |
113 | Mount a disk pack on drive 0. | |
114 | .IP 3. | |
115 | Key in at 50000 and execute the following boot program: | |
116 | You may enter in lower-case, the LSI-11 will echo in upper-case. | |
117 | The machine's printouts are shown in boldface, | |
118 | explanatory comments are within ( ). | |
119 | Terminate each line you type by carriage return or line-feed. | |
120 | .RT | |
121 | .DS | |
122 | \fB>>>\|\fRHALT | |
123 | \fB>>>\|\fRUNJAM | |
124 | \fB>>>\|\fRINIT | |
125 | \fB>>>\|\fRD 50000 20009FDE | |
126 | \fB>>>\|\fRD+ D0512001 | |
127 | \fB>>>\|\fRD+ 3204A101 | |
128 | \fB>>>\|\fRD+ C114C08F | |
129 | \fB>>>\|\fRD+ A1D40424 | |
130 | \fB>>>\|\fRD+ 008FD00C | |
131 | \fB>>>\|\fRD+ C1800000 | |
132 | \fB>>>\|\fRD+ 8F320800 | |
133 | \fB>>>\|\fRD+ 10A1FE00 | |
134 | \fB>>>\|\fRD+ 00C139D0 | |
135 | \fB>>>\|\fRD+ 00000004 | |
136 | \fB>>>\|\fRE 50000/NE:A | |
137 | \&... (machine prints out values, check typing) | |
138 | \fB>>>\|\fRSTART 50000 | |
139 | .DE | |
140 | .IP | |
141 | The tape should move and the CPU should halt at location 5002A. | |
142 | If it doesn't, you probably have entered the program | |
143 | incorrectly. | |
144 | Start over and check your typing. | |
145 | .IP 4. | |
146 | Start the CPU with | |
147 | .DS | |
148 | \fB>>>\|\fRSTART 0 | |
149 | .DE | |
150 | .IP 5. | |
151 | The console should type | |
152 | .DS | |
153 | .I | |
154 | \fB=\fR | |
155 | .R | |
156 | .DE | |
157 | If the disk pack is already formatted, skip to step 6. | |
158 | Otherwise, format the pack with: | |
159 | .DS | |
160 | (bring in standalone RP06 formatter) | |
161 | \fB=\|\fRrp6fmt | |
162 | \fBformat : Format RP06/RM03 Disk\fR | |
163 | ||
164 | \fBMBA no. : \fR0 (format spindle on mba \# 0) | |
165 | \fBunit : \fR0 (format unit zero) | |
166 | (this procedure should take about 20 minutes) | |
167 | (some diagnostic messages may appear here) | |
168 | ||
169 | \fBunit : \fR-1 (exit from formatter) | |
170 | \fB=\fR (back at tape boot level) | |
171 | .DE | |
172 | .IP 6. | |
173 | Next, verify the readability of the pack via | |
174 | .DS | |
175 | (bring in RP06 verifier) | |
176 | \fB=\|\fRrpread | |
177 | \fBdread : Read RP06/RM03 Disk\fR | |
178 | ||
179 | \fBdisk unit : \fR0 (specify unit zero) | |
180 | \fBstart block : \fR0 (start at block zero) | |
181 | \fBno. blocks :\fR (default is entire pack) | |
182 | ||
183 | (this procedure should take about 10 minutes) | |
184 | (some diagnostic messages may appear here) | |
185 | \fB# Data Check errors : nn\fR (number of soft errors) | |
186 | \fB# Other errors : xx\fR (number of hard errors) | |
187 | \fBdisk unit: \fR\-1 (exit from rpread) | |
188 | \fB=\fR (back to tape boot) | |
189 | .DE | |
190 | If the number of `Other errors' is not zero, consideration | |
191 | should be given to obtaining a clean pack before proceeding | |
192 | further. | |
193 | .IP 7. | |
194 | Create the root file system with the following procedure: | |
195 | .DS | |
196 | (bring in a standalone version of the \fImkfs\fR (1) program) | |
197 | \fB=\|\fRmkfs | |
198 | \fBfile sys size:\fR 7942 (number of 1024 byte blocks in root) | |
199 | \fBfile system:\fR hp(0,0) (root is on drive zero; first filsys there) | |
200 | \fBisize = 5072\fR (number of inodes in root filesystem) | |
201 | \fBm/n = 3 500\fR (interleave parameters) | |
202 | \fB=\fR (back at tape boot level) | |
203 | .DE | |
204 | You now have a empty UNIX root filesystem. | |
205 | To restore the data which you need to boot the system, type | |
206 | .DS | |
207 | (bring in a standalone \fIrestor\fR\|(1) program) | |
208 | \fB=\|\fRrestor | |
209 | \fBTape?\fR ht(1,1) (1600 bpi, second tape file) | |
210 | \fBDisk?\fR hp(0,0) (into root file system) | |
211 | \fBLast chance before scribbling on disk.\fR (just hit return) | |
212 | (30 second pause then tape should move) | |
213 | \&... | |
214 | \fB=\fR (back at tape boot level) | |
215 | .DE | |
216 | Now, you are ready to boot up | |
217 | .UX. | |
218 | .SH | |
219 | Booting UNIX | |
220 | .PP | |
221 | Now boot UNIX: | |
222 | .DS | |
223 | (load bootstrap program) | |
224 | \fB=\fR\|boot | |
225 | \fBBoot\fR | |
226 | \fB: \fRhp(0,0)vmunix (bring in \fIvmunix\fR off root system) | |
227 | .DE | |
228 | The bootstrap should then print out the sizes of the different | |
229 | parts of the system (text, initialized and uninitialized data) and then | |
230 | the system should start with a message which looks (like): | |
231 | .DS | |
232 | .B | |
233 | 61656+61072+70120 start 0x4b4 | |
234 | VM/UNIX (Berkeley Version 2.1) 1/5/80 | |
235 | real mem = \fIxxx\fB | |
236 | avail mem = \fIyyy\fB | |
237 | ERASE IS CONTROL-H!!! | |
238 | \fB#\fR | |
239 | .R | |
240 | .DE | |
241 | The | |
242 | .I | |
243 | mem | |
244 | .R | |
245 | messages give the | |
246 | amount of real (physical) memory and the | |
247 | memory available to user programs | |
248 | in bytes. | |
249 | For example, if your machine has only 512K bytes of memory, then | |
250 | xxx will be 524228, i.e. exactly 512K. | |
251 | The ``ERASE-IS'' message is part of /.profile | |
252 | which was executed by the root shell when it started. You will | |
253 | probably want to change /.profile somewhat. | |
254 | .PP | |
255 | UNIX is now running, | |
256 | and the `UNIX Programmer's manual' applies; | |
257 | references below of the form X(Y) mean the subsection named | |
258 | X in section Y of the manual. | |
259 | The `#' is the prompt from the Shell, | |
260 | and indicates you are the super-user. | |
261 | You should first check the integrity of the root file system by | |
262 | giving the command | |
263 | .DS | |
264 | \fB#\fR chk /dev/rrp0a | |
265 | .DE | |
266 | which abbreviates | |
267 | .DS | |
268 | icheck /dev/rrp0a | |
269 | dcheck /dev/rrp0a | |
270 | .DE | |
271 | The output from | |
272 | .I chk | |
273 | should look something like: | |
274 | .DS | |
275 | .B | |
276 | icheck /dev/rrp0a | |
277 | /dev/rrp0a: | |
278 | files \0153 (r=111,d=12,b=8,c=22) | |
279 | used 1065 (i=27,ii=0,iii=0,d=1038) | |
280 | free 6558 | |
281 | missing \0\0\00 | |
282 | dcheck /dev/rrp0a | |
283 | /dev/rrp0a: | |
284 | entries link cnt | |
285 | 1 0 0 | |
286 | .R | |
287 | .DE | |
288 | .PP | |
289 | The diagnostic from \fIdcheck\fR is normal, as inode number 1 is reserved for | |
290 | placement of bad blocks, but currently unimplemented. | |
291 | .PP | |
292 | The next thing to do is to extract the rest of the data from | |
293 | the tape. | |
294 | Comments are enclosed in ( ); don't type these. | |
295 | The number in the first command is the | |
296 | size of the filesystem to be created, in 1024 character blocks, | |
297 | just as given to the standalone version of | |
298 | .I mkfs | |
299 | above. | |
300 | (If you have an RM-03 rather than an RP-06 use ``41040'' rather than | |
301 | ``145673'' in the procedure below.) | |
302 | .DS | |
303 | \fB#\|\fRdate \fIyymmddhhmm\fR (set date, see \fIdate\fR\|(1)) | |
304 | \fB#\|\fR/etc/mkfs /dev/rrp0g 145673 (create empty user filesystem) | |
305 | \fBisize = 65488\fR (this is the number of available inodes) | |
306 | \fBm/n = 3 500\fR (freelist interleave parameters) | |
307 | (this takes a few minutes) | |
308 | \fB#\|\fR/etc/mount /dev/rp0g /usr (mount the usr filesystem) | |
309 | \fB#\|\fRcd /usr (make /usr the current directory) | |
310 | \fB#\|\fRcp /dev/rmt5 /dev/null (skip first tape file (tp format)) | |
311 | \fB#\|\fRcp /dev/rmt5 /dev/null (skip second tape file (root)) | |
312 | \fB#\|\fRumask 22 | |
313 | \fB#\|\fRtar xbf 20 /dev/rmt1 (extract the usr filesystem) | |
314 | \fB#\|\fRdd if=/usr/mdec/uboot of=/dev/rrp0a bs=1b count=1 | |
315 | (write boot block so \fIbproc\fR(8) disk boot scheme will work) | |
316 | \fB#\|\fRcd / (back to root) | |
317 | \fB#\|\fR/etc/umount /dev/rp0g (unmount /usr) | |
318 | .DE | |
319 | All of the data on the tape has now been extracted. | |
320 | The tape will rewind automatically. | |
321 | .PP | |
322 | You should now check the consistency of the /usr file system by doing | |
323 | .DS | |
324 | \fB#\fR chk /dev/rrp0g | |
325 | .DE | |
326 | In order to use the /usr file system, you should now remount it by | |
327 | saying | |
328 | .DS | |
329 | \fB#\fR /etc/mount /dev/rp0g /usr | |
330 | .DE | |
331 | Since all directories created by | |
332 | .IR tar (1) | |
333 | will have mode 755 (due to the | |
334 | .IR umask (1) | |
335 | above, you should run the commands | |
336 | .DS | |
337 | \fB#\fR cd /usr/src/cmd/Admin | |
338 | \fB#\fR DESTDIR=/ | |
339 | \fB#\fR export DESTDIR | |
340 | \fB#\fR ./mk MODES | |
341 | .DE | |
342 | which will reset the permissions of several directories (such as /tmp) | |
343 | and also correctly set ownerships and modes of ``set-user-id'' programs | |
344 | (although they were correct already, this does no harm.) | |
345 | .SH | |
346 | Making a UNIX boot floppy | |
347 | .PP | |
348 | The next thing to do is to make a | |
349 | .UX | |
350 | boot floppy, by placing some files on a clean floppy using | |
351 | .IR arff (1). | |
352 | Place a clean floppy in the console, and issue the following commands: | |
353 | .DS | |
354 | \fB#\fR cd /usr/src/sys/floppy | |
355 | \fB#\fR arff cr * | |
356 | .DE | |
357 | This will put a copy of each file in the directory /usr/src/sys/floppy | |
358 | onto the floppy. | |
359 | You should now be able to reboot using the procedures in | |
360 | .IR boot (8). | |
361 | Try this after saying | |
362 | .DS | |
363 | \fB#\fR sync | |
364 | .DE | |
365 | which allows the system to initiate all i/o before you halt the CPU. | |
366 | It is traditional to say | |
367 | .DS | |
368 | \fB#\fR sync | |
369 | \fB#\fR sync | |
370 | .DE | |
371 | to give the system a little time, and to then reboot. | |
372 | .PP | |
373 | The boot floppy you created does not have enough files on | |
374 | it to deadstart the machine with. If your standard console floppy does | |
375 | not have a multi-segment RT-11 directory structure on it, and if you make | |
376 | a copy of it using | |
377 | .IR flcopy (1), | |
378 | delete unneeded files to make space for the | |
379 | .UX | |
380 | boot programs, and then add the files with | |
381 | .DS | |
382 | \fB#\fR cd /usr/src/sys/floppy | |
383 | \fB#\fR arff r * | |
384 | .DE | |
385 | you should be able to cold-start the machine from this floppy. | |
386 | The following set of files is believed to be adequate to cold-start | |
387 | the machine: | |
388 | .TS | |
389 | center; | |
390 | l l l. | |
391 | boot.exe restar.cmd wcsmon.sys | |
392 | consol.sys restar.ilv wcssrv.bin | |
393 | filea.pat vmb.exe | |
394 | pcs.pat wcs\fInnn\fR.pat | |
395 | .TE | |
396 | It is also useful to have the console help files on your floppy; | |
397 | their names end in ``.hlp''. | |
398 | .SH | |
399 | Taking the system up and down | |
400 | .PP | |
401 | To bring the system up to a multi-user configuration after a boot | |
402 | all you have to do is hit control-d on the console. The system | |
403 | will then perform | |
404 | /etc/rc, | |
405 | a multi-user restart script, and come up on the terminals which are | |
406 | indicated in /etc/ttys. | |
407 | See | |
408 | .IR init.vm (8) | |
409 | and | |
410 | .IR ttys (5). | |
411 | To take the system down to a single user state you can use | |
412 | .DS | |
413 | \fB#\fR kill 1 | |
414 | .DE | |
415 | when you are up multi-user. | |
416 | This will kill all processes and give you a shell on the console, | |
417 | as if you had just booted. | |
418 | .PP | |
419 | If you wish to change the lines which are active you can edit the file | |
420 | /etc/ttys, changing the first characters of lines, and then do | |
421 | .DS | |
422 | \fB#\fR kill \-1 1 | |
423 | .DE | |
424 | See | |
425 | .IR init.vm(8) | |
426 | for more information. | |
427 | .SH | |
428 | Adding devices | |
429 | .PP | |
430 | The UNIX system running is configured to run | |
431 | with the given disk | |
432 | and tape, a console, and 8 DZ11 lines. | |
433 | This is probably not the correct | |
434 | configuration. | |
435 | You will have to correct the configuration table to reflect | |
436 | the true state of your machine. | |
437 | .PP | |
438 | Before you mess with the source for the system it is wise to make | |
439 | a backup. The following will do this: | |
440 | .DS | |
441 | \fB#\fR cd /usr/src | |
442 | \fB#\fR mkdir distsys distsys/h distsys/sys | |
443 | \fB#\fR cd sys/sys | |
444 | \fB#\fR cp * /usr/src/distsys/sys | |
445 | \fB#\fR cd ../h | |
446 | \fB#\fR cp * /usr/src/distsys/h | |
447 | .DE | |
448 | This allows you to find out what you have done to the distribution | |
449 | system by later running the command | |
450 | .IR diffdir (1), | |
451 | comparing these directories. | |
452 | .PP | |
453 | \fBN.B.: Note that the system header files in /usr/src/sys/h are linked | |
454 | to the files in /usr/include/sys. Since programs which depend on constants | |
455 | in /usr/include/sys/param.h must correspond to the running system, you | |
456 | should be careful to not break these links.\fR | |
457 | .PP | |
458 | There are certain magic numbers and | |
459 | configuration parameters embedded in various | |
460 | device drivers that you may have to change. | |
461 | The device addresses of each device | |
462 | are defined in each driver. | |
463 | In case you have any non-standard device | |
464 | addresses, | |
465 | just change the address and recompile. | |
466 | Also, if the devices's interrupt vector address(es) | |
467 | are not currently known to the system (this is likely), | |
468 | then the file /usr/src/sys/sys/univec.c must be modified | |
469 | appropriately: namely, the proper interrupt routine addresses | |
470 | must be placed in the table `UNIvec'. Use the DZ11 | |
471 | as an example (as distributed, the DZ11 vectors are | |
472 | assumed to be at locations c0 and c4 (hexadecimal)). | |
473 | .PP | |
474 | You will notice that the system, as distributed, has conditional code | |
475 | in it. The current Berkeley system, on ``Ernie Co-vax'' is made | |
476 | by defining IDENT in the ``makefile'' to be | |
477 | .DS | |
478 | IDENT= \-DERNIE \-DUCB | |
479 | .DE | |
480 | This enables the conditional code both for Berkeley and for this | |
481 | particular machine. | |
482 | It is traditional to pick a monicker for your machine, and change IDENT | |
483 | to reflect it, and to then put in changes conditionally whenever | |
484 | this makes sense. | |
485 | You can be guided by the ERNIE conditional code. | |
486 | .PP | |
487 | The system comes with 4 drivers which we are using: | |
488 | A KL/DL driver | |
489 | .I kl.c, | |
490 | a Versatec printer/plotter driver | |
491 | .I vp.c, | |
492 | and two copies of a (old and simpleminded) UNIBUS disk driver, | |
493 | named | |
494 | .I rm.c | |
495 | and | |
496 | .I rp.c. | |
497 | These last two are not in the most aesthetically pleasing of shapes, | |
498 | but are fully functional. | |
499 | There is also an RK driver | |
500 | .I rk.c, | |
501 | but we are not using it, and it may need a little work. | |
502 | .PP | |
503 | The disk and tape drivers | |
504 | for the MASSBUS devices | |
505 | (hp.c, ht.c) | |
506 | are set up to run 1 drive and should | |
507 | be changed if you have more. | |
508 | .PP | |
509 | Now, make sure you add any new drivers which you have to the list | |
510 | of DRIVERS here, and to the FILES and CFILES variables so they | |
511 | will be compiled and included in listings. You can also delete | |
512 | drivers which you don't need from FILES and CFILES, or change the code | |
513 | so that nothing will be compiled by using a ``#ifndef''. | |
514 | .PP | |
515 | The | |
516 | .I makefile | |
517 | has several useful entry points: | |
518 | .IP clean 15n | |
519 | Cleans out the directory, removing | |
520 | .B \&.o | |
521 | files and the like. | |
522 | .IP lint 15n | |
523 | Runs lint on the system; the system was almost lint-free as sent | |
524 | to you. See | |
525 | .I linterrs | |
526 | for the remaining | |
527 | .I lint | |
528 | when it was distributed. | |
529 | .IP depend 15n | |
530 | Creates a new makefile indicating dependencies on header files by | |
531 | running a search through | |
532 | .B \&.c | |
533 | files looking for ``#include'' lines. | |
534 | Make sure you format your code like the rest of the system so that | |
535 | this will work. | |
536 | .IP print 15n | |
537 | Produces a nice listing of most everything in the system | |
538 | directory in a canonical order. | |
539 | .IP symbols.sort 15n | |
540 | Creates a new file for sorting symbols in the system namelist. | |
541 | If you have locally written programs which use the system namelist | |
542 | you can put the symbols which they reference in | |
543 | .I symbols.raw | |
544 | and they will be moved at system generation to the front of the | |
545 | system namelist for quicker access. | |
546 | .IP tags 15n | |
547 | Creates a | |
548 | .I tags | |
549 | file for | |
550 | .I ex, | |
551 | to make editing of the system much easier. | |
552 | .PP | |
553 | Before running | |
554 | .I make, | |
555 | you should check the definition of the constants in | |
556 | /usr/src/sys/h/param.h | |
557 | The constants | |
558 | NBUF, NINODE, NFILE, NPROC, and NTEXT | |
559 | can be changed, and also TIMEZONE and perhaps HZ if you run on 50 cycles.\(dg | |
560 | .FS | |
561 | \(dg If you change NINODE, NFILE, NPROC or NTEXT, then the programs | |
562 | .IR analyze (1), | |
563 | .IR ps (1), | |
564 | .IR pstat (1) | |
565 | and | |
566 | .IR w (1) | |
567 | will have to be recompiled. | |
568 | A procedure for doing this is given below. | |
569 | .FE | |
570 | As distributed, the system is tuned for a fairly large machine. | |
571 | (There are also tunable constants in the file | |
572 | /usr/src/sys/h/vm.h | |
573 | but ignore them for the time being.) | |
574 | .PP | |
575 | To generate a new VM/UNIX do | |
576 | .DS | |
577 | \fB#\fR make clean | |
578 | \fB#\fR make depend | |
579 | \fB#\fR make | |
580 | .DE | |
581 | and when this works | |
582 | .DS | |
583 | \fB#\fR make lint | |
584 | .DE | |
585 | to discover any residual bugs. | |
586 | .PP | |
587 | The final object file (vmunix) should be | |
588 | moved to the root, and then booted to try it out. | |
589 | It is best to name it /newvmunix so as not to destroy | |
590 | the working system until you're sure it does work. | |
591 | It is also a good idea to keep the old working version around as | |
592 | /oldvmunix (and perhaps even a /oldervmunix) to guard against disaster. | |
593 | .PP | |
594 | \fBBe sure to always have the current system in /vmunix when you are | |
595 | running multi-user or commands such as \fIps\|\fR(1)\fB | |
596 | \fBand\fR \fIw\|\fR(1)\fB will not work.\fR | |
597 | .SH | |
598 | Special Files | |
599 | .PP | |
600 | Next you must put in special files for the new devices in | |
601 | the directory /dev using | |
602 | .IR mknod (1). | |
603 | Print the configuration file | |
604 | /usr/src/sys/sys/conf.c. | |
605 | This is the major | |
606 | device switch of each device class (block and character). | |
607 | There is one line for each device configured in your system | |
608 | and a null line for place holding for those devices | |
609 | not configured. | |
610 | The essential block special files were installed above; | |
611 | for any new devices, | |
612 | the major device number is selected by counting the | |
613 | line number (from zero) | |
614 | of the device's entry in the block configuration table. | |
615 | Thus the first entry in the table bdevsw would be | |
616 | major device zero. | |
617 | This number is also printed in the table along the right margin. | |
618 | .PP | |
619 | The minor device is the drive number, | |
620 | unit number or partition as described | |
621 | under each device in section 4. | |
622 | For tapes where the unit is dial selectable, | |
623 | a special file may be made for each possible | |
624 | selection. | |
625 | You can also add entries for other disk drives. | |
626 | .PP | |
627 | In reality, device names are arbitrary. | |
628 | It is usually | |
629 | convenient to have a system for deriving names, but it doesn't | |
630 | have to be the one presented above. | |
631 | .PP | |
632 | Some further notes on minor device numbers. | |
633 | The hp driver uses the 0100 bit of the minor device number to | |
634 | indicate whether or not to interleave a filesystem across | |
635 | more than one physical device. | |
636 | See | |
637 | .IR hp (4) | |
638 | for more detail. | |
639 | The ht driver uses the 04 bit to indicate whether | |
640 | or not to rewind the tape when it is closed. The | |
641 | 010 bit indicates the density of the tape on TE16 drives. | |
642 | Again, see | |
643 | .IR ht (4). | |
644 | .PP | |
645 | The naming of character devices is similar to block devices. | |
646 | Here the names are even more arbitrary except that | |
647 | devices meant to be used | |
648 | for teletype access should (to avoid confusion, no other reason) be named | |
649 | /dev/ttyX, where X is some string (as in `0' or `d0'). | |
650 | While it is possible to use truly arbitrary strings here, the accounting | |
651 | and noticeably the | |
652 | .IR ps (1) | |
653 | command make good use of the fact that tty names | |
654 | (at Berkeley) are distinct in the first 2 characters. In fact, we use | |
655 | the following convention: | |
656 | ``ttyN'', with N a number for normal DZ ports. ``ttydX'' with X a single | |
657 | character (starting from 0) for dialups, ``ttykX'' with X a single letter | |
658 | for KL ports, and ``console'' (abbrev ``co'') for the console. | |
659 | This works out well. | |
660 | .PP | |
661 | The files console, tty0-tty7, mem, kmem, kUmem, floppy and null are | |
662 | already correctly configured, as are special files for the default | |
663 | paging are | |
664 | /dev/drum, and the raw and block versions of the root and /usr file | |
665 | systems. | |
666 | .PP | |
667 | The disk and magtape drivers provide a `raw' interface | |
668 | to the device which provides direct transmission | |
669 | between the user's core and the device and allows | |
670 | reading or writing large records. | |
671 | The raw device counts as a character device, | |
672 | and conventionally has the name of the corresponding | |
673 | standard block special file with `r' prepended. | |
674 | Thus the raw magtape | |
675 | files are called /dev/rmtX. | |
676 | .PP | |
677 | Whenever special files are created, | |
678 | care should be taken to change | |
679 | the access modes | |
680 | .IR (chmod (1)) | |
681 | on these files to appropriate values. | |
682 | .SH | |
683 | Basics of Disk Layout | |
684 | .PP | |
685 | If | |
686 | there are to be more filesystems mounted than just the root | |
687 | and /usr, | |
688 | use | |
689 | .IR mkfs (1) | |
690 | to create any new filesystem and | |
691 | put its mounting in the file /etc/rc (see | |
692 | .IR init (8) | |
693 | and | |
694 | .IR mount (1)). | |
695 | (You might look at /etc/rc anyway to | |
696 | see what has been provided for you.) | |
697 | .PP | |
698 | There are several considerations in deciding how to adjust the arrangement | |
699 | of things on your disks: | |
700 | the most important is making sure there is adequate space | |
701 | for what is required; | |
702 | secondarily, throughput should be maximized. | |
703 | Paging space is an important parameter. | |
704 | The system | |
705 | as distributed has 33440 (512 byte) blocks in which to page. | |
706 | This should be large enough for most sites. | |
707 | You can change this if local wisdom indicates this is not good. | |
708 | .PP | |
709 | Many common system programs (C, the editor, the assembler etc.) | |
710 | create intermediate files in the /tmp directory, | |
711 | so the filesystem where this is stored also should be made | |
712 | large enough to accommodate | |
713 | most high-water marks. | |
714 | The root filesystem as distributed is quite large, and there should be | |
715 | no problem. | |
716 | All the programs that create files in /tmp take | |
717 | care to delete them, but most are not immune to | |
718 | events like being hung up upon, and can leave dregs. | |
719 | The directory should be examined every so often and the old | |
720 | files deleted. | |
721 | .PP | |
722 | Exhaustion of user-file space is certain to occur | |
723 | now and then; | |
724 | the only mechanisms for controlling this phenomenon | |
725 | are occasional use of | |
726 | .IR du (1), | |
727 | .IR df (1), | |
728 | .IR quot (1), | |
729 | threatening | |
730 | messages of the day, and personal letters. | |
731 | .PP | |
732 | The efficiency with which UNIX is able to use the CPU | |
733 | is largely dictated by the configuration of disk controllers. | |
734 | For general time-sharing applications, | |
735 | the best strategy is to try to split the root filesystem and system binaries | |
736 | (/usr), the temporary files and paging activity (/tmp and /dev/drum), | |
737 | and the user files among three disk arms. | |
738 | We will discuss such considerations more below. | |
739 | .PP | |
740 | Once you have decided how to make best use | |
741 | of your hardware, the question is how to initialize it. | |
742 | If you have the equipment, | |
743 | the best way to move a filesystem | |
744 | is to dump it | |
745 | .IR (dump (1)) | |
746 | to magtape, | |
747 | use | |
748 | .IR mkfs (1) | |
749 | to create the new filesystem, | |
750 | and restore (\fIrestor\fR\|(1)) the tape. | |
751 | If for some reason you don't want to use magtape, | |
752 | dump accepts an argument telling where to put the dump; | |
753 | you might use another disk. | |
754 | Sometimes a filesystem has to be increased in logical size | |
755 | without copying. | |
756 | The super-block of the device has a word | |
757 | giving the highest address which can be allocated. | |
758 | For relatively small increases, this word can be patched | |
759 | using the debugger (\fIadb\fR\|(1)) | |
760 | and the free list reconstructed using | |
761 | .IR icheck (1). | |
762 | The size should not be increased very greatly | |
763 | by this technique, however, | |
764 | since although the allocatable space will increase | |
765 | the maximum number of files will not (that is, the i-list | |
766 | size can't be changed). | |
767 | Read and understand the description given in | |
768 | .IR filesys (5) | |
769 | before playing around in this way. | |
770 | .PP | |
771 | If you have to merge a filesystem into another, existing one, | |
772 | the best bet is to | |
773 | use | |
774 | .IR tar (1). | |
775 | If you must shrink a filesystem, the best bet is to dump | |
776 | the original and restor it onto the new filesystem. | |
777 | However, this will not work if the i-list on the smaller filesystem | |
778 | is smaller than the maximum allocated inode on the larger. | |
779 | If this is the case, reconstruct the filesystem from scratch | |
780 | on another filesystem (perhaps using \fItar\fR(1)) and then dump it. | |
781 | If you | |
782 | are playing with the root filesystem and only have one drive | |
783 | the procedure is more complicated. | |
784 | What you do is the following: | |
785 | .IP 1. | |
786 | GET A SECOND PACK!!!! | |
787 | .IP 2. | |
788 | Dump the root filesystem to tape using | |
789 | .IR dump (1). | |
790 | .IP 3. | |
791 | Bring the system down and mount the new pack. | |
792 | .IP 4. | |
793 | Load the standalone versions of | |
794 | .IR mkfs (1) | |
795 | and | |
796 | .IR restor (1) | |
797 | from the floppy | |
798 | with a procedure like: | |
799 | .DS | |
800 | \fB>>>\fRUNJAM | |
801 | \fB>>>\fRINIT | |
802 | \fB>>>\fRLOAD MKFS | |
803 | LOAD DONE, xxxx BYTES LOADED | |
804 | \fB>>>\fRST 2 | |
805 | ||
806 | \&... | |
807 | ||
808 | \fB>>>\fRH | |
809 | HALTED AT yyyy | |
810 | \fB>>>\fRU | |
811 | \fB>>>\fRI | |
812 | \fB>>>\fRLOAD RESTOR | |
813 | LOAD DONE, zzzz BYTES LOADED | |
814 | ||
815 | \&... etc | |
816 | .DE | |
817 | .IP 5. | |
818 | Boot normally | |
819 | using the newly created disk filesystem. | |
820 | .PP | |
821 | Note that if you change the disk partition tables or add new disk | |
822 | drivers they should also be added to the standalone system in | |
823 | /usr/src/sys/stand. | |
824 | .SH | |
825 | System Identification | |
826 | .PP | |
827 | You should edit the files: | |
828 | .DS | |
829 | /usr/include/ident.h | |
830 | /usr/include/whoami.h | |
831 | /usr/include/whoami | |
832 | .DE | |
833 | to correspond to your system, and then recompile and install | |
834 | .IR getty.vm (8) | |
835 | via: | |
836 | .DS | |
837 | \fB#\fR cd /usr/src/cmd | |
838 | \fB#\fR DESTDIR=/ | |
839 | \fB#\fR export DESTDIR | |
840 | \fB#\fR Admin/mk getty.vm.c | |
841 | .DE | |
842 | This will arrange for an appropriate banner to be printed on terminals | |
843 | before users log in. | |
844 | .SH | |
845 | Adding New Users | |
846 | .PP | |
847 | See | |
848 | .IR adduser (8); | |
849 | local needs will undoubtedly dictate a somewhat | |
850 | different procedure. | |
851 | .SH | |
852 | Multiple Users | |
853 | .PP | |
854 | If UNIX is to support simultaneous | |
855 | access from more than just the console terminal, | |
856 | the file /etc/ttys (\fIttys\fR\|(5)) has to be edited. | |
857 | To add a new terminal be sure the device is configured | |
858 | and the special file exists, then set | |
859 | the first character of the appropriate line of /etc/ttys to 1 | |
860 | (or add a new line). | |
861 | You should also edit the file | |
862 | /etc/ttytype | |
863 | placing the type of the new terminal there (see \fIttytype\fR\|(5)). | |
864 | The file | |
865 | /etc/ttywhere is also a useful one to keep up to date. | |
866 | .PP | |
867 | Note that /usr/src/cmd/init.vm.c will have to be recompiled if there are to be | |
868 | more than 100 terminals. | |
869 | Also note that if the special file is inaccessible when \fIinit.vm\fR tries to create a process | |
870 | for it, the system will thrash trying and retrying to open it. | |
871 | .SH | |
872 | File System Health | |
873 | .PP | |
874 | Periodically (say every day or so) and always after a crash, | |
875 | you should check all the filesystems for consistency | |
876 | (\fIicheck, dcheck\fR\|(1)). | |
877 | It is quite important to execute | |
878 | .IR sync (8) | |
879 | before rebooting or taking the machine down. | |
880 | This is done automatically every 30 seconds by the | |
881 | .IR update (8) | |
882 | program when a multiple-user system is running, | |
883 | but you should do it anyway to make sure. | |
884 | .PP | |
885 | Dumping of the filesystem should be done regularly, | |
886 | since once the system is going it is very easy to | |
887 | become complacent. | |
888 | Complete and incremental dumps are easily done with | |
889 | .IR dump (1). | |
890 | See the scripts | |
891 | /etc/dumpusr | |
892 | and | |
893 | /etc/dumproot | |
894 | which we use to dump our file systems. | |
895 | You should arrange to do a towers-of-hanoi dump sequence; we tune | |
896 | ours so that almost all files are dumped on two tapes and kept for at | |
897 | least a week in most every case. We take full dumps every month (and keep | |
898 | these indefinitely). | |
899 | Note that | |
900 | /etc/dumpusr | |
901 | maintain the file /etc/lastdumpdone. | |
902 | This can be printed out at login by people who can easily then | |
903 | start dumps if one is needed.\(dg | |
904 | .FS | |
905 | \(dg More precisely, we have three sets of dump tapes: 10 daily tapes, | |
906 | 5 weekly sets of 2 tapes, and fresh sets of three tapes monthly. | |
907 | We do daily dumps circularly on the daily tapes with sequence | |
908 | `3 2 5 4 7 6 9 8 9 9 9 ...'. | |
909 | Each weekly is a level 1 and the daily dump sequence level | |
910 | restarts after each weekly dump. | |
911 | Full dumps are level 0 and the daily sequence restarts after each full dump | |
912 | also. | |
913 | Thus a typical dump sequence would be: | |
914 | .TS H | |
915 | c c c c c | |
916 | n n n l l. | |
917 | tape name level number date opr size | |
918 | _ | |
919 | .TH | |
920 | FULL 0 Nov 24, 1979 jkf 137K | |
921 | D1 3 Nov 28, 1979 jkf 29K | |
922 | D2 2 Nov 29, 1979 rrh 34K | |
923 | D3 5 Nov 30, 1979 rrh 19K | |
924 | D4 4 Dec 1, 1979 rrh 22K | |
925 | W1 1 Dec 2, 1979 etc 40K | |
926 | D5 3 Dec 4, 1979 rrh 15K | |
927 | D6 2 Dec 5, 1979 jkf 25K | |
928 | D7 5 Dec 6, 1979 jkf 15K | |
929 | D8 4 Dec 7, 1979 rrh 19K | |
930 | W2 1 Dec 9, 1979 etc 118K | |
931 | D9 3 Dec 11, 1979 rrh 15K | |
932 | D10 2 Dec 12, 1979 rrh 26K | |
933 | D1 5 Dec 15, 1979 rrh 14K | |
934 | W3 1 Dec 17, 1979 etc 71K | |
935 | D2 3 Dec 18, 1979 etc 13K | |
936 | FULL 0 Dec 22, 1979 etc 135K | |
937 | .TE | |
938 | We do weekly's often enough that daily's always fit on one tape and | |
939 | in fact never get to the sequence of 9's in the daily level numbers. | |
940 | .FE | |
941 | .PP | |
942 | Dumping of files by name is best done by | |
943 | .IR tar (1) | |
944 | but the number of files is somewhat limited. | |
945 | Finally if there are enough drives entire | |
946 | disks can be copied with | |
947 | .IR dd (1) | |
948 | using the raw special files and an appropriate | |
949 | block size. | |
950 | .SH | |
951 | Converting 32/V Filesystems | |
952 | .PP | |
953 | The best way to convert filesystems from 32/V | |
954 | to the new format | |
955 | is to use | |
956 | .IR tar (1). | |
957 | After converting, you can still restore files from your old-format dump | |
958 | tapes (yes the dump format is different, sorry about that), by using | |
959 | ``512restor'' instead of ``restor''. | |
960 | If you wish, you can move whole file systems from 32/V to the new system | |
961 | by using ``dump'' and then ``512restor''. | |
962 | .SH | |
963 | Regenerating the system | |
964 | .PP | |
965 | It is quite easy to regenerate the system, and it is a good | |
966 | idea to try this once right away to build confidence. | |
967 | The system consists of three major parts: | |
968 | the kernel itself (/usr/src/sys/sys), the user programs | |
969 | (/usr/src/cmd and subdirectories), and the libraries | |
970 | (/usr/src/lib*). | |
971 | The major part of this is /usr/src/cmd. | |
972 | .PP | |
973 | We have already seen how to recompile the system itself. | |
974 | The three major libraries are the C library in /usr/src/libc | |
975 | and the \s-2FORTRAN\s0 libraries /usr/src/libI77 and /usr/src/libF77. In each | |
976 | case the library is remade by changing into the corresponding directory | |
977 | and doing | |
978 | .DS | |
979 | \fB#\fR make | |
980 | .DE | |
981 | and then installed by | |
982 | .DS | |
983 | \fB#\fR make install | |
984 | .DE | |
985 | Similar to the system, | |
986 | .DS | |
987 | \fB#\fR make clean | |
988 | .DE | |
989 | cleans up. | |
990 | The source for all other libraries is kept in subdirectories of | |
991 | /usr/src/lib; each has a makefile and can be recompiled by the above | |
992 | recipe. | |
993 | .PP | |
994 | Recompiling all user programs is accomplished by using | |
995 | a directory in /usr/src/cmd, called Admin, which contains | |
996 | two files: mk and destinations. | |
997 | The file mk is a shell script for recompiling files in /usr/src/cmd. | |
998 | For instance, to recompile ``date.c'', | |
999 | all one has to do is | |
1000 | .DS | |
1001 | \fB#\fR cd /usr/src/cmd | |
1002 | \fB#\fR Admin/mk date.c | |
1003 | .DE | |
1004 | this will place a stripped version of the binary of ``date'' | |
1005 | in /usr/dist3/bin/date, since date normally resides in /bin, and | |
1006 | Admin is building a file-system like tree rooted at /usr/dist3. | |
1007 | You will have to make the directory dist3 for this to work. | |
1008 | It is possible to use any directory for the destination, it isn't necessary | |
1009 | to use the default /usr/dist3. | |
1010 | You can set the default by doing: | |
1011 | .DS | |
1012 | \fB#\fR DESTDIR=\fIpathname\fR | |
1013 | \fB#\fR export DESTDIR | |
1014 | .DE | |
1015 | .PP | |
1016 | To regenerate all the system source you can do | |
1017 | .DS | |
1018 | \fB#\fR DESTDIR=/usr/newsys | |
1019 | \fB#\fR export DESTDIR | |
1020 | \fB#\fR cd /usr | |
1021 | \fB#\fR rm -r newsys | |
1022 | \fB#\fR mkdir newsys | |
1023 | \fB#\fR cd /usr/src/cmd | |
1024 | \fB#\fR Admin/mk * > Admin/errs 2>& 1 & | |
1025 | .DE | |
1026 | This will take about 4 hours on a reasonably configured machine. | |
1027 | When it finished you can move the hierarchy into the normal places | |
1028 | using | |
1029 | .IR mv (1) | |
1030 | and | |
1031 | .IR cp (1), | |
1032 | and then execute | |
1033 | .DS | |
1034 | \fB#\fR DESTDIR=/ | |
1035 | \fB#\fR export DESTDIR | |
1036 | \fB#\fR cd /usr/src/cmd/Admin | |
1037 | \fB#\fR mk ALIASES | |
1038 | \fB#\fR mk MODES | |
1039 | .DE | |
1040 | to link files together as necessary and to set all the right set-user-id | |
1041 | bits. | |
1042 | .SH | |
1043 | Making orderly changes | |
1044 | .PP | |
1045 | In order to keep track of changes to system source we migrate changed | |
1046 | versions of commands in /usr/src/cmd in through the directory /usr/src/new | |
1047 | and out of /usr/src/cmd into /usr/src/old for a time before removing them. | |
1048 | Locally written commands which aren't distributed are kept in /usr/src/local | |
1049 | and their binaries are kept in /usr/local. This allows /usr/bin /usr/ucb | |
1050 | and /bin to correspond to the distribution tape (and to the manuals that | |
1051 | people can buy). People wishing to use /usr/local commands are made | |
1052 | aware that they aren't in the base manual. As manual updates incorporate | |
1053 | these commands they are moved to /usr/ucb. | |
1054 | .PP | |
1055 | A directory /usr/junk to throw garbage into, as well as binary directories | |
1056 | /usr/old and /usr/new are very useful. The man command supports manual | |
1057 | directories such as /usr/man/manj for junk and /usr/man/manl for local | |
1058 | to make this or something similar practical. | |
1059 | .SH | |
1060 | Interpreting system activity | |
1061 | .PP | |
1062 | The | |
1063 | .I vmstat | |
1064 | program provided with the system is designed to be an aid to monitoring | |
1065 | systemwide activity. Together with the | |
1066 | .IR ps (1) | |
1067 | command (as in ``ps av''), it can be used to investigate systemwide | |
1068 | virtual activity. | |
1069 | You should modify | |
1070 | .IR vmstat (1m) | |
1071 | so that it prints out disk statistics for the disks | |
1072 | you have, changing the headers to something appropriate for your | |
1073 | system. | |
1074 | Examine the definitions of DK_UNIT in the disk drivers supplied to see | |
1075 | how the code in \fIvmstat\fR and \fIiostat\fR and the system correspond. | |
1076 | .PP | |
1077 | By running | |
1078 | .I vmstat | |
1079 | when the system is active you can judge the system activity in several | |
1080 | dimensions: job distribution, virtual memory load, paging and swapping | |
1081 | activity, disk and cpu utilization. | |
1082 | Ideally, most jobs should be either running (RQ) or sleeping (SL), | |
1083 | there should be little paging or swapping activity, there should | |
1084 | be available bandwidth on the disk devices (most single arms peak | |
1085 | out at about 30-35 tps in practice), and the user cpu utilization (US) should | |
1086 | be high (about 60%). | |
1087 | .PP | |
1088 | If the system is busy, then the number of active jobs may be large, | |
1089 | and several of these jobs may often be in disk wait (DW). If the virtual | |
1090 | memory is very active, then the paging demon may be running (SR will | |
1091 | be non-zero). It is healthy for the paging demon to free pages when | |
1092 | the virtual memory gets active; it is triggered by the amount of free | |
1093 | memory dropping below a threshold and increases its pace as free memory | |
1094 | goes to zero. | |
1095 | .PP | |
1096 | If you run | |
1097 | .I vmstat | |
1098 | when the system is busy (a ``vmstat 5'' is best, since that is how | |
1099 | often most of the numbers are recomputed by the system), you can find | |
1100 | imbalances by noting abnormal job distributions. If a large number | |
1101 | of jobs are in disk wait (DW) or page wait (PW), then the disk subsystem | |
1102 | is overloaded or imbalanced. If you have a large number of non-dma | |
1103 | devices or open teletype lines which are ``ringing'', or user programs | |
1104 | which are doing high-speed non-buffered input/output, then the system | |
1105 | time may go very high (60-70% or higher). | |
1106 | .PP | |
1107 | If the system is very heavily loaded, or if you have very little memory | |
1108 | relative to your load (512K is little in most any case), then the system | |
1109 | may be forced to swap. This is likely to be accompanied by a noticeable | |
1110 | reduction in system performance as the system does not swap ``working | |
1111 | sets'', but rather forces jobs to reinitialize their resident sets | |
1112 | by demand paging. If you expect to be in a memory-poor environment | |
1113 | for an extended period you might consider administratively limiting system | |
1114 | load. | |
1115 | .SH | |
1116 | Tunable constants | |
1117 | .PP | |
1118 | There is a modicum of tuning available in the paging replacement algorithm | |
1119 | if it appears to be badly tuned for your configuration. | |
1120 | The page replacement (clock) algorithm is run whenever there are | |
1121 | not LOTSFREE pages available | |
1122 | (this and all other constants discussed here are defined in the system | |
1123 | header file /usr/src/sys/h/vm.h). | |
1124 | This sets up resistance to consumption of the remaining free memory | |
1125 | at a minimal rate SLOWSCAN, which gives the desired number of seconds | |
1126 | between successive examinations of each page. The rate at which the clock | |
1127 | algorithm is run increases linearly to a desired rate of FASTSCAN when | |
1128 | there is no free memory. Thus as the available free memory decreases, | |
1129 | the clock algorithm works harder to hold on to what is left. | |
1130 | If less than DESFREE pages are available and the paging rate is high, | |
1131 | then the system will begin to swap processes out. If less than MINFREE | |
1132 | pages are available then the system will begin to swap, regardless of the | |
1133 | paging rate. | |
1134 | .PP | |
1135 | When it has to swap, the system first tries to find a process which has | |
1136 | been blocked for a long time and swap it out first. If there are no | |
1137 | jobs of this flavor, then it will choose among the remaining jobs in | |
1138 | a strictly round-robin fashion, based on core residency time. It attempts | |
1139 | to guarantee (during periods of very heavy load) enough core residency to | |
1140 | a process to allow it to at least rebuild its set of active pages (since | |
1141 | it must do so by demand paging). Processes which are swapped out | |
1142 | with large numbers of active pages similarly receive lower priority for | |
1143 | swapin, favoring small jobs to return to the core resident set quickly. | |
1144 | .PP | |
1145 | It is | |
1146 | .I very | |
1147 | desirable that the system run under reasonably heavy load with little | |
1148 | swapping, with the memory partitioning being done by the clock replacement | |
1149 | algorithm, rather than by the swapping algorithm. The costs associated | |
1150 | with paging activity are the time spent in the paging demon, the overhead | |
1151 | associated with reclaim page faults (RE), and the extra disk activity | |
1152 | associated with pagins and pageouts. | |
1153 | We will discuss disk considerations later; when kept to about 40 reclaim | |
1154 | faults per second, the cost of reclaims is less than 1% of total processor | |
1155 | time. The cpu time (shown by ``ps l2'') accumulated by the pageout demon | |
1156 | will show how much overhead it is generating. | |
1157 | .PP | |
1158 | The system, as distributed, runs the replacement algorithm whenever less | |
1159 | than 1/8 of the total user memory is free. | |
1160 | This is done starting with a 30 second revolution time of the clock | |
1161 | algorithm and increasing to a 20 second revolution time when there is no | |
1162 | free memory. | |
1163 | The goal here is to use as much memory as possible (i.e. have the | |
1164 | free list short) but to not have the system run out and start to swap. | |
1165 | You can experiment with changing the writable copies of these variables | |
1166 | (e.g. ``lotsfree'' is the writable copy of LOTSFREE) using | |
1167 | .I adb, | |
1168 | as in: | |
1169 | .DS | |
1170 | \fB#\fR adb \-w /vmunix /dev/kmem | |
1171 | /m 0 #ffffffff | |
1172 | lotsfree/D | |
1173 | ---adb prints value of lotsfree--- | |
1174 | /W 0t100 | |
1175 | .DE | |
1176 | Here the ``/W 0t100'' command changed the value of | |
1177 | .I lotsfree | |
1178 | to be 100 (decimal). | |
1179 | .SH | |
1180 | Balancing disk load | |
1181 | .PP | |
1182 | It is critical for good performance to balance disk load. | |
1183 | There are at least five components of the disk load which you can | |
1184 | divide between the available disks: | |
1185 | .DS | |
1186 | 1. The root file system. | |
1187 | 2. The /tmp file system. | |
1188 | 3. The /usr file system. | |
1189 | 4. The user files. | |
1190 | 5. The paging activity. | |
1191 | .DE | |
1192 | In our system we currently have 1.75M bytes of memory and 2 disks: | |
1193 | an RP06 and an AMPEX 9300. We run with the root, /tmp, and paging activity | |
1194 | on the RP-06, while the | |
1195 | /usr | |
1196 | file system and user files are on the 9300. | |
1197 | This gives a fairly even split of file activity in our environment. | |
1198 | .PP | |
1199 | A split such as this is a good initial guess if you have two arms. | |
1200 | If you are fortunate to have three arms, you can try splitting the | |
1201 | files up various ways. The most important things to remember are to | |
1202 | even out the disk load as much as possible, and to do this by | |
1203 | decoupling file systems (on separate arms) between which heavy copying occurs. | |
1204 | Note that a long term average balanced load is not important... it is | |
1205 | much more important to have instantaneously balanced | |
1206 | load when the system is busy. | |
1207 | .PP | |
1208 | Intelligent experimentation with a few file system arrangements can | |
1209 | pay off in much improved performance. It is particularly easy to | |
1210 | move the root, the | |
1211 | /tmp | |
1212 | file system and the paging area. Place the | |
1213 | user files and the | |
1214 | /usr | |
1215 | directory as space needs dictate and experiment | |
1216 | with the other, more easily moved file systems. | |
1217 | .PP | |
1218 | Finally, when you have your configuration worked out, you should set | |
1219 | the constant MAXPGIO based on the maximum number of transfers you can | |
1220 | expect from your paging device. The system assumes that if more transfers | |
1221 | than this occur, then the system is overloaded, and unless there is | |
1222 | a reasonable amount of free memory, it then begins to swap. The swap scheduler | |
1223 | also consider jobs small if their size when they were swapped out was less | |
1224 | than twice this constant. Such small jobs have a better chance of getting | |
1225 | swapped back in when the core situation is tight, since they can be | |
1226 | expected to be able to run in a small number of page frames. | |
1227 | .SH | |
1228 | Process size limitations | |
1229 | .PP | |
1230 | As distributed, the system provides for a maximum of 64M bytes of | |
1231 | resident user virtual address space. The size of the text, and data | |
1232 | segments of a single process are currently limited to 4M bytes each, and | |
1233 | the stack segment size is limited to 512K bytes. These | |
1234 | can be increased by changing the constants MAXTSIZ, MAXDSIZ and MAXSSIZ | |
1235 | in the file | |
1236 | /usr/src/sys/h/vm.h. | |
1237 | You must be careful in doing this that you have adequate paging space. | |
1238 | As configured above, the system has only 16M bytes of paging area.\(dg | |
1239 | .FS | |
1240 | \(dg Recovery from running out of paging area is currently | |
1241 | not handled gracefully: the system panics. | |
1242 | .FE | |
1243 | .PP | |
1244 | To increase the amount of resident virtual space possible, | |
1245 | you can alter the constant USRPTSIZE (in | |
1246 | /usr/src/sys/h/vm.h) | |
1247 | and by correspondingly change the definitions of | |
1248 | .I Usrptmap | |
1249 | and | |
1250 | .I Syssize | |
1251 | in | |
1252 | /usr/src/sys/sys/locore.s | |
1253 | .PP | |
1254 | The 4M byte limitation on individual segment size is enforced by | |
1255 | the constants MAXTSIZ, MAXDSIZ and MAXSSIZ for the text, data and stack | |
1256 | segments respectively. These can be increased, given the availability of | |
1257 | adequate amounts of swap space, up to 16M bytes. | |
1258 | In order to increase the size of the stack or data | |
1259 | segments beyond 16M, you will have to increase the amount of space which can be | |
1260 | mapped by the corresponding disk map. To increase, for instance, the | |
1261 | maximum segment size to 32M bytes it would be adequate to double both | |
1262 | the initial and maximum sizes for the diskmap. Thus defining (in | |
1263 | /usr/src/sys/h/dmap.h) | |
1264 | .DS | |
1265 | #define DMMIN 32 | |
1266 | #define DMMAX 8192 | |
1267 | .DE | |
1268 | (i.e. a minimum segment size of 16K bytes and a maximum size of 4M bytes) | |
1269 | would allow the 16 diskmap slots to map 32M bytes. | |
1270 | .SH | |
1271 | Other limitations | |
1272 | .PP | |
1273 | Due to the fact that the file system block numbers are stored in | |
1274 | page table | |
1275 | .B pg_blkno | |
1276 | entries, the maximum size of a file system is limited to | |
1277 | 2^20 1024 byte blocks. Thus no file system can be larger than 1024M bytes. | |
1278 | .PP | |
1279 | The number of system buffers is limited to be less than 64 | |
1280 | because of the way that the MASSBUS adaptor map registers are initialized. | |
1281 | The construct ``(128<<9)'' appears in the | |
1282 | /usr/src/sys/sys/mba.c | |
1283 | and | |
1284 | /usr/src/sys/sys/hp.c | |
1285 | code, silently enforcing this restriction. | |
1286 | .SH | |
1287 | Scaling down | |
1288 | .PP | |
1289 | If you have less than 1M byte of memory you may wish to scale the | |
1290 | paging system down, by reducing some fixed table sizes not directly | |
1291 | related to the paging system. | |
1292 | For instance, we have raised NBUF from 32 to 48, NCLIST from 150 to 500, | |
1293 | (also increasing the basic clist block size CBSIZE from 12 to 28) | |
1294 | and NPROC, NINODE and NFILE for a fairly large system from the way | |
1295 | they were distributed for \s-2UNIX/32V\s0. | |
1296 | We also increased TTLOWAT (from 50 to 125) and TTHIWAT (from 150 to 650.) | |
1297 | If you | |
1298 | pull NCLIST down, you should adjust these also. | |
1299 | You can use | |
1300 | .IR pstat (1m) | |
1301 | to find out how much of these structures are typically in use. | |
1302 | Although the document is pretty much obsolete for the \s-2VAX\s0, | |
1303 | you can see the last few pages of ``Regenerating System Software'' | |
1304 | in Volume 2B of the programmers manual for hints on setting some of these | |
1305 | constants. | |
1306 | .SH | |
1307 | Odds and Ends | |
1308 | .PP | |
1309 | The programs | |
1310 | dump, | |
1311 | icheck, quot, dcheck, ncheck, and df | |
1312 | (source in /usr/source/cmd) | |
1313 | should be changed to | |
1314 | reflect your default mounted filesystem devices. | |
1315 | Print the first few lines of these | |
1316 | programs and the changes will be obvious. | |
1317 | You will probably want to amend some of /usr/lib/crontab, and will | |
1318 | certainly want to add to /usr/lib/Mail.rc. | |
1319 | .PP | |
1320 | You should periodically examine the file /usr/adm/messages, which acts | |
1321 | as a system error log (see dmesg(1m)). | |
1322 | In particular, memory controller errors are checked for every 10 minutes | |
1323 | and a diagnostic is produced (printing memory controller register C) | |
1324 | if there were any errors. | |
1325 | .in +4i | |
1326 | .sp 3 | |
1327 | Good Luck | |
1328 | .sp 1 | |
1329 | William N. Joy | |
1330 | .br | |
1331 | Ozalp Babaoglu |