| 1 | .bd S B 3 |
| 2 | .TL |
| 3 | Setting up the Third Berkeley Software Tape* |
| 4 | .AU |
| 5 | William N. Joy |
| 6 | Ozalp Babaoglu |
| 7 | .AI |
| 8 | Computer Science Division |
| 9 | Department of Electrical Engineering and Computer Science |
| 10 | University of California, Berkeley |
| 11 | Berkeley, California 94720 |
| 12 | .PP |
| 13 | .de IR |
| 14 | \fI\\$1\fP\\$2 |
| 15 | .. |
| 16 | .de UX |
| 17 | \s-2UNIX\s0\\$1 |
| 18 | .. |
| 19 | .FS |
| 20 | *An early version of this paper appeared under the title |
| 21 | ``Setting up the Berkeley Virtual Memory Extensions to the |
| 22 | \s-2UNIX\s0 |
| 23 | Operating System'' |
| 24 | and, no doubt, references to this paper by this name exist elsewhere |
| 25 | in the documentation. |
| 26 | Portions of this document are adapted from |
| 27 | ``Setting Up Unix/32V Version 1.0'' |
| 28 | by Thomas B. London and John F. Reiser. |
| 29 | .FE |
| 30 | The distribution tape can be used only a DEC VAX-11/780** |
| 31 | with RM03 or RP06 disks and with |
| 32 | TE16 tape drives. |
| 33 | We have the ability to make tapes for systems with UNIBUS** disks, but |
| 34 | .FS |
| 35 | ** DEC, VAX, UNIBUS and MASSBUS are trademarks of |
| 36 | Digital Equipment Corporation. |
| 37 | .FE |
| 38 | .FS |
| 39 | \(dg \s-2UNIX\s0 is a Trademark of Bell Laboratories. |
| 40 | .FE |
| 41 | such tapes are inherently rather system-specific, and will not be |
| 42 | discussed here. |
| 43 | The tape consists of some preliminary bootstrapping programs followed by |
| 44 | one dump of a filesystem (see |
| 45 | .IR dump (1)) |
| 46 | and one tape archive image (see |
| 47 | .IR tar (1)); |
| 48 | if needed, |
| 49 | individual files can be extracted |
| 50 | after the initial construction of the filesystems. |
| 51 | .PP |
| 52 | If you are set up to do it, |
| 53 | it is a good idea immediately to make a copy of the |
| 54 | tape to guard against disaster. |
| 55 | The tape is 9-track 1600 BPI and contains some 512-byte records |
| 56 | followed by many 10240-byte records. |
| 57 | There are interspersed tapemarks; end-of-tape is signalled |
| 58 | by a double end-of-file. |
| 59 | .PP |
| 60 | The tape contains binary images |
| 61 | of the system and all the user level programs, along with source |
| 62 | and manual sections for them. |
| 63 | There are about 4200 |
| 64 | .UX \(dg |
| 65 | files altogether. |
| 66 | The first tape file contains bootstrapping programs. |
| 67 | The second tape file is to be put on one filesystem |
| 68 | called the `root filesystem', and contains essential binaries and enough |
| 69 | other files to allow the system to run. |
| 70 | The third tape file has all of the source and documentation. |
| 71 | Altogether the files provided on the tape occupy approximately 40000 512 byte |
| 72 | blocks.\(dd |
| 73 | .FS |
| 74 | \(dd\s-2UNIX\s0 traditionally talks in terms of 512 character blocks, and |
| 75 | for consistency |
| 76 | across different versions of |
| 77 | .UX |
| 78 | and to avoid mass confusion, user programs in the Virtual Vax version of the |
| 79 | system also talk in terms of 512 blocks, despite the fact that the file |
| 80 | system allocates 1024 byte blocks of disk space. |
| 81 | All user progras such as |
| 82 | .IR ls (1) |
| 83 | and |
| 84 | .IR du (1) |
| 85 | speak in terms of 512 byte blocks; only system maintenance programs such |
| 86 | as |
| 87 | .IR mkfs (1), |
| 88 | .IR icheck (1), |
| 89 | .IR dump (1), |
| 90 | and |
| 91 | .IR df (1), |
| 92 | speak to 1024 byte blocks. It is true that i/o is most efficient in 1024 |
| 93 | byte quantities, but it is most natural for the user to think of this |
| 94 | as ``2 blocks at a time.'' |
| 95 | In any case, packs remain sectored 512 bytes per sector, and at the lowest |
| 96 | driver levels the system deals with 512 byte disk records. |
| 97 | .FE |
| 98 | .SH |
| 99 | Making a disk from tape |
| 100 | .PP |
| 101 | This description is an annotated version of the `sysgen' manual |
| 102 | page in section 8 of the UNIX Programmer's Manual. |
| 103 | Before you begin to work on the remainder of this manual, be sure you |
| 104 | have an up to date manual, and that you have applied all updates |
| 105 | to the manual which were provided with it, in the correct order. |
| 106 | .PP |
| 107 | Perform the following bootstrap procedure to obtain |
| 108 | a disk with a root filesystem on it. |
| 109 | .IP 1. |
| 110 | Mount the magtape on drive 0 at load point, making |
| 111 | sure that the ring is not inserted. |
| 112 | .IP 2. |
| 113 | Mount a disk pack on drive 0. |
| 114 | .IP 3. |
| 115 | Key in at 50000 and execute the following boot program: |
| 116 | You may enter in lower-case, the LSI-11 will echo in upper-case. |
| 117 | The machine's printouts are shown in boldface, |
| 118 | explanatory comments are within ( ). |
| 119 | Terminate each line you type by carriage return or line-feed. |
| 120 | .RT |
| 121 | .DS |
| 122 | \fB>>>\|\fRHALT |
| 123 | \fB>>>\|\fRUNJAM |
| 124 | \fB>>>\|\fRINIT |
| 125 | \fB>>>\|\fRD 50000 20009FDE |
| 126 | \fB>>>\|\fRD+ D0512001 |
| 127 | \fB>>>\|\fRD+ 3204A101 |
| 128 | \fB>>>\|\fRD+ C114C08F |
| 129 | \fB>>>\|\fRD+ A1D40424 |
| 130 | \fB>>>\|\fRD+ 008FD00C |
| 131 | \fB>>>\|\fRD+ C1800000 |
| 132 | \fB>>>\|\fRD+ 8F320800 |
| 133 | \fB>>>\|\fRD+ 10A1FE00 |
| 134 | \fB>>>\|\fRD+ 00C139D0 |
| 135 | \fB>>>\|\fRD+ 00000004 |
| 136 | \fB>>>\|\fRE 50000/NE:A |
| 137 | \&... (machine prints out values, check typing) |
| 138 | \fB>>>\|\fRSTART 50000 |
| 139 | .DE |
| 140 | .IP |
| 141 | The tape should move and the CPU should halt at location 5002A. |
| 142 | If it doesn't, you probably have entered the program |
| 143 | incorrectly. |
| 144 | Start over and check your typing. |
| 145 | .IP 4. |
| 146 | Start the CPU with |
| 147 | .DS |
| 148 | \fB>>>\|\fRSTART 0 |
| 149 | .DE |
| 150 | .IP 5. |
| 151 | The console should type |
| 152 | .DS |
| 153 | .I |
| 154 | \fB=\fR |
| 155 | .R |
| 156 | .DE |
| 157 | If the disk pack is already formatted, skip to step 6. |
| 158 | Otherwise, format the pack with: |
| 159 | .DS |
| 160 | (bring in standalone RP06 formatter) |
| 161 | \fB=\|\fRrp6fmt |
| 162 | \fBformat : Format RP06/RM03 Disk\fR |
| 163 | |
| 164 | \fBMBA no. : \fR0 (format spindle on mba \# 0) |
| 165 | \fBunit : \fR0 (format unit zero) |
| 166 | (this procedure should take about 20 minutes) |
| 167 | (some diagnostic messages may appear here) |
| 168 | |
| 169 | \fBunit : \fR-1 (exit from formatter) |
| 170 | \fB=\fR (back at tape boot level) |
| 171 | .DE |
| 172 | .IP 6. |
| 173 | Next, verify the readability of the pack via |
| 174 | .DS |
| 175 | (bring in RP06 verifier) |
| 176 | \fB=\|\fRrpread |
| 177 | \fBdread : Read RP06/RM03 Disk\fR |
| 178 | |
| 179 | \fBdisk unit : \fR0 (specify unit zero) |
| 180 | \fBstart block : \fR0 (start at block zero) |
| 181 | \fBno. blocks :\fR (default is entire pack) |
| 182 | |
| 183 | (this procedure should take about 10 minutes) |
| 184 | (some diagnostic messages may appear here) |
| 185 | \fB# Data Check errors : nn\fR (number of soft errors) |
| 186 | \fB# Other errors : xx\fR (number of hard errors) |
| 187 | \fBdisk unit: \fR\-1 (exit from rpread) |
| 188 | \fB=\fR (back to tape boot) |
| 189 | .DE |
| 190 | If the number of `Other errors' is not zero, consideration |
| 191 | should be given to obtaining a clean pack before proceeding |
| 192 | further. |
| 193 | .IP 7. |
| 194 | Create the root file system with the following procedure: |
| 195 | .DS |
| 196 | (bring in a standalone version of the \fImkfs\fR (1) program) |
| 197 | \fB=\|\fRmkfs |
| 198 | \fBfile sys size:\fR 7942 (number of 1024 byte blocks in root) |
| 199 | \fBfile system:\fR hp(0,0) (root is on drive zero; first filsys there) |
| 200 | \fBisize = 5072\fR (number of inodes in root filesystem) |
| 201 | \fBm/n = 3 500\fR (interleave parameters) |
| 202 | \fB=\fR (back at tape boot level) |
| 203 | .DE |
| 204 | You now have a empty UNIX root filesystem. |
| 205 | To restore the data which you need to boot the system, type |
| 206 | .DS |
| 207 | (bring in a standalone \fIrestor\fR\|(1) program) |
| 208 | \fB=\|\fRrestor |
| 209 | \fBTape?\fR ht(1,1) (1600 bpi, second tape file) |
| 210 | \fBDisk?\fR hp(0,0) (into root file system) |
| 211 | \fBLast chance before scribbling on disk.\fR (just hit return) |
| 212 | (30 second pause then tape should move) |
| 213 | \&... |
| 214 | \fB=\fR (back at tape boot level) |
| 215 | .DE |
| 216 | Now, you are ready to boot up |
| 217 | .UX. |
| 218 | .SH |
| 219 | Booting UNIX |
| 220 | .PP |
| 221 | Now boot UNIX: |
| 222 | .DS |
| 223 | (load bootstrap program) |
| 224 | \fB=\fR\|boot |
| 225 | \fBBoot\fR |
| 226 | \fB: \fRhp(0,0)vmunix (bring in \fIvmunix\fR off root system) |
| 227 | .DE |
| 228 | The bootstrap should then print out the sizes of the different |
| 229 | parts of the system (text, initialized and uninitialized data) and then |
| 230 | the system should start with a message which looks (like): |
| 231 | .DS |
| 232 | .B |
| 233 | 61656+61072+70120 start 0x4b4 |
| 234 | VM/UNIX (Berkeley Version 2.1) 1/5/80 |
| 235 | real mem = \fIxxx\fB |
| 236 | avail mem = \fIyyy\fB |
| 237 | ERASE IS CONTROL-H!!! |
| 238 | \fB#\fR |
| 239 | .R |
| 240 | .DE |
| 241 | The |
| 242 | .I |
| 243 | mem |
| 244 | .R |
| 245 | messages give the |
| 246 | amount of real (physical) memory and the |
| 247 | memory available to user programs |
| 248 | in bytes. |
| 249 | For example, if your machine has only 512K bytes of memory, then |
| 250 | xxx will be 524228, i.e. exactly 512K. |
| 251 | The ``ERASE-IS'' message is part of /.profile |
| 252 | which was executed by the root shell when it started. You will |
| 253 | probably want to change /.profile somewhat. |
| 254 | .PP |
| 255 | UNIX is now running, |
| 256 | and the `UNIX Programmer's manual' applies; |
| 257 | references below of the form X(Y) mean the subsection named |
| 258 | X in section Y of the manual. |
| 259 | The `#' is the prompt from the Shell, |
| 260 | and indicates you are the super-user. |
| 261 | You should first check the integrity of the root file system by |
| 262 | giving the command |
| 263 | .DS |
| 264 | \fB#\fR chk /dev/rrp0a |
| 265 | .DE |
| 266 | which abbreviates |
| 267 | .DS |
| 268 | icheck /dev/rrp0a |
| 269 | dcheck /dev/rrp0a |
| 270 | .DE |
| 271 | The output from |
| 272 | .I chk |
| 273 | should look something like: |
| 274 | .DS |
| 275 | .B |
| 276 | icheck /dev/rrp0a |
| 277 | /dev/rrp0a: |
| 278 | files \0153 (r=111,d=12,b=8,c=22) |
| 279 | used 1065 (i=27,ii=0,iii=0,d=1038) |
| 280 | free 6558 |
| 281 | missing \0\0\00 |
| 282 | dcheck /dev/rrp0a |
| 283 | /dev/rrp0a: |
| 284 | entries link cnt |
| 285 | 1 0 0 |
| 286 | .R |
| 287 | .DE |
| 288 | .PP |
| 289 | The diagnostic from \fIdcheck\fR is normal, as inode number 1 is reserved for |
| 290 | placement of bad blocks, but currently unimplemented. |
| 291 | .PP |
| 292 | The next thing to do is to extract the rest of the data from |
| 293 | the tape. |
| 294 | Comments are enclosed in ( ); don't type these. |
| 295 | The number in the first command is the |
| 296 | size of the filesystem to be created, in 1024 character blocks, |
| 297 | just as given to the standalone version of |
| 298 | .I mkfs |
| 299 | above. |
| 300 | (If you have an RM-03 rather than an RP-06 use ``41040'' rather than |
| 301 | ``145673'' in the procedure below.) |
| 302 | .DS |
| 303 | \fB#\|\fRdate \fIyymmddhhmm\fR (set date, see \fIdate\fR\|(1)) |
| 304 | \fB#\|\fR/etc/mkfs /dev/rrp0g 145673 (create empty user filesystem) |
| 305 | \fBisize = 65488\fR (this is the number of available inodes) |
| 306 | \fBm/n = 3 500\fR (freelist interleave parameters) |
| 307 | (this takes a few minutes) |
| 308 | \fB#\|\fR/etc/mount /dev/rp0g /usr (mount the usr filesystem) |
| 309 | \fB#\|\fRcd /usr (make /usr the current directory) |
| 310 | \fB#\|\fRcp /dev/rmt5 /dev/null (skip first tape file (tp format)) |
| 311 | \fB#\|\fRcp /dev/rmt5 /dev/null (skip second tape file (root)) |
| 312 | \fB#\|\fRumask 22 |
| 313 | \fB#\|\fRtar xbf 20 /dev/rmt1 (extract the usr filesystem) |
| 314 | \fB#\|\fRdd if=/usr/mdec/uboot of=/dev/rrp0a bs=1b count=1 |
| 315 | (write boot block so \fIbproc\fR(8) disk boot scheme will work) |
| 316 | \fB#\|\fRcd / (back to root) |
| 317 | \fB#\|\fR/etc/umount /dev/rp0g (unmount /usr) |
| 318 | .DE |
| 319 | All of the data on the tape has now been extracted. |
| 320 | The tape will rewind automatically. |
| 321 | .PP |
| 322 | You should now check the consistency of the /usr file system by doing |
| 323 | .DS |
| 324 | \fB#\fR chk /dev/rrp0g |
| 325 | .DE |
| 326 | In order to use the /usr file system, you should now remount it by |
| 327 | saying |
| 328 | .DS |
| 329 | \fB#\fR /etc/mount /dev/rp0g /usr |
| 330 | .DE |
| 331 | Since all directories created by |
| 332 | .IR tar (1) |
| 333 | will have mode 755 (due to the |
| 334 | .IR umask (1) |
| 335 | above, you should run the commands |
| 336 | .DS |
| 337 | \fB#\fR cd /usr/src/cmd/Admin |
| 338 | \fB#\fR DESTDIR=/ |
| 339 | \fB#\fR export DESTDIR |
| 340 | \fB#\fR ./mk MODES |
| 341 | .DE |
| 342 | which will reset the permissions of several directories (such as /tmp) |
| 343 | and also correctly set ownerships and modes of ``set-user-id'' programs |
| 344 | (although they were correct already, this does no harm.) |
| 345 | .SH |
| 346 | Making a UNIX boot floppy |
| 347 | .PP |
| 348 | The next thing to do is to make a |
| 349 | .UX |
| 350 | boot floppy, by placing some files on a clean floppy using |
| 351 | .IR arff (1). |
| 352 | Place a clean floppy in the console, and issue the following commands: |
| 353 | .DS |
| 354 | \fB#\fR cd /usr/src/sys/floppy |
| 355 | \fB#\fR arff cr * |
| 356 | .DE |
| 357 | This will put a copy of each file in the directory /usr/src/sys/floppy |
| 358 | onto the floppy. |
| 359 | You should now be able to reboot using the procedures in |
| 360 | .IR boot (8). |
| 361 | Try this after saying |
| 362 | .DS |
| 363 | \fB#\fR sync |
| 364 | .DE |
| 365 | which allows the system to initiate all i/o before you halt the CPU. |
| 366 | It is traditional to say |
| 367 | .DS |
| 368 | \fB#\fR sync |
| 369 | \fB#\fR sync |
| 370 | .DE |
| 371 | to give the system a little time, and to then reboot. |
| 372 | .PP |
| 373 | The boot floppy you created does not have enough files on |
| 374 | it to deadstart the machine with. If your standard console floppy does |
| 375 | not have a multi-segment RT-11 directory structure on it, and if you make |
| 376 | a copy of it using |
| 377 | .IR flcopy (1), |
| 378 | delete unneeded files to make space for the |
| 379 | .UX |
| 380 | boot programs, and then add the files with |
| 381 | .DS |
| 382 | \fB#\fR cd /usr/src/sys/floppy |
| 383 | \fB#\fR arff r * |
| 384 | .DE |
| 385 | you should be able to cold-start the machine from this floppy. |
| 386 | The following set of files is believed to be adequate to cold-start |
| 387 | the machine: |
| 388 | .TS |
| 389 | center; |
| 390 | l l l. |
| 391 | boot.exe restar.cmd wcsmon.sys |
| 392 | consol.sys restar.ilv wcssrv.bin |
| 393 | filea.pat vmb.exe |
| 394 | pcs.pat wcs\fInnn\fR.pat |
| 395 | .TE |
| 396 | It is also useful to have the console help files on your floppy; |
| 397 | their names end in ``.hlp''. |
| 398 | .SH |
| 399 | Taking the system up and down |
| 400 | .PP |
| 401 | To bring the system up to a multi-user configuration after a boot |
| 402 | all you have to do is hit control-d on the console. The system |
| 403 | will then perform |
| 404 | /etc/rc, |
| 405 | a multi-user restart script, and come up on the terminals which are |
| 406 | indicated in /etc/ttys. |
| 407 | See |
| 408 | .IR init.vm (8) |
| 409 | and |
| 410 | .IR ttys (5). |
| 411 | To take the system down to a single user state you can use |
| 412 | .DS |
| 413 | \fB#\fR kill 1 |
| 414 | .DE |
| 415 | when you are up multi-user. |
| 416 | This will kill all processes and give you a shell on the console, |
| 417 | as if you had just booted. |
| 418 | .PP |
| 419 | If you wish to change the lines which are active you can edit the file |
| 420 | /etc/ttys, changing the first characters of lines, and then do |
| 421 | .DS |
| 422 | \fB#\fR kill \-1 1 |
| 423 | .DE |
| 424 | See |
| 425 | .IR init.vm(8) |
| 426 | for more information. |
| 427 | .SH |
| 428 | Adding devices |
| 429 | .PP |
| 430 | The UNIX system running is configured to run |
| 431 | with the given disk |
| 432 | and tape, a console, and 8 DZ11 lines. |
| 433 | This is probably not the correct |
| 434 | configuration. |
| 435 | You will have to correct the configuration table to reflect |
| 436 | the true state of your machine. |
| 437 | .PP |
| 438 | Before you mess with the source for the system it is wise to make |
| 439 | a backup. The following will do this: |
| 440 | .DS |
| 441 | \fB#\fR cd /usr/src |
| 442 | \fB#\fR mkdir distsys distsys/h distsys/sys |
| 443 | \fB#\fR cd sys/sys |
| 444 | \fB#\fR cp * /usr/src/distsys/sys |
| 445 | \fB#\fR cd ../h |
| 446 | \fB#\fR cp * /usr/src/distsys/h |
| 447 | .DE |
| 448 | This allows you to find out what you have done to the distribution |
| 449 | system by later running the command |
| 450 | .IR diffdir (1), |
| 451 | comparing these directories. |
| 452 | .PP |
| 453 | \fBN.B.: Note that the system header files in /usr/src/sys/h are linked |
| 454 | to the files in /usr/include/sys. Since programs which depend on constants |
| 455 | in /usr/include/sys/param.h must correspond to the running system, you |
| 456 | should be careful to not break these links.\fR |
| 457 | .PP |
| 458 | There are certain magic numbers and |
| 459 | configuration parameters embedded in various |
| 460 | device drivers that you may have to change. |
| 461 | The device addresses of each device |
| 462 | are defined in each driver. |
| 463 | In case you have any non-standard device |
| 464 | addresses, |
| 465 | just change the address and recompile. |
| 466 | Also, if the devices's interrupt vector address(es) |
| 467 | are not currently known to the system (this is likely), |
| 468 | then the file /usr/src/sys/sys/univec.c must be modified |
| 469 | appropriately: namely, the proper interrupt routine addresses |
| 470 | must be placed in the table `UNIvec'. Use the DZ11 |
| 471 | as an example (as distributed, the DZ11 vectors are |
| 472 | assumed to be at locations c0 and c4 (hexadecimal)). |
| 473 | .PP |
| 474 | You will notice that the system, as distributed, has conditional code |
| 475 | in it. The current Berkeley system, on ``Ernie Co-vax'' is made |
| 476 | by defining IDENT in the ``makefile'' to be |
| 477 | .DS |
| 478 | IDENT= \-DERNIE \-DUCB |
| 479 | .DE |
| 480 | This enables the conditional code both for Berkeley and for this |
| 481 | particular machine. |
| 482 | It is traditional to pick a monicker for your machine, and change IDENT |
| 483 | to reflect it, and to then put in changes conditionally whenever |
| 484 | this makes sense. |
| 485 | You can be guided by the ERNIE conditional code. |
| 486 | .PP |
| 487 | The system comes with 4 drivers which we are using: |
| 488 | A KL/DL driver |
| 489 | .I kl.c, |
| 490 | a Versatec printer/plotter driver |
| 491 | .I vp.c, |
| 492 | and two copies of a (old and simpleminded) UNIBUS disk driver, |
| 493 | named |
| 494 | .I rm.c |
| 495 | and |
| 496 | .I rp.c. |
| 497 | These last two are not in the most aesthetically pleasing of shapes, |
| 498 | but are fully functional. |
| 499 | There is also an RK driver |
| 500 | .I rk.c, |
| 501 | but we are not using it, and it may need a little work. |
| 502 | .PP |
| 503 | The disk and tape drivers |
| 504 | for the MASSBUS devices |
| 505 | (hp.c, ht.c) |
| 506 | are set up to run 1 drive and should |
| 507 | be changed if you have more. |
| 508 | .PP |
| 509 | Now, make sure you add any new drivers which you have to the list |
| 510 | of DRIVERS here, and to the FILES and CFILES variables so they |
| 511 | will be compiled and included in listings. You can also delete |
| 512 | drivers which you don't need from FILES and CFILES, or change the code |
| 513 | so that nothing will be compiled by using a ``#ifndef''. |
| 514 | .PP |
| 515 | The |
| 516 | .I makefile |
| 517 | has several useful entry points: |
| 518 | .IP clean 15n |
| 519 | Cleans out the directory, removing |
| 520 | .B \&.o |
| 521 | files and the like. |
| 522 | .IP lint 15n |
| 523 | Runs lint on the system; the system was almost lint-free as sent |
| 524 | to you. See |
| 525 | .I linterrs |
| 526 | for the remaining |
| 527 | .I lint |
| 528 | when it was distributed. |
| 529 | .IP depend 15n |
| 530 | Creates a new makefile indicating dependencies on header files by |
| 531 | running a search through |
| 532 | .B \&.c |
| 533 | files looking for ``#include'' lines. |
| 534 | Make sure you format your code like the rest of the system so that |
| 535 | this will work. |
| 536 | .IP print 15n |
| 537 | Produces a nice listing of most everything in the system |
| 538 | directory in a canonical order. |
| 539 | .IP symbols.sort 15n |
| 540 | Creates a new file for sorting symbols in the system namelist. |
| 541 | If you have locally written programs which use the system namelist |
| 542 | you can put the symbols which they reference in |
| 543 | .I symbols.raw |
| 544 | and they will be moved at system generation to the front of the |
| 545 | system namelist for quicker access. |
| 546 | .IP tags 15n |
| 547 | Creates a |
| 548 | .I tags |
| 549 | file for |
| 550 | .I ex, |
| 551 | to make editing of the system much easier. |
| 552 | .PP |
| 553 | Before running |
| 554 | .I make, |
| 555 | you should check the definition of the constants in |
| 556 | /usr/src/sys/h/param.h |
| 557 | The constants |
| 558 | NBUF, NINODE, NFILE, NPROC, and NTEXT |
| 559 | can be changed, and also TIMEZONE and perhaps HZ if you run on 50 cycles.\(dg |
| 560 | .FS |
| 561 | \(dg If you change NINODE, NFILE, NPROC or NTEXT, then the programs |
| 562 | .IR analyze (1), |
| 563 | .IR ps (1), |
| 564 | .IR pstat (1) |
| 565 | and |
| 566 | .IR w (1) |
| 567 | will have to be recompiled. |
| 568 | A procedure for doing this is given below. |
| 569 | .FE |
| 570 | As distributed, the system is tuned for a fairly large machine. |
| 571 | (There are also tunable constants in the file |
| 572 | /usr/src/sys/h/vm.h |
| 573 | but ignore them for the time being.) |
| 574 | .PP |
| 575 | To generate a new VM/UNIX do |
| 576 | .DS |
| 577 | \fB#\fR make clean |
| 578 | \fB#\fR make depend |
| 579 | \fB#\fR make |
| 580 | .DE |
| 581 | and when this works |
| 582 | .DS |
| 583 | \fB#\fR make lint |
| 584 | .DE |
| 585 | to discover any residual bugs. |
| 586 | .PP |
| 587 | The final object file (vmunix) should be |
| 588 | moved to the root, and then booted to try it out. |
| 589 | It is best to name it /newvmunix so as not to destroy |
| 590 | the working system until you're sure it does work. |
| 591 | It is also a good idea to keep the old working version around as |
| 592 | /oldvmunix (and perhaps even a /oldervmunix) to guard against disaster. |
| 593 | .PP |
| 594 | \fBBe sure to always have the current system in /vmunix when you are |
| 595 | running multi-user or commands such as \fIps\|\fR(1)\fB |
| 596 | \fBand\fR \fIw\|\fR(1)\fB will not work.\fR |
| 597 | .SH |
| 598 | Special Files |
| 599 | .PP |
| 600 | Next you must put in special files for the new devices in |
| 601 | the directory /dev using |
| 602 | .IR mknod (1). |
| 603 | Print the configuration file |
| 604 | /usr/src/sys/sys/conf.c. |
| 605 | This is the major |
| 606 | device switch of each device class (block and character). |
| 607 | There is one line for each device configured in your system |
| 608 | and a null line for place holding for those devices |
| 609 | not configured. |
| 610 | The essential block special files were installed above; |
| 611 | for any new devices, |
| 612 | the major device number is selected by counting the |
| 613 | line number (from zero) |
| 614 | of the device's entry in the block configuration table. |
| 615 | Thus the first entry in the table bdevsw would be |
| 616 | major device zero. |
| 617 | This number is also printed in the table along the right margin. |
| 618 | .PP |
| 619 | The minor device is the drive number, |
| 620 | unit number or partition as described |
| 621 | under each device in section 4. |
| 622 | For tapes where the unit is dial selectable, |
| 623 | a special file may be made for each possible |
| 624 | selection. |
| 625 | You can also add entries for other disk drives. |
| 626 | .PP |
| 627 | In reality, device names are arbitrary. |
| 628 | It is usually |
| 629 | convenient to have a system for deriving names, but it doesn't |
| 630 | have to be the one presented above. |
| 631 | .PP |
| 632 | Some further notes on minor device numbers. |
| 633 | The hp driver uses the 0100 bit of the minor device number to |
| 634 | indicate whether or not to interleave a filesystem across |
| 635 | more than one physical device. |
| 636 | See |
| 637 | .IR hp (4) |
| 638 | for more detail. |
| 639 | The ht driver uses the 04 bit to indicate whether |
| 640 | or not to rewind the tape when it is closed. The |
| 641 | 010 bit indicates the density of the tape on TE16 drives. |
| 642 | Again, see |
| 643 | .IR ht (4). |
| 644 | .PP |
| 645 | The naming of character devices is similar to block devices. |
| 646 | Here the names are even more arbitrary except that |
| 647 | devices meant to be used |
| 648 | for teletype access should (to avoid confusion, no other reason) be named |
| 649 | /dev/ttyX, where X is some string (as in `0' or `d0'). |
| 650 | While it is possible to use truly arbitrary strings here, the accounting |
| 651 | and noticeably the |
| 652 | .IR ps (1) |
| 653 | command make good use of the fact that tty names |
| 654 | (at Berkeley) are distinct in the first 2 characters. In fact, we use |
| 655 | the following convention: |
| 656 | ``ttyN'', with N a number for normal DZ ports. ``ttydX'' with X a single |
| 657 | character (starting from 0) for dialups, ``ttykX'' with X a single letter |
| 658 | for KL ports, and ``console'' (abbrev ``co'') for the console. |
| 659 | This works out well. |
| 660 | .PP |
| 661 | The files console, tty0-tty7, mem, kmem, kUmem, floppy and null are |
| 662 | already correctly configured, as are special files for the default |
| 663 | paging are |
| 664 | /dev/drum, and the raw and block versions of the root and /usr file |
| 665 | systems. |
| 666 | .PP |
| 667 | The disk and magtape drivers provide a `raw' interface |
| 668 | to the device which provides direct transmission |
| 669 | between the user's core and the device and allows |
| 670 | reading or writing large records. |
| 671 | The raw device counts as a character device, |
| 672 | and conventionally has the name of the corresponding |
| 673 | standard block special file with `r' prepended. |
| 674 | Thus the raw magtape |
| 675 | files are called /dev/rmtX. |
| 676 | .PP |
| 677 | Whenever special files are created, |
| 678 | care should be taken to change |
| 679 | the access modes |
| 680 | .IR (chmod (1)) |
| 681 | on these files to appropriate values. |
| 682 | .SH |
| 683 | Basics of Disk Layout |
| 684 | .PP |
| 685 | If |
| 686 | there are to be more filesystems mounted than just the root |
| 687 | and /usr, |
| 688 | use |
| 689 | .IR mkfs (1) |
| 690 | to create any new filesystem and |
| 691 | put its mounting in the file /etc/rc (see |
| 692 | .IR init (8) |
| 693 | and |
| 694 | .IR mount (1)). |
| 695 | (You might look at /etc/rc anyway to |
| 696 | see what has been provided for you.) |
| 697 | .PP |
| 698 | There are several considerations in deciding how to adjust the arrangement |
| 699 | of things on your disks: |
| 700 | the most important is making sure there is adequate space |
| 701 | for what is required; |
| 702 | secondarily, throughput should be maximized. |
| 703 | Paging space is an important parameter. |
| 704 | The system |
| 705 | as distributed has 33440 (512 byte) blocks in which to page. |
| 706 | This should be large enough for most sites. |
| 707 | You can change this if local wisdom indicates this is not good. |
| 708 | .PP |
| 709 | Many common system programs (C, the editor, the assembler etc.) |
| 710 | create intermediate files in the /tmp directory, |
| 711 | so the filesystem where this is stored also should be made |
| 712 | large enough to accommodate |
| 713 | most high-water marks. |
| 714 | The root filesystem as distributed is quite large, and there should be |
| 715 | no problem. |
| 716 | All the programs that create files in /tmp take |
| 717 | care to delete them, but most are not immune to |
| 718 | events like being hung up upon, and can leave dregs. |
| 719 | The directory should be examined every so often and the old |
| 720 | files deleted. |
| 721 | .PP |
| 722 | Exhaustion of user-file space is certain to occur |
| 723 | now and then; |
| 724 | the only mechanisms for controlling this phenomenon |
| 725 | are occasional use of |
| 726 | .IR du (1), |
| 727 | .IR df (1), |
| 728 | .IR quot (1), |
| 729 | threatening |
| 730 | messages of the day, and personal letters. |
| 731 | .PP |
| 732 | The efficiency with which UNIX is able to use the CPU |
| 733 | is largely dictated by the configuration of disk controllers. |
| 734 | For general time-sharing applications, |
| 735 | the best strategy is to try to split the root filesystem and system binaries |
| 736 | (/usr), the temporary files and paging activity (/tmp and /dev/drum), |
| 737 | and the user files among three disk arms. |
| 738 | We will discuss such considerations more below. |
| 739 | .PP |
| 740 | Once you have decided how to make best use |
| 741 | of your hardware, the question is how to initialize it. |
| 742 | If you have the equipment, |
| 743 | the best way to move a filesystem |
| 744 | is to dump it |
| 745 | .IR (dump (1)) |
| 746 | to magtape, |
| 747 | use |
| 748 | .IR mkfs (1) |
| 749 | to create the new filesystem, |
| 750 | and restore (\fIrestor\fR\|(1)) the tape. |
| 751 | If for some reason you don't want to use magtape, |
| 752 | dump accepts an argument telling where to put the dump; |
| 753 | you might use another disk. |
| 754 | Sometimes a filesystem has to be increased in logical size |
| 755 | without copying. |
| 756 | The super-block of the device has a word |
| 757 | giving the highest address which can be allocated. |
| 758 | For relatively small increases, this word can be patched |
| 759 | using the debugger (\fIadb\fR\|(1)) |
| 760 | and the free list reconstructed using |
| 761 | .IR icheck (1). |
| 762 | The size should not be increased very greatly |
| 763 | by this technique, however, |
| 764 | since although the allocatable space will increase |
| 765 | the maximum number of files will not (that is, the i-list |
| 766 | size can't be changed). |
| 767 | Read and understand the description given in |
| 768 | .IR filesys (5) |
| 769 | before playing around in this way. |
| 770 | .PP |
| 771 | If you have to merge a filesystem into another, existing one, |
| 772 | the best bet is to |
| 773 | use |
| 774 | .IR tar (1). |
| 775 | If you must shrink a filesystem, the best bet is to dump |
| 776 | the original and restor it onto the new filesystem. |
| 777 | However, this will not work if the i-list on the smaller filesystem |
| 778 | is smaller than the maximum allocated inode on the larger. |
| 779 | If this is the case, reconstruct the filesystem from scratch |
| 780 | on another filesystem (perhaps using \fItar\fR(1)) and then dump it. |
| 781 | If you |
| 782 | are playing with the root filesystem and only have one drive |
| 783 | the procedure is more complicated. |
| 784 | What you do is the following: |
| 785 | .IP 1. |
| 786 | GET A SECOND PACK!!!! |
| 787 | .IP 2. |
| 788 | Dump the root filesystem to tape using |
| 789 | .IR dump (1). |
| 790 | .IP 3. |
| 791 | Bring the system down and mount the new pack. |
| 792 | .IP 4. |
| 793 | Load the standalone versions of |
| 794 | .IR mkfs (1) |
| 795 | and |
| 796 | .IR restor (1) |
| 797 | from the floppy |
| 798 | with a procedure like: |
| 799 | .DS |
| 800 | \fB>>>\fRUNJAM |
| 801 | \fB>>>\fRINIT |
| 802 | \fB>>>\fRLOAD MKFS |
| 803 | LOAD DONE, xxxx BYTES LOADED |
| 804 | \fB>>>\fRST 2 |
| 805 | |
| 806 | \&... |
| 807 | |
| 808 | \fB>>>\fRH |
| 809 | HALTED AT yyyy |
| 810 | \fB>>>\fRU |
| 811 | \fB>>>\fRI |
| 812 | \fB>>>\fRLOAD RESTOR |
| 813 | LOAD DONE, zzzz BYTES LOADED |
| 814 | |
| 815 | \&... etc |
| 816 | .DE |
| 817 | .IP 5. |
| 818 | Boot normally |
| 819 | using the newly created disk filesystem. |
| 820 | .PP |
| 821 | Note that if you change the disk partition tables or add new disk |
| 822 | drivers they should also be added to the standalone system in |
| 823 | /usr/src/sys/stand. |
| 824 | .SH |
| 825 | System Identification |
| 826 | .PP |
| 827 | You should edit the files: |
| 828 | .DS |
| 829 | /usr/include/ident.h |
| 830 | /usr/include/whoami.h |
| 831 | /usr/include/whoami |
| 832 | .DE |
| 833 | to correspond to your system, and then recompile and install |
| 834 | .IR getty.vm (8) |
| 835 | via: |
| 836 | .DS |
| 837 | \fB#\fR cd /usr/src/cmd |
| 838 | \fB#\fR DESTDIR=/ |
| 839 | \fB#\fR export DESTDIR |
| 840 | \fB#\fR Admin/mk getty.vm.c |
| 841 | .DE |
| 842 | This will arrange for an appropriate banner to be printed on terminals |
| 843 | before users log in. |
| 844 | .SH |
| 845 | Adding New Users |
| 846 | .PP |
| 847 | See |
| 848 | .IR adduser (8); |
| 849 | local needs will undoubtedly dictate a somewhat |
| 850 | different procedure. |
| 851 | .SH |
| 852 | Multiple Users |
| 853 | .PP |
| 854 | If UNIX is to support simultaneous |
| 855 | access from more than just the console terminal, |
| 856 | the file /etc/ttys (\fIttys\fR\|(5)) has to be edited. |
| 857 | To add a new terminal be sure the device is configured |
| 858 | and the special file exists, then set |
| 859 | the first character of the appropriate line of /etc/ttys to 1 |
| 860 | (or add a new line). |
| 861 | You should also edit the file |
| 862 | /etc/ttytype |
| 863 | placing the type of the new terminal there (see \fIttytype\fR\|(5)). |
| 864 | The file |
| 865 | /etc/ttywhere is also a useful one to keep up to date. |
| 866 | .PP |
| 867 | Note that /usr/src/cmd/init.vm.c will have to be recompiled if there are to be |
| 868 | more than 100 terminals. |
| 869 | Also note that if the special file is inaccessible when \fIinit.vm\fR tries to create a process |
| 870 | for it, the system will thrash trying and retrying to open it. |
| 871 | .SH |
| 872 | File System Health |
| 873 | .PP |
| 874 | Periodically (say every day or so) and always after a crash, |
| 875 | you should check all the filesystems for consistency |
| 876 | (\fIicheck, dcheck\fR\|(1)). |
| 877 | It is quite important to execute |
| 878 | .IR sync (8) |
| 879 | before rebooting or taking the machine down. |
| 880 | This is done automatically every 30 seconds by the |
| 881 | .IR update (8) |
| 882 | program when a multiple-user system is running, |
| 883 | but you should do it anyway to make sure. |
| 884 | .PP |
| 885 | Dumping of the filesystem should be done regularly, |
| 886 | since once the system is going it is very easy to |
| 887 | become complacent. |
| 888 | Complete and incremental dumps are easily done with |
| 889 | .IR dump (1). |
| 890 | See the scripts |
| 891 | /etc/dumpusr |
| 892 | and |
| 893 | /etc/dumproot |
| 894 | which we use to dump our file systems. |
| 895 | You should arrange to do a towers-of-hanoi dump sequence; we tune |
| 896 | ours so that almost all files are dumped on two tapes and kept for at |
| 897 | least a week in most every case. We take full dumps every month (and keep |
| 898 | these indefinitely). |
| 899 | Note that |
| 900 | /etc/dumpusr |
| 901 | maintain the file /etc/lastdumpdone. |
| 902 | This can be printed out at login by people who can easily then |
| 903 | start dumps if one is needed.\(dg |
| 904 | .FS |
| 905 | \(dg More precisely, we have three sets of dump tapes: 10 daily tapes, |
| 906 | 5 weekly sets of 2 tapes, and fresh sets of three tapes monthly. |
| 907 | We do daily dumps circularly on the daily tapes with sequence |
| 908 | `3 2 5 4 7 6 9 8 9 9 9 ...'. |
| 909 | Each weekly is a level 1 and the daily dump sequence level |
| 910 | restarts after each weekly dump. |
| 911 | Full dumps are level 0 and the daily sequence restarts after each full dump |
| 912 | also. |
| 913 | Thus a typical dump sequence would be: |
| 914 | .TS H |
| 915 | c c c c c |
| 916 | n n n l l. |
| 917 | tape name level number date opr size |
| 918 | _ |
| 919 | .TH |
| 920 | FULL 0 Nov 24, 1979 jkf 137K |
| 921 | D1 3 Nov 28, 1979 jkf 29K |
| 922 | D2 2 Nov 29, 1979 rrh 34K |
| 923 | D3 5 Nov 30, 1979 rrh 19K |
| 924 | D4 4 Dec 1, 1979 rrh 22K |
| 925 | W1 1 Dec 2, 1979 etc 40K |
| 926 | D5 3 Dec 4, 1979 rrh 15K |
| 927 | D6 2 Dec 5, 1979 jkf 25K |
| 928 | D7 5 Dec 6, 1979 jkf 15K |
| 929 | D8 4 Dec 7, 1979 rrh 19K |
| 930 | W2 1 Dec 9, 1979 etc 118K |
| 931 | D9 3 Dec 11, 1979 rrh 15K |
| 932 | D10 2 Dec 12, 1979 rrh 26K |
| 933 | D1 5 Dec 15, 1979 rrh 14K |
| 934 | W3 1 Dec 17, 1979 etc 71K |
| 935 | D2 3 Dec 18, 1979 etc 13K |
| 936 | FULL 0 Dec 22, 1979 etc 135K |
| 937 | .TE |
| 938 | We do weekly's often enough that daily's always fit on one tape and |
| 939 | in fact never get to the sequence of 9's in the daily level numbers. |
| 940 | .FE |
| 941 | .PP |
| 942 | Dumping of files by name is best done by |
| 943 | .IR tar (1) |
| 944 | but the number of files is somewhat limited. |
| 945 | Finally if there are enough drives entire |
| 946 | disks can be copied with |
| 947 | .IR dd (1) |
| 948 | using the raw special files and an appropriate |
| 949 | block size. |
| 950 | .SH |
| 951 | Converting 32/V Filesystems |
| 952 | .PP |
| 953 | The best way to convert filesystems from 32/V |
| 954 | to the new format |
| 955 | is to use |
| 956 | .IR tar (1). |
| 957 | After converting, you can still restore files from your old-format dump |
| 958 | tapes (yes the dump format is different, sorry about that), by using |
| 959 | ``512restor'' instead of ``restor''. |
| 960 | If you wish, you can move whole file systems from 32/V to the new system |
| 961 | by using ``dump'' and then ``512restor''. |
| 962 | .SH |
| 963 | Regenerating the system |
| 964 | .PP |
| 965 | It is quite easy to regenerate the system, and it is a good |
| 966 | idea to try this once right away to build confidence. |
| 967 | The system consists of three major parts: |
| 968 | the kernel itself (/usr/src/sys/sys), the user programs |
| 969 | (/usr/src/cmd and subdirectories), and the libraries |
| 970 | (/usr/src/lib*). |
| 971 | The major part of this is /usr/src/cmd. |
| 972 | .PP |
| 973 | We have already seen how to recompile the system itself. |
| 974 | The three major libraries are the C library in /usr/src/libc |
| 975 | and the \s-2FORTRAN\s0 libraries /usr/src/libI77 and /usr/src/libF77. In each |
| 976 | case the library is remade by changing into the corresponding directory |
| 977 | and doing |
| 978 | .DS |
| 979 | \fB#\fR make |
| 980 | .DE |
| 981 | and then installed by |
| 982 | .DS |
| 983 | \fB#\fR make install |
| 984 | .DE |
| 985 | Similar to the system, |
| 986 | .DS |
| 987 | \fB#\fR make clean |
| 988 | .DE |
| 989 | cleans up. |
| 990 | The source for all other libraries is kept in subdirectories of |
| 991 | /usr/src/lib; each has a makefile and can be recompiled by the above |
| 992 | recipe. |
| 993 | .PP |
| 994 | Recompiling all user programs is accomplished by using |
| 995 | a directory in /usr/src/cmd, called Admin, which contains |
| 996 | two files: mk and destinations. |
| 997 | The file mk is a shell script for recompiling files in /usr/src/cmd. |
| 998 | For instance, to recompile ``date.c'', |
| 999 | all one has to do is |
| 1000 | .DS |
| 1001 | \fB#\fR cd /usr/src/cmd |
| 1002 | \fB#\fR Admin/mk date.c |
| 1003 | .DE |
| 1004 | this will place a stripped version of the binary of ``date'' |
| 1005 | in /usr/dist3/bin/date, since date normally resides in /bin, and |
| 1006 | Admin is building a file-system like tree rooted at /usr/dist3. |
| 1007 | You will have to make the directory dist3 for this to work. |
| 1008 | It is possible to use any directory for the destination, it isn't necessary |
| 1009 | to use the default /usr/dist3. |
| 1010 | You can set the default by doing: |
| 1011 | .DS |
| 1012 | \fB#\fR DESTDIR=\fIpathname\fR |
| 1013 | \fB#\fR export DESTDIR |
| 1014 | .DE |
| 1015 | .PP |
| 1016 | To regenerate all the system source you can do |
| 1017 | .DS |
| 1018 | \fB#\fR DESTDIR=/usr/newsys |
| 1019 | \fB#\fR export DESTDIR |
| 1020 | \fB#\fR cd /usr |
| 1021 | \fB#\fR rm -r newsys |
| 1022 | \fB#\fR mkdir newsys |
| 1023 | \fB#\fR cd /usr/src/cmd |
| 1024 | \fB#\fR Admin/mk * > Admin/errs 2>& 1 & |
| 1025 | .DE |
| 1026 | This will take about 4 hours on a reasonably configured machine. |
| 1027 | When it finished you can move the hierarchy into the normal places |
| 1028 | using |
| 1029 | .IR mv (1) |
| 1030 | and |
| 1031 | .IR cp (1), |
| 1032 | and then execute |
| 1033 | .DS |
| 1034 | \fB#\fR DESTDIR=/ |
| 1035 | \fB#\fR export DESTDIR |
| 1036 | \fB#\fR cd /usr/src/cmd/Admin |
| 1037 | \fB#\fR mk ALIASES |
| 1038 | \fB#\fR mk MODES |
| 1039 | .DE |
| 1040 | to link files together as necessary and to set all the right set-user-id |
| 1041 | bits. |
| 1042 | .SH |
| 1043 | Making orderly changes |
| 1044 | .PP |
| 1045 | In order to keep track of changes to system source we migrate changed |
| 1046 | versions of commands in /usr/src/cmd in through the directory /usr/src/new |
| 1047 | and out of /usr/src/cmd into /usr/src/old for a time before removing them. |
| 1048 | Locally written commands which aren't distributed are kept in /usr/src/local |
| 1049 | and their binaries are kept in /usr/local. This allows /usr/bin /usr/ucb |
| 1050 | and /bin to correspond to the distribution tape (and to the manuals that |
| 1051 | people can buy). People wishing to use /usr/local commands are made |
| 1052 | aware that they aren't in the base manual. As manual updates incorporate |
| 1053 | these commands they are moved to /usr/ucb. |
| 1054 | .PP |
| 1055 | A directory /usr/junk to throw garbage into, as well as binary directories |
| 1056 | /usr/old and /usr/new are very useful. The man command supports manual |
| 1057 | directories such as /usr/man/manj for junk and /usr/man/manl for local |
| 1058 | to make this or something similar practical. |
| 1059 | .SH |
| 1060 | Interpreting system activity |
| 1061 | .PP |
| 1062 | The |
| 1063 | .I vmstat |
| 1064 | program provided with the system is designed to be an aid to monitoring |
| 1065 | systemwide activity. Together with the |
| 1066 | .IR ps (1) |
| 1067 | command (as in ``ps av''), it can be used to investigate systemwide |
| 1068 | virtual activity. |
| 1069 | You should modify |
| 1070 | .IR vmstat (1m) |
| 1071 | so that it prints out disk statistics for the disks |
| 1072 | you have, changing the headers to something appropriate for your |
| 1073 | system. |
| 1074 | Examine the definitions of DK_UNIT in the disk drivers supplied to see |
| 1075 | how the code in \fIvmstat\fR and \fIiostat\fR and the system correspond. |
| 1076 | .PP |
| 1077 | By running |
| 1078 | .I vmstat |
| 1079 | when the system is active you can judge the system activity in several |
| 1080 | dimensions: job distribution, virtual memory load, paging and swapping |
| 1081 | activity, disk and cpu utilization. |
| 1082 | Ideally, most jobs should be either running (RQ) or sleeping (SL), |
| 1083 | there should be little paging or swapping activity, there should |
| 1084 | be available bandwidth on the disk devices (most single arms peak |
| 1085 | out at about 30-35 tps in practice), and the user cpu utilization (US) should |
| 1086 | be high (about 60%). |
| 1087 | .PP |
| 1088 | If the system is busy, then the number of active jobs may be large, |
| 1089 | and several of these jobs may often be in disk wait (DW). If the virtual |
| 1090 | memory is very active, then the paging demon may be running (SR will |
| 1091 | be non-zero). It is healthy for the paging demon to free pages when |
| 1092 | the virtual memory gets active; it is triggered by the amount of free |
| 1093 | memory dropping below a threshold and increases its pace as free memory |
| 1094 | goes to zero. |
| 1095 | .PP |
| 1096 | If you run |
| 1097 | .I vmstat |
| 1098 | when the system is busy (a ``vmstat 5'' is best, since that is how |
| 1099 | often most of the numbers are recomputed by the system), you can find |
| 1100 | imbalances by noting abnormal job distributions. If a large number |
| 1101 | of jobs are in disk wait (DW) or page wait (PW), then the disk subsystem |
| 1102 | is overloaded or imbalanced. If you have a large number of non-dma |
| 1103 | devices or open teletype lines which are ``ringing'', or user programs |
| 1104 | which are doing high-speed non-buffered input/output, then the system |
| 1105 | time may go very high (60-70% or higher). |
| 1106 | .PP |
| 1107 | If the system is very heavily loaded, or if you have very little memory |
| 1108 | relative to your load (512K is little in most any case), then the system |
| 1109 | may be forced to swap. This is likely to be accompanied by a noticeable |
| 1110 | reduction in system performance as the system does not swap ``working |
| 1111 | sets'', but rather forces jobs to reinitialize their resident sets |
| 1112 | by demand paging. If you expect to be in a memory-poor environment |
| 1113 | for an extended period you might consider administratively limiting system |
| 1114 | load. |
| 1115 | .SH |
| 1116 | Tunable constants |
| 1117 | .PP |
| 1118 | There is a modicum of tuning available in the paging replacement algorithm |
| 1119 | if it appears to be badly tuned for your configuration. |
| 1120 | The page replacement (clock) algorithm is run whenever there are |
| 1121 | not LOTSFREE pages available |
| 1122 | (this and all other constants discussed here are defined in the system |
| 1123 | header file /usr/src/sys/h/vm.h). |
| 1124 | This sets up resistance to consumption of the remaining free memory |
| 1125 | at a minimal rate SLOWSCAN, which gives the desired number of seconds |
| 1126 | between successive examinations of each page. The rate at which the clock |
| 1127 | algorithm is run increases linearly to a desired rate of FASTSCAN when |
| 1128 | there is no free memory. Thus as the available free memory decreases, |
| 1129 | the clock algorithm works harder to hold on to what is left. |
| 1130 | If less than DESFREE pages are available and the paging rate is high, |
| 1131 | then the system will begin to swap processes out. If less than MINFREE |
| 1132 | pages are available then the system will begin to swap, regardless of the |
| 1133 | paging rate. |
| 1134 | .PP |
| 1135 | When it has to swap, the system first tries to find a process which has |
| 1136 | been blocked for a long time and swap it out first. If there are no |
| 1137 | jobs of this flavor, then it will choose among the remaining jobs in |
| 1138 | a strictly round-robin fashion, based on core residency time. It attempts |
| 1139 | to guarantee (during periods of very heavy load) enough core residency to |
| 1140 | a process to allow it to at least rebuild its set of active pages (since |
| 1141 | it must do so by demand paging). Processes which are swapped out |
| 1142 | with large numbers of active pages similarly receive lower priority for |
| 1143 | swapin, favoring small jobs to return to the core resident set quickly. |
| 1144 | .PP |
| 1145 | It is |
| 1146 | .I very |
| 1147 | desirable that the system run under reasonably heavy load with little |
| 1148 | swapping, with the memory partitioning being done by the clock replacement |
| 1149 | algorithm, rather than by the swapping algorithm. The costs associated |
| 1150 | with paging activity are the time spent in the paging demon, the overhead |
| 1151 | associated with reclaim page faults (RE), and the extra disk activity |
| 1152 | associated with pagins and pageouts. |
| 1153 | We will discuss disk considerations later; when kept to about 40 reclaim |
| 1154 | faults per second, the cost of reclaims is less than 1% of total processor |
| 1155 | time. The cpu time (shown by ``ps l2'') accumulated by the pageout demon |
| 1156 | will show how much overhead it is generating. |
| 1157 | .PP |
| 1158 | The system, as distributed, runs the replacement algorithm whenever less |
| 1159 | than 1/8 of the total user memory is free. |
| 1160 | This is done starting with a 30 second revolution time of the clock |
| 1161 | algorithm and increasing to a 20 second revolution time when there is no |
| 1162 | free memory. |
| 1163 | The goal here is to use as much memory as possible (i.e. have the |
| 1164 | free list short) but to not have the system run out and start to swap. |
| 1165 | You can experiment with changing the writable copies of these variables |
| 1166 | (e.g. ``lotsfree'' is the writable copy of LOTSFREE) using |
| 1167 | .I adb, |
| 1168 | as in: |
| 1169 | .DS |
| 1170 | \fB#\fR adb \-w /vmunix /dev/kmem |
| 1171 | /m 0 #ffffffff |
| 1172 | lotsfree/D |
| 1173 | ---adb prints value of lotsfree--- |
| 1174 | /W 0t100 |
| 1175 | .DE |
| 1176 | Here the ``/W 0t100'' command changed the value of |
| 1177 | .I lotsfree |
| 1178 | to be 100 (decimal). |
| 1179 | .SH |
| 1180 | Balancing disk load |
| 1181 | .PP |
| 1182 | It is critical for good performance to balance disk load. |
| 1183 | There are at least five components of the disk load which you can |
| 1184 | divide between the available disks: |
| 1185 | .DS |
| 1186 | 1. The root file system. |
| 1187 | 2. The /tmp file system. |
| 1188 | 3. The /usr file system. |
| 1189 | 4. The user files. |
| 1190 | 5. The paging activity. |
| 1191 | .DE |
| 1192 | In our system we currently have 1.75M bytes of memory and 2 disks: |
| 1193 | an RP06 and an AMPEX 9300. We run with the root, /tmp, and paging activity |
| 1194 | on the RP-06, while the |
| 1195 | /usr |
| 1196 | file system and user files are on the 9300. |
| 1197 | This gives a fairly even split of file activity in our environment. |
| 1198 | .PP |
| 1199 | A split such as this is a good initial guess if you have two arms. |
| 1200 | If you are fortunate to have three arms, you can try splitting the |
| 1201 | files up various ways. The most important things to remember are to |
| 1202 | even out the disk load as much as possible, and to do this by |
| 1203 | decoupling file systems (on separate arms) between which heavy copying occurs. |
| 1204 | Note that a long term average balanced load is not important... it is |
| 1205 | much more important to have instantaneously balanced |
| 1206 | load when the system is busy. |
| 1207 | .PP |
| 1208 | Intelligent experimentation with a few file system arrangements can |
| 1209 | pay off in much improved performance. It is particularly easy to |
| 1210 | move the root, the |
| 1211 | /tmp |
| 1212 | file system and the paging area. Place the |
| 1213 | user files and the |
| 1214 | /usr |
| 1215 | directory as space needs dictate and experiment |
| 1216 | with the other, more easily moved file systems. |
| 1217 | .PP |
| 1218 | Finally, when you have your configuration worked out, you should set |
| 1219 | the constant MAXPGIO based on the maximum number of transfers you can |
| 1220 | expect from your paging device. The system assumes that if more transfers |
| 1221 | than this occur, then the system is overloaded, and unless there is |
| 1222 | a reasonable amount of free memory, it then begins to swap. The swap scheduler |
| 1223 | also consider jobs small if their size when they were swapped out was less |
| 1224 | than twice this constant. Such small jobs have a better chance of getting |
| 1225 | swapped back in when the core situation is tight, since they can be |
| 1226 | expected to be able to run in a small number of page frames. |
| 1227 | .SH |
| 1228 | Process size limitations |
| 1229 | .PP |
| 1230 | As distributed, the system provides for a maximum of 64M bytes of |
| 1231 | resident user virtual address space. The size of the text, and data |
| 1232 | segments of a single process are currently limited to 4M bytes each, and |
| 1233 | the stack segment size is limited to 512K bytes. These |
| 1234 | can be increased by changing the constants MAXTSIZ, MAXDSIZ and MAXSSIZ |
| 1235 | in the file |
| 1236 | /usr/src/sys/h/vm.h. |
| 1237 | You must be careful in doing this that you have adequate paging space. |
| 1238 | As configured above, the system has only 16M bytes of paging area.\(dg |
| 1239 | .FS |
| 1240 | \(dg Recovery from running out of paging area is currently |
| 1241 | not handled gracefully: the system panics. |
| 1242 | .FE |
| 1243 | .PP |
| 1244 | To increase the amount of resident virtual space possible, |
| 1245 | you can alter the constant USRPTSIZE (in |
| 1246 | /usr/src/sys/h/vm.h) |
| 1247 | and by correspondingly change the definitions of |
| 1248 | .I Usrptmap |
| 1249 | and |
| 1250 | .I Syssize |
| 1251 | in |
| 1252 | /usr/src/sys/sys/locore.s |
| 1253 | .PP |
| 1254 | The 4M byte limitation on individual segment size is enforced by |
| 1255 | the constants MAXTSIZ, MAXDSIZ and MAXSSIZ for the text, data and stack |
| 1256 | segments respectively. These can be increased, given the availability of |
| 1257 | adequate amounts of swap space, up to 16M bytes. |
| 1258 | In order to increase the size of the stack or data |
| 1259 | segments beyond 16M, you will have to increase the amount of space which can be |
| 1260 | mapped by the corresponding disk map. To increase, for instance, the |
| 1261 | maximum segment size to 32M bytes it would be adequate to double both |
| 1262 | the initial and maximum sizes for the diskmap. Thus defining (in |
| 1263 | /usr/src/sys/h/dmap.h) |
| 1264 | .DS |
| 1265 | #define DMMIN 32 |
| 1266 | #define DMMAX 8192 |
| 1267 | .DE |
| 1268 | (i.e. a minimum segment size of 16K bytes and a maximum size of 4M bytes) |
| 1269 | would allow the 16 diskmap slots to map 32M bytes. |
| 1270 | .SH |
| 1271 | Other limitations |
| 1272 | .PP |
| 1273 | Due to the fact that the file system block numbers are stored in |
| 1274 | page table |
| 1275 | .B pg_blkno |
| 1276 | entries, the maximum size of a file system is limited to |
| 1277 | 2^20 1024 byte blocks. Thus no file system can be larger than 1024M bytes. |
| 1278 | .PP |
| 1279 | The number of system buffers is limited to be less than 64 |
| 1280 | because of the way that the MASSBUS adaptor map registers are initialized. |
| 1281 | The construct ``(128<<9)'' appears in the |
| 1282 | /usr/src/sys/sys/mba.c |
| 1283 | and |
| 1284 | /usr/src/sys/sys/hp.c |
| 1285 | code, silently enforcing this restriction. |
| 1286 | .SH |
| 1287 | Scaling down |
| 1288 | .PP |
| 1289 | If you have less than 1M byte of memory you may wish to scale the |
| 1290 | paging system down, by reducing some fixed table sizes not directly |
| 1291 | related to the paging system. |
| 1292 | For instance, we have raised NBUF from 32 to 48, NCLIST from 150 to 500, |
| 1293 | (also increasing the basic clist block size CBSIZE from 12 to 28) |
| 1294 | and NPROC, NINODE and NFILE for a fairly large system from the way |
| 1295 | they were distributed for \s-2UNIX/32V\s0. |
| 1296 | We also increased TTLOWAT (from 50 to 125) and TTHIWAT (from 150 to 650.) |
| 1297 | If you |
| 1298 | pull NCLIST down, you should adjust these also. |
| 1299 | You can use |
| 1300 | .IR pstat (1m) |
| 1301 | to find out how much of these structures are typically in use. |
| 1302 | Although the document is pretty much obsolete for the \s-2VAX\s0, |
| 1303 | you can see the last few pages of ``Regenerating System Software'' |
| 1304 | in Volume 2B of the programmers manual for hints on setting some of these |
| 1305 | constants. |
| 1306 | .SH |
| 1307 | Odds and Ends |
| 1308 | .PP |
| 1309 | The programs |
| 1310 | dump, |
| 1311 | icheck, quot, dcheck, ncheck, and df |
| 1312 | (source in /usr/source/cmd) |
| 1313 | should be changed to |
| 1314 | reflect your default mounted filesystem devices. |
| 1315 | Print the first few lines of these |
| 1316 | programs and the changes will be obvious. |
| 1317 | You will probably want to amend some of /usr/lib/crontab, and will |
| 1318 | certainly want to add to /usr/lib/Mail.rc. |
| 1319 | .PP |
| 1320 | You should periodically examine the file /usr/adm/messages, which acts |
| 1321 | as a system error log (see dmesg(1m)). |
| 1322 | In particular, memory controller errors are checked for every 10 minutes |
| 1323 | and a diagnostic is produced (printing memory controller register C) |
| 1324 | if there were any errors. |
| 1325 | .in +4i |
| 1326 | .sp 3 |
| 1327 | Good Luck |
| 1328 | .sp 1 |
| 1329 | William N. Joy |
| 1330 | .br |
| 1331 | Ozalp Babaoglu |