BSD 4_3_Tahoe development
[unix-history] / usr / doc / misc / kchanges.4.1
CommitLineData
16565d81
C
1.TL
2Changes in the Kernel in 4.1bsd
3.sp
4May 10, 1981
5.br
6Revised: September 1, 1981
7.AU
8Bill Joy
9.AI
10Computer Systems Research Group
11University of California, Berkeley
12.PP
13This document summarizes the changes in the kernel
14between the November 1980 4.0bsd release and the and
15April 1981 4.1bsd distribution. The information is presented
16in both overall terms (e.g. organizational changes), and as
17specific comments about individual changed files. See
18the source code itself for more details.
19.PP
20The major changes fall in five categories:
21.IP 1)
22Changes in the VAX 11/780 specific portions of the code, so that
23VAX 11/750's are supported also.
24.IP 2)
25Changes in the organization of the code, so that more than one
26configuration of the system may be built from a single set of sources.
27Each system is described by a single file which includes parameters
28such as system size, devices on the machine, etc.
29All ``magic numbers'' such as device register addresses are collected
30in this single file.
31.IP 3)
32Extensive changes in the device subsystem to allow multiple UNIBUS
33and MASSBUS adapters to be used, multiple instances of device controllers
34to exist without duplicating driver code, and to provide the capability
35of system configuration at boot time. The configuration capability
36is used to produce a generic system which runs on all supported hardware,
37and is used for distributions. Pattern matching in the configuration
38capability also allows hardware redundancy to be used to good effect.
39.IP 4)
40Diagnostics of the system have been reworked to be in a standard and
41readable format; file system diagnostic refer to the file systems by
42name rather than device number. Device diagnostics refer to the devices
43by name, and print error messages including device registers decoded
44symbolically rather than simply in octal or hexadecimal.
45DEC standard bad sector forwarding has been added to the drivers for DEC disks.
46.IP 5)
47Performance improvements, noticeably in the paging subsystem.
48.PP
49A number of enhancements and bug fixes have also been made.
50.SH
51Carrying over local software
52.PP
53The majority of local changes should carry over to the new system
54quite easily. It it necessary to create a configuration file for each
55machine from which a system will be built, but this is quite easy,
56and such files are designed to be usable without change in future
57releases of the system.
58.PP
59Locally written UNIBUS device drivers will need to be converted to
60work in the new system. The new functions needed of the device drivers are:
61.IP 1)
62Forcing a device interrupt at bootstrap time, given a proposed
63device register address. This is used by the configuration program
64to decide if the device really exists.
65.IP 2)
66If buffered data paths are to be used, the driver must use routines
67in the UNIBUS adapter subsystem which arranges for i/o requests to be
68queued when there are no resources available.
69.IP 3)
70Drivers must not assume that only one instance of a device exists in the
71system, but must rather be parameterized and use the information provided
72by the bootstrap procedure to drive all available devices.
73.PP
74Of course, it is not necessary to make a driver ``fully supported'' for it
75to be used. It suffices to handle 1) by pretending that the interrupt
76occurred, returning the (for a single system) known UNIBUS vector information,
77and assuming that the device exists on specific UNIBUS adapters.
78Drivers which use UNIBUS resources only statically or not all all need
79not be concerned with 2), and drivers can assume that there is only one
80instance of the supported device on the system, and just not work
81if more than one such device is really present.
82.PP
83In any case, more information about device driver changes is given in the
84last section of this document;
85also see \fIautoconf\fR\|(4) for information about the messages
86printed out by the configuration code at bootstrap time.
87Looking at the provided standard
88supported drivers for examples of code is also a good idea.
89.PP
90There is also a new interface for MASSBUS devices. Since all MASSBUS
91devices are already supported, there is no external documentation for writing
92new MASSBUS drivers at the present. If you have questions or intend
93to write a driver for a home-brew interface, you should read the MASSBUS
94and MASSBUS device driver code, which is amply commented. In any case,
95the MASSBUS interface is more stylized than the UNIBUS interface, and you
96may have to extend the functionality of the MASSBUS driver to handle radically
97different devices.
98.SH
99Organizational changes
100.PP
101On RK07 systems the source for the system lives in the root directory,
102since there is so little space. The system otherwise lives where it
103used to: the subdirectories of /usr/src/sys, with copies of the header
104files for the installed system in /usr/include/sys.
105.PP
106The system compilation procedure has been changes so that more than
107one set of binaries may be kept conveniently with a single copy of the
108source code. The system sources are kept in the directories \fBsys/sys\fR
109and \fBsys/dev\fR with the header files in \fBsys/h\fR. Source files
110which were previously kept in \fBsys/conf\fR are now in \fBsys/dev\fR,
111and no binaries are kept in any of these directories.
112.PP
113The directory
114\fBsys/conf\fR contains a number of files related to system configuration.
115For each machine to be configured, a single file is created in the
116\fBsys/conf\fR directory; thus files \fBERNIE\fR and \fBBERT\fR might
117exist there. Each such file describes all the parameters of the machine
118to be used: the devices which are to be configured into the system, optional
119parts of the system to be included, as well as the timezone in which
120the machine lives and the maximum number of simultaneous active users; the
121last is used to scale system tables.
122The format of the configuration files is described in \fIconfig\fR\|(8).
123.PP
124Corresponding to each system to be configured there is a directory of
125\fBsys\fR, thus \fBsys/BERT\fR and \fBsys/ERNIE\fR. These directories
126are made with \fImkdir\fR and then the \fB/etc/config\fR command is run,
127from the \fBsys/conf\fR directory, specifying \fBBERT\fR or \fBERNIE\fR
128as argument. The configuration program processes the information in
129the configuration files, and produces, in the directory \fB../BERT\fR or
130\fB\&../ERNIE\fR respectively:
131.IP 1)
132A set of header files, e.g. \fBdz.h\fR, which contain the number of devices
133and controllers to be available in the target system. These definitions
134force conditional compilation of drivers resulting in the inclusion
135or exclusion of driver code and the sizing of driver tables. This
136technique, based on compilation, is more powerful than a loader-based
137technique, since small sections of code may be easily conditionalized.
138Similarly, dynamic loading of device drivers is not needed, as only
139drivers which are needed are included in the resulting system.
140.IP 2)
141A small assembly language vector interface, which turns the
142hardware generated UNIBUS interrupt sequences into C calls on the
143driver interrupt routines. This \fBubglue.s\fR file glues the hardware
144interrupt sequence into the UNIX interface.
145.IP 3)
146A table file \fBioconf.c\fR which initializes tables to be
147used at bootstrap time by the system configuration routines. The configuration
148routines interpret the contents of the table and determine which devices
149are available on the system. They determine the vector addresses of UNIBUS
150devices by forcing the devices to interrupt. Pattern matching in the tables
151may be used to take advantage of hardware redundancy: the specifications
152need not completely constrain device placement, so the system can be built
153to bootstrap in several different configurations, locating the same devices
154on different interconnects by the fact that their unit numbers have not
155changes (for example). Thus two RP06 disks could be specified as:
156.DS
157disk hp0 at mba? drive 0
158disk hp1 at mba? drive 1
159.DE
160and then the disks could be cabled to any available MASSBUS adapter; the
161pattern matching in the configuration procedure would locate the drives.
162Similarly, a tape formatter on the same system could be specified:
163.DS
164master ht0 at mba? drive ?
165.DE
166and then placed anywhere on any MBA.
167Contrast this flexible specification with
168.DS
169disk hp0 at mba0 drive 0
170disk hp1 at mba0 drive 1
171master ht0 at mba1 drive 0
172.DE
173which is not reconfigurable. This latter specification corresponds
174to the previous UNIX capabilities, which did not allow tapes and disks
175on the same MASSBUS adapter.
176.IP 4)
177Finally the \fIconfig\fR program constructs a \fImakefile\fR
178for the system which builds the drivers needed in the specified configuration,
179and includes system loading sequences for the different root and swap
180device configurations desired.
181.PP
182It is now easy to include ``subsystems'' optionally. This is done
183through the same mechanisms which causes conditional inclusion of device
184drivers. The file \fBconf/files\fR contains a palate of files which
185builds the system. Each line is either of the form:
186.DS
187filename standard
188.DE
189or
190.DS
191filename optional \fIxx\fR
192.DE
193where \fIxx\fR is the name of a device which requires the file, or a
194\fIpseudo-device\fR. To define a subsystem to be added to the kernel
195it suffices to add specifications to the \fBconf/files\fR file for the
196newly optional files and to then place a specification
197.DS
198pseudo-device \fIxx\fR
199.DE
200in the system configuration file. A line
201.DS
202options \fIXX\fR
203.DE
204may also be added to the configuration
205to have the symbol XX defined during compilation,
206for use in conditional compilations in the standard part of the system.
207Such conditional compilation is typically used to provide hooks in the
208standard part of the kernel to switch out to subsystem functions.
209.PP
210This completes the general description of organizational changes.
211We now describe the changes in the system, file by file.
212.SH
213Header files: sys/h and /usr/include/sys
214.PP
215General changes: device drivers now have header files in these
216directories, thus the ``up'' driver has a header file ``upreg.h''.
217This so the standalone code and the mainline UNIX code can share
218the common definitions.
219.PP
220The ``.m'' files of the previous distribution
221have been eliminated (with the sole exception of \fBpcb.m\fR); the
222magic numbers which were manually entered in these files are instead
223generated by a program from the definitions in the corresponding
224\fB\&.h\fR files; a number of header files thus no longer warn about
225correspondences that must be maintained.
226.PP
227The system tables are now described by pointers to their beginning
228and end and a count, rather than compiled in constants. This allows
229table sizes to be chosen at boot time (although the system currently
230does this only for the file system buffer cache), and makes programs
231such as \fBps\fR and \fBw\fR not compile in these constants. Note,
232especially, that the symbols such as \fIproc\fR and \fIinode\fR are
233now memory locations containing the addresses of these structures
234rather than the base of the structures themselves. Programs which
235access these structures have been changed and use the variables \fInproc\fR
236and \fIninode\fR in core rather than the (now defunct) constants NPROC
237and NINODE.
238.de BP
239.IP \\fB\\$1\\fR 14n
240..
241.BP buf.h
242Now declares three headers on which the in-core buffers are placed.
243Buffers which are locked in the buffer cache are placed on the first
244queue. Currently, only file system super blocks are locked in core,
245and to good effect: it is now possible to rebuild the super-block of
246the root file system with the system quiescent (without rebooting) if
247the block device is used. It is no longer necessary to take a buffer
248for the super-block of a file system and also make a copy of it at
249each sync; the same buffer can be reused and simply released: since it
250is locked it will remain in the buffer cache.
251.IP
252The other two queues implement the lru cache and the list of blocks
253which have been discarded. By having queues for both of these rather
254than using the end of a single queue, we achieve true fifo behavior for
255blocks which we consider ``discarded''; previously rather strange behavior
256resulted from pushing these blocks backwards on the front of the single
257queue. (In particular pipes would behave badly on idle systems under
258some circumstances.)
259.IP
260The number of pages paged is counted at pageout completion, as well
261as the pageout event count. A bug in the \fBphysio\fR routine which
262caused physical transfers of more than 60000 bytes to sometimes
263fail to return an error indication has been fixed.
264.IP
265A flag has been added that marks a buffer as consisting only of a header
266and also one which marks a buffer being used for bad-sector processing.
267.BP callo.h
268Is now called \fBcallout.h\fR, and the name of the structure is similarly
269changed to make it consistent with the other structures in the kernel.
270The structures are now linked together in linked lists, to prevent arcane
271situations previously possible where only half of the structures would
272be used, but the table space would be exhausted.
273.BP clock.h
274A botch in handling of leap years has been fixed. A macro is defined
275here to queue a software interrupt for handling most of the clock
276processing at an IPL lower than the clock IPL.
277.BP cmap.h
278This file, like a number of others, no longer warns that the size
279of the structure is known elsewhere; such dependencies are the concern
280of a C program and automated through makefile dependencies.
281.BP conf.h
282A \fId_dump\fR entry has been added to the block device table, and is
283used as the system now normally does automatic dumps of core memory to disk
284after a crash. The field \fId_tab\fR is now called \fId_flags\fR and
285set to B_TAPE for tapes. For reasons not worth explaining here, there
286are no ``tab'' structures to sensibly use in initializing this field now,
287and in any case the only use of it was to tell which block devices were tapes.
288.BP dkbad.h
289Is a new file which defines the format of the bad sector forwarding
290information according to DEC standard 144.
291.BP dmap.h
292The constant DMMIN has been increased to 32 to allow upto 16k bytes to be
293paged to the paging devices in a clustered pageout.
294.BP filsys.h
295The two fields \fIs_fname\fR and \fIs_fpack\fR which were not implemented
296before were merged together (into a single 12 character field) which
297is called \fIs_fsmnt\fR. The system puts the ASCII path name where a file
298system is mounted (e.g. /usr) in this field and uses it in printing error
299messages on the console; (e.g. ``/usr: file system full'' rather than
300``no space on dev 0/6'').
301.BP inline.h
302In order to reduce the number of conditional flags defined when compiling
303the system, the conditional flag FASTVAX, which was always defined,
304has been deleted. A conditional flag UNFAST, which is never defined, has been
305added to take its converse's place.
306.BP inode.h
307The constant NINDEX has been reduced from 15 to 6. This limits the number
308of files which may be join()'ed into a multiplexor (\fImpx\fR\|(2)) tree. You
309may have to increase this if you use the multiplexor extensively, but it
310saves a large amount of space in the kernel if you can use the smaller value,
311since NINDEX of 15 causes 40 bytes of extra unused space to be allocated
312to every inode.
313.BP map.h
314The \fBmalloc.c\fR routines have been rewritten to check for table
315overflow and renamed \frmap.c\fR.
316.BP mba.h
317Is now split into \fBmbareg.h\fR and \fBmbavar.h\fR, the former contains
318the definitions of device register and is usable, e.g., in the standalone
319version of the system. The latter contains system variable related to the
320MASSBUS adapters.
321.BP mem.h
322Is a new file which contains information on the memory controller registers
323in the form of macros which make the several VAX processors seem very
324similar to the UNIX code. Note also that the system now uses interrupts
325from the memory system to force error logging since the previous technique
326(polling) works only on the 11/780.
327.BP mscp.h
328A new file which defines the DEC \fIMass Storage Control Protocol\fR
329used by the UDA50 disk controller.
330.BP mtpr.h
331The register numbers are now given in hex, as in the DEC manuals; registers
332for all VAX processors are included.
333.BP msgbuf.h
334Defines the structure of the error message buffer, which is now kept
335in the last 1024 bytes of memory. This allows it to be preserved
336across system crashes and lets messages such as machine check reports
337be written conveniently into the error log.
338.BP nexus.h
339A new header file which defines the registers and constants related
340to the interconnect architecture of VAXen.
341.BP param.h
342No longer defines the large number of constants related to system sizing;
343a smaller number of rarely changed constants are given here. In particular,
344constants which were typically changed to affect the maximum number
345of supportable users are now controlled by the value given the \fBmaxusers\fR
346keyword in the machine specification (as described in \fIconfig\fR\|(8)).
347The \fIconfig\fR program turns this specification into parameters to the
348\fBparam.c\fR file which uses formulae to compute the values for the
349size of the process table, inode table, etc.
350.IP
351This file now includes the standard file <sys/types.h> to get system
352types rather than replicating the definitions from that file. It also
353defines a DELAY(n) macro which is used in device drivers to provide
354rougly \fIn\fR microseconds of delay. Finally the definition of UPAGES,
355the number of system control pages per-process has been increased from 6 to 8.
356This is partially due to the fact that there is now a red-zone page between
357the kernel stack and the kernel critical data in the \fIu.\fR area,
358but also because the kernel stack
359was precariously close to being too small before.
360.BP pcb.h
361Now includes definitions related to the use of AST's to implement user program
362profiling and rescheduling. Because AST's are now used, it is no longer
363necessary to take clock interrupts on the kernel stack; they now run on
364the interrupt stack where they belong. Also rescheduling processing
365is much cleaner, since the reschedule interrupts only go off when returning
366to user mode, not in the kernel where they have to be ignored (because
367UNIX cannot reschedule when running normally in the kernel.)
368.BP proc.h
369Now defines SOWEUPC, a new flag used to indicate that a profiling count
370should be generated when the (already posted) AST for this process goes off.
371Another new flag SSEQL indicates that the process has declared sequential
372paging behavior for its data space. Finally the field \fBp_maxrss\fR
373has been added, specifying the declared ``memoryuse'' limit in pages.
374.BP psl.h
375Has a bug fixed in the definitions of PSL_USERCLR. Now also declares
376PSL_USERSET and PSL_MBZ (must-be-zero).
377.BP system.h
378Defines the variables \fIhz\fR, \fItimezone\fR and \fIdstflag\fR replacing
379the old compile-time constants. No longer declares \fImsgbuf\fR as a variable
380(see \fBmsgbuf.h\fR). Defines the \fIdumpdev\fR and \fIdumplo\fR variables
381which specify where dumps are to take place. No longer defines the
382debugging variables \fIprintsw\fR and \fIcoresw\fR which have been removed
383in favor of more local debugging variables. No longer defines the field
384\fIsy_nrarg\fR for the system call entry structures, since system calls
385never take register arguments on the VAX.
386.IP
387A variable \fBwantin\fR has been added which is set each time a process
388is woken up which wants to be swapped in. This is used so that
389the code in \fBswapout\R in \fBvmsched.c\fR does not run with elevated
390priority.
391.BP trap.h
392Rearranges some codes previously used only internally so they would
393be contigous numerically. These are the finer machine traps which result
394in SIGILL and are made available to a signal handling process and defined
395in <signal.h>. Defines ASTFLT rather than RESCHED, since the VMS software
396interrupt which is used for VMS rescheduling never was appropriate for UNIX
397and is no longer used.
398.BP uba.h
399Has been split into \fBubareg.h\fR and \fBubavar.h\fR; see the description
400of device driver changes below.
401.BP user.h
402Contains definitions related to the new \fB#!\fR exec facility. The field
403\fIu.u_cfcode\fR has been renamed \fIu.u_code\fR since it is now used for
404purposes other than compatibility mode (presenting the more precise
405hardware reason for SIGILL and SIGFPE signals.)
406.BP vlimit.h
407Now defined LIM_MAXRSS for the ``limit memoryuse'' feature.
408.BP vm.h
409The \fBvm*.h\fR headers have been compressed into a more sensible set
410of files; the macros are all in \fBvmmac.h\fR (absorbing \fBvmclust.h\fR and
411\fBvmklust.h\fR), metering stuff is all in \fBvmmeter.h\fR (absorbing
412\fBvmmon.h\fR and \fBvmtotal.h\fR) and the parameters are all in
413\fBvmparam.h\fR (absorbing \fBvmtune.h\fR, most of the parameters
414of which are now adjusted at boot time in \fIsetupclock\fR in \fBvmsched.c\fR.)
415.BP vmmeter.h
416The structure \fBvmmeter\fR now computes the number of pages paged in
417\fIv_pgpgin\fR and pages paged out \fIv_pgpgout\fR, as well as the
418number of pages freed because of the behavior of programs which have
419told the system they are sequential \fIv_seqfree\fR.
420.BP vmparam.h
421The values of MAXDSIZ and MAXSSIZ have increased due to the increase to
422DMMIN in \fBdmap.h\fR.
423The klustering constants have been changed: in-clustering is now in 4 page
424(4k byte) chunks, and out-clustering is up to 16k bytes. Sequential programs
425kluster in 8k bytes, and text segments kluster in 2k bytes. The gap
426for the window into sequential programs is currently (primitively) defined
427as a constant kere in KLSDIST.
428.BP vmsystm.h
429Defines a new variable \fIavefree30\fR, which computes the average memory
430like \fIavefree\fR, but averaged over a longer period of time. This
431is used to put more hysteresis into swapping, and keep the system
432from swapping immediately when memory drops low.
433.SH
434System files: sys/sys
435.PP
436A number of files in the system have had minor changes made to them
437to reduce the length of time the system runs with the interrupt
438priority level raised; in particular, the times when the IPL is high
439enough to block the clock have been severely limited, in hopes
440of providing better real-time response (eventually) and possibly
441being able to drive the 11/750 console cassette (soon) which has
442severe interrupt latency constraints due to poor hardware interface design.
443.BP acct.c
444The code was tightened by using a register variable.
445The \fIsysphys\fR routine was moved to \fBsys4.c\fR since
446it had no business being here.
447.BP alloc.c
448Prints error messages relating to file system problems
449using the name of the file system rather than the major/minor
450device number of the device. Some code which attempted to prevent
451``dups in free'' after a reboot, but could not prevent this completely,
452has just been removed; the condition is not harmful in any case, as it
453is normal and fixed by \fIfsck\fR\|(8). The system now prints directly
454on a user's terminal if that user causes a file system to run out of free
455space. The routines here also know how to deal with the fact that the
456super-blocks are now kept locked in the buffer cache.
457.BP asm.sed
458No longer defines \fIspl1\fR which is now defunct; \fBspl7\fR is now
459VAX IPL 0x1f rather than 0x18, blocking most processor aborts
460device interrupts from the console storage device, and a number
461of other processor dependent interrupts. Deals with
462a strange feature of the optimizer which converts ``$0'' into a register
463which contains 0. Implements the \fIffs\fR routine of \fBsig.c\fR
464in a much more efficient way (in just a couple of VAX instructions.)
465Beware, however: UNIX's notion of \fIffs\fR returns 1 for the low
466bit of a word, while the hardware \fIffs\fR would return 0.
467.BP clock.c
468Now runs only that code which is absolutely necessary when the processor
469priority is very high, queueing a software interrupt at which priority
470the rest of the clock processing is done. The conditional
471(old and long unused) code which profiled the kernel in a static
472buffer has been removed. The option of fishing characters out of the
473\fIdz\fR and \fIdh\fR silo's less often than every clock tick (1/hz)
474has been removed. Instead the silos are processed every clock tick if
475the system includes the berknet (bk) line discipline, or not at all
476(i.e. we take input interrupts) if ``bk'' is not included in the system.
477.IP
478The processing and watching of hung UNIBUS adapters has been moved from
479here to the UNIBUS routines. Automatic niceing of long-running (more than
48010 minutes of user-state time) processes is now the default here, rather
481than being based on ``#if ERNIE''. A bug in the check for timeout table
482overflow which would cause the table to overrun without overflow
483being detected has been fixed. The timeout table is now implemented
484as a linked list, so that the entries can be conveniently discarded
485before calling the timeout routines. This prevents the anomalous
486case where only half the entries are used but the table fills up.
487.BP fio.c
488Has been changed to do the correct thing when special files or mounted
489file systems are closed: a flush is done at the last close and all blocks
490are invalidated. The standard ``table full'' routines are called when
491the file related tables fill up. These routines no longer pass
492\fBstruct chan *\fR pointers down to called routines, passing, instead,
493the more universal \fBstruct file *\fR pointers from which the \fBchan\fR
494pointers are easily derived.
495.BP iget.c
496Now uses the standard \fItablefull\fR routine.
497.BP ioctl.c
498The last argument to \fId_ioctl\fR routines when called is now always 0.
499.BP locore.s
500Has been extensively changed to accomodate the new configurable system,
501and to handle multiple UNIBUS and MASSBUS adapters. The code is now
502written using macros and the C preprocessor, improving readability.
503Complicated logic (such as the code to handle UNIBUS adapter errors)
504has been migrated to C code.
505.IP
506MASSBUS and UNIBUS adapters are no longer initialized or mapped here;
507this is the job of the configuration code in the system.
508The locore code distinguishes, in handling UNIBUS interrupts, from
509the machine being \fIcold\fR and not; when cold UNIBUS interrupts are
510handled so as to be suitable for determined device vectoring via probing.
511Device interrupts on the UNIBUS are now vectored through the code in
512a file \fBubglue.s\fR produced by the configuration program. To mask
513as much as possible the differences between the different VAX processors,
514the 11/780 uses the same \fBubglue.s\fR as the other processors which
515directly vector UNIBUS interrupts.
516.IP
517Many more of the exceptional conditions
518in the machine are caught now; only ``SBI alert'' and ``SBI fault'' remain
519uncaught by UNIX. The system control block is now defined in a file
520\fBscb.s\fR so that some symbols derived from C language header files by
521a program (and printed into a format suitable for inclusion in an assembly)
522may be stuck in after the system control block and before the mainline
523\fBlocore.s\fR.
524.IP
525The primitive routines \fIcopyseg\fR and \fIclearseg\fR are no longer
526run with the IPL raised very high.
527Further minor bugs have been fixed in the primitives, notably
528\fIaddupc\fR (a bug which caused 1/8 of the profiling ticks to be lost),
529and \fIkernacc\fR (a bug which allowed a strange command to a certain
530program to crash the system).
531.BP machdep.c
532.br
533Now sets up the error message buffer (in the last 1024 bytes of core)
534and the system data structures (such as the file and process table)
535at boot time. Currently only the file system buffer cache is sized
536at boot time, but all data structures are easily sized here. The startup
537routine also calls the routine \fIconfigure\fR to configure the system
538for the current hardware, locating available devices.
539.IP
540The \fIsendsig\fR routine passes a code back when a SIGFPE or SIGILL arrives,
541letting the signal handler determine which of the several conditions mapped
542to these two signals actually occurred. It uses the <frame.h> header file
543rather than redefining it.
544.IP
545The routines which monitor memory errors are now driven by interrupts
546(since the previous polling technique works only on 11/780). Extensive
547use of macros is made to make the various VAXen look similar.
548Instead of printing the raw contents of the memory controller registers,
549a array address and a syndrome is printed. Multiple memory controllers
550are supported.
551.IP
552The routines related to UNIBUS monitoring have been put with the rest of the
553UNIBUS routines in \fB../dev/uba.c\fR. The reboot interface has been
554improved, adding an automatic crash dump to a dump device (normally
555a disk aimed at the back end of a paging area). The system no longer
556``halts'' when you ask it to (since this can cause a reboot to occur);
557rather it raises the IPL as high as it can and goes into a tight loop.
558Routines have been added to handle machine checks and print out the
559stack frame in a format which is readable by one who grok's what
560the fields mean.
561.BP main.c
562Now establishes a red zone between the stack and \fIu.\fR area in process 0;
563further processes also have red zones, protecting the \fIu.\fR from too-large
564stacks. The main routines also setup the super-blocks which are locked
565into the file system buffer cache, and copy the name of the root file system
566(/) into its super-block so that the name will be available if, e.g., the
567root file system becomes full.
568.BP malloc.c
569Has been renamed \fBrmap.c\fR.
570.BP nami.c
571Now respects the notion of \fB..\fR in a directory which is a virtual root
572directory after a \fIchroot\fR\|(2) call.
573.BP prf.c
574No longer implements the ascii in-core event tracing facility,
575which proved to be too slow to
576be useful; a binary facility replaces it, and is also conditionalized
577on TRACE, but implemented in \fBvmmon.c\fR.
578Implements the output of numbers non-recursively, since
579the recursive method occasionally caused the kernel stacks to overflow.
580Implements a new kernel routine \fIuprintf\fR which prints directly on a user's
581terminal for informing him/her of situations such as file systems which
582are full (because his/her program wrote to the file system when it was full.)
583Implements a new format ``%b'' which takes two arguments, a number and
584a second pattern. The pattern specifies a base to print the number in,
585and then a set of short strings separated by bit numbers (origin 1, escaped
586in octal into the string in the C compiler). The format prints the
587symbolic names for the bits which are in the string and set in the number
588within <>'s and separated by commas. This is extensively used to produce
589readable system error diagnostic messages on the console, decoding
590the bits of device registers symbolically.
591.IP
592The routines \fIprdev\fR and \fIdeverr\fR, which printed diagnostics
593which were difficult to interpreter, are deleted. There are two new
594routines: \fItablefull\fR which balks that a table is full, and
595\fIharderr\fR which begins a message about a hard (unrecoverable)
596error on a device.
597.BP prim.c
598Now maintains a count of free \fIclist\fR space.
599The code here now runs at \fBspl5\fR rather than \fBspl6\fR since
600there is no longer any need to block the clock.
601.BP rdwri.c
602Sees the change FASTVAX to not UNFAST. Also always clears the set-user-id
603bit when a file with the bit set is written on; previously this
604was done only ``#if UCB''. If you ``#define INSECURITY'' you get the
605code the old way.
606.BP scb.s
607Is a new file defining the system control block (as described above).
608.BP sig.c
609Has a bug fixed which caused processes to occasionally stop when the
610shell thought they were running.
611Processes are now given signals immediately when they are sent if the
612process is running.
613.BP slp.c
614A clumsiness which forced the swapout code to run with the IPL raised
615has been fixed by adding a variable \fIwantin\fR with which \fIwakeup\fR
616can communicate to the swapper that a swapped out process now wants
617to return to memory.
618The routine \fIsetpri\fR has been modified so that
619processes which are over their declared (soft) memory size limitation
620are assigned lower CPU priority when the system is very tight on memory.
621.BP sys.c
622No longer allows detached jobs to access /dev/tty; this was a security
623glitch.
624.BP sys1.c
625Implements the ``#!'' executable shell script scheme. No longer lets
626executable files be read by users using \fIptrace\fR unless the user
627has read access.
628Operates \fIexec\fR much more efficiently by avoiding copying argument
629lists unless the \fIexec\fR is going to succeed.
630.BP sys2.c
631The \fIopeni\fR routine passes both the FREAD and FWRITE flags to its
632callees; this is needed by the magnetic tape open routines.
633The \fImaknode\fR routine sticks the whole argument value in the ``rdev''
634field of the inode. This is used by the \fIbadblock\fR\|(8) program
635to store block numbers corresponding to bad sectors on the disk in otherwise
636apparently empty files.
637.BP sys3.c
638The \fImount\fR and \fIumount\fR calls have been changed to deal correctly
639with buffer flushing and with simulateous access by other programs to
640the file system block devices. The \fImount\fR call also copies into the
641super-block of the file system the name of the device on which the file
642system is mounted (e.g. /usr).
643.BP sys4.c
644The \fIsyslock\fR routine has been moved here from \fBacct.c\fR.
645The mechanisms for sending signals to all processes, which is used in
646shutting down the system, has been changed so that the process which is
647broadcasting the signal does not receive it itself. This allows the
648\fIhalt\fR and \fIshutdown\fR programs to be written in a straightforward way.
649.BP trap.c
650Prints out the \fBpc\fR when an unexpected trap occurs.
651Handles AST's to implement profiling ticks and for rescheduling rather
652than the (older style) use of reschedule interrupts.
653Allows process reschedules after page faults.
654.BP vmmon.c
655Contains the internal routine \fBtrace1\fR which implements kernel
656event tracing in a circular buffer.
657.BP vmpage.c
658Tracing code, which is normally not compiled in, has been added.
659A extra case was added to an \fIif\fR statement to allow implementation
660of the vlimit(LIM_MAXRSS) feature of the system, for limiting processes
661which consume more than a process specific amount of physical memory.
662A botch was fixed in the virtual memory pre-paging which put pre-paged
663pages in the clock loop rather than at the end of the free list.
664Code has been added to implement a additonal replacement algorithm
665for processes which are declared sequential: when a hard page fault
666occurs, the pages sequentially preceding the faulted page are returned
667to the free list.
668.BP vmproc.c
669Contains a number of small changes related to AST processing.
670.BP vmpt.c
671Also contains changes for handling AST's as well as the initialization
672of the red-zone separating the stack and \fBu.\fR area of newly created
673processes.
674A bug in translation buffer flushing which caused rare and mysterious
675kernel crashes with the kernel stack not valid has been fixed.
676.BP vmsched.c
677Code has been added here which initializes the parameters of the clock
678page replacement algorithm based on the size of the machine. The \fIswapout\fR
679routine has been changed so that it no longer runs entirely at a high
680interrupt priority level (see \fBslp.c\fR above). The algorithm for
681the choice of processes to swap in and out and the hysteresis in the
682swap algorithm has been adjusted to work reasonably in extreme conditions
683when there are very large and or very few processes active in the system.
684.BP vmsubr.c
685Contains the \fIsetredzone\fR routine definition.
686.BP vmsys.c
687Contains the user interface to the kernel tracing routines.
688Code has been added to \fIvadvise\fR to setup VA_SEQL.
689.SH
690Device support: sys/dev
691.PP
692The major change to the device subsystem is the support of multiple
693MASSBUS and UNIBUS adapters, the support for multiple instances of each
694particular controller, and the support of system configuration at
695bootstrap time, investigating the interconnects, devices, and
696controllers available on the machine.
697These changes will be discussed in detail in the next section,
698which describes how to change existing drivers to work in the new system
699and gives pointers on style for writing new drivers.
700.PP
701Other changes in the device drivers affecting more than one driver:
702.IP *
703The input silos for DH-11's and DZ-11's are no longer serviced at clock IPL.
704Rather the clock interrupt queues a software interrupt during to
705service the silos. This means that the device interrupt routines
706are called from IPL 0x15, the IPL at which they normally interrupt. Thus
707it is no longer necessary to define \fBspl5\fR to be \fBspl6\fR (blocking
708the clock) in routines which handle asynchronous line input/output.
709.IP *
710The internal interface to the line discipline routines has been changed
711slightly by reordering parameters to make the arguments to the various
712\fIioctl\fR interfaces more similar; in particular \fIttioctl\fR routine
713call has been changed. If you have locally written line disciplines
714or asynchronous device drivers you should check the interfaces.
715.IP *
716The tty interface now provides full 8-bit output when the terminal is
717in LLITOUT mode; this requires support from the \fIxx\fR\|param routines
718in the device drivers (e.g. from \fIdhparam\fR and \fIdzparam\fR.)
719.IP *
720The UNIBUS adapter support routines have changed substantially, to allow
721for queueing of requests when resources are short and for support of
722multiple UNIBUS adapters. The interface now also allows devices which
723cannot function when other DMA is active on the UNIBUS to obtain
724exclusive transient use of UNIBUS resources; this is needed to successfully
725run RK07 disk controllers in the presence of other buffered data path DMA.
726In addition, it is used by 6250bpi tape drives supported on the UNIBUS.
727See the section on configuration and UNIBUS device drivers below
728for more information.
729.IP *
730DEC standard bad sector forwarding is provided for all standard DEC
731devices using the DEC formatters; the code which implements this is
732easily ported to the storage module drivers in the system, and this
733is planned soon.*
734.FS
735* The hard thing in providing bad sector handling for non-DEC drives
736is providing a formatter which produces the bad block information and
737flags the bad sectors appropriately.
738.FE
739.BP bio.c
740The hashing of buffers has been changed to use the existing device
741chain two way links. This means that unhashing is much easier, saves
742space, and uses the pointers which were otherwise little used.
743The buffers are now kept on one of three lists when not busy: a list
744of super-blocks which are locked in core, a list of good data blocks,
745which is kept fifo and used to implement the LRU buffer cache, and a list of
746data blocks for which further usage is not anticipated; this is also
747kept fifo.
748.IP
749Calls to some new tracing routines are conditionally included in \fBbio.c\fR;
750we are using them to do some performance measurement. The \fBd_tab\fR
751field of the block device table has been changed to a \fBd_flags\fR field,
752and that change is known here, where old field was checked before (to
753see if it was non-zero). Better messages are printed now when swap
754space is exhausted, and a user is told on his/her terminal that a process
755was killed before it started because there was no space.
756A subroutine has been added to purge the blocks from a specific device
757from the cache; this is used to fix some long standing buffer cache
758flushing problems which prevented removable media from being used
759reliably.
760.BP bk.c
761The definition of \fBspl5\fR as \fBspl6\fR has been removed from here.
762The line discipline is included only if the specification
763.DS
764pseudo-device bk
765.DE
766is included in the system configuration.
767The input silos on \fBdh\fR and \fBdz\fR devices are used only when
768this line discipline is included in the system.
769The comment about future implementation of 8-bit paths with this
770discipline has been deleted, since there is no longer any intention of
771doing this.
772.BP conf.c
773Has been moved to this directory from the directory \fB../conf\fR.
774\fBThis file should be changed only if you are adding support for
775a device not included on the standard distribution tape.\fR
776.BP ct.c
777Is a new driver, for a C/A/T phototypesetter interface.
778.BP dh.c
779No longer has to define \fBspl5\fR to be \fBspl6\fR.
780Incorporates the DM-11 driver standardly. A method is provided
781for specifying that lines are to be operated even though the hardware
782does not indicate that they are ready (using the flags word in the
783configuration specification, see \fIdh\fR\|(4)). A reasonable messages
784is printed when a \fBdh\fR silo overflows, replacing the old style of
785just printing a sequence of letter \fBo\fR's on the console.
786.BP dhfdm.c
787Has been incorporated into \fBdh.c\fR.
788.BP dn.c
789A driver for the DEC DN-11 autodialer interface.
790.BP dsort.c
791Has been rewritten to correct a bug which caused the elevators
792to be sorted incorrectly.
793.BP dz.c
794No longer has to define \fBspl5\fR to be \fBspl5\fR.
795Has been changed to allow lines to be specified as not properly wired
796and brought up without the ready signals showing in the interface;
797see \fIdz\fR\|(4) for details. Prints reasonable diagnostics when
798the input silo overflows.
799.BP flp.c
800Knows that there is no floppy on an 11/750.
801.BP hp.c
802Is now a sub-driver to \fBmba.c\fR, which probes nexus space for the MASSBUS
803adaptors and device space on the MASSBUS's for disks, setting up the
804driver for each device which is in the configuration.
805A number of minor bugs and enhancements have been made to the driver:
806The driver handles the new RM80 drive
807and its SSE (skip-sector-error) facility for bad sector handling,
808as well as the DEC standard bad block forwarding.
809Due to the bad block forwarding, the last three tracks of each disk
810are normally reserved to the system and available only through the
811use of a special file system partition.
812A further bug has been fixed in the initialization of the tables for RM05
813sectoring.
814The driver no longer (baroquely) turns on and off interrupts
815on the MASSBUS adapter. Basic dual-port drive handling code has
816been added to the driver.
817.IP
818The remaining remarks apply to all three supported disk drivers:
819the \fBhp\fR driver for MASSBUS disks, the \fBup\fR driver for
820UNIBUS storage modules, and the \fBhk\fR driver for RK07's:
821The drivers do
822not SEARCH or SEEK if there is only one drive on the MASSBUS. On a UNIBUS
823no SEARCHing or SEEKing is done if one drive is on the controller.
824The offset positions and recalibration of error recovery
825is now done with interrupts rather than by waiting for the operations
826to complete. This prevents the system from being tied up during the
827many recoveries of a disk operation, and is necessary in any case
828in at least one of the disk drivers (RK07).
829The iostat numbers for each MASSBUS and UNIBUS drive are calculated
830by the auto-configurator at boot time, not compiled into the drivers.
831Much cleaner handling of errors is done: the drivers realize which
832errors are not even potentially recoverable, handle drives spinning
833up and down with readable diagnostics, and print reasonable, legible
834error messages when hard errors and soft ecc's occur.
835Each driver includes a low-level non-interrupt driver used to take
836crash dumps at the end of a paging area on the device.
837The drivers include a raw i/o buffer per drive so that raw operations
838on separate devices can be overlapped (both seeks and transfers); previously
839only one raw device operation could be pending per device type.
840.BP ht.c
841The tape drive is now a sub-drive of the MASSBUS driver. The following
842remarks apply to all supported tape drivers: \fBht\fR and \fBmt\fR
843for MASSBUS
844. \"tapes, \fBts\fR for the UNIBUS ts-11, \fBtm\fR for the UNIBUS
845. \"TM-11 emulations, and \fBut\fR for UNIBUS TU45 emulations.
846tapes, \fBts\fR for the UNIBUS ts-11, and \fBtm\fR for the UNIBUS
847TM-11 emulations.
848.IP
849Each driver implements a set of tape ioctl operations on raw tapes providing
850access to the functionality of the hardware such as skipping forward
851and backward records and files and writing end-of-file marks on the tape.
852Better error diagnostics are also given on tape errors.
853Multiple tape controllers and transports are supported.
854A dump routine is provided with each driver for taking a
855post-mortem crash dump on tape, although dumps are normally made
856to the paging area on the disk.
857.IP
858With the exception of the \fBts\fP driver,
859the drivers detect and reject
860attempts to switch tape density while writing a tape.
861.BP lp.c
862Is a fully supported driver for one or more line printer interfaces.
863It has been improved from the previous drivers (which were not supported)
864to take a small fraction of the number of interrupts that the previous
865drivers took. The user-level code driving the printers has been arranged
866to work on 1200 baud DECWRITER III terminals or true printers.
867.BP mba.c
868Has been rewritten. Now allows mixing of disks and tapes on the same
869and across multiple mba's, with the devices being driven from the routines
870here calling routines defined in the individual device drivers.
871.BP mem.c
872Has been fixed to not allow any access to nexus space, even by the
873super-users, since such access inevitably results in a machine check
874and a system crash.
875.BP mt.c
876A driver for the DEC TU78 tape drive.
877.BP mx?.c
878A bug has been fixed which, caused by a missing call to \fIchdrain\fR
879caused multiplexor files to become clogged under certain circumstances.
880.BP rk.c
881Is a new driver for RK07 disks. It uses the same logic as the storage
882module drive driver \fBup.c\fR whenever possible. It also makes use
883of the interlocking facilities of the UNIBUS device support because the
884\fBrk\fR controller cannot tolerate concurrent UNIBUS dma when it is operating
885due to a design flaw.
886.BP swap.c
887Now places only half of the first piece of the \fIswapmap\fR in the
888\fIargmap\fR.
889.BP swap??*.c
890Are the files for different swap configurations. Thus \fBswaphp.c\fR
891defines the root and swap devices for a UNIX based on a \fBhp\fR disk.
892The files such as \fBswaphphp.c\fR are for interleaved paging configurations,
893placing the swapping and paging activity on two disk arms. You can
894make additional such files and include them in your configuration
895files.
896.BP tdump.c
897Has been deleted, replaced by the dump routines in individual drivers.
898.BP tm.c
899Is a driver for UNIBUS tape drives on controllers such as the EMULEX TC-11.
900It has the same functionality as \fBht\fR (see \fBht.c\fR above.)
901.BP ts.c
902Is a driver for the UNIBUS TS-11 tape drive. It has full functionality
903except the transport itself only supports 1600 bpi.
904.BP tty.c
905No longer raises its IPL to \fBspl6\fR internally to block the clock.
906Has its internal interface to \fBioctl\fR entries changed slightly to
907be globally consistent (see, e.g. \fIttioctl\fR). The DIOC* ioctl
908entries have been deleted since they are not used in any standard
909UNIX line disciplines.
910.BP ttynew.c
911A bug is fixed which prevented echoing from occurring in raw mode.
912The dec-compatible method of ^S/^Q processing needed to support VT-100s
913in smooth scroll mode is implemented when the local mode ``decctlq''
914is specified.
915.BP ttyold.c
916Implements ``decctlq'' mode.
917.BP tu.c
918A driver for the 11/750 TU58 console cassette interface. \fBNote:
919this driver provides reliable service only on a quiescent system.\fP
920.BP uba.c
921Has a much more structured interface. All the basic routines for dealing
922with the UNIBUS specify a UNIBUS adapter number to use, since there
923are potentially several on a machine. When requesting allocation
924of UNIBUS map entries, the caller specifies whether he is willing
925to block in the allocation routines waiting for resources to come
926available. If he is not, and there are no resources available, a value
927of 0 is returned, and the caller must deal with this.
928The routine which frees UNIBUS resources now takes the address of the
929variable describing the resources to be freed rather than the value
930of this variable to eliminate a race condition (where the routine is
931called, a UNIBUS interrupt occurs causing a UNIBUS reset, and the
932resources are freed twice, causing a \fIpanic\fR\|).
933.IP
934The normal interface for DMA operation is now to pass a pointer to
935a UNIBUS related structure to a routine \fIubago\fR, which allocates
936UNIBUS resources. If resources are not available, the structure
937is queued on a request queue, and processed when resources are available.
938When the requested resources are allocated, a driver specific \fIxxgo\fR
939routine is called, and can stuff the device registers with the
940address into which the operation is mapped and start the operation.
941The use of this interface is described in the next section.
942.IP
943Finally, we note that the error handling code which was written
944in assembly language is now written in C.
945.BP uda.c
946A driver for the UDA50 disk controller with RA80 Winchester
947storage modules.
948.BP up.c
949The UNIBUS storage module disk driver has been fixed up in the same
950way that the \fBhp\fR driver was, giving better error diagnostics
951and using interrupts during error recovery, etc. See \fBhp.c\fR
952above for details. The driver uses a feature of the EMULEX SC-21
953to determine the size of the disks in use, so that it can adapt
954to both 300M storage modules and the Fujitsu 160M drives which are
955popular. Other drive sizes can be added easily.
956. \".BP ut.c
957. \"A driver for the System Industries Model 9700 tape drive, emulating
958. \"a DEC TU45 on the UNIBUS.
959.BP va.c
960The \fBvarian\fR printer-plotter driver has been modified so that
961it can support more than one device, probes the devices so they
962can be placed on differrent UNIBUS'es, and prints an error diagnostic
963when device errors are detected.
964.BP vaxcpu.c
965Is a new file which contains initializations of various CPU-type
966dependent structures.
967.BP vp.c
968Has been modified to handle multiple devices, and adapted to the
969auto-configuration code.
970.SH
971Configuration and UNIBUS device drivers
972.PP
973Someday this section will be a separate document.
974This section explains how to interface an existing UNIX
975device driver to the VAX system, especially to the UNIBUS
976routines and the autoconfiguration code.
977.PP
978A PDP-11, UNIX/32V or 3BSD or 4.0BSD
979driver on the VAX UNIBUS will need to be modified
980to run under 4.1BSD. There are three reasons why such a driver
981will need to be changed:
982.IP 1)
9834.1bsd supports multiple UNIBUS adapters.
984.IP 2)
9854.1bsd supports system configuration at boot time.
986.IP 3)
9874.1bsd manages the UNIBUS resources and does not crash when
988resources are not available; the resource allocation protocol must
989be honored. In addition, devices such as the RK07 which require
990everyone else to get off the UNIBUS when they are running
991need cooperation from other DMA devices if they are to work.
992.PP
993Each UNIBUS on a VAX has a set of resources:
994.IP *
995496 map registers which are used to convert from the 18 bit UNIBUS
996addresses into the much larger VAX address space.
997.IP *
998Some number of buffered data paths (3 on an 11/750, 15 on an 11/780)
999which are used by high speed devices to transfer
1000data using fewer bus cycles.
1001.LP
1002There is a structure of type \fBstruct uba_hd\fR in the system per UNIBUS
1003adapter used to manage these resources. This structure also contains
1004a linked list where devices waiting for resources to complete DMA UNIBUS
1005activity have requests waiting.
1006.PP
1007There are three central structures in the writing of drivers for UNIBUS
1008controllers; devices which do not do DMA i/o can often use only two
1009of these structures. The structures are \fBstruct uba_ctlr\fR, the
1010UNIBUS controller structure, \fBstruct uba_device\fR the UNIBUS
1011device structure, and \fBstruct uba_driver\fR, the UNIBUS driver structure.
1012The \fBuba_ctlr\fR and \fBuba_device\fR structures are in
1013one-to-one correspondence with the definitions of controllers and
1014devices in the system configuration.
1015Each driver has a \fBstruct uba_driver\fR structure specifying an internal
1016interface to the rest of the system.
1017.PP
1018Thus a specification
1019.DS
1020controller sc0 at uba0 csr 0176700 vector upintr
1021.DE
1022would cause a \fBstruct uba_ctlr\fR to be declared and initialized in
1023the file \fBioconf.c\fR for the system configured from this description.
1024Similarly specifying
1025.DS
1026disk up0 at sc0 drive 0
1027.DE
1028would declare a related \fBuba_device\fR in the same file.
1029The \fBup.c\fR driver which implements this driver specifies in
1030its declarations:
1031.DS
1032int upprobe(), upslave(), upattach(), updgo(), upintr();
1033struct uba_ctlr *upminfo[NSC];
1034struct uba_device *updinfo[NUP];
1035u_short upstd[] = { 0776700, 0774400, 0776300, 0 };
1036struct uba_driver scdriver =
1037 { upprobe, upslave, upattach, updgo, upstd, "up", updinfo, "sc", upminfo };
1038.DE
1039initializing the \fBuba_driver\fR structure.
1040The driver will support some number of controllers named \fBsc0\fR, \fBsc1\fR,
1041etc, and some number of drives named \fBup0\fR, \fBup1\fR, etc. where the
1042drives may be on any of the controllers (that is there is a single
1043linear name space for devices, separate from the controllers.)
1044.PP
1045We now explain the fields in the various structures. It may help
1046to look at a copy of \fBh/ubareg.h\fR, \fBh/ubavar.h\fR and drivers
1047such as \fBup.c\fR and \fBdz.c\fR while reading the descriptions of
1048the various structure fields.
1049.SH
1050uba_driver structure
1051.PP
1052One of these structures exists per driver. It is initialized in
1053the driver and contains functions used by the configuration program
1054and by the UNIBUS resource routines. The fields of the structure are:
1055.BP ud_probe
1056A routine which is given a \fBcaddr_t\fR address as argument and
1057should cause an interrupt on the device whose control-status register
1058is at that address in virtual memory. It may be the case that the
1059device does not exist, so the probe routine should use delays (via
1060the DELAY(n) macro which delays for \fIn\fR microseconds) rather than
1061waiting for specific events to occur. The routine must \fBnot\fR
1062declare its argument as a \fBregister\fR parameter, but \fBmust\fR declare
1063.DS
1064\fBregister int br, cvec;\fR
1065.DE
1066as local variables. At boot time the system takes special measures
1067that these variables are ``value-result'' parameters. The \fBbr\fR
1068is the IPL of the device when it interrupts, and the \fBcvec\fR
1069is the interrupt vector address on the UNIBUS. These registers
1070are actually filled in in the interrupt handler when an interrupt occurs.
1071.IP
1072As an example, here is the \fBup.c\fR
1073probe routine:
1074.DS
1075upprobe(reg)
1076 caddr_t reg;
1077{
1078 register int br, cvec;
1079
1080#ifdef lint
1081 br = 0; cvec = br; br = cvec;
1082#endif
1083 ((struct updevice *)reg)->upcs1 = UP_IE|UP_RDY;
1084 DELAY(10);
1085 ((struct updevice *)reg)->upcs1 = 0;
1086 return (1);
1087}
1088.DE
1089The definitions for \fIlint\fR serve to indicate to it that the
1090\fBbr\fR and \fBcvec\fR variables are value-result. The statements
1091here interrupt enable the device and write the ready bit UP_RDY.
1092The 10 microsecond delay insures that the interrupt enable will
1093not be cancelled before the interrupt can be posted. The return of
1094``1'' here indicates that the probe routine is satisfied that the device
1095is present. A probe routine may use the function ``badaddr'' to see
1096if certain other addresses are accessible on the UNIBUS (without generating
1097a machine check), or look at the contents of locations where certain
1098registers should be. If the registers contents are not acceptable or
1099the addresses don't respond, the probe routine can return 0 and the
1100device will not be considered to be there.
1101.IP
1102One other thing to note is that the action of different VAXen when illegal
1103addresses are accessed on the UNIBUS may differ. Some of the machines
1104may generate machine checks and some may cause UNIBUS errors. Such
1105considerations are handled by the configuration program and the driver
1106writer need not be concerned with them.
1107.IP
1108It is also possible to write a very simple probe routine for a one-of-a-kind
1109device if probing is difficult or impossible. Such a routine would include
1110statements of the form:
1111.DS
1112br = 0x15;
1113cvec = 0200;
1114.DE
1115for instance, to declare that the device ran at UNIBUS br5 and interrupted
1116through vector 0200 on the UNIBUS. The current TS-11 driver does
1117something similar to this because the device is so difficult to force
1118an interrupt on that it hardly seems worthwhile. (Besides, TS-11's are
1119usually present on small 11/750's which have only one UNIBUS, and TS-11's
1120can have only exactly one transport per-controller so little probing is
1121needed.)
1122.BP ud_slave
1123This routine is called with a \fBuba_device\fR structure (yet to
1124be described) and the address of the device controller. It should
1125determine whether a particular slave device of a controller is
1126present, returning 1 if it is and 0 if it is not.
1127As an example here is the slave routine for \fBup.c\fR.
1128.DS
1129upslave(ui, reg)
1130 struct uba_device *ui;
1131 caddr_t reg;
1132{
1133 register struct updevice *upaddr = (struct updevice *)reg;
1134
1135 upaddr->upcs1 = 0; /* conservative */
1136 upaddr->upcs2 = ui->ui_slave;
1137 if (upaddr->upcs2&UPCS2_NED) {
1138 upaddr->upcs1 = UP_DCLR|UP_GO;
1139 return (0);
1140 }
1141 return (1);
1142}
1143.DE
1144Here the code fetches the slave (disk unit) number from the \fBui_slave\fR
1145field of the \fBuba_device\fR structure, and sees if the controller
1146responds that that is a non-existant driver (NED). If the drive
1147a drive clear is issued to clean the state of the controller, and 0 is
1148returned indicating that the slave is not there. Otherwise a 1 is returned.
1149.BP ud_attach
1150The attach routine is called after the autoconfigure code and the driver concur
1151that a peripheral exists attached to a controller. This is the routine
1152where internal driver state about the peripheral can be initialized.
1153Here is the \fIattach\fR routine from the \fBup.c\fR driver:
1154.ID
1155.nf
1156upattach(ui)
1157 register struct uba_device *ui;
1158{
1159 register struct updevice *upaddr;
1160
1161 if (upwstart == 0) {
1162 timeout(upwatch, (caddr_t)0, hz);
1163 upwstart++;
1164 }
1165 if (ui->ui_dk >= 0)
1166 dk_mspw[ui->ui_dk] = .0000020345;
1167 upip[ui->ui_ctlr][ui->ui_slave] = ui;
1168 up_softc[ui->ui_ctlr].sc_ndrive++;
1169 upaddr = (struct updevice *)ui->ui_addr;
1170 upaddr->upcs1 = 0;
1171 upaddr->upcs2 = ui->ui_slave;
1172 upaddr->uphr = UPHR_MAXTRAK;
1173 if (upaddr->uphr == 9)
1174 ui->ui_type = 1; /* fujitsu hack */
1175 upaddr->upcs2 = UPCS2_CLR;
1176}
1177.DE
1178The attach routine here performs a number of functions. The first
1179time any drive is attached to the controller it starts the timeout
1180routine which watches the disk drives to make sure that interrupts
1181aren't lost. It also initializes, for devices which have been assigned
1182\fIiostat\fR numbers (when ui->ui_dk >= 0), the transfer rate of the
1183device in the array \fBdk_mspw\fR, the fraction of a second it takes
1184to transfer 16 bit word. It then initializes an inverting pointer
1185in the array \fBupip\fR which will be used later to determine, for a
1186particular \fBup\fR controller and slave number, the corresponding
1187\fBuba_device\fR. It increments the count of the number of devices
1188on this controller, so that search commands can later be avoided
1189if the count is exactly 1. It then uses a hardware feature of the
1190EMULEX SC-21 to ask if the number of tracks on the device is 9. If
1191it is, then the driver assumes that the type is ``1'', which corresponds
1192to a FUJITSU 160M drive. The alternative is the only other currently
1193supported device, a 300 Megabyte CDC or AMPEX drive, which has \fBui_type\fR
11940. Note that if the controller is not an SC-21 then attempting to find
1195out the maximum track in the device will yield an error, and a 300
1196Megabyte device will be assumed. In any case, any errors resulting from
1197the attempt to type the drive are cleared by a controller clear before
1198the routine returns.
1199.BP ud_dgo
1200Is the routine which is called by the UNIBUS resource management
1201routines when an operation is ready to be started (because the required
1202resources have been allocated). The routine in \fBup.c\fR is:
1203.DS
1204updgo(um)
1205 struct uba_ctlr *um;
1206{
1207 register struct updevice *upaddr = (struct updevice *)um->um_addr;
1208
1209 upaddr->upba = um->um_ubinfo;
1210 upaddr->upcs1 = um->um_cmd|((um->um_ubinfo>>8)&0x300);
1211}
1212.DE
1213This routine uses the field \fBum_ubinfo\fR of the \fBuba_ctlr\fR
1214structure which is where the UNIBUS routines store the UNIBUS
1215map allocation information. In particluar, the low 18 bits of this
1216word give the UNIBUS address assigned to the transfer. The assignment
1217to \fIupba\fR in the go routine places the low 16 bits of the UNIBUS
1218address in the disk UNIBUS address register. The next assignment
1219places the disk operation command and the extended (high 2) address
1220bits in the device control-status register, starting the i/o operation.
1221The field \fBum_cmd\fR was initialized with the command to be stuffed
1222here in the driver code itself before the call to the \fBubago\fR
1223routine which eventually resulted in the call to \fBupdgo\fR.
1224.BP ud_addr
1225Are the conventional addresses for the device control registers in
1226UNIBUS space. This information is not used by the system in this
1227release, but may be used in future releases to look for instances of
1228the device supported by the driver. In the current system, the configuration
1229file specifies the control-status register addresses of all configured devices.
1230.BP ud_dname
1231Is the name of a \fIdevice\fR supported by this controller; thus the
1232disks on a SC-21 controller are called \fBup0\fR, \fBup1\fR, etc.
1233That is because this field contains \fBup\fR.
1234.BP ud_dinfo
1235Is an array of back pointers to the \fBuba_device\fR structures for
1236each device attached to the controller. Each driver defines a set of
1237controllers and a set of devices. The device address space is always
1238one-dimensional, so that the presence of extra controllers may be
1239masked away (e.g. by pattern matching) to take advantage of hardware
1240redundancy. This field is filled in by the configuration program,
1241and used by the driver.
1242.BP ud_mname
1243The name of a controller, e.g. \fBsc\fR for the \fBup.c\fR driver.
1244The first SC-21 is called \fBsc0\fR, etc.
1245.BP ud_minfo
1246The backpointer array to the structures for the controllers.
1247.BP ud_xclu
1248If non-zero specifies that the controller requires exclusive
1249use of the UNIBUS when it is running. This is non-zero currently
1250only for the RK611 controller for the RK07 disks to map around a hardware
1251problem. It could also be used if 6250bpi tape drives are to be used
1252on the UNIBUS to insure that they get the bandwidth that they need
1253(basically the whole bus).
1254.SH
1255uba_ctlr structure
1256.PP
1257One of these structures exists per-controller.
1258The fields link the controller to its UNIBUS adaptor and contain the
1259state information about the devices on the controller. The fields are:
1260.BP um_driver
1261A pointer to the \fBstruct uba_device\fR for this driver, which has
1262fields as defined above.
1263.BP um_ctlr
1264The controller number for this controller, e.g. the 0 in \fBsc0\fR.
1265.BP um_alive
1266Set to 1 if the controller is considered alive; currently, always set
1267for any structure encountered during normal operation. That is, the
1268driver will have a handle on a \fBuba_ctlr\fR structure only if the
1269configuration routines set this field to a 1 and entered it into the
1270driver tables.
1271.BP um_intr
1272The interrupt vector routines for this device. These are generated
1273by the \fIconfig\fR\|(8) program and this field is initialized in the
1274\fBioconf.c\fR file.
1275.BP um_hd
1276A back-pointer to the UNIBUS adapter to which this controller is attached.
1277.BP um_cmd
1278A place for the driver to store the command which is to be given to
1279the device before calling the routine \fIubago\fR with the devices
1280\fBuba_device\fR structure. This information is then retrieved when the
1281device go routine is called and stuffed in the device control status register
1282to start the i/o operation.
1283.BP um_ubinfo
1284Information about the UNIBUS resources allocated to the device. This is
1285normally only used in device driver go routine (as \fBupdgo\fR above)
1286and occasionally in exceptional condition handling such as ECC correction.
1287.BP um_tab
1288This buffer structure is a place where the driver hangs the device structures
1289which are ready to transfer. Each driver allocates a buf structure for each
1290device (e.g. \fBupdtab\fR in the \fBup.c\fR driver) for this purpose.
1291You can think of this structure as a device-control-block, and the
1292buf structures linked to it as the unit-control-blocks.
1293The code for dealing with this structure is stylized; see the \fBrk.c\fR
1294or \fBup.c\fR driver for the details. If the \fBubago\fR routine
1295is to be used, the structure attached to this \fBbuf\fR structure
1296must be:
1297.RS
1298.IP *
1299A chain of \fBbuf\fR structures for each waiting device on this controller.
1300.IP *
1301On each waiting \fBbuf\fR structure another \fBbuf\fR structure which is
1302the one containing the parameters of the i/o operation.
1303.RE
1304.SH
1305uba_device structure
1306.PP
1307One of these structure exists for each device attached to a UNIBUS
1308controller. Devices which are not attached to controllers or which
1309perform no buffered data path
1310DMA i/o may have only a device structure. Thus \fBdz\fR
1311and \fBdh\fR devices have only \fBuba_device\fR structures.
1312The fields are:
1313.BP ui_driver
1314A pointer to the \fBstruct uba_driver\fR structure for this device type.
1315.BP ui_unit
1316The unit number of this device, e.g. 0 in \fBup0\fR, or 1 in \fBdh1\fR.
1317.BP ui_ctlr
1318The number of the controller on which this device is attached, or \-1
1319if this device is not on a controller.
1320.BP ui_ubanum
1321The number of the UNIBUS on which this device is attached.
1322.BP ui_slave
1323The slave number of this device on the controller which it is attached to,
1324or \-1 if the device is not a slave. Thus a disk which was unit 2 on
1325a SC-21 would have \fBui_slave\fR 2; it might or might not be \fBup2\fR,
1326that depends on the system configuration specification.
1327.BP ui_intr
1328The interrupt vector entries for this device, copied into the UNIBUS
1329interrupt vector at boot time. The values of these fields are filled
1330in by the \fBconfig\fR\|(8) program to small code segments which it
1331generates in the file \fBubglue.s\fR.
1332.BP ui_addr
1333The control-status register address of this device.
1334.BP ui_dk
1335The iostat number assigned to this device. Numbers are assigned to
1336disks only, and are small positive integers which index the various
1337\fBdk_*\fR arrays in <sys/dk.h>.
1338.BP ui_flags
1339The optional ``\fBflags \fR\fIxxx\fR'' parameter from the configuration
1340specification was copied to this field, to be interpreted by the driver.
1341If \fBflags\fR was not specified, then this field will contain a 0.
1342.BP ui_alive
1343The device is really there. Presently set to 1 when a device is
1344determined to be alive, and left 1.
1345.BP ui_type
1346The device type, to be used by the driver internally. Thus the \fBup.c\fR
1347driver uses a \fBui_type\fR of 0 to mean a 300 Megabyte drive and a type
1348of 1 to mean a 160 Megabyte FUJITSU drive.
1349.BP ui_physaddr
1350The physical memory address of the device control-status register.
1351This is used in the device dump routines typically.
1352.BP ui_mi
1353A \fBstruct uba_ctlr\fR pointer to the controller (if any) on which
1354this device resides.
1355.BP ui_hd
1356A \fBstruct uba_hd\fR pointer to the UNIBUS on which this device resides.
1357.SH
1358Changing drivers
1359.PP
1360If you driver does not do buffered data path DMA, conversion
1361to the new system should be straightforward; if it uses
1362buffered data paths more work will be required, but the
1363task is really mostly cosmetic.
1364.PP
1365In any case, first add a line to the file \fBconf/files\fR of the form
1366.DS
1367dev/zz.c optional zz device-driver
1368.DE
1369so that your driver will be included when you
1370specify it in a configuration.
1371Change the \fBdev/conf.c\fR file to include a block or character
1372device entry for your device. Note that the block device entries
1373now include a \fBd_dump\fR entry; if you are a block device but
1374don't have a dump entry point, just make one in your driver that
1375returns the value ENODEV.
1376.PP
1377Then build a system configuration including your driver so that
1378you have a compilation environment for your driver. You will
1379have to add a \fBstruct uba_driver\fR declaration for your driver,
1380and change its calls to UNIBUS routines to correspond to these
1381routines in the new system. Trouble spots will show up here.
1382In particular, notice that you must specify flags to \fBuballoc\fR
1383if you call it:
1384.BP NEEDBDP
1385if you need a buffered data path
1386.BP CANTWAIT
1387if you are calling (potentially) from interrupt level
1388.LP
1389You may discover that your driver ``cantwait'' but that you are calling
1390from interrupt level. This botch existed in most previous VAX UNIX
1391drivers, since there were no mechanisms for dealing with this.
1392We will describe some options shortly.
1393.PP
1394First, suppose your driver doesn't do buffered data path dma.
1395What else is there for you to do? Very little really.
1396You should change your driver to print messages on the console
1397in the format now used by all device drivers; see section 4 of the
1398revised programmers manual for details.
1399To make more certain that your driver is ready for the new system
1400environment, look at some of the simple existing drivers
1401and mimic the style to create the portions of the driver which
1402are needed to interface with the configuration part of the system.
1403Useful drivers to look at may be:
1404.BP ct.c
1405Very simple drive which does programmed i/o to C/A/T phototypesetter.
1406.BP dh.c
1407Communications line driver which uses non-buffered UNIBUS dma
1408for output.
1409.BP dz.c
1410Communications line driver which does programmed i/o.
1411.PP
1412Basically all you have to do is write a \fBud_probe\fR and a \fBud_attach\fR
1413routine for the controller. It suffices to have a \fBud_probe\fR routine
1414which just initializes \fBbr\fR and \fBcvec\fR, and a \fBud_attach\fR
1415routine which does nothing.
1416Making the device fully configurable requires, of course, more work,
1417but is worth it if you expect the device to be in common usage
1418and want to share it with others.
1419.PP
1420If you managed to create all the needed hooks, then make sure you include
1421the necessary header files; the ones included by \fBct.c\fR are nearly
1422minimal. Order is important here, don't be suprised at undefined structure
1423complaints if you order the includes wrongly.
1424Finally if you get the device configured in, you can try bootstrapping
1425and see if configuration messages print out about your device.
1426It is a good idea to have some messages in the probe routine so that
1427you can see that you are getting called and what is going on.
1428If you do not get called, then you probably have the control-status
1429register address wrong in your system configuration. The autoconfigure
1430code notices that the device doesn't exist in this case and you will never
1431get called.
1432.PP
1433Assuming that your probe routine works and you manage
1434to generate an interrupt, then you are basically back to where you
1435would have been under older versions of UNIX.
1436Just be sure to use the \fBui_ctlr\fR field of the \fBuba_device\fR
1437structures to address the device; compiling in funny constants
1438will make your driver only work on the CPU type you have (780 or 750).
1439.PP
1440Other bad things that might happen while you are setting up the configuration
1441stuff:
1442.IP *
1443You get ``nexus zero vector'' errors from the system. This will happen
1444if you cause a device to interrupt, but take away the interrupt enable
1445so fast that the UNIBUS adapter cancels the interrupt and confuses
1446the processor. The best thing to do it to put a modest delay in the
1447probe code between the instructions which should cause and interrupt and
1448the clearing of the interrupt enable. (You should clear interrupt
1449enable before you leave the probe routine so the device doesn't interrupt
1450more and confuse the system while it is configuring other devices.)
1451.IP *
1452The device refuses to interrupt or interrupts with a ``zero vector''.
1453This typically indicates a problem with the hardware or, for devices
1454which emulate other devices, that the emulation is incomplete. Devices
1455may fail to present interrupt vectors because they have configuration
1456switches set wrong, or because they are being accessed in inappropriate ways.
1457Incomplete emulation can cause ``maintenance mode'' features to not work
1458properly, and these features are often needed to force device interrupts.
1459.SH
1460Adapting devices which do buffered data path dma
1461.PP
1462These devices fall into two categories: those which are controllers
1463to which devices are attached, and those which are just single devices.
1464The interface for the former is very stylized and we recommend that
1465you simply mimic one of the existing tape or disk drivers in adapting
1466to the system. You will find that the existing tape and disk drivers
1467are all \fBvery\fR similar; this is deliberate so that it isn't
1468necessary to rewrite the whole driver for each device, since the available
1469devices are typically very similar.
1470.PP
1471Other devices which do buffered data path DMA can be adapted to the
1472new system in one of two ways:
1473.IP *
1474They can do their own data path allocation, calling the UNIBUS
1475allocation routines from the ``top-half'' (non-interrupt) code,
1476sleeping in the UNIBUS code when resources are not available.
1477See for an example the code in the \fBvp.c\fR driver.
1478.IP *
1479They can set up a two-level structure like the tape and disk drivers
1480do, and call the \fIubago\fR routine and use the \fBud_dgo\fR interface
1481to start DMA operations.
1482See for an example the code in the \fBup.c\fR driver.
1483.PP
1484Either way works acceptably well; the second (\fIubago\fR\|) interface
1485is preferable because it does not force a context switch per i/o operation
1486(to the routine driving the i/o from the ``top-half'').
1487.PP
1488If you have questions about converting drivers, feel free to call us
1489and ask or to send us mail. We hope (eventually) to write a more
1490complete paper for driver writers, but don't have the manpower to do this
1491just now.