.TL Changes in the Kernel in 4.1bsd .sp May 10, 1981 .br Revised: September 1, 1981 .AU Bill Joy .AI Computer Systems Research Group University of California, Berkeley .PP This document summarizes the changes in the kernel between the November 1980 4.0bsd release and the and April 1981 4.1bsd distribution. The information is presented in both overall terms (e.g. organizational changes), and as specific comments about individual changed files. See the source code itself for more details. .PP The major changes fall in five categories: .IP 1) Changes in the VAX 11/780 specific portions of the code, so that VAX 11/750's are supported also. .IP 2) Changes in the organization of the code, so that more than one configuration of the system may be built from a single set of sources. Each system is described by a single file which includes parameters such as system size, devices on the machine, etc. All ``magic numbers'' such as device register addresses are collected in this single file. .IP 3) Extensive changes in the device subsystem to allow multiple UNIBUS and MASSBUS adapters to be used, multiple instances of device controllers to exist without duplicating driver code, and to provide the capability of system configuration at boot time. The configuration capability is used to produce a generic system which runs on all supported hardware, and is used for distributions. Pattern matching in the configuration capability also allows hardware redundancy to be used to good effect. .IP 4) Diagnostics of the system have been reworked to be in a standard and readable format; file system diagnostic refer to the file systems by name rather than device number. Device diagnostics refer to the devices by name, and print error messages including device registers decoded symbolically rather than simply in octal or hexadecimal. DEC standard bad sector forwarding has been added to the drivers for DEC disks. .IP 5) Performance improvements, noticeably in the paging subsystem. .PP A number of enhancements and bug fixes have also been made. .SH Carrying over local software .PP The majority of local changes should carry over to the new system quite easily. It it necessary to create a configuration file for each machine from which a system will be built, but this is quite easy, and such files are designed to be usable without change in future releases of the system. .PP Locally written UNIBUS device drivers will need to be converted to work in the new system. The new functions needed of the device drivers are: .IP 1) Forcing a device interrupt at bootstrap time, given a proposed device register address. This is used by the configuration program to decide if the device really exists. .IP 2) If buffered data paths are to be used, the driver must use routines in the UNIBUS adapter subsystem which arranges for i/o requests to be queued when there are no resources available. .IP 3) Drivers must not assume that only one instance of a device exists in the system, but must rather be parameterized and use the information provided by the bootstrap procedure to drive all available devices. .PP Of course, it is not necessary to make a driver ``fully supported'' for it to be used. It suffices to handle 1) by pretending that the interrupt occurred, returning the (for a single system) known UNIBUS vector information, and assuming that the device exists on specific UNIBUS adapters. Drivers which use UNIBUS resources only statically or not all all need not be concerned with 2), and drivers can assume that there is only one instance of the supported device on the system, and just not work if more than one such device is really present. .PP In any case, more information about device driver changes is given in the last section of this document; also see \fIautoconf\fR\|(4) for information about the messages printed out by the configuration code at bootstrap time. Looking at the provided standard supported drivers for examples of code is also a good idea. .PP There is also a new interface for MASSBUS devices. Since all MASSBUS devices are already supported, there is no external documentation for writing new MASSBUS drivers at the present. If you have questions or intend to write a driver for a home-brew interface, you should read the MASSBUS and MASSBUS device driver code, which is amply commented. In any case, the MASSBUS interface is more stylized than the UNIBUS interface, and you may have to extend the functionality of the MASSBUS driver to handle radically different devices. .SH Organizational changes .PP On RK07 systems the source for the system lives in the root directory, since there is so little space. The system otherwise lives where it used to: the subdirectories of /usr/src/sys, with copies of the header files for the installed system in /usr/include/sys. .PP The system compilation procedure has been changes so that more than one set of binaries may be kept conveniently with a single copy of the source code. The system sources are kept in the directories \fBsys/sys\fR and \fBsys/dev\fR with the header files in \fBsys/h\fR. Source files which were previously kept in \fBsys/conf\fR are now in \fBsys/dev\fR, and no binaries are kept in any of these directories. .PP The directory \fBsys/conf\fR contains a number of files related to system configuration. For each machine to be configured, a single file is created in the \fBsys/conf\fR directory; thus files \fBERNIE\fR and \fBBERT\fR might exist there. Each such file describes all the parameters of the machine to be used: the devices which are to be configured into the system, optional parts of the system to be included, as well as the timezone in which the machine lives and the maximum number of simultaneous active users; the last is used to scale system tables. The format of the configuration files is described in \fIconfig\fR\|(8). .PP Corresponding to each system to be configured there is a directory of \fBsys\fR, thus \fBsys/BERT\fR and \fBsys/ERNIE\fR. These directories are made with \fImkdir\fR and then the \fB/etc/config\fR command is run, from the \fBsys/conf\fR directory, specifying \fBBERT\fR or \fBERNIE\fR as argument. The configuration program processes the information in the configuration files, and produces, in the directory \fB../BERT\fR or \fB\&../ERNIE\fR respectively: .IP 1) A set of header files, e.g. \fBdz.h\fR, which contain the number of devices and controllers to be available in the target system. These definitions force conditional compilation of drivers resulting in the inclusion or exclusion of driver code and the sizing of driver tables. This technique, based on compilation, is more powerful than a loader-based technique, since small sections of code may be easily conditionalized. Similarly, dynamic loading of device drivers is not needed, as only drivers which are needed are included in the resulting system. .IP 2) A small assembly language vector interface, which turns the hardware generated UNIBUS interrupt sequences into C calls on the driver interrupt routines. This \fBubglue.s\fR file glues the hardware interrupt sequence into the UNIX interface. .IP 3) A table file \fBioconf.c\fR which initializes tables to be used at bootstrap time by the system configuration routines. The configuration routines interpret the contents of the table and determine which devices are available on the system. They determine the vector addresses of UNIBUS devices by forcing the devices to interrupt. Pattern matching in the tables may be used to take advantage of hardware redundancy: the specifications need not completely constrain device placement, so the system can be built to bootstrap in several different configurations, locating the same devices on different interconnects by the fact that their unit numbers have not changes (for example). Thus two RP06 disks could be specified as: .DS disk hp0 at mba? drive 0 disk hp1 at mba? drive 1 .DE and then the disks could be cabled to any available MASSBUS adapter; the pattern matching in the configuration procedure would locate the drives. Similarly, a tape formatter on the same system could be specified: .DS master ht0 at mba? drive ? .DE and then placed anywhere on any MBA. Contrast this flexible specification with .DS disk hp0 at mba0 drive 0 disk hp1 at mba0 drive 1 master ht0 at mba1 drive 0 .DE which is not reconfigurable. This latter specification corresponds to the previous UNIX capabilities, which did not allow tapes and disks on the same MASSBUS adapter. .IP 4) Finally the \fIconfig\fR program constructs a \fImakefile\fR for the system which builds the drivers needed in the specified configuration, and includes system loading sequences for the different root and swap device configurations desired. .PP It is now easy to include ``subsystems'' optionally. This is done through the same mechanisms which causes conditional inclusion of device drivers. The file \fBconf/files\fR contains a palate of files which builds the system. Each line is either of the form: .DS filename standard .DE or .DS filename optional \fIxx\fR .DE where \fIxx\fR is the name of a device which requires the file, or a \fIpseudo-device\fR. To define a subsystem to be added to the kernel it suffices to add specifications to the \fBconf/files\fR file for the newly optional files and to then place a specification .DS pseudo-device \fIxx\fR .DE in the system configuration file. A line .DS options \fIXX\fR .DE may also be added to the configuration to have the symbol XX defined during compilation, for use in conditional compilations in the standard part of the system. Such conditional compilation is typically used to provide hooks in the standard part of the kernel to switch out to subsystem functions. .PP This completes the general description of organizational changes. We now describe the changes in the system, file by file. .SH Header files: sys/h and /usr/include/sys .PP General changes: device drivers now have header files in these directories, thus the ``up'' driver has a header file ``upreg.h''. This so the standalone code and the mainline UNIX code can share the common definitions. .PP The ``.m'' files of the previous distribution have been eliminated (with the sole exception of \fBpcb.m\fR); the magic numbers which were manually entered in these files are instead generated by a program from the definitions in the corresponding \fB\&.h\fR files; a number of header files thus no longer warn about correspondences that must be maintained. .PP The system tables are now described by pointers to their beginning and end and a count, rather than compiled in constants. This allows table sizes to be chosen at boot time (although the system currently does this only for the file system buffer cache), and makes programs such as \fBps\fR and \fBw\fR not compile in these constants. Note, especially, that the symbols such as \fIproc\fR and \fIinode\fR are now memory locations containing the addresses of these structures rather than the base of the structures themselves. Programs which access these structures have been changed and use the variables \fInproc\fR and \fIninode\fR in core rather than the (now defunct) constants NPROC and NINODE. .de BP .IP \\fB\\$1\\fR 14n .. .BP buf.h Now declares three headers on which the in-core buffers are placed. Buffers which are locked in the buffer cache are placed on the first queue. Currently, only file system super blocks are locked in core, and to good effect: it is now possible to rebuild the super-block of the root file system with the system quiescent (without rebooting) if the block device is used. It is no longer necessary to take a buffer for the super-block of a file system and also make a copy of it at each sync; the same buffer can be reused and simply released: since it is locked it will remain in the buffer cache. .IP The other two queues implement the lru cache and the list of blocks which have been discarded. By having queues for both of these rather than using the end of a single queue, we achieve true fifo behavior for blocks which we consider ``discarded''; previously rather strange behavior resulted from pushing these blocks backwards on the front of the single queue. (In particular pipes would behave badly on idle systems under some circumstances.) .IP The number of pages paged is counted at pageout completion, as well as the pageout event count. A bug in the \fBphysio\fR routine which caused physical transfers of more than 60000 bytes to sometimes fail to return an error indication has been fixed. .IP A flag has been added that marks a buffer as consisting only of a header and also one which marks a buffer being used for bad-sector processing. .BP callo.h Is now called \fBcallout.h\fR, and the name of the structure is similarly changed to make it consistent with the other structures in the kernel. The structures are now linked together in linked lists, to prevent arcane situations previously possible where only half of the structures would be used, but the table space would be exhausted. .BP clock.h A botch in handling of leap years has been fixed. A macro is defined here to queue a software interrupt for handling most of the clock processing at an IPL lower than the clock IPL. .BP cmap.h This file, like a number of others, no longer warns that the size of the structure is known elsewhere; such dependencies are the concern of a C program and automated through makefile dependencies. .BP conf.h A \fId_dump\fR entry has been added to the block device table, and is used as the system now normally does automatic dumps of core memory to disk after a crash. The field \fId_tab\fR is now called \fId_flags\fR and set to B_TAPE for tapes. For reasons not worth explaining here, there are no ``tab'' structures to sensibly use in initializing this field now, and in any case the only use of it was to tell which block devices were tapes. .BP dkbad.h Is a new file which defines the format of the bad sector forwarding information according to DEC standard 144. .BP dmap.h The constant DMMIN has been increased to 32 to allow upto 16k bytes to be paged to the paging devices in a clustered pageout. .BP filsys.h The two fields \fIs_fname\fR and \fIs_fpack\fR which were not implemented before were merged together (into a single 12 character field) which is called \fIs_fsmnt\fR. The system puts the ASCII path name where a file system is mounted (e.g. /usr) in this field and uses it in printing error messages on the console; (e.g. ``/usr: file system full'' rather than ``no space on dev 0/6''). .BP inline.h In order to reduce the number of conditional flags defined when compiling the system, the conditional flag FASTVAX, which was always defined, has been deleted. A conditional flag UNFAST, which is never defined, has been added to take its converse's place. .BP inode.h The constant NINDEX has been reduced from 15 to 6. This limits the number of files which may be join()'ed into a multiplexor (\fImpx\fR\|(2)) tree. You may have to increase this if you use the multiplexor extensively, but it saves a large amount of space in the kernel if you can use the smaller value, since NINDEX of 15 causes 40 bytes of extra unused space to be allocated to every inode. .BP map.h The \fBmalloc.c\fR routines have been rewritten to check for table overflow and renamed \frmap.c\fR. .BP mba.h Is now split into \fBmbareg.h\fR and \fBmbavar.h\fR, the former contains the definitions of device register and is usable, e.g., in the standalone version of the system. The latter contains system variable related to the MASSBUS adapters. .BP mem.h Is a new file which contains information on the memory controller registers in the form of macros which make the several VAX processors seem very similar to the UNIX code. Note also that the system now uses interrupts from the memory system to force error logging since the previous technique (polling) works only on the 11/780. .BP mscp.h A new file which defines the DEC \fIMass Storage Control Protocol\fR used by the UDA50 disk controller. .BP mtpr.h The register numbers are now given in hex, as in the DEC manuals; registers for all VAX processors are included. .BP msgbuf.h Defines the structure of the error message buffer, which is now kept in the last 1024 bytes of memory. This allows it to be preserved across system crashes and lets messages such as machine check reports be written conveniently into the error log. .BP nexus.h A new header file which defines the registers and constants related to the interconnect architecture of VAXen. .BP param.h No longer defines the large number of constants related to system sizing; a smaller number of rarely changed constants are given here. In particular, constants which were typically changed to affect the maximum number of supportable users are now controlled by the value given the \fBmaxusers\fR keyword in the machine specification (as described in \fIconfig\fR\|(8)). The \fIconfig\fR program turns this specification into parameters to the \fBparam.c\fR file which uses formulae to compute the values for the size of the process table, inode table, etc. .IP This file now includes the standard file to get system types rather than replicating the definitions from that file. It also defines a DELAY(n) macro which is used in device drivers to provide rougly \fIn\fR microseconds of delay. Finally the definition of UPAGES, the number of system control pages per-process has been increased from 6 to 8. This is partially due to the fact that there is now a red-zone page between the kernel stack and the kernel critical data in the \fIu.\fR area, but also because the kernel stack was precariously close to being too small before. .BP pcb.h Now includes definitions related to the use of AST's to implement user program profiling and rescheduling. Because AST's are now used, it is no longer necessary to take clock interrupts on the kernel stack; they now run on the interrupt stack where they belong. Also rescheduling processing is much cleaner, since the reschedule interrupts only go off when returning to user mode, not in the kernel where they have to be ignored (because UNIX cannot reschedule when running normally in the kernel.) .BP proc.h Now defines SOWEUPC, a new flag used to indicate that a profiling count should be generated when the (already posted) AST for this process goes off. Another new flag SSEQL indicates that the process has declared sequential paging behavior for its data space. Finally the field \fBp_maxrss\fR has been added, specifying the declared ``memoryuse'' limit in pages. .BP psl.h Has a bug fixed in the definitions of PSL_USERCLR. Now also declares PSL_USERSET and PSL_MBZ (must-be-zero). .BP system.h Defines the variables \fIhz\fR, \fItimezone\fR and \fIdstflag\fR replacing the old compile-time constants. No longer declares \fImsgbuf\fR as a variable (see \fBmsgbuf.h\fR). Defines the \fIdumpdev\fR and \fIdumplo\fR variables which specify where dumps are to take place. No longer defines the debugging variables \fIprintsw\fR and \fIcoresw\fR which have been removed in favor of more local debugging variables. No longer defines the field \fIsy_nrarg\fR for the system call entry structures, since system calls never take register arguments on the VAX. .IP A variable \fBwantin\fR has been added which is set each time a process is woken up which wants to be swapped in. This is used so that the code in \fBswapout\R in \fBvmsched.c\fR does not run with elevated priority. .BP trap.h Rearranges some codes previously used only internally so they would be contigous numerically. These are the finer machine traps which result in SIGILL and are made available to a signal handling process and defined in . Defines ASTFLT rather than RESCHED, since the VMS software interrupt which is used for VMS rescheduling never was appropriate for UNIX and is no longer used. .BP uba.h Has been split into \fBubareg.h\fR and \fBubavar.h\fR; see the description of device driver changes below. .BP user.h Contains definitions related to the new \fB#!\fR exec facility. The field \fIu.u_cfcode\fR has been renamed \fIu.u_code\fR since it is now used for purposes other than compatibility mode (presenting the more precise hardware reason for SIGILL and SIGFPE signals.) .BP vlimit.h Now defined LIM_MAXRSS for the ``limit memoryuse'' feature. .BP vm.h The \fBvm*.h\fR headers have been compressed into a more sensible set of files; the macros are all in \fBvmmac.h\fR (absorbing \fBvmclust.h\fR and \fBvmklust.h\fR), metering stuff is all in \fBvmmeter.h\fR (absorbing \fBvmmon.h\fR and \fBvmtotal.h\fR) and the parameters are all in \fBvmparam.h\fR (absorbing \fBvmtune.h\fR, most of the parameters of which are now adjusted at boot time in \fIsetupclock\fR in \fBvmsched.c\fR.) .BP vmmeter.h The structure \fBvmmeter\fR now computes the number of pages paged in \fIv_pgpgin\fR and pages paged out \fIv_pgpgout\fR, as well as the number of pages freed because of the behavior of programs which have told the system they are sequential \fIv_seqfree\fR. .BP vmparam.h The values of MAXDSIZ and MAXSSIZ have increased due to the increase to DMMIN in \fBdmap.h\fR. The klustering constants have been changed: in-clustering is now in 4 page (4k byte) chunks, and out-clustering is up to 16k bytes. Sequential programs kluster in 8k bytes, and text segments kluster in 2k bytes. The gap for the window into sequential programs is currently (primitively) defined as a constant kere in KLSDIST. .BP vmsystm.h Defines a new variable \fIavefree30\fR, which computes the average memory like \fIavefree\fR, but averaged over a longer period of time. This is used to put more hysteresis into swapping, and keep the system from swapping immediately when memory drops low. .SH System files: sys/sys .PP A number of files in the system have had minor changes made to them to reduce the length of time the system runs with the interrupt priority level raised; in particular, the times when the IPL is high enough to block the clock have been severely limited, in hopes of providing better real-time response (eventually) and possibly being able to drive the 11/750 console cassette (soon) which has severe interrupt latency constraints due to poor hardware interface design. .BP acct.c The code was tightened by using a register variable. The \fIsysphys\fR routine was moved to \fBsys4.c\fR since it had no business being here. .BP alloc.c Prints error messages relating to file system problems using the name of the file system rather than the major/minor device number of the device. Some code which attempted to prevent ``dups in free'' after a reboot, but could not prevent this completely, has just been removed; the condition is not harmful in any case, as it is normal and fixed by \fIfsck\fR\|(8). The system now prints directly on a user's terminal if that user causes a file system to run out of free space. The routines here also know how to deal with the fact that the super-blocks are now kept locked in the buffer cache. .BP asm.sed No longer defines \fIspl1\fR which is now defunct; \fBspl7\fR is now VAX IPL 0x1f rather than 0x18, blocking most processor aborts device interrupts from the console storage device, and a number of other processor dependent interrupts. Deals with a strange feature of the optimizer which converts ``$0'' into a register which contains 0. Implements the \fIffs\fR routine of \fBsig.c\fR in a much more efficient way (in just a couple of VAX instructions.) Beware, however: UNIX's notion of \fIffs\fR returns 1 for the low bit of a word, while the hardware \fIffs\fR would return 0. .BP clock.c Now runs only that code which is absolutely necessary when the processor priority is very high, queueing a software interrupt at which priority the rest of the clock processing is done. The conditional (old and long unused) code which profiled the kernel in a static buffer has been removed. The option of fishing characters out of the \fIdz\fR and \fIdh\fR silo's less often than every clock tick (1/hz) has been removed. Instead the silos are processed every clock tick if the system includes the berknet (bk) line discipline, or not at all (i.e. we take input interrupts) if ``bk'' is not included in the system. .IP The processing and watching of hung UNIBUS adapters has been moved from here to the UNIBUS routines. Automatic niceing of long-running (more than 10 minutes of user-state time) processes is now the default here, rather than being based on ``#if ERNIE''. A bug in the check for timeout table overflow which would cause the table to overrun without overflow being detected has been fixed. The timeout table is now implemented as a linked list, so that the entries can be conveniently discarded before calling the timeout routines. This prevents the anomalous case where only half the entries are used but the table fills up. .BP fio.c Has been changed to do the correct thing when special files or mounted file systems are closed: a flush is done at the last close and all blocks are invalidated. The standard ``table full'' routines are called when the file related tables fill up. These routines no longer pass \fBstruct chan *\fR pointers down to called routines, passing, instead, the more universal \fBstruct file *\fR pointers from which the \fBchan\fR pointers are easily derived. .BP iget.c Now uses the standard \fItablefull\fR routine. .BP ioctl.c The last argument to \fId_ioctl\fR routines when called is now always 0. .BP locore.s Has been extensively changed to accomodate the new configurable system, and to handle multiple UNIBUS and MASSBUS adapters. The code is now written using macros and the C preprocessor, improving readability. Complicated logic (such as the code to handle UNIBUS adapter errors) has been migrated to C code. .IP MASSBUS and UNIBUS adapters are no longer initialized or mapped here; this is the job of the configuration code in the system. The locore code distinguishes, in handling UNIBUS interrupts, from the machine being \fIcold\fR and not; when cold UNIBUS interrupts are handled so as to be suitable for determined device vectoring via probing. Device interrupts on the UNIBUS are now vectored through the code in a file \fBubglue.s\fR produced by the configuration program. To mask as much as possible the differences between the different VAX processors, the 11/780 uses the same \fBubglue.s\fR as the other processors which directly vector UNIBUS interrupts. .IP Many more of the exceptional conditions in the machine are caught now; only ``SBI alert'' and ``SBI fault'' remain uncaught by UNIX. The system control block is now defined in a file \fBscb.s\fR so that some symbols derived from C language header files by a program (and printed into a format suitable for inclusion in an assembly) may be stuck in after the system control block and before the mainline \fBlocore.s\fR. .IP The primitive routines \fIcopyseg\fR and \fIclearseg\fR are no longer run with the IPL raised very high. Further minor bugs have been fixed in the primitives, notably \fIaddupc\fR (a bug which caused 1/8 of the profiling ticks to be lost), and \fIkernacc\fR (a bug which allowed a strange command to a certain program to crash the system). .BP machdep.c .br Now sets up the error message buffer (in the last 1024 bytes of core) and the system data structures (such as the file and process table) at boot time. Currently only the file system buffer cache is sized at boot time, but all data structures are easily sized here. The startup routine also calls the routine \fIconfigure\fR to configure the system for the current hardware, locating available devices. .IP The \fIsendsig\fR routine passes a code back when a SIGFPE or SIGILL arrives, letting the signal handler determine which of the several conditions mapped to these two signals actually occurred. It uses the header file rather than redefining it. .IP The routines which monitor memory errors are now driven by interrupts (since the previous polling technique works only on 11/780). Extensive use of macros is made to make the various VAXen look similar. Instead of printing the raw contents of the memory controller registers, a array address and a syndrome is printed. Multiple memory controllers are supported. .IP The routines related to UNIBUS monitoring have been put with the rest of the UNIBUS routines in \fB../dev/uba.c\fR. The reboot interface has been improved, adding an automatic crash dump to a dump device (normally a disk aimed at the back end of a paging area). The system no longer ``halts'' when you ask it to (since this can cause a reboot to occur); rather it raises the IPL as high as it can and goes into a tight loop. Routines have been added to handle machine checks and print out the stack frame in a format which is readable by one who grok's what the fields mean. .BP main.c Now establishes a red zone between the stack and \fIu.\fR area in process 0; further processes also have red zones, protecting the \fIu.\fR from too-large stacks. The main routines also setup the super-blocks which are locked into the file system buffer cache, and copy the name of the root file system (/) into its super-block so that the name will be available if, e.g., the root file system becomes full. .BP malloc.c Has been renamed \fBrmap.c\fR. .BP nami.c Now respects the notion of \fB..\fR in a directory which is a virtual root directory after a \fIchroot\fR\|(2) call. .BP prf.c No longer implements the ascii in-core event tracing facility, which proved to be too slow to be useful; a binary facility replaces it, and is also conditionalized on TRACE, but implemented in \fBvmmon.c\fR. Implements the output of numbers non-recursively, since the recursive method occasionally caused the kernel stacks to overflow. Implements a new kernel routine \fIuprintf\fR which prints directly on a user's terminal for informing him/her of situations such as file systems which are full (because his/her program wrote to the file system when it was full.) Implements a new format ``%b'' which takes two arguments, a number and a second pattern. The pattern specifies a base to print the number in, and then a set of short strings separated by bit numbers (origin 1, escaped in octal into the string in the C compiler). The format prints the symbolic names for the bits which are in the string and set in the number within <>'s and separated by commas. This is extensively used to produce readable system error diagnostic messages on the console, decoding the bits of device registers symbolically. .IP The routines \fIprdev\fR and \fIdeverr\fR, which printed diagnostics which were difficult to interpreter, are deleted. There are two new routines: \fItablefull\fR which balks that a table is full, and \fIharderr\fR which begins a message about a hard (unrecoverable) error on a device. .BP prim.c Now maintains a count of free \fIclist\fR space. The code here now runs at \fBspl5\fR rather than \fBspl6\fR since there is no longer any need to block the clock. .BP rdwri.c Sees the change FASTVAX to not UNFAST. Also always clears the set-user-id bit when a file with the bit set is written on; previously this was done only ``#if UCB''. If you ``#define INSECURITY'' you get the code the old way. .BP scb.s Is a new file defining the system control block (as described above). .BP sig.c Has a bug fixed which caused processes to occasionally stop when the shell thought they were running. Processes are now given signals immediately when they are sent if the process is running. .BP slp.c A clumsiness which forced the swapout code to run with the IPL raised has been fixed by adding a variable \fIwantin\fR with which \fIwakeup\fR can communicate to the swapper that a swapped out process now wants to return to memory. The routine \fIsetpri\fR has been modified so that processes which are over their declared (soft) memory size limitation are assigned lower CPU priority when the system is very tight on memory. .BP sys.c No longer allows detached jobs to access /dev/tty; this was a security glitch. .BP sys1.c Implements the ``#!'' executable shell script scheme. No longer lets executable files be read by users using \fIptrace\fR unless the user has read access. Operates \fIexec\fR much more efficiently by avoiding copying argument lists unless the \fIexec\fR is going to succeed. .BP sys2.c The \fIopeni\fR routine passes both the FREAD and FWRITE flags to its callees; this is needed by the magnetic tape open routines. The \fImaknode\fR routine sticks the whole argument value in the ``rdev'' field of the inode. This is used by the \fIbadblock\fR\|(8) program to store block numbers corresponding to bad sectors on the disk in otherwise apparently empty files. .BP sys3.c The \fImount\fR and \fIumount\fR calls have been changed to deal correctly with buffer flushing and with simulateous access by other programs to the file system block devices. The \fImount\fR call also copies into the super-block of the file system the name of the device on which the file system is mounted (e.g. /usr). .BP sys4.c The \fIsyslock\fR routine has been moved here from \fBacct.c\fR. The mechanisms for sending signals to all processes, which is used in shutting down the system, has been changed so that the process which is broadcasting the signal does not receive it itself. This allows the \fIhalt\fR and \fIshutdown\fR programs to be written in a straightforward way. .BP trap.c Prints out the \fBpc\fR when an unexpected trap occurs. Handles AST's to implement profiling ticks and for rescheduling rather than the (older style) use of reschedule interrupts. Allows process reschedules after page faults. .BP vmmon.c Contains the internal routine \fBtrace1\fR which implements kernel event tracing in a circular buffer. .BP vmpage.c Tracing code, which is normally not compiled in, has been added. A extra case was added to an \fIif\fR statement to allow implementation of the vlimit(LIM_MAXRSS) feature of the system, for limiting processes which consume more than a process specific amount of physical memory. A botch was fixed in the virtual memory pre-paging which put pre-paged pages in the clock loop rather than at the end of the free list. Code has been added to implement a additonal replacement algorithm for processes which are declared sequential: when a hard page fault occurs, the pages sequentially preceding the faulted page are returned to the free list. .BP vmproc.c Contains a number of small changes related to AST processing. .BP vmpt.c Also contains changes for handling AST's as well as the initialization of the red-zone separating the stack and \fBu.\fR area of newly created processes. A bug in translation buffer flushing which caused rare and mysterious kernel crashes with the kernel stack not valid has been fixed. .BP vmsched.c Code has been added here which initializes the parameters of the clock page replacement algorithm based on the size of the machine. The \fIswapout\fR routine has been changed so that it no longer runs entirely at a high interrupt priority level (see \fBslp.c\fR above). The algorithm for the choice of processes to swap in and out and the hysteresis in the swap algorithm has been adjusted to work reasonably in extreme conditions when there are very large and or very few processes active in the system. .BP vmsubr.c Contains the \fIsetredzone\fR routine definition. .BP vmsys.c Contains the user interface to the kernel tracing routines. Code has been added to \fIvadvise\fR to setup VA_SEQL. .SH Device support: sys/dev .PP The major change to the device subsystem is the support of multiple MASSBUS and UNIBUS adapters, the support for multiple instances of each particular controller, and the support of system configuration at bootstrap time, investigating the interconnects, devices, and controllers available on the machine. These changes will be discussed in detail in the next section, which describes how to change existing drivers to work in the new system and gives pointers on style for writing new drivers. .PP Other changes in the device drivers affecting more than one driver: .IP * The input silos for DH-11's and DZ-11's are no longer serviced at clock IPL. Rather the clock interrupt queues a software interrupt during to service the silos. This means that the device interrupt routines are called from IPL 0x15, the IPL at which they normally interrupt. Thus it is no longer necessary to define \fBspl5\fR to be \fBspl6\fR (blocking the clock) in routines which handle asynchronous line input/output. .IP * The internal interface to the line discipline routines has been changed slightly by reordering parameters to make the arguments to the various \fIioctl\fR interfaces more similar; in particular \fIttioctl\fR routine call has been changed. If you have locally written line disciplines or asynchronous device drivers you should check the interfaces. .IP * The tty interface now provides full 8-bit output when the terminal is in LLITOUT mode; this requires support from the \fIxx\fR\|param routines in the device drivers (e.g. from \fIdhparam\fR and \fIdzparam\fR.) .IP * The UNIBUS adapter support routines have changed substantially, to allow for queueing of requests when resources are short and for support of multiple UNIBUS adapters. The interface now also allows devices which cannot function when other DMA is active on the UNIBUS to obtain exclusive transient use of UNIBUS resources; this is needed to successfully run RK07 disk controllers in the presence of other buffered data path DMA. In addition, it is used by 6250bpi tape drives supported on the UNIBUS. See the section on configuration and UNIBUS device drivers below for more information. .IP * DEC standard bad sector forwarding is provided for all standard DEC devices using the DEC formatters; the code which implements this is easily ported to the storage module drivers in the system, and this is planned soon.* .FS * The hard thing in providing bad sector handling for non-DEC drives is providing a formatter which produces the bad block information and flags the bad sectors appropriately. .FE .BP bio.c The hashing of buffers has been changed to use the existing device chain two way links. This means that unhashing is much easier, saves space, and uses the pointers which were otherwise little used. The buffers are now kept on one of three lists when not busy: a list of super-blocks which are locked in core, a list of good data blocks, which is kept fifo and used to implement the LRU buffer cache, and a list of data blocks for which further usage is not anticipated; this is also kept fifo. .IP Calls to some new tracing routines are conditionally included in \fBbio.c\fR; we are using them to do some performance measurement. The \fBd_tab\fR field of the block device table has been changed to a \fBd_flags\fR field, and that change is known here, where old field was checked before (to see if it was non-zero). Better messages are printed now when swap space is exhausted, and a user is told on his/her terminal that a process was killed before it started because there was no space. A subroutine has been added to purge the blocks from a specific device from the cache; this is used to fix some long standing buffer cache flushing problems which prevented removable media from being used reliably. .BP bk.c The definition of \fBspl5\fR as \fBspl6\fR has been removed from here. The line discipline is included only if the specification .DS pseudo-device bk .DE is included in the system configuration. The input silos on \fBdh\fR and \fBdz\fR devices are used only when this line discipline is included in the system. The comment about future implementation of 8-bit paths with this discipline has been deleted, since there is no longer any intention of doing this. .BP conf.c Has been moved to this directory from the directory \fB../conf\fR. \fBThis file should be changed only if you are adding support for a device not included on the standard distribution tape.\fR .BP ct.c Is a new driver, for a C/A/T phototypesetter interface. .BP dh.c No longer has to define \fBspl5\fR to be \fBspl6\fR. Incorporates the DM-11 driver standardly. A method is provided for specifying that lines are to be operated even though the hardware does not indicate that they are ready (using the flags word in the configuration specification, see \fIdh\fR\|(4)). A reasonable messages is printed when a \fBdh\fR silo overflows, replacing the old style of just printing a sequence of letter \fBo\fR's on the console. .BP dhfdm.c Has been incorporated into \fBdh.c\fR. .BP dn.c A driver for the DEC DN-11 autodialer interface. .BP dsort.c Has been rewritten to correct a bug which caused the elevators to be sorted incorrectly. .BP dz.c No longer has to define \fBspl5\fR to be \fBspl5\fR. Has been changed to allow lines to be specified as not properly wired and brought up without the ready signals showing in the interface; see \fIdz\fR\|(4) for details. Prints reasonable diagnostics when the input silo overflows. .BP flp.c Knows that there is no floppy on an 11/750. .BP hp.c Is now a sub-driver to \fBmba.c\fR, which probes nexus space for the MASSBUS adaptors and device space on the MASSBUS's for disks, setting up the driver for each device which is in the configuration. A number of minor bugs and enhancements have been made to the driver: The driver handles the new RM80 drive and its SSE (skip-sector-error) facility for bad sector handling, as well as the DEC standard bad block forwarding. Due to the bad block forwarding, the last three tracks of each disk are normally reserved to the system and available only through the use of a special file system partition. A further bug has been fixed in the initialization of the tables for RM05 sectoring. The driver no longer (baroquely) turns on and off interrupts on the MASSBUS adapter. Basic dual-port drive handling code has been added to the driver. .IP The remaining remarks apply to all three supported disk drivers: the \fBhp\fR driver for MASSBUS disks, the \fBup\fR driver for UNIBUS storage modules, and the \fBhk\fR driver for RK07's: The drivers do not SEARCH or SEEK if there is only one drive on the MASSBUS. On a UNIBUS no SEARCHing or SEEKing is done if one drive is on the controller. The offset positions and recalibration of error recovery is now done with interrupts rather than by waiting for the operations to complete. This prevents the system from being tied up during the many recoveries of a disk operation, and is necessary in any case in at least one of the disk drivers (RK07). The iostat numbers for each MASSBUS and UNIBUS drive are calculated by the auto-configurator at boot time, not compiled into the drivers. Much cleaner handling of errors is done: the drivers realize which errors are not even potentially recoverable, handle drives spinning up and down with readable diagnostics, and print reasonable, legible error messages when hard errors and soft ecc's occur. Each driver includes a low-level non-interrupt driver used to take crash dumps at the end of a paging area on the device. The drivers include a raw i/o buffer per drive so that raw operations on separate devices can be overlapped (both seeks and transfers); previously only one raw device operation could be pending per device type. .BP ht.c The tape drive is now a sub-drive of the MASSBUS driver. The following remarks apply to all supported tape drivers: \fBht\fR and \fBmt\fR for MASSBUS . \"tapes, \fBts\fR for the UNIBUS ts-11, \fBtm\fR for the UNIBUS . \"TM-11 emulations, and \fBut\fR for UNIBUS TU45 emulations. tapes, \fBts\fR for the UNIBUS ts-11, and \fBtm\fR for the UNIBUS TM-11 emulations. .IP Each driver implements a set of tape ioctl operations on raw tapes providing access to the functionality of the hardware such as skipping forward and backward records and files and writing end-of-file marks on the tape. Better error diagnostics are also given on tape errors. Multiple tape controllers and transports are supported. A dump routine is provided with each driver for taking a post-mortem crash dump on tape, although dumps are normally made to the paging area on the disk. .IP With the exception of the \fBts\fP driver, the drivers detect and reject attempts to switch tape density while writing a tape. .BP lp.c Is a fully supported driver for one or more line printer interfaces. It has been improved from the previous drivers (which were not supported) to take a small fraction of the number of interrupts that the previous drivers took. The user-level code driving the printers has been arranged to work on 1200 baud DECWRITER III terminals or true printers. .BP mba.c Has been rewritten. Now allows mixing of disks and tapes on the same and across multiple mba's, with the devices being driven from the routines here calling routines defined in the individual device drivers. .BP mem.c Has been fixed to not allow any access to nexus space, even by the super-users, since such access inevitably results in a machine check and a system crash. .BP mt.c A driver for the DEC TU78 tape drive. .BP mx?.c A bug has been fixed which, caused by a missing call to \fIchdrain\fR caused multiplexor files to become clogged under certain circumstances. .BP rk.c Is a new driver for RK07 disks. It uses the same logic as the storage module drive driver \fBup.c\fR whenever possible. It also makes use of the interlocking facilities of the UNIBUS device support because the \fBrk\fR controller cannot tolerate concurrent UNIBUS dma when it is operating due to a design flaw. .BP swap.c Now places only half of the first piece of the \fIswapmap\fR in the \fIargmap\fR. .BP swap??*.c Are the files for different swap configurations. Thus \fBswaphp.c\fR defines the root and swap devices for a UNIX based on a \fBhp\fR disk. The files such as \fBswaphphp.c\fR are for interleaved paging configurations, placing the swapping and paging activity on two disk arms. You can make additional such files and include them in your configuration files. .BP tdump.c Has been deleted, replaced by the dump routines in individual drivers. .BP tm.c Is a driver for UNIBUS tape drives on controllers such as the EMULEX TC-11. It has the same functionality as \fBht\fR (see \fBht.c\fR above.) .BP ts.c Is a driver for the UNIBUS TS-11 tape drive. It has full functionality except the transport itself only supports 1600 bpi. .BP tty.c No longer raises its IPL to \fBspl6\fR internally to block the clock. Has its internal interface to \fBioctl\fR entries changed slightly to be globally consistent (see, e.g. \fIttioctl\fR). The DIOC* ioctl entries have been deleted since they are not used in any standard UNIX line disciplines. .BP ttynew.c A bug is fixed which prevented echoing from occurring in raw mode. The dec-compatible method of ^S/^Q processing needed to support VT-100s in smooth scroll mode is implemented when the local mode ``decctlq'' is specified. .BP ttyold.c Implements ``decctlq'' mode. .BP tu.c A driver for the 11/750 TU58 console cassette interface. \fBNote: this driver provides reliable service only on a quiescent system.\fP .BP uba.c Has a much more structured interface. All the basic routines for dealing with the UNIBUS specify a UNIBUS adapter number to use, since there are potentially several on a machine. When requesting allocation of UNIBUS map entries, the caller specifies whether he is willing to block in the allocation routines waiting for resources to come available. If he is not, and there are no resources available, a value of 0 is returned, and the caller must deal with this. The routine which frees UNIBUS resources now takes the address of the variable describing the resources to be freed rather than the value of this variable to eliminate a race condition (where the routine is called, a UNIBUS interrupt occurs causing a UNIBUS reset, and the resources are freed twice, causing a \fIpanic\fR\|). .IP The normal interface for DMA operation is now to pass a pointer to a UNIBUS related structure to a routine \fIubago\fR, which allocates UNIBUS resources. If resources are not available, the structure is queued on a request queue, and processed when resources are available. When the requested resources are allocated, a driver specific \fIxxgo\fR routine is called, and can stuff the device registers with the address into which the operation is mapped and start the operation. The use of this interface is described in the next section. .IP Finally, we note that the error handling code which was written in assembly language is now written in C. .BP uda.c A driver for the UDA50 disk controller with RA80 Winchester storage modules. .BP up.c The UNIBUS storage module disk driver has been fixed up in the same way that the \fBhp\fR driver was, giving better error diagnostics and using interrupts during error recovery, etc. See \fBhp.c\fR above for details. The driver uses a feature of the EMULEX SC-21 to determine the size of the disks in use, so that it can adapt to both 300M storage modules and the Fujitsu 160M drives which are popular. Other drive sizes can be added easily. . \".BP ut.c . \"A driver for the System Industries Model 9700 tape drive, emulating . \"a DEC TU45 on the UNIBUS. .BP va.c The \fBvarian\fR printer-plotter driver has been modified so that it can support more than one device, probes the devices so they can be placed on differrent UNIBUS'es, and prints an error diagnostic when device errors are detected. .BP vaxcpu.c Is a new file which contains initializations of various CPU-type dependent structures. .BP vp.c Has been modified to handle multiple devices, and adapted to the auto-configuration code. .SH Configuration and UNIBUS device drivers .PP Someday this section will be a separate document. This section explains how to interface an existing UNIX device driver to the VAX system, especially to the UNIBUS routines and the autoconfiguration code. .PP A PDP-11, UNIX/32V or 3BSD or 4.0BSD driver on the VAX UNIBUS will need to be modified to run under 4.1BSD. There are three reasons why such a driver will need to be changed: .IP 1) 4.1bsd supports multiple UNIBUS adapters. .IP 2) 4.1bsd supports system configuration at boot time. .IP 3) 4.1bsd manages the UNIBUS resources and does not crash when resources are not available; the resource allocation protocol must be honored. In addition, devices such as the RK07 which require everyone else to get off the UNIBUS when they are running need cooperation from other DMA devices if they are to work. .PP Each UNIBUS on a VAX has a set of resources: .IP * 496 map registers which are used to convert from the 18 bit UNIBUS addresses into the much larger VAX address space. .IP * Some number of buffered data paths (3 on an 11/750, 15 on an 11/780) which are used by high speed devices to transfer data using fewer bus cycles. .LP There is a structure of type \fBstruct uba_hd\fR in the system per UNIBUS adapter used to manage these resources. This structure also contains a linked list where devices waiting for resources to complete DMA UNIBUS activity have requests waiting. .PP There are three central structures in the writing of drivers for UNIBUS controllers; devices which do not do DMA i/o can often use only two of these structures. The structures are \fBstruct uba_ctlr\fR, the UNIBUS controller structure, \fBstruct uba_device\fR the UNIBUS device structure, and \fBstruct uba_driver\fR, the UNIBUS driver structure. The \fBuba_ctlr\fR and \fBuba_device\fR structures are in one-to-one correspondence with the definitions of controllers and devices in the system configuration. Each driver has a \fBstruct uba_driver\fR structure specifying an internal interface to the rest of the system. .PP Thus a specification .DS controller sc0 at uba0 csr 0176700 vector upintr .DE would cause a \fBstruct uba_ctlr\fR to be declared and initialized in the file \fBioconf.c\fR for the system configured from this description. Similarly specifying .DS disk up0 at sc0 drive 0 .DE would declare a related \fBuba_device\fR in the same file. The \fBup.c\fR driver which implements this driver specifies in its declarations: .DS int upprobe(), upslave(), upattach(), updgo(), upintr(); struct uba_ctlr *upminfo[NSC]; struct uba_device *updinfo[NUP]; u_short upstd[] = { 0776700, 0774400, 0776300, 0 }; struct uba_driver scdriver = { upprobe, upslave, upattach, updgo, upstd, "up", updinfo, "sc", upminfo }; .DE initializing the \fBuba_driver\fR structure. The driver will support some number of controllers named \fBsc0\fR, \fBsc1\fR, etc, and some number of drives named \fBup0\fR, \fBup1\fR, etc. where the drives may be on any of the controllers (that is there is a single linear name space for devices, separate from the controllers.) .PP We now explain the fields in the various structures. It may help to look at a copy of \fBh/ubareg.h\fR, \fBh/ubavar.h\fR and drivers such as \fBup.c\fR and \fBdz.c\fR while reading the descriptions of the various structure fields. .SH uba_driver structure .PP One of these structures exists per driver. It is initialized in the driver and contains functions used by the configuration program and by the UNIBUS resource routines. The fields of the structure are: .BP ud_probe A routine which is given a \fBcaddr_t\fR address as argument and should cause an interrupt on the device whose control-status register is at that address in virtual memory. It may be the case that the device does not exist, so the probe routine should use delays (via the DELAY(n) macro which delays for \fIn\fR microseconds) rather than waiting for specific events to occur. The routine must \fBnot\fR declare its argument as a \fBregister\fR parameter, but \fBmust\fR declare .DS \fBregister int br, cvec;\fR .DE as local variables. At boot time the system takes special measures that these variables are ``value-result'' parameters. The \fBbr\fR is the IPL of the device when it interrupts, and the \fBcvec\fR is the interrupt vector address on the UNIBUS. These registers are actually filled in in the interrupt handler when an interrupt occurs. .IP As an example, here is the \fBup.c\fR probe routine: .DS upprobe(reg) caddr_t reg; { register int br, cvec; #ifdef lint br = 0; cvec = br; br = cvec; #endif ((struct updevice *)reg)->upcs1 = UP_IE|UP_RDY; DELAY(10); ((struct updevice *)reg)->upcs1 = 0; return (1); } .DE The definitions for \fIlint\fR serve to indicate to it that the \fBbr\fR and \fBcvec\fR variables are value-result. The statements here interrupt enable the device and write the ready bit UP_RDY. The 10 microsecond delay insures that the interrupt enable will not be cancelled before the interrupt can be posted. The return of ``1'' here indicates that the probe routine is satisfied that the device is present. A probe routine may use the function ``badaddr'' to see if certain other addresses are accessible on the UNIBUS (without generating a machine check), or look at the contents of locations where certain registers should be. If the registers contents are not acceptable or the addresses don't respond, the probe routine can return 0 and the device will not be considered to be there. .IP One other thing to note is that the action of different VAXen when illegal addresses are accessed on the UNIBUS may differ. Some of the machines may generate machine checks and some may cause UNIBUS errors. Such considerations are handled by the configuration program and the driver writer need not be concerned with them. .IP It is also possible to write a very simple probe routine for a one-of-a-kind device if probing is difficult or impossible. Such a routine would include statements of the form: .DS br = 0x15; cvec = 0200; .DE for instance, to declare that the device ran at UNIBUS br5 and interrupted through vector 0200 on the UNIBUS. The current TS-11 driver does something similar to this because the device is so difficult to force an interrupt on that it hardly seems worthwhile. (Besides, TS-11's are usually present on small 11/750's which have only one UNIBUS, and TS-11's can have only exactly one transport per-controller so little probing is needed.) .BP ud_slave This routine is called with a \fBuba_device\fR structure (yet to be described) and the address of the device controller. It should determine whether a particular slave device of a controller is present, returning 1 if it is and 0 if it is not. As an example here is the slave routine for \fBup.c\fR. .DS upslave(ui, reg) struct uba_device *ui; caddr_t reg; { register struct updevice *upaddr = (struct updevice *)reg; upaddr->upcs1 = 0; /* conservative */ upaddr->upcs2 = ui->ui_slave; if (upaddr->upcs2&UPCS2_NED) { upaddr->upcs1 = UP_DCLR|UP_GO; return (0); } return (1); } .DE Here the code fetches the slave (disk unit) number from the \fBui_slave\fR field of the \fBuba_device\fR structure, and sees if the controller responds that that is a non-existant driver (NED). If the drive a drive clear is issued to clean the state of the controller, and 0 is returned indicating that the slave is not there. Otherwise a 1 is returned. .BP ud_attach The attach routine is called after the autoconfigure code and the driver concur that a peripheral exists attached to a controller. This is the routine where internal driver state about the peripheral can be initialized. Here is the \fIattach\fR routine from the \fBup.c\fR driver: .ID .nf upattach(ui) register struct uba_device *ui; { register struct updevice *upaddr; if (upwstart == 0) { timeout(upwatch, (caddr_t)0, hz); upwstart++; } if (ui->ui_dk >= 0) dk_mspw[ui->ui_dk] = .0000020345; upip[ui->ui_ctlr][ui->ui_slave] = ui; up_softc[ui->ui_ctlr].sc_ndrive++; upaddr = (struct updevice *)ui->ui_addr; upaddr->upcs1 = 0; upaddr->upcs2 = ui->ui_slave; upaddr->uphr = UPHR_MAXTRAK; if (upaddr->uphr == 9) ui->ui_type = 1; /* fujitsu hack */ upaddr->upcs2 = UPCS2_CLR; } .DE The attach routine here performs a number of functions. The first time any drive is attached to the controller it starts the timeout routine which watches the disk drives to make sure that interrupts aren't lost. It also initializes, for devices which have been assigned \fIiostat\fR numbers (when ui->ui_dk >= 0), the transfer rate of the device in the array \fBdk_mspw\fR, the fraction of a second it takes to transfer 16 bit word. It then initializes an inverting pointer in the array \fBupip\fR which will be used later to determine, for a particular \fBup\fR controller and slave number, the corresponding \fBuba_device\fR. It increments the count of the number of devices on this controller, so that search commands can later be avoided if the count is exactly 1. It then uses a hardware feature of the EMULEX SC-21 to ask if the number of tracks on the device is 9. If it is, then the driver assumes that the type is ``1'', which corresponds to a FUJITSU 160M drive. The alternative is the only other currently supported device, a 300 Megabyte CDC or AMPEX drive, which has \fBui_type\fR 0. Note that if the controller is not an SC-21 then attempting to find out the maximum track in the device will yield an error, and a 300 Megabyte device will be assumed. In any case, any errors resulting from the attempt to type the drive are cleared by a controller clear before the routine returns. .BP ud_dgo Is the routine which is called by the UNIBUS resource management routines when an operation is ready to be started (because the required resources have been allocated). The routine in \fBup.c\fR is: .DS updgo(um) struct uba_ctlr *um; { register struct updevice *upaddr = (struct updevice *)um->um_addr; upaddr->upba = um->um_ubinfo; upaddr->upcs1 = um->um_cmd|((um->um_ubinfo>>8)&0x300); } .DE This routine uses the field \fBum_ubinfo\fR of the \fBuba_ctlr\fR structure which is where the UNIBUS routines store the UNIBUS map allocation information. In particluar, the low 18 bits of this word give the UNIBUS address assigned to the transfer. The assignment to \fIupba\fR in the go routine places the low 16 bits of the UNIBUS address in the disk UNIBUS address register. The next assignment places the disk operation command and the extended (high 2) address bits in the device control-status register, starting the i/o operation. The field \fBum_cmd\fR was initialized with the command to be stuffed here in the driver code itself before the call to the \fBubago\fR routine which eventually resulted in the call to \fBupdgo\fR. .BP ud_addr Are the conventional addresses for the device control registers in UNIBUS space. This information is not used by the system in this release, but may be used in future releases to look for instances of the device supported by the driver. In the current system, the configuration file specifies the control-status register addresses of all configured devices. .BP ud_dname Is the name of a \fIdevice\fR supported by this controller; thus the disks on a SC-21 controller are called \fBup0\fR, \fBup1\fR, etc. That is because this field contains \fBup\fR. .BP ud_dinfo Is an array of back pointers to the \fBuba_device\fR structures for each device attached to the controller. Each driver defines a set of controllers and a set of devices. The device address space is always one-dimensional, so that the presence of extra controllers may be masked away (e.g. by pattern matching) to take advantage of hardware redundancy. This field is filled in by the configuration program, and used by the driver. .BP ud_mname The name of a controller, e.g. \fBsc\fR for the \fBup.c\fR driver. The first SC-21 is called \fBsc0\fR, etc. .BP ud_minfo The backpointer array to the structures for the controllers. .BP ud_xclu If non-zero specifies that the controller requires exclusive use of the UNIBUS when it is running. This is non-zero currently only for the RK611 controller for the RK07 disks to map around a hardware problem. It could also be used if 6250bpi tape drives are to be used on the UNIBUS to insure that they get the bandwidth that they need (basically the whole bus). .SH uba_ctlr structure .PP One of these structures exists per-controller. The fields link the controller to its UNIBUS adaptor and contain the state information about the devices on the controller. The fields are: .BP um_driver A pointer to the \fBstruct uba_device\fR for this driver, which has fields as defined above. .BP um_ctlr The controller number for this controller, e.g. the 0 in \fBsc0\fR. .BP um_alive Set to 1 if the controller is considered alive; currently, always set for any structure encountered during normal operation. That is, the driver will have a handle on a \fBuba_ctlr\fR structure only if the configuration routines set this field to a 1 and entered it into the driver tables. .BP um_intr The interrupt vector routines for this device. These are generated by the \fIconfig\fR\|(8) program and this field is initialized in the \fBioconf.c\fR file. .BP um_hd A back-pointer to the UNIBUS adapter to which this controller is attached. .BP um_cmd A place for the driver to store the command which is to be given to the device before calling the routine \fIubago\fR with the devices \fBuba_device\fR structure. This information is then retrieved when the device go routine is called and stuffed in the device control status register to start the i/o operation. .BP um_ubinfo Information about the UNIBUS resources allocated to the device. This is normally only used in device driver go routine (as \fBupdgo\fR above) and occasionally in exceptional condition handling such as ECC correction. .BP um_tab This buffer structure is a place where the driver hangs the device structures which are ready to transfer. Each driver allocates a buf structure for each device (e.g. \fBupdtab\fR in the \fBup.c\fR driver) for this purpose. You can think of this structure as a device-control-block, and the buf structures linked to it as the unit-control-blocks. The code for dealing with this structure is stylized; see the \fBrk.c\fR or \fBup.c\fR driver for the details. If the \fBubago\fR routine is to be used, the structure attached to this \fBbuf\fR structure must be: .RS .IP * A chain of \fBbuf\fR structures for each waiting device on this controller. .IP * On each waiting \fBbuf\fR structure another \fBbuf\fR structure which is the one containing the parameters of the i/o operation. .RE .SH uba_device structure .PP One of these structure exists for each device attached to a UNIBUS controller. Devices which are not attached to controllers or which perform no buffered data path DMA i/o may have only a device structure. Thus \fBdz\fR and \fBdh\fR devices have only \fBuba_device\fR structures. The fields are: .BP ui_driver A pointer to the \fBstruct uba_driver\fR structure for this device type. .BP ui_unit The unit number of this device, e.g. 0 in \fBup0\fR, or 1 in \fBdh1\fR. .BP ui_ctlr The number of the controller on which this device is attached, or \-1 if this device is not on a controller. .BP ui_ubanum The number of the UNIBUS on which this device is attached. .BP ui_slave The slave number of this device on the controller which it is attached to, or \-1 if the device is not a slave. Thus a disk which was unit 2 on a SC-21 would have \fBui_slave\fR 2; it might or might not be \fBup2\fR, that depends on the system configuration specification. .BP ui_intr The interrupt vector entries for this device, copied into the UNIBUS interrupt vector at boot time. The values of these fields are filled in by the \fBconfig\fR\|(8) program to small code segments which it generates in the file \fBubglue.s\fR. .BP ui_addr The control-status register address of this device. .BP ui_dk The iostat number assigned to this device. Numbers are assigned to disks only, and are small positive integers which index the various \fBdk_*\fR arrays in . .BP ui_flags The optional ``\fBflags \fR\fIxxx\fR'' parameter from the configuration specification was copied to this field, to be interpreted by the driver. If \fBflags\fR was not specified, then this field will contain a 0. .BP ui_alive The device is really there. Presently set to 1 when a device is determined to be alive, and left 1. .BP ui_type The device type, to be used by the driver internally. Thus the \fBup.c\fR driver uses a \fBui_type\fR of 0 to mean a 300 Megabyte drive and a type of 1 to mean a 160 Megabyte FUJITSU drive. .BP ui_physaddr The physical memory address of the device control-status register. This is used in the device dump routines typically. .BP ui_mi A \fBstruct uba_ctlr\fR pointer to the controller (if any) on which this device resides. .BP ui_hd A \fBstruct uba_hd\fR pointer to the UNIBUS on which this device resides. .SH Changing drivers .PP If you driver does not do buffered data path DMA, conversion to the new system should be straightforward; if it uses buffered data paths more work will be required, but the task is really mostly cosmetic. .PP In any case, first add a line to the file \fBconf/files\fR of the form .DS dev/zz.c optional zz device-driver .DE so that your driver will be included when you specify it in a configuration. Change the \fBdev/conf.c\fR file to include a block or character device entry for your device. Note that the block device entries now include a \fBd_dump\fR entry; if you are a block device but don't have a dump entry point, just make one in your driver that returns the value ENODEV. .PP Then build a system configuration including your driver so that you have a compilation environment for your driver. You will have to add a \fBstruct uba_driver\fR declaration for your driver, and change its calls to UNIBUS routines to correspond to these routines in the new system. Trouble spots will show up here. In particular, notice that you must specify flags to \fBuballoc\fR if you call it: .BP NEEDBDP if you need a buffered data path .BP CANTWAIT if you are calling (potentially) from interrupt level .LP You may discover that your driver ``cantwait'' but that you are calling from interrupt level. This botch existed in most previous VAX UNIX drivers, since there were no mechanisms for dealing with this. We will describe some options shortly. .PP First, suppose your driver doesn't do buffered data path dma. What else is there for you to do? Very little really. You should change your driver to print messages on the console in the format now used by all device drivers; see section 4 of the revised programmers manual for details. To make more certain that your driver is ready for the new system environment, look at some of the simple existing drivers and mimic the style to create the portions of the driver which are needed to interface with the configuration part of the system. Useful drivers to look at may be: .BP ct.c Very simple drive which does programmed i/o to C/A/T phototypesetter. .BP dh.c Communications line driver which uses non-buffered UNIBUS dma for output. .BP dz.c Communications line driver which does programmed i/o. .PP Basically all you have to do is write a \fBud_probe\fR and a \fBud_attach\fR routine for the controller. It suffices to have a \fBud_probe\fR routine which just initializes \fBbr\fR and \fBcvec\fR, and a \fBud_attach\fR routine which does nothing. Making the device fully configurable requires, of course, more work, but is worth it if you expect the device to be in common usage and want to share it with others. .PP If you managed to create all the needed hooks, then make sure you include the necessary header files; the ones included by \fBct.c\fR are nearly minimal. Order is important here, don't be suprised at undefined structure complaints if you order the includes wrongly. Finally if you get the device configured in, you can try bootstrapping and see if configuration messages print out about your device. It is a good idea to have some messages in the probe routine so that you can see that you are getting called and what is going on. If you do not get called, then you probably have the control-status register address wrong in your system configuration. The autoconfigure code notices that the device doesn't exist in this case and you will never get called. .PP Assuming that your probe routine works and you manage to generate an interrupt, then you are basically back to where you would have been under older versions of UNIX. Just be sure to use the \fBui_ctlr\fR field of the \fBuba_device\fR structures to address the device; compiling in funny constants will make your driver only work on the CPU type you have (780 or 750). .PP Other bad things that might happen while you are setting up the configuration stuff: .IP * You get ``nexus zero vector'' errors from the system. This will happen if you cause a device to interrupt, but take away the interrupt enable so fast that the UNIBUS adapter cancels the interrupt and confuses the processor. The best thing to do it to put a modest delay in the probe code between the instructions which should cause and interrupt and the clearing of the interrupt enable. (You should clear interrupt enable before you leave the probe routine so the device doesn't interrupt more and confuse the system while it is configuring other devices.) .IP * The device refuses to interrupt or interrupts with a ``zero vector''. This typically indicates a problem with the hardware or, for devices which emulate other devices, that the emulation is incomplete. Devices may fail to present interrupt vectors because they have configuration switches set wrong, or because they are being accessed in inappropriate ways. Incomplete emulation can cause ``maintenance mode'' features to not work properly, and these features are often needed to force device interrupts. .SH Adapting devices which do buffered data path dma .PP These devices fall into two categories: those which are controllers to which devices are attached, and those which are just single devices. The interface for the former is very stylized and we recommend that you simply mimic one of the existing tape or disk drivers in adapting to the system. You will find that the existing tape and disk drivers are all \fBvery\fR similar; this is deliberate so that it isn't necessary to rewrite the whole driver for each device, since the available devices are typically very similar. .PP Other devices which do buffered data path DMA can be adapted to the new system in one of two ways: .IP * They can do their own data path allocation, calling the UNIBUS allocation routines from the ``top-half'' (non-interrupt) code, sleeping in the UNIBUS code when resources are not available. See for an example the code in the \fBvp.c\fR driver. .IP * They can set up a two-level structure like the tape and disk drivers do, and call the \fIubago\fR routine and use the \fBud_dgo\fR interface to start DMA operations. See for an example the code in the \fBup.c\fR driver. .PP Either way works acceptably well; the second (\fIubago\fR\|) interface is preferable because it does not force a context switch per i/o operation (to the routine driving the i/o from the ``top-half''). .PP If you have questions about converting drivers, feel free to call us and ask or to send us mail. We hope (eventually) to write a more complete paper for driver writers, but don't have the manpower to do this just now.