From 41b28288eb0e6568db5c5e95f7e13c5e65e8bf82 Mon Sep 17 00:00:00 2001
From: Keith Bostic <bostic@ucbvax.Berkeley.EDU>
Date: Fri, 23 Oct 1987 19:32:51 -0800
Subject: [PATCH] new version; (Chris Torek)

SCCS-vsn: share/man/man4/man4.vax/uda.4 6.3
---
 usr/src/share/man/man4/man4.vax/uda.4 | 476 ++++++++++++++++++++------
 1 file changed, 369 insertions(+), 107 deletions(-)

diff --git a/usr/src/share/man/man4/man4.vax/uda.4 b/usr/src/share/man/man4/man4.vax/uda.4
index b52593893a..0b9756d15e 100644
--- a/usr/src/share/man/man4/man4.vax/uda.4
+++ b/usr/src/share/man/man4/man4.vax/uda.4
@@ -1,83 +1,58 @@
-.\" Copyright (c) 1980 Regents of the University of California.
+.\" Copyright (c) 1980, 1987 Regents of the University of California.
 .\" All rights reserved.  The Berkeley software License Agreement
 .\" specifies the terms and conditions for redistribution.
 .\"
-.\"	@(#)uda.4	6.2 (Berkeley) %G%
+.\"	@(#)uda.4	6.3 (Berkeley) %G%
 .\"
 .TH UDA 4 ""
 .UC 4
 .SH NAME
-uda \- UDA-50 disk controller interface
+uda \- UDA50 disk controller interface
 .SH SYNOPSIS
 .B "controller uda0 at uba0 csr 0172150 vector udaintr"
 .br
-.B "disk ra0 at uda0 drive 0"
+.b "disk ra0 at uda0 drive 0"
 .SH DESCRIPTION
-This is a driver for the DEC UDA-50 disk controller
-and for other compatible controllers.
-The UDA-50 communicates with the host through a packet
-oriented protocol termed the Mass Storage Control Protocol (MSCP).
+This is a driver for the DEC UDA50 disk controller and other
+compatible controllers.  The UDA50 communicates with the host through
+a packet protocol known as the Mass Storage Control Protocol (MSCP).
 Consult the file
 .RI < vax/mscp.h >
 for a detailed description of this protocol.
 .PP
 Files with minor device numbers 0 through 7 refer to various portions
-of drive 0;
-minor devices 8 through 15 refer to drive 1, etc.
-The standard device names begin with ``ra'' followed by
-the drive number and then a letter a-h for partitions 0-7 respectively.
+of drive 0; minor devices 8 through 15 refer to drive 1, etc.  The
+standard device names begin with `ra' followed by the drive number
+and then a letter a-h for partitions 0-7 respectively.
 The character ? stands here for a drive number in the range 0-7.
 .PP
-The block files access the disk via the system's normal
-buffering mechanism and may be read and written without regard to
-physical disk records.  There is also a `raw' interface
-which provides for direct transmission between the disk
-and the user's read or write buffer.
-A single read or write call results in exactly one I/O operation
-and therefore raw I/O is considerably more efficient when
-many words are transmitted.  The names of the raw files
-conventionally begin with an extra `r.'
+The block files access the disk via the system's normal buffering
+mechanism mechanism and may be read and written without regard to
+physical disk records.  There is also a `raw' interface which provides
+for direct transmission between the disk and the user's read or write
+buffer.  A single read or write call results in exactly one I/O
+operation and therefore raw I/O is considerably more efficient when
+many words are transmitted.  The names of the raw files conventionally
+begin with an extra `r'.
 .PP
 In raw I/O counts should be a multiple of 512 bytes (a disk sector).
 Likewise
 .I seek
 calls should specify a multiple of 512 bytes.
 .SH "DISK SUPPORT"
-This driver configures the drive type of each drive
-when it is first opened.
-A partition table in the driver is required for each type of disk.
-The origin and size (in sectors) of the pseudo-disks
-on each drive are shown below.
-Not all partitions begin on
-cylinder boundaries, as on other drives, because previous drivers
-used one partition table for all drive types.
-Variants of the partition tables are common;
-check the driver and the file
+This driver configures the type of each drive when it is first
+encountered.  A partition table in the driver is required for each type
+of disk.  The origin and size (in sectors) of the pseudo-disks on each
+drive are shown below.  Not all partitions begin on cylinder
+boundaries, as on other drives, because previous drivers used one
+partition table for all drive types.  Variants of the partition tables
+are common; check the driver and the file
 .IR /etc/disktab ( disktab (5))
 for other possibilities.
 .PP
 .nf
 .ta .5i +\w'000000    'u +\w'000000    'u +\w'000000    'u +\w'000000    'u
 .PP
-RC25 partitions
-	disk	start	length
-	ra?a	0	15884
-	ra?b	15884	10032
-	ra?c	0	50902
-	ra?g	25916	24986
-RD52 partitions
-	disk	start	length
-	ra?a	0	15884
-	ra?b	15884	9766
-	ra?c	0	60480
-	ra?g	25650	34830
-RD53 partitions
-	disk	start	length
-	ra?a	0	15884
-	ra?b	15884	33440
-	ra?c	0	138672
-	ra?g	49324	89348
-	ra?h	15884	122788
 RA60 partitions
 	disk	start	length
 	ra?a	0	15884
@@ -123,66 +98,353 @@ RA81 partitions with 4.2BSD-compatible partitions
 .DT
 .fi
 .PP
-The ra?a partition is normally used for the root file system,
-the ra?b partition as a paging area,
-and the ra?c partition for pack-pack copying (it maps the entire disk).
+The ra?a partition is normally used for the root file system, the ra?b
+partition as a paging area, and the ra?c partition for pack-pack
+copying (it maps the entire disk).
 .SH FILES
 /dev/ra[0-9][a-f]
 .br
 /dev/rra[0-9][a-f]
 .SH DIAGNOSTICS
-.BR "uda: ubinfo %x" .
-(VAX 11/750 only.)
-When allocating UNIBUS resources, the driver found it already
-had resources previously allocated.  This indicates a bug in
-the driver.
-.PP
-.BR "udasa %o, state %d" .
-(Additional status information given after a hard i/o error.)
-The values of the UDA-50 status register and the internal
-driver state are printed.
-.PP
-.BR "uda%d: random interrupt ignored" .
-An unexpected interrupt was received (e.g. when no i/o was
-pending).  The interrupt is ignored.
-.PP
-.BR "uda%d: interrupt in unknown state %d ignored" .
-An interrupt was received when the driver was in an unknown
-internal state.  Indicates a hardware problem or a driver bug.
-.PP
-.BR "uda%d: fatal error (%o)" .
-The UDA-50 indicated a ``fatal error'' in the status returned
-to the host.  The contents of the status register are displayed.
-.PP
-.BR OFFLINE .
-(Additional status information given after a hard i/o error.)
-A hard i/o error occurred because the drive was not on-line.
-.PP
-.BR "status %o" .
-(Additional status information given after a hard i/o error.)
-The status information returned from the UDA-50 is tacked onto
-the end of the hard error message printed on the console.
-.PP
-.BR "uda: unknown packet" .
-An MSCP packet of unknown type was received from the UDA-50.
-Check the cabling to the controller.
-.PP
-The following errors are interpretations of MSCP error messages
-returned by the UDA-50 to the host.
-.PP
-.BR "uda%d: %s error, controller error, event 0%o" .
-.PP
-.BR "uda%d: %s error, host memory access error, event 0%o, addr 0%o" .
-.PP
-.BR "uda%d: %s error, disk transfer error, unit %d" .
-.PP
-.BR "uda%d: %s error, SDI error, unit %d, event 0%o" .
-.PP
-.BR "uda%d: %s error, small disk error, unit %d, event 0%o, cyl %d" .
+.TP
+panic: udaslave
+No command packets were available while the driver was looking
+for disk drives.  The controller is not extending enough credits
+to use the drives.
+.TP
+uda%d: no response to Get Unit Status request
+A disk drive was found, but did not respond to a status request.
+This is either a hardware problem or someone pulling unit number
+plugs very fast.
+.TP
+uda%d: unit %d off line
+While searching for drives, the controller found one that
+seems to be manually disabled.  It is ignored.
+.TP
+uda%d: unable to get unit status
+Something went wrong while trying to determine the status of
+a disk drive.  This is followed by an error detail.
+.TP
+uda%d: unit %d, next %d
+This probably never happens, but I wanted to know if it did.  I
+have no idea what one should do about it.
+.TP
+uda%d: cannot handle unit number %d (max is %d)
+The controller found a drive whose unit number is too large.
+Valid unit numbers are those in the range [0..7].
+.TP
+uda%d: unit %d (media ID `%s') is of unknown type %d; ignored
+The controller found a drive whose type is not known, and thus has
+no partitioning.  The drive has been ignored.  You can add the type
+to the udatypes[] table, now that you know what it is:  The media
+ID will be something like `DU RA25'.
+.TP
+uda%d: uballoc map failed
+Unibus resource map allocation failed during initialisation.  This
+can only happen if you have 496 devices on a Unibus.
+.TP
+uda%d: timeout during init
+The controller did not initialise within ten seconds.  A hardware
+problem, but it sometimes goes away if you try again.
+.TP
+uda%d: init failed, sa=%b
+The controller refused to initalise.
+.TP
+uda%d: controller hung
+The controller never finished initialisation.  Retrying may sometimes
+fix it.
+.TP
+ra%d: drive will not come on line
+The drive will not come on line, probably because it is spun down.
+This should be preceded by a message giving details as to why the
+drive stayed off line.
+.TP
+uda%d: still hung
+When the controller hangs, the driver occasionally tries to reinitialise
+it.  This means it just tried, without success.
+.TP
+panic: udastart: bp==NULL
+A bug in the driver has put an empty drive queue on a controller queue.
+.TP
+uda%d: command ring too small
+If you increase NCMDL2, you may see a performance improvement.
+(See /sys/vaxuba/uda.c.)
+.TP
+panic: udastart
+A drive was found marked for status or on-line functions while performing
+status or on-line functions.  This indicates a bug in the driver.
+.TP
+uda%d: controller error, sa=%b
+The controller reported an error.  The driver will reset it and retry
+pending I/O.
+.TP
+uda%d: stray intr
+The controller interrupted when it should have stayed quiet.  The
+interrupt has been ignored.
+.TP
+uda%d: init step %d failed, sa=%b
+The controller reported an error during the named initialisation step.
+The driver will retry initialisation later.
+.TP
+uda%d: version %d model %d
+An informational message giving the revision level of the controller.
+.TP
+uda%d: DMA burst size set to %d
+An informational message showing the DMA burst size, in words.
+.TP
+panic: udaintr
+Indicates a bug in the generic MSCP code.
+.TP
+uda%d: driver bug, state %d
+The driver has a bogus value for the controller state.  Something
+is quite wrong.  This is immediately followed by a `panic: udastate'.
+.TP
+uda%d: purge bdp %d
+A benign message tracing BDP purges.  I have been trying to figure
+out what BDP purges are for.  You might want to comment out this
+call to log() in /sys/vaxuba/uda.c.
+.TP
+.RI "uda%d: SETCTLRC failed: " detail
+The Set Controller Characteristics command (the last part of the
+controller initialisation sequence) failed.  The
+.I detail
+message tells why.
+.TP
+.RI "uda%d: attempt to bring ra%d on line failed: " detail
+The drive could not be brought on line.  The
+.I detail
+message tells why.
+.TP
+uda%d: ra%d: unknown type %d
+The type index of the named drive is not known to the driver, so the
+drive will be ignored.
+.TP
+ra%d: changed types! was %s
+A drive somehow changed from one kind to another, e.g., from an RA80
+to an RA60.  The driver believes the new type.
+.TP
+ra%d: %s, size = %d sectors
+The named drive is of the given type, and has that many sectors of
+user-file area.  This is printed during configuration.
+.TP
+.RI "uda%d: attempt to get status for ra%d failed: " detail
+A status request failed.  The
+.I detail
+message should tell why.
+.TP
+ra%d: unit %d, nspt %d, group %d, ntpc %d, rctsize %d,
+.br
+.ti -5
+nrpt %d, nrct %d
+.br
+Information about the geometry of the named drive.  This is not
+used by the driver, but can one setting up
+.I disktab
+entries, e.g.  Note that the sectors per track, group, and tracks per
+cylinder values are those after bad blocking is accounted for, and will
+differ slightly from the actual hardware setup.  This message also
+reports the MSCP unit number for the drive.  Errors tend to include
+only the MSCP unit number, rather than the drive number, since that
+is all the driver can tell at the time.
+.TP
+ra%d: bad block report: %d
+The drive has reported the given block as bad.  If there are multiple
+bad blocks, the drive will report only the first; in this case this
+message will be followed by `+ others'.  Get DEC to forward the
+block with EVRLK.
+.TP
+ra%d: serious exception reported
+I have no idea what this really means.
+.TP
+panic: udareplace
+The controller reported completion of a REPLACE operation.  The
+driver never issues any REPLACEs, so something is wrong.
+.TP
+panic: udabb
+The controller reported completion of bad block related I/O.  The
+driver never issues any such, so something is wrong.
+.TP
+uda%d: lost interrupt
+The controller has gone out to lunch, and is being reset to try to bring
+it back.
+.TP
+panic: mscp_go: AEB_MAX_BP too small
+You defined AVOID_EMULEX_BUG and increased NCMDL2 and Emulex has
+new firmware.  Raise AEB_MAX_BP or turn off AVOID_EMULEX_BUG.
+.TP
+uda%d: unit %d: unknown message type 0x%x ignored
+The controller responded with a mysterious message type. See
+/sys/vax/mscp.h for a list of known message types.  This is probably
+a controller hardware problem.
+.TP
+uda%d: unit %d out of range
+The disk drive unit number (the unit plug) is higher than the
+maximum number the driver allows (currently 7).
+.TP
+uda%d: unit %d not configured, \fImessage\fP ignored
+The named disk drive has announced its presence to the controller,
+but was not, or cannot now be, configured into the running system.
+.I Message
+is one of `available attention' (an `I am here' message) or
+`stray response op 0x%x status 0x%x' (anything else).
+.TP
+ra%d: bad lbn (%d)?
+The drive has reported an invalid command error, probably due to an
+invalid block number.  If the lbn value is very much greater than the
+size reported by the drive, this is the problem.  It is probably due to
+an improperly configured partition table.  Other invalid commands
+indicate a bug in the driver, or hardware trouble.
+.TP
+ra%d: duplicate ONLINE ignored
+The drive has come on-line while already on-line.  This condition
+can probably be ignored (and has been).
+.TP
+ra%d: io done, but no buffer?
+Hardware trouble, or a bug; the drive has finished an I/O request,
+but the response has an invalid (zero) command reference number.
+.TP
+Emulex SC41/MS screwup: uda%d, got %d correct, then
+.br
+.ti -5
+changed 0x%x to 0x%x
+.br
+You turned on AVOID_EMULEX_BUG, and the driver successfully
+avoided the bug.  The number of correctly-handled requests is
+reported, along with the expected and actual values relating to
+the bug being avoided.
+.TP
+panic: unrecoverable Emulex screwup
+You turned on AVOID_EMULEX_BUG, but Emulex was too clever and
+avoided the avoidance.  Try turning on MSCP_PARANOIA instead.
+.TP
+uda%d: bad response packet ignored
+You turned on MSCP_PARANOIA, and the driver caught the controller in
+a lie.  The lie has been ignored, and the controller will soon be
+reset (after a `lost' interrupt).  This is followed by a hex dump of
+the offending packet.
+.TP
+ra%d: bogus REPLACE end
+The drive has reported finishing a bad sector replacement, but the
+driver never issues bad sector replacement commands.  The report
+is ignored.  This is likely a hardware problem.
+.TP
+ra%d: unknown opcode 0x%x status 0x%x ignored
+The drive has reported something that the driver cannot understand.
+Perhaps DEC has been inventive, or perhaps your hardware is ill.
+This is followed by a hex dump of the offending packet.
+.TP
+uda%d: %s error datagram
+The controller has reported some kind of error, either `hard'
+(unrecoverable) or `soft' (recoverable).  If the controller is going on
+(attempting to fix the problem), this message includes the remark
+`(continuing)'.  Emulex controllers wrongly claim that all soft errors
+are hard errors.  This message may be followed by
+one of the following 5 messages, depending on its type, and will always
+be followed by a failure detail message (also listed below).
+.RS
+.TP
+memory addr 0x%x
+A host memory access error; this is the address that could not be
+read.
+.TP
+unit %d: level %d retry %d, %s %d
+A typical disk error; the retry count and error recovery levels are
+printed, along with the block type (`lbn', or logical block; or `rbn',
+or replacement block) and number.  If the string is something else, DEC
+has been clever, or your hardware has gone to Australia for vacation
+(unless you live there; then it might be in New Zealand, or Brazil).
+.TP
+unit %d: %s %d
+Also a disk error, but an `SDI' error, whatever that is.  (I doubt
+it has anything to do with Ronald Reagan.)  This lists the block
+type (`lbn' or `rbn') and number.
+.TP
+unit %d: small disk error, cyl %d
+Yet another kind of disk error, but for small disks.  (`That's what
+it says, guv'nor.  Dunnask me what it means.')
+.TP
+unit %d: unknown error, format 0x%x
+A mysterious error: the given format code is not known.
+.RE
 .PP
-.BR "uda%d: %s error, unknown error, unit %d, format 0%o, event 0%o" .
-.SH BUGS
-The partition tables attempt to combine compatibility
-with previous drivers and functionality; this is impossible.
-The best solution would be to read the partition tables
-off the drive.
+The detail messages are as follows:
+.RS
+.TP
+success (%s) (code 0, subcode %d)
+Everything worked, but the controller thought it would let you know
+that something went wrong.  No matter what subcode, this can probably
+be ignored.
+.TP
+invalid command (%s) (code 1, subcode %d)
+This probably cannot occur unless the hardware is out; %s should be
+`invalid msg length', meaning some command was too short or too long.
+.TP
+command aborted (unknown subcode) (code 2, subcode %d)
+This should never occur, as the driver never aborts commands.
+.TP
+unit offline (%s) (code 3, subcode %d)
+The drive is offline, either because it is not around (`unknown
+drive'), stopped (`not mounted'), out of order (`inoperative'), has the
+same unit number as some other drive (`duplicate'), or has been
+disabled for diagnostics (`in diagnosis').
+.TP
+unit available (unknown subcode) (code 4, subcode %d)
+The controller has decided to report a perfectly normal event as
+an error.  (Why?)
+.TP
+media format error (%s) (code 5, subcode %d)
+The drive cannot be used without reformatting.  The Format Control
+Table cannot be read (`fct unread - edc'), there is a bad sector
+header (`invalid sector header'), the drive is not set for 512-byte
+sectors (`not 512 sectors'), the drive is not formatted (`not formatted'),
+or the FCT has an uncorrectable ECC error (`fct ecc').
+.TP
+write protected (%s) (code 6, subcode %d)
+The drive is write protected, either by the front panel switch
+(`hardware') or via the driver (`software').  The driver never
+sets software write protect.
+.TP
+compare error (unknown subcode) (code 7, subcode %d)
+A compare operation showed some sort of difference.  The driver
+never uses compare operations.
+.TP
+data error (%s) (code 7, subcode %d)
+Something went wrong reading or writing a data sector.  A `forced
+error' is a software-asserted error used to mark a sector that contains
+suspect data.  Rewriting the sector will clear the forced error.  This
+is normally set only during bad block replacment, and the driver does
+no bad block replacement, so these should not occur.  A `header
+compare' error probably means the block is shot.  A `sync timeout'
+presumably has something to do with sector synchronisation.
+An `uncorrectable ecc' error is an ordinary data error that cannot
+be fixed via ECC logic.  A `%d symbol ecc' error is a data error
+that can be (and presumably has been) corrected by the ECC logic.
+It might indicate a sector that is imperfect but usable, or that
+is starting to go bad.  If any of these errors recur, the sector
+may need to be replaced.
+.TP
+host buffer access error (%s) (code %d, subcode %d)
+Something went wrong while trying to copy data to or from the host
+(Vax).  The subcode is one of `odd xfer addr', `odd xfer count',
+`non-exist. memory', or `memory parity'.  The first two could be a
+software glitch; the last two indicate hardware problems.
+.TP
+controller error (%s) (code %d, subcode %d)
+The controller has detected a hardware error in itself.  A
+`serdes overrun' is a serialiser / deserialiser overrun; `edc'
+probably stands for `error detection code'; and `inconsistent
+internal data struct' is obvious.
+.TP
+drive error (%s) (code %d, subcode %d)
+Either the controller or the drive has detected a hardware error
+in the drive.  I am not sure what an `sdi command timeout' is, but
+these seem to occur benignly on occasion.  A `ctlr detected protocol'
+error means that the controller and drive do not agree on a protocol;
+this could be a cabling problem, or a version mismatch.  A `positioner'
+error means the drive seek hardware is ailing; `lost rd/wr ready'
+means the drive read/write logic is sick; and `drive clock dropout'
+means that the drive clock logic is bad, or the media is hopelessly
+scrambled.  I have no idea what `lost recvr ready' means.  A `drive 
+detected error' is a catch-all for drive hardware trouble; `ctlr
+detected pulse or parity' errors are often caused by cabling problems.
+.RE
-- 
2.20.1