BSD 4_2 development
[unix-history] / usr / man / man8 / crash.8v
CommitLineData
e7839a72
C
1.TH CRASH 8V "1 September 1981"
2.UC 4
3.SH NAME
4crash \- what happens when the system crashes
5.SH DESCRIPTION
6This section explains what happens when the system crashes and how
7you can analyze crash dumps.
8.PP
9When the system crashes voluntarily it prints a message of the form
10.IP
11panic: why i gave up the ghost
12.LP
13on the console, takes a dump on a mass storage peripheral,
14and then invokes an automatic reboot procedure as
15described in
16.IR reboot (8).
17(If auto-reboot is disabled on the front panel of the machine the system
18will simply halt at this point.)
19Unless some unexpected inconsistency is encountered in the state
20of the file systems due to hardware or software failure the system
21will then resume multi-user operations.
22.PP
23The system has a large number of internal consistency checks; if one
24of these fails, then it will panic with a very short message indicating
25which one failed.
26.PP
27The most common cause of system failures is hardware failure, which
28can reflect itself in different ways. Here are the messages which
29you are likely to encounter, with some hints as to causes.
30Left unstated in all cases is the possibility that hardware or software
31error produced the message in some unexpected way.
32.TP
33.B IO err in push
34.ns
35.TP
36.B hard IO err in swap
37The system encountered an error trying to write to the paging device
38or an error in reading critical information from a disk drive.
39You should fix your disk if it is broken or unreliable.
40.TP
41.B timeout table overflow
42.ns
43This really shouldn't be a panic, but until we fix up the data structure
44involved, running out of entries causes a crash. If this happens,
45you should make the timeout table bigger.
46.TP
47.B KSP not valid
48.ns
49.TP
50.B SBI fault
51.ns
52.TP
53.B CHM? in kernel
54These indicate either a serious bug in the system or, more often,
55a glitch or failing hardware.
56If SBI faults recur, check out the hardware or call
57field service. If the other faults recur, there is likely a bug somewhere
58in the system, although these can be caused by a flakey processor.
59Run processor microdiagnostics.
60.TP
61.B machine check %x:
62.I description
63.ns
64.TP
65.I \0\0\0machine dependent machine-check information
66.ns
67We should describe machine checks, and will someday.
68For now, ask someone who knows (like your friendly field service people).
69.TP
70.B trap type %d, code=%d, pc=%x
71A unexpected trap has occurred within the system; the trap types are:
72.sp
73.nf
740 reserved addressing fault
751 privileged instruction fault
762 reserved operand fault
773 bpt instruction fault
784 xfc instruction fault
795 system call trap
806 arithmetic trap
817 ast delivery trap
828 segmentation fault
839 protection fault
8410 trace trap
8511 compatibility mode fault
8612 page fault
8713 page table fault
88.fi
89.sp
90The favorite trap types in system crashes are trap types 8 and 9,
91indicating
92a wild reference. The code is the referenced address, and the pc at the
93time of the fault is printed. These problems tend to be easy to track
94down if they are kernel bugs since the processor stops cold, but random
95flakiness seems to cause this sometimes.
96.TP
97.B init died
98The system initialization process has exited. This is bad news, as no new
99users will then be able to log in. Rebooting is the only fix, so the
100system just does it right away.
101.PP
102That completes the list of panic types you are likely to see.
103.PP
104When the system crashes it writes (or at least attempts to write)
105an image of memory into the back end of the primary swap
106area. After the system is rebooted, the program
107.IR savecore (8)
108runs and preserves a copy of this core image and the current
109system in a specified directory for later perusal. See
110.IR savecore (8)
111for details.
112.PP
113To analyze a dump you should begin by running
114.IR adb (1)
115with the
116.B \-k
117flag on the core dump.
118Normally the command
119``*(intstack-4)$c''
120will provide a stack trace from the point of
121the crash and this will provide a clue as to
122what went wrong.
123A more complete discussion
124of system debugging is impossible here.
125See, however,
126``Using ADB to Debug the UNIX Kernel''.
127.SH "SEE ALSO"
128adb(1),
129analyze(8),
130reboot(8)
131.br
132.I "VAX 11/780 System Maintenance Guide"
133for more information about machine checks.
134.br
135.I "Using ADB to Debug the UNIX Kernel"