Commit | Line | Data |
---|---|---|
3922de1d C |
1 | |
2 | ||
3 | ||
4 | CRASH(8V) 1990 CRASH(8V) | |
5 | ||
6 | ||
7 | ||
8 | N\bNA\bAM\bME\bE | |
9 | crash - what happens when the system crashes | |
10 | ||
11 | D\bDE\bES\bSC\bCR\bRI\bIP\bPT\bTI\bIO\bON\bN | |
12 | This section explains what happens when the system crashes | |
13 | and (very briefly) how to analyze crash dumps. | |
14 | ||
15 | When the system crashes voluntarily it prints a message of | |
16 | the form | |
17 | ||
18 | panic: why i gave up the ghost | |
19 | ||
20 | on the console, takes a dump on a mass storage peripheral, | |
21 | and then invokes an automatic reboot procedure as described | |
22 | in _\br_\be_\bb_\bo_\bo_\bt(8). (If auto-reboot is disabled on the front | |
23 | panel of the machine the system will simply halt at this | |
24 | point.) Unless some unexpected inconsistency is encountered | |
25 | in the state of the file systems due to hardware or software | |
26 | failure, the system will then resume multi-user operations. | |
27 | ||
28 | The system has a large number of internal consistency | |
29 | checks; if one of these fails, then it will panic with a | |
30 | very short message indicating which one failed. In many | |
31 | instances, this will be the name of the routine which | |
32 | detected the error, or a two-word description of the incon- | |
33 | sistency. A full understanding of most panic messages | |
34 | requires perusal of the source code for the system. | |
35 | ||
36 | The most common cause of system failures is hardware | |
37 | failure, which can reflect itself in different ways. Here | |
38 | are the messages which are most likely, with some hints as | |
39 | to causes. Left unstated in all cases is the possibility | |
40 | that hardware or software error produced the message in some | |
41 | unexpected way. | |
42 | ||
43 | i\bii\bin\bni\bit\bt | |
44 | This cryptic panic message results from a failure to | |
45 | mount the root filesystem during the bootstrap process. | |
46 | Either the root filesystem has been corrupted, or the | |
47 | system is attempting to use the wrong device as root | |
48 | filesystem. Usually, an alternate copy of the system | |
49 | binary or an alternate root filesystem can be used to | |
50 | bring up the system to investigate. | |
51 | ||
52 | C\bCa\ban\bn'\b't\bt e\bex\bxe\bec\bc /\b/s\bsb\bbi\bin\bn/\b/i\bin\bni\bit\bt | |
53 | This is not a panic message, as reboots are likely to | |
54 | be futile. Late in the bootstrap procedure, the system | |
55 | was unable to locate and execute the initialization | |
56 | process, _\bi_\bn_\bi_\bt(8). The root filesystem is incorrect or | |
57 | has been corrupted, or the mode or type of /sbin/init | |
58 | forbids execution. | |
59 | ||
60 | ||
61 | ||
62 | ||
63 | Printed 7/27/90 June 1 | |
64 | ||
65 | ||
66 | ||
67 | ||
68 | ||
69 | ||
70 | CRASH(8V) 1990 CRASH(8V) | |
71 | ||
72 | ||
73 | ||
74 | I\bIO\bO e\ber\brr\br i\bin\bn p\bpu\bus\bsh\bh | |
75 | h\bha\bar\brd\bd I\bIO\bO e\ber\brr\br i\bin\bn s\bsw\bwa\bap\bp | |
76 | The system encountered an error trying to write to the | |
77 | paging device or an error in reading critical informa- | |
78 | tion from a disk drive. The offending disk should be | |
79 | fixed if it is broken or unreliable. | |
80 | ||
81 | r\bre\bea\bal\bll\blo\boc\bcc\bcg\bg:\b: b\bba\bad\bd o\bop\bpt\bti\bim\bm | |
82 | i\bia\bal\bll\blo\boc\bc:\b: d\bdu\bup\bp a\bal\bll\blo\boc\bc | |
83 | a\bal\bll\blo\boc\bcc\bcg\bgb\bbl\blk\bk:\b: c\bcy\byl\bl g\bgr\bro\bou\bup\bps\bs c\bco\bor\brr\bru\bup\bpt\bte\bed\bd | |
84 | i\bia\bal\bll\blo\boc\bcc\bcg\bg:\b: m\bma\bap\bp c\bco\bor\brr\bru\bup\bpt\bte\bed\bd | |
85 | f\bfr\bre\bee\be:\b: f\bfr\bre\bee\bei\bin\bng\bg f\bfr\bre\bee\be b\bbl\blo\boc\bck\bk | |
86 | f\bfr\bre\bee\be:\b: f\bfr\bre\bee\bei\bin\bng\bg f\bfr\bre\bee\be f\bfr\bra\bag\bg | |
87 | i\bif\bfr\bre\bee\be:\b: f\bfr\bre\bee\bei\bin\bng\bg f\bfr\bre\bee\be i\bin\bno\bod\bde\be | |
88 | a\bal\bll\blo\boc\bcc\bcg\bg:\b: m\bma\bap\bp c\bco\bor\brr\bru\bup\bpt\bte\bed\bd | |
89 | These panic messages are among those that may be pro- | |
90 | duced when filesystem inconsistencies are detected. | |
91 | The problem generally results from a failure to repair | |
92 | damaged filesystems after a crash, hardware failures, | |
93 | or other condition that should not normally occur. A | |
94 | filesystem check will normally correct the problem. | |
95 | ||
96 | t\bti\bim\bme\beo\bou\but\bt t\bta\bab\bbl\ble\be o\bov\bve\ber\brf\bfl\blo\bow\bw | |
97 | This really shouldn't be a panic, but until the data | |
98 | structure involved is made to be extensible, running | |
99 | out of entries causes a crash. If this happens, make | |
100 | the timeout table bigger. | |
101 | ||
102 | K\bKS\bSP\bP n\bno\bot\bt v\bva\bal\bli\bid\bd | |
103 | S\bSB\bBI\bI f\bfa\bau\bul\blt\bt | |
104 | C\bCH\bHM\bM?\b? i\bin\bn k\bke\ber\brn\bne\bel\bl | |
105 | These indicate either a serious bug in the system or, | |
106 | more often, a glitch or failing hardware. If SBI | |
107 | faults recur, check out the hardware or call field ser- | |
108 | vice. If the other faults recur, there is likely a bug | |
109 | somewhere in the system, although these can be caused | |
110 | by a flakey processor. Run processor microdiagnostics. | |
111 | ||
112 | m\bma\bac\bch\bhi\bin\bne\be c\bch\bhe\bec\bck\bk %\b%x\bx:\b: | |
113 | _\bd_\be_\bs_\bc_\br_\bi_\bp_\bt_\bi_\bo_\bn | |
114 | ||
115 | _\bm_\ba_\bc_\bh_\bi_\bn_\be _\bd_\be_\bp_\be_\bn_\bd_\be_\bn_\bt _\bm_\ba_\bc_\bh_\bi_\bn_\be-_\bc_\bh_\be_\bc_\bk _\bi_\bn_\bf_\bo_\br_\bm_\ba_\bt_\bi_\bo_\bn | |
116 | Machine checks are different on each type of CPU. Most | |
117 | of the internal processor registers are saved at the | |
118 | time of the fault and are printed on the console. For | |
119 | most processors, there is one line that summarizes the | |
120 | type of machine check. Often, the nature of the prob- | |
121 | lem is apparent from this messaage and/or the contents | |
122 | of key registers. The VAX Hardware Handbook should be | |
123 | consulted, and, if necessary, your friendly field ser- | |
124 | vice people should be informed of the problem. | |
125 | ||
126 | ||
127 | ||
128 | ||
129 | Printed 7/27/90 June 2 | |
130 | ||
131 | ||
132 | ||
133 | ||
134 | ||
135 | ||
136 | CRASH(8V) 1990 CRASH(8V) | |
137 | ||
138 | ||
139 | ||
140 | t\btr\bra\bap\bp t\bty\byp\bpe\be %\b%d\bd,\b, c\bco\bod\bde\be=\b=%\b%x\bx,\b, p\bpc\bc=\b=%\b%x\bx | |
141 | A unexpected trap has occurred within the system; the | |
142 | trap types are: | |
143 | ||
144 | 0 reserved addressing fault | |
145 | 1 privileged instruction fault | |
146 | 2 reserved operand fault | |
147 | 3 bpt instruction fault | |
148 | 4 xfc instruction fault | |
149 | 5 system call trap | |
150 | 6 arithmetic trap | |
151 | 7 ast delivery trap | |
152 | 8 segmentation fault | |
153 | 9 protection fault | |
154 | 10 trace trap | |
155 | 11 compatibility mode fault | |
156 | 12 page fault | |
157 | 13 page table fault | |
158 | ||
159 | The favorite trap types in system crashes are trap | |
160 | types 8 and 9, indicating a wild reference. The code | |
161 | is the referenced address, and the pc at the time of | |
162 | the fault is printed. These problems tend to be easy | |
163 | to track down if they are kernel bugs since the proces- | |
164 | sor stops cold, but random flakiness seems to cause | |
165 | this sometimes. The debugger can be used to locate the | |
166 | instruction and subroutine corresponding to the PC | |
167 | value. If that is insufficient to suggest the nature | |
168 | of the problem, more detailed examination of the system | |
169 | status at the time of the trap usually can produce an | |
170 | explanation. | |
171 | ||
172 | i\bin\bni\bit\bt d\bdi\bie\bed\bd | |
173 | The system initialization process has exited. This is | |
174 | bad news, as no new users will then be able to log in. | |
175 | Rebooting is the only fix, so the system just does it | |
176 | right away. | |
177 | ||
178 | o\bou\but\bt o\bof\bf m\bmb\bbu\buf\bfs\bs:\b: m\bma\bap\bp f\bfu\bul\bll\bl | |
179 | The network has exhausted its private page map for net- | |
180 | work buffers. This usually indicates that buffers are | |
181 | being lost, and rather than allow the system to slowly | |
182 | degrade, it reboots immediately. The map may be made | |
183 | larger if necessary. | |
184 | ||
185 | That completes the list of panic types you are likely to | |
186 | see. | |
187 | ||
188 | When the system crashes it writes (or at least attempts to | |
189 | write) an image of memory into the back end of the dump dev- | |
190 | ice, usually the same as the primary swap area. After the | |
191 | system is rebooted, the program _\bs_\ba_\bv_\be_\bc_\bo_\br_\be(8) runs and | |
192 | ||
193 | ||
194 | ||
195 | Printed 7/27/90 June 3 | |
196 | ||
197 | ||
198 | ||
199 | ||
200 | ||
201 | ||
202 | CRASH(8V) 1990 CRASH(8V) | |
203 | ||
204 | ||
205 | ||
206 | preserves a copy of this core image and the current system | |
207 | in a specified directory for later perusal. See _\bs_\ba_\bv_\be_\bc_\bo_\br_\be(8) | |
208 | for details. | |
209 | ||
210 | To analyze a dump you should begin by running _\ba_\bd_\bb(1) with | |
211 | the -\b-k\bk flag on the system load image and core dump. If the | |
212 | core image is the result of a panic, the panic message is | |
213 | printed. Normally the command ``$c'' will provide a stack | |
214 | trace from the point of the crash and this will provide a | |
215 | clue as to what went wrong. A more complete discussion of | |
216 | system debugging is impossible here. See, however, ``Using | |
217 | ADB to Debug the UNIX Kernel''. | |
218 | ||
219 | S\bSE\bEE\bE A\bAL\bLS\bSO\bO | |
220 | adb(1), reboot(8) | |
221 | _\bV_\bA_\bX _\b1_\b1/_\b7_\b8_\b0 _\bS_\by_\bs_\bt_\be_\bm _\bM_\ba_\bi_\bn_\bt_\be_\bn_\ba_\bn_\bc_\be _\bG_\bu_\bi_\bd_\be and _\bV_\bA_\bX _\bH_\ba_\br_\bd_\bw_\ba_\br_\be _\bH_\ba_\bn_\bd_\b- | |
222 | _\bb_\bo_\bo_\bk for more information about machine checks. | |
223 | _\bU_\bs_\bi_\bn_\bg _\bA_\bD_\bB _\bt_\bo _\bD_\be_\bb_\bu_\bg _\bt_\bh_\be _\bU_\bN_\bI_\bX _\bK_\be_\br_\bn_\be_\bl | |
224 | ||
225 | ||
226 | ||
227 | ||
228 | ||
229 | ||
230 | ||
231 | ||
232 | ||
233 | ||
234 | ||
235 | ||
236 | ||
237 | ||
238 | ||
239 | ||
240 | ||
241 | ||
242 | ||
243 | ||
244 | ||
245 | ||
246 | ||
247 | ||
248 | ||
249 | ||
250 | ||
251 | ||
252 | ||
253 | ||
254 | ||
255 | ||
256 | ||
257 | ||
258 | ||
259 | ||
260 | ||
261 | Printed 7/27/90 June 4 | |
262 | ||
263 | ||
264 |