Commit | Line | Data |
---|---|---|
6dc518d4 OB |
1 | .TH ANALYZE 1M |
2 | .UC | |
3 | .SH NAME | |
4 | analyze \- Virtual UNIX postmortem crash analyzer | |
5 | .SH SYNOPSIS | |
6 | .B analyze | |
7 | [ | |
8 | .B \-s | |
9 | swapfile | |
10 | ] [ | |
11 | .B \-fmdDv | |
12 | ] | |
13 | corefile | |
14 | [ system ] | |
15 | .SH DESCRIPTION | |
16 | .I Analyze | |
17 | is the post-mortem analyzer for the state of the paging system. | |
18 | The following procedure should be followed when the paging system | |
19 | crashes: | |
20 | .IP 1) | |
21 | Following normal crash dump procedures, dump the PSL, registers | |
22 | and the top 40 or so locations of the kernel stack on the console, | |
23 | and then take a core image dump on tape. | |
24 | .IP 2) | |
25 | Boot a version of the system which pages or swaps in an area distinct | |
26 | from the paging system. | |
27 | .IP 3) | |
28 | Fix the root file system and then copy the mag tape into the file | |
29 | .I /vmcore | |
30 | (cp will do fine). | |
31 | .IP 4) | |
32 | Run the command ``analyze \-s /dev/drum /vmcore /vmunix'' and save | |
33 | the output for a systems programmer. | |
34 | .IP 5) | |
35 | Follow the normal reboot procedure to resume operations. | |
36 | .IP 6) | |
37 | When the system is up again, run the command ``pstat \-pxk /vmcore /vmunix'' | |
38 | and route the output to a hardcopy device. | |
39 | .PP | |
40 | The outputs from the above procedure will get you started investigating | |
41 | the cause of the crash, if it was paging related. For details | |
42 | on the internal data structures of the system see the document | |
43 | .I "Data Structures added in the Berkeley Virtual Memory Extensions to the UNIX System" | |
44 | A listing of the system will also be handy while examining the core dump. | |
45 | It is suggested that you save the file | |
46 | .I /vmcore | |
47 | in a less volatile place if you want to make sure it is not clobbered. | |
48 | .PP | |
49 | The | |
50 | .I analyze | |
51 | program reads the relevant system data structures from the core | |
52 | image file and coordinates these with the information on the disk | |
53 | to determine the state of the paging subsystem at the point of crash. | |
54 | It looks at each process in the system, and the resources each is | |
55 | using in an attempt to determine inconsistencies in the paging system | |
56 | state. Normally, the output consists of a sequence of lines showing | |
57 | each active process, its state (whether swapped in or not), its | |
58 | .I p0br, | |
59 | and the number and location of its page table pages. | |
60 | Any pages which are locked while raw i/o is in progress, or which | |
61 | are locked because they are | |
62 | .I intransit | |
63 | are also printed. (Intransit text pages often diagnose as duplicated; | |
64 | you will have to weed these out by hand.) | |
65 | .PP | |
66 | The program checks that any pages in core which are marked as not | |
67 | modified are, in fact, identical to the swap space copies. | |
68 | It also checks for non-overlap of the swap space, and that the core | |
69 | map entries correspond to the page tables. | |
70 | The state of the free list is also checked. | |
71 | .PP | |
72 | Options to | |
73 | .I analyze | |
74 | include | |
75 | .B \-m | |
76 | which causes the entire coremap state to be dumped, | |
77 | .B \-D | |
78 | which causes the diskmap for each process to be printed, | |
79 | .B \-d | |
80 | which causes the (sorted) paging area usage to be printed, | |
81 | .B \-v | |
82 | (long unused) which causes a hugely verbose output format to be used, | |
83 | and | |
84 | .B \-f | |
85 | which causes the free list to be dumped. | |
86 | .PP | |
87 | In general, the output from this program can be confused by processes | |
88 | which were forking, swapping, or exiting or | |
89 | happened to be in unusual states when the | |
90 | crash occurred. You should examine the flags field of the output of | |
91 | .I pstat | |
92 | to help filter out such processes. | |
93 | .PP | |
94 | You can look at the core dump with | |
95 | .I adb | |
96 | if you do | |
97 | .IP | |
98 | adb /vmunix /vmcore | |
99 | .br | |
100 | .lg 0 | |
101 | /m 80000000 #ffffffff | |
102 | .LP | |
103 | which fixes the map of | |
104 | .I vmcore | |
105 | so that symbols in data space will work. | |
106 | It will be necessary to do | |
107 | .IP | |
108 | .lg 0 | |
109 | ?m 80000000 #ffffffff | |
110 | .LP | |
111 | to get text symbols to work also. | |
112 | You can then look at the | |
113 | .I Umap | |
114 | array to see the | |
115 | .I u. | |
116 | pages of the process which was running at the point of the crash. | |
117 | Find the first number in the output of | |
118 | .I pstat. | |
119 | You should be able to find the kernel stack at the point of crash | |
120 | either as the stack in the | |
121 | .I u. | |
122 | area for this process or in the interrupt stack | |
123 | .I intstack. | |
124 | .PP | |
125 | Note that the debugger is looking at the physical memory at the point | |
126 | of crash; you will have to determine which pages of physical memory | |
127 | the stack is in if you wish to look at it by computing | |
128 | 0x80000000 + Umap[i] * 0x200. | |
129 | A similar computation will give addresses of page table pages. Thus | |
130 | if | |
131 | .I analyze | |
132 | says that a processes page tables are in page 218 (hex of course), then | |
133 | you can look at them by looking at address 0x80043000 in the dump, i.e. | |
134 | ``80043000,80/X'' will print the page of page tables. | |
135 | .SH FILES | |
136 | /vmunix default system namelist | |
137 | .SH SEE ALSO | |
138 | ``Data Structures added in the Berkeley Virtual Memory Extensions to the UNIX System'', | |
139 | crash(8), pstat(1m), ps(1) | |
140 | .SH AUTHORS | |
141 | Ozalp Babaoglu and William Joy | |
142 | .SH DIAGNOSTICS | |
143 | Various diagnostics about overlaps in swap mappings, missing swap mappings, | |
144 | page table entries inconsistent with the core map, incore pages which | |
145 | are marked clean but differ from disk-image copies, pages which are | |
146 | locked or intransit, and inconsistencies in the free list. | |
147 | .PP | |
148 | It would be nice if this program analyzed the system in general, rather | |
149 | than just the paging system in particular. |