typo
[unix-history] / usr / src / sbin / fsck / SMM.doc / 3.t
CommitLineData
3edcb7c8
KB
1.\" Copyright (c) 1982 The Regents of the University of California.
2.\" All rights reserved.
a38b2411 3.\"
3edcb7c8
KB
4.\" %sccs.include.redist.roff%
5.\"
b7956604 6.\" @(#)3.t 4.5 (Berkeley) %G%
a38b2411
KM
7.\"
8.ds RH Fixing corrupted file systems
9.NH
10Fixing corrupted file systems
11.PP
12A file system
13can become corrupted in several ways.
14The most common of these ways are
15improper shutdown procedures
16and hardware failures.
17.PP
18File systems may become corrupted during an
19.I "unclean halt" .
20This happens when proper shutdown
21procedures are not observed,
22physically write-protecting a mounted file system,
23or a mounted file system is taken off-line.
24The most common operator procedural failure is forgetting to
25.I sync
26the system before halting the CPU.
27.PP
28File systems may become further corrupted if proper startup
29procedures are not observed, e.g.,
30not checking a file system for inconsistencies,
31and not repairing inconsistencies.
32Allowing a corrupted file system to be used (and, thus, to be modified
33further) can be disastrous.
34.PP
35Any piece of hardware can fail at any time.
36Failures
37can be as subtle as a bad block
38on a disk pack, or as blatant as a non-functional disk-controller.
39.NH 2
40Detecting and correcting corruption
41.PP
42Normally
43.I fsck
44is run non-interactively.
45In this mode it will only fix
46corruptions that are expected to occur from an unclean halt.
47These actions are a proper subset of the actions that
48.I fsck
49will take when it is running interactively.
50Throughout this paper we assume that
51.I fsck
52is being run interactively,
53and all possible errors can be encountered.
54When an inconsistency is discovered in this mode,
55.I fsck
56reports the inconsistency for the operator to
57chose a corrective action.
58.PP
59A quiescent\(dd
60.FS
61\(dd I.e., unmounted and not being written on.
62.FE
63file system may be checked for structural integrity
64by performing consistency checks on the
65redundant data intrinsic to a file system.
66The redundant data is either read from
67the file system,
68or computed from other known values.
69The file system
70.B must
71be in a quiescent state when
72.I fsck
73is run,
74since
75.I fsck
76is a multi-pass program.
77.PP
78In the following sections,
79we discuss methods to discover inconsistencies
80and possible corrective actions
81for the cylinder group blocks, the inodes, the indirect blocks, and
82the data blocks containing directory entries.
83.NH 2
84Super-block checking
85.PP
86The most commonly corrupted item in a file system
87is the summary information
88associated with the super-block.
89The summary information is prone to corruption
90because it is modified with every change to the file
91system's blocks or inodes,
92and is usually corrupted
93after an unclean halt.
94.PP
95The super-block is checked for inconsistencies
96involving file-system size, number of inodes,
97free-block count, and the free-inode count.
98The file-system size must be larger than the
99number of blocks used by the super-block
100and the number of blocks used by the list of inodes.
101The file-system size and layout information
102are the most critical pieces of information for
103.I fsck .
104While there is no way to actually check these sizes,
105since they are statically determined by
106.I newfs ,
107.I fsck
108can check that these sizes are within reasonable bounds.
109All other file system checks require that these sizes be correct.
110If
111.I fsck
112detects corruption in the static parameters of the default super-block,
113.I fsck
114requests the operator to specify the location of an alternate super-block.
115.NH 2
116Free block checking
117.PP
118.I Fsck
119checks that all the blocks
120marked as free in the cylinder group block maps
121are not claimed by any files.
122When all the blocks have been initially accounted for,
123.I fsck
124checks that
125the number of free blocks
126plus the number of blocks claimed by the inodes
127equals the total number of blocks in the file system.
128.PP
129If anything is wrong with the block allocation maps,
130.I fsck
131will rebuild them,
132based on the list it has computed of allocated blocks.
133.PP
134The summary information associated with the super-block
135counts the total number of free blocks within the file system.
136.I Fsck
137compares this count to the
138number of free blocks it found within the file system.
139If the two counts do not agree, then
140.I fsck
141replaces the incorrect count in the summary information
142by the actual free-block count.
143.PP
144The summary information
145counts the total number of free inodes within the file system.
146.I Fsck
147compares this count to the number
148of free inodes it found within the file system.
149If the two counts do not agree, then
150.I fsck
151replaces the incorrect count in the
152summary information by the actual free-inode count.
153.NH 2
154Checking the inode state
155.PP
156An individual inode is not as likely to be corrupted as
157the allocation information.
158However, because of the great number of active inodes,
159a few of the inodes are usually corrupted.
160.PP
161The list of inodes in the file system
162is checked sequentially starting with inode 2
163(inode 0 marks unused inodes;
164inode 1 is saved for future generations)
165and progressing through the last inode in the file system.
166The state of each inode is checked for
167inconsistencies involving format and type,
168link count,
169duplicate blocks,
170bad blocks,
171and inode size.
172.PP
173Each inode contains a mode word.
174This mode word describes the type and state of the inode.
175Inodes must be one of six types:
176regular inode, directory inode, symbolic link inode,
177special block inode, special character inode, or socket inode.
178Inodes may be found in one of three allocation states:
179unallocated, allocated, and neither unallocated nor allocated.
180This last state suggests an incorrectly formated inode.
181An inode can get in this state if
182bad data is written into the inode list.
183The only possible corrective action is for
184.I fsck
185is to clear the inode.
186.NH 2
187Inode links
188.PP
189Each inode counts the
190total number of directory entries
191linked to the inode.
192.I Fsck
193verifies the link count of each inode
194by starting at the root of the file system,
195and descending through the directory structure.
196The actual link count for each inode
197is calculated during the descent.
198.PP
199If the stored link count is non-zero and the actual
200link count is zero,
201then no directory entry appears for the inode.
202If this happens,
203.I fsck
204will place the disconnected file in the
205.I lost+found
206directory.
207If the stored and actual link counts are non-zero and unequal,
208a directory entry may have been added or removed without the inode being
209updated.
210If this happens,
211.I fsck
212replaces the incorrect stored link count by the actual link count.
213.PP
214Each inode contains a list,
215or pointers to
216lists (indirect blocks),
217of all the blocks claimed by the inode.
218Since indirect blocks are owned by an inode,
219inconsistencies in indirect blocks directly
220affect the inode that owns it.
221.PP
222.I Fsck
223compares each block number claimed by an inode
224against a list of already allocated blocks.
225If another inode already claims a block number,
226then the block number is added to a list of
227.I "duplicate blocks" .
228Otherwise, the list of allocated blocks
229is updated to include the block number.
230.PP
231If there are any duplicate blocks,
232.I fsck
233will perform a partial second
234pass over the inode list
235to find the inode of the duplicated block.
236The second pass is needed,
237since without examining the files associated with
238these inodes for correct content,
239not enough information is available
240to determine which inode is corrupted and should be cleared.
241If this condition does arise
242(only hardware failure will cause it),
243then the inode with the earliest
244modify time is usually incorrect,
245and should be cleared.
246If this happens,
247.I fsck
248prompts the operator to clear both inodes.
249The operator must decide which one should be kept
250and which one should be cleared.
251.PP
252.I Fsck
253checks the range of each block number claimed by an inode.
254If the block number is
255lower than the first data block in the file system,
256or greater than the last data block,
257then the block number is a
258.I "bad block number" .
259Many bad blocks in an inode are usually caused by
260an indirect block that was not written to the file system,
261a condition which can only occur if there has been a hardware failure.
262If an inode contains bad block numbers,
263.I fsck
264prompts the operator to clear it.
265.NH 2
266Inode data size
267.PP
268Each inode contains a count of the number of data blocks
269that it contains.
270The number of actual data blocks
271is the sum of the allocated data blocks
272and the indirect blocks.
273.I Fsck
274computes the actual number of data blocks
275and compares that block count against
276the actual number of blocks the inode claims.
277If an inode contains an incorrect count
278.I fsck
279prompts the operator to fix it.
280.PP
281Each inode contains a thirty-two bit size field.
282The size is the number of data bytes
283in the file associated with the inode.
284The consistency of the byte size field is roughly checked
285by computing from the size field the maximum number of blocks
286that should be associated with the inode,
287and comparing that expected block count against
288the actual number of blocks the inode claims.
289.NH 2
290Checking the data associated with an inode
291.PP
292An inode can directly or indirectly
293reference three kinds of data blocks.
294All referenced blocks must be the same kind.
295The three types of data blocks are:
296plain data blocks, symbolic link data blocks, and directory data blocks.
297Plain data blocks
298contain the information stored in a file;
299symbolic link data blocks
300contain the path name stored in a link.
301Directory data blocks contain directory entries.
302.I Fsck
303can only check the validity of directory data blocks.
304.PP
305Each directory data block is checked for
306several types of inconsistencies.
307These inconsistencies include
308directory inode numbers pointing to unallocated inodes,
309directory inode numbers that are greater than
310the number of inodes in the file system,
311incorrect directory inode numbers for ``\fB.\fP'' and ``\fB..\fP'',
312and directories that are not attached to the file system.
313If the inode number in a directory data block
314references an unallocated inode,
315then
316.I fsck
317will remove that directory entry.
318Again,
319this condition can only arise when there has been a hardware failure.
320.PP
321If a directory entry inode number references
322outside the inode list, then
323.I fsck
324will remove that directory entry.
325This condition occurs if bad data is written into a directory data block.
326.PP
327The directory inode number entry for ``\fB.\fP''
328must be the first entry in the directory data block.
329The inode number for ``\fB.\fP''
330must reference itself;
331e.g., it must equal the inode number
332for the directory data block.
333The directory inode number entry
334for ``\fB..\fP'' must be
335the second entry in the directory data block.
336Its value must equal the inode number for the
337parent of the directory entry
338(or the inode number of the directory
339data block if the directory is the
340root directory).
341If the directory inode numbers are
342incorrect,
343.I fsck
344will replace them with the correct values.
145bc69d
KM
345If there are multiple hard links to a directory,
346the first one encountered is considered the real parent
347to which ``\fB..\fP'' should point;
b7956604 348\fIfsck\fP recommends deletion for the subsequently discovered names.
a38b2411
KM
349.NH 2
350File system connectivity
351.PP
352.I Fsck
353checks the general connectivity of the file system.
354If directories are not linked into the file system, then
355.I fsck
356links the directory back into the file system in the
357.I lost+found
358directory.
359This condition only occurs when there has been a hardware failure.
6a1194d8 360.ds RH "References"
a38b2411
KM
361.SH
362\s+2Acknowledgements\s0
363.PP
364I thank Bill Joy, Sam Leffler, Robert Elz and Dennis Ritchie
365for their suggestions and help in implementing the new file system.
366Thanks also to Robert Henry for his editorial input to
367get this document together.
368Finally we thank our sponsors,
369the National Science Foundation under grant MCS80-05144,
370and the Defense Advance Research Projects Agency (DoD) under
371Arpa Order No. 4031 monitored by Naval Electronic System Command under
372Contract No. N00039-82-C-0235. (Kirk McKusick, July 1983)
373.PP
374I would like to thank Larry A. Wehr for advice that lead
375to the first version of
376.I fsck
377and Rick B. Brandt for adapting
378.I fsck
379to
380UNIX/TS. (T. Kowalski, July 1979)
381.sp 2
382.SH
383\s+2References\s0
384.LP
385.IP [Dolotta78] 20
386Dolotta, T. A., and Olsson, S. B. eds.,
6a1194d8
KM
387.I "UNIX User's Manual, Edition 1.1\^" ,
388January 1978.
a38b2411
KM
389.IP [Joy83] 20
390Joy, W., Cooper, E., Fabry, R., Leffler, S., McKusick, M., and Mosher, D.
6a1194d8
KM
3914.2BSD System Manual,
392.I "University of California at Berkeley" ,
393.I "Computer Systems Research Group Technical Report"
394#4, 1982.
395.IP [McKusick84] 20
a38b2411 396McKusick, M., Joy, W., Leffler, S., and Fabry, R.
6a1194d8
KM
397A Fast File System for UNIX,
398\fIACM Transactions on Computer Systems 2\fP, 3.
399pp. 181-197, August 1984.
a38b2411
KM
400.IP [Ritchie78] 20
401Ritchie, D. M., and Thompson, K.,
402The UNIX Time-Sharing System,
403.I "The Bell System Technical Journal"
404.B 57 ,
4056 (July-August 1978, Part 2), pp. 1905-29.
406.IP [Thompson78] 20
407Thompson, K.,
408UNIX Implementation,
409.I "The Bell System Technical Journal\^"
410.B 57 ,
4116 (July-August 1978, Part 2), pp. 1931-46.
412.ds RH Appendix A \- Fsck Error Conditions
413.bp