Commit | Line | Data |
---|---|---|
ad787160 C |
1 | .\" This module is believed to contain source code proprietary to AT&T. |
2 | .\" Use and redistribution is subject to the Berkeley Software License | |
3 | .\" Agreement and your Software Agreement with AT&T (Western Electric). | |
3edcb7c8 | 4 | .\" |
ad787160 | 5 | .\" @(#)lint.ms 6.2 (Berkeley) 4/17/91 |
795f68a3 | 6 | .\" |
4ef40fda KM |
7 | .EH 'PS1:9-%''Lint, a C Program Checker' |
8 | .OH 'Lint, a C Program Checker''PS1:9-%' | |
9 | .\".RP | |
795f68a3 KM |
10 | .ND "July 26, 1978" |
11 | .OK | |
4ef40fda KM |
12 | .\"Program Portability |
13 | .\"Strong Type Checking | |
795f68a3 KM |
14 | .TL |
15 | Lint, a C Program Checker | |
16 | .AU "MH 2C-559" 3968 | |
17 | S. C. Johnson | |
18 | .AI | |
19 | .MH | |
20 | .AB | |
21 | .PP | |
22 | .I Lint | |
23 | is a command which examines C source programs, | |
24 | detecting | |
25 | a number of bugs and obscurities. | |
26 | It enforces the type rules of C more strictly than | |
27 | the C compilers. | |
28 | It may also be used to enforce a number of portability | |
29 | restrictions involved in moving | |
30 | programs between different machines and/or operating systems. | |
31 | Another option detects a number of wasteful, or error prone, constructions | |
32 | which nevertheless are, strictly speaking, legal. | |
33 | .PP | |
34 | .I Lint | |
35 | accepts multiple input files and library specifications, and checks them for consistency. | |
36 | .PP | |
37 | The separation of function between | |
38 | .I lint | |
39 | and the C compilers has both historical and practical | |
40 | rationale. | |
41 | The compilers turn C programs into executable files rapidly | |
42 | and efficiently. | |
43 | This is possible in part because the | |
44 | compilers do not do sophisticated | |
45 | type checking, especially between | |
46 | separately compiled programs. | |
47 | .I Lint | |
48 | takes a more global, leisurely view of the program, | |
49 | looking much more carefully at the compatibilities. | |
50 | .PP | |
51 | This document discusses the use of | |
52 | .I lint , | |
53 | gives an overview of the implementation, and gives some hints on the | |
54 | writing of machine independent C code. | |
55 | .AE | |
56 | .CS 10 2 12 0 0 5 | |
57 | .SH | |
58 | Introduction and Usage | |
59 | .PP | |
60 | Suppose there are two C | |
61 | .[ | |
62 | Kernighan Ritchie Programming Prentice 1978 | |
63 | .] | |
64 | source files, | |
65 | .I file1. c | |
66 | and | |
67 | .I file2.c , | |
68 | which are ordinarily compiled and loaded together. | |
69 | Then the command | |
70 | .DS | |
71 | lint file1.c file2.c | |
72 | .DE | |
73 | produces messages describing inconsistencies and inefficiencies | |
74 | in the programs. | |
75 | The program enforces the typing rules of C | |
76 | more strictly than the C compilers | |
77 | (for both historical and practical reasons) | |
78 | enforce them. | |
79 | The command | |
80 | .DS | |
81 | lint \-p file1.c file2.c | |
82 | .DE | |
83 | will produce, in addition to the above messages, additional messages | |
84 | which relate to the portability of the programs to other operating | |
85 | systems and machines. | |
86 | Replacing the | |
87 | .B \-p | |
88 | by | |
89 | .B \-h | |
90 | will produce messages about various error-prone or wasteful constructions | |
91 | which, strictly speaking, are not bugs. | |
92 | Saying | |
93 | .B \-hp | |
94 | gets the whole works. | |
95 | .PP | |
96 | The next several sections describe the major messages; | |
97 | the document closes with sections | |
98 | discussing the implementation and giving suggestions | |
99 | for writing portable C. | |
100 | An appendix gives a summary of the | |
101 | .I lint | |
102 | options. | |
103 | .SH | |
104 | A Word About Philosophy | |
105 | .PP | |
106 | Many of the facts which | |
107 | .I lint | |
108 | needs may be impossible to | |
109 | discover. | |
110 | For example, whether a given function in a program ever gets called | |
111 | may depend on the input data. | |
112 | Deciding whether | |
113 | .I exit | |
114 | is ever called is equivalent to solving the famous ``halting problem,'' known to be | |
115 | recursively undecidable. | |
116 | .PP | |
117 | Thus, most of the | |
118 | .I lint | |
119 | algorithms are a compromise. | |
120 | If a function is never mentioned, it can never be called. | |
121 | If a function is mentioned, | |
122 | .I lint | |
123 | assumes it can be called; this is not necessarily so, but in practice is quite reasonable. | |
124 | .PP | |
125 | .I Lint | |
126 | tries to give information with a high degree of relevance. | |
127 | Messages of the form ``\fIxxx\fR might be a bug'' | |
128 | are easy to generate, but are acceptable only in proportion | |
129 | to the fraction of real bugs they uncover. | |
130 | If this fraction of real bugs is too small, the messages lose their credibility | |
131 | and serve merely to clutter up the output, | |
132 | obscuring the more important messages. | |
133 | .PP | |
134 | Keeping these issues in mind, we now consider in more detail | |
135 | the classes of messages which | |
136 | .I lint | |
137 | produces. | |
138 | .SH | |
139 | Unused Variables and Functions | |
140 | .PP | |
141 | As sets of programs evolve and develop, | |
142 | previously used variables and arguments to | |
143 | functions may become unused; | |
144 | it is not uncommon for external variables, or even entire | |
145 | functions, to become unnecessary, and yet | |
146 | not be removed from the source. | |
147 | These ``errors of commission'' rarely cause working programs to fail, but they are a source | |
148 | of inefficiency, and make programs harder to understand | |
149 | and change. | |
150 | Moreover, information about such unused variables and functions can occasionally | |
151 | serve to discover bugs; if a function does a necessary job, and | |
152 | is never called, something is wrong! | |
153 | .PP | |
154 | .I Lint | |
155 | complains about variables and functions which are defined but not otherwise | |
156 | mentioned. | |
157 | An exception is variables which are declared through explicit | |
158 | .B extern | |
159 | statements but are never referenced; thus the statement | |
160 | .DS | |
161 | extern float sin(\|); | |
162 | .DE | |
163 | will evoke no comment if | |
164 | .I sin | |
165 | is never used. | |
166 | Note that this agrees with the semantics of the C compiler. | |
167 | In some cases, these unused external declarations might be of some interest; they | |
168 | can be discovered by adding the | |
169 | .B \-x | |
170 | flag to the | |
171 | .I lint | |
172 | invocation. | |
173 | .PP | |
174 | Certain styles of programming | |
175 | require many functions to be written with similar interfaces; | |
176 | frequently, some of the arguments may be unused | |
177 | in many of the calls. | |
178 | The | |
179 | .B \-v | |
180 | option is available to suppress the printing of | |
181 | complaints about unused arguments. | |
182 | When | |
183 | .B \-v | |
184 | is in effect, no messages are produced about unused | |
185 | arguments except for those | |
186 | arguments which are unused and also declared as | |
187 | register arguments; this can be considered | |
188 | an active (and preventable) waste of the register | |
189 | resources of the machine. | |
190 | .PP | |
191 | There is one case where information about unused, or | |
192 | undefined, variables is more distracting | |
193 | than helpful. | |
194 | This is when | |
195 | .I lint | |
196 | is applied to some, but not all, files out of a collection | |
197 | which are to be loaded together. | |
198 | In this case, many of the functions and variables defined | |
199 | may not be used, and, conversely, | |
200 | many functions and variables defined elsewhere may be used. | |
201 | The | |
202 | .B \-u | |
203 | flag may be used to suppress the spurious messages which might otherwise appear. | |
204 | .SH | |
205 | Set/Used Information | |
206 | .PP | |
207 | .I Lint | |
208 | attempts to detect cases where a variable is used before it is set. | |
209 | This is very difficult to do well; | |
210 | many algorithms take a good deal of time and space, | |
211 | and still produce messages about perfectly valid programs. | |
212 | .I Lint | |
213 | detects local variables (automatic and register storage classes) | |
214 | whose first use appears physically earlier in the input file than the first assignment to the variable. | |
215 | It assumes that taking the address of a variable constitutes a ``use,'' since the actual use | |
216 | may occur at any later time, in a data dependent fashion. | |
217 | .PP | |
218 | The restriction to the physical appearance of variables in the file makes the | |
219 | algorithm very simple and quick to implement, | |
220 | since the true flow of control need not be discovered. | |
221 | It does mean that | |
222 | .I lint | |
223 | can complain about some programs which are legal, | |
224 | but these programs would probably be considered bad on stylistic grounds (e.g. might | |
225 | contain at least two \fBgoto\fR's). | |
226 | Because static and external variables are initialized to 0, | |
227 | no meaningful information can be discovered about their uses. | |
228 | The algorithm deals correctly, however, with initialized automatic variables, and variables | |
229 | which are used in the expression which first sets them. | |
230 | .PP | |
231 | The set/used information also permits recognition of those local variables which are set | |
232 | and never used; these form a frequent source of inefficiencies, and may also be symptomatic of bugs. | |
233 | .SH | |
234 | Flow of Control | |
235 | .PP | |
236 | .I Lint | |
237 | attempts to detect unreachable portions of the programs which it processes. | |
238 | It will complain about unlabeled statements immediately following | |
239 | \fBgoto\fR, \fBbreak\fR, \fBcontinue\fR, or \fBreturn\fR statements. | |
240 | An attempt is made to detect loops which can never be left at the bottom, detecting the | |
241 | special cases | |
242 | \fBwhile\fR( 1 ) and \fBfor\fR(;;) as infinite loops. | |
243 | .I Lint | |
244 | also complains about loops which cannot be entered at the top; | |
245 | some valid programs may have such loops, but at best they are bad style, | |
246 | at worst bugs. | |
247 | .PP | |
248 | .I Lint | |
249 | has an important area of blindness in the flow of control algorithm: | |
250 | it has no way of detecting functions which are called and never return. | |
251 | Thus, a call to | |
252 | .I exit | |
253 | may cause unreachable code which | |
254 | .I lint | |
255 | does not detect; the most serious effects of this are in the | |
256 | determination of returned function values (see the next section). | |
257 | .PP | |
258 | One form of unreachable statement is not usually complained about by | |
259 | .I lint; | |
260 | a | |
261 | .B break | |
262 | statement that cannot be reached causes no message. | |
263 | Programs generated by | |
264 | .I yacc , | |
265 | .[ | |
266 | Johnson Yacc 1975 | |
267 | .] | |
268 | and especially | |
269 | .I lex , | |
270 | .[ | |
271 | Lesk Lex | |
272 | .] | |
273 | may have literally hundreds of unreachable | |
274 | .B break | |
275 | statements. | |
276 | The | |
277 | .B \-O | |
278 | flag in the C compiler will often eliminate the resulting object code inefficiency. | |
279 | Thus, these unreached statements are of little importance, | |
280 | there is typically nothing the user can do about them, and the | |
281 | resulting messages would clutter up the | |
282 | .I lint | |
283 | output. | |
284 | If these messages are desired, | |
285 | .I lint | |
286 | can be invoked with the | |
287 | .B \-b | |
288 | option. | |
289 | .SH | |
290 | Function Values | |
291 | .PP | |
292 | Sometimes functions return values which are never used; | |
293 | sometimes programs incorrectly use function ``values'' | |
294 | which have never been returned. | |
295 | .I Lint | |
296 | addresses this problem in a number of ways. | |
297 | .PP | |
298 | Locally, within a function definition, | |
299 | the appearance of both | |
300 | .DS | |
301 | return( \fIexpr\fR ); | |
302 | .DE | |
303 | and | |
304 | .DS | |
305 | return ; | |
306 | .DE | |
307 | statements is cause for alarm; | |
308 | .I lint | |
309 | will give the message | |
310 | .DS | |
311 | function \fIname\fR contains return(e) and return | |
312 | .DE | |
313 | The most serious difficulty with this is detecting when a function return is implied | |
314 | by flow of control reaching the end of the function. | |
315 | This can be seen with a simple example: | |
316 | .DS | |
317 | .ta .5i 1i 1.5i | |
318 | \fRf ( a ) { | |
319 | if ( a ) return ( 3 ); | |
320 | g (\|); | |
321 | } | |
322 | .DE | |
323 | Notice that, if \fIa\fR tests false, \fIf\fR will call \fIg\fR and then return | |
324 | with no defined return value; this will trigger a complaint from | |
325 | .I lint . | |
326 | If \fIg\fR, like \fIexit\fR, never returns, | |
327 | the message will still be produced when in fact nothing is wrong. | |
328 | .PP | |
329 | In practice, some potentially serious bugs have been discovered by this feature; | |
330 | it also accounts for a substantial fraction of the ``noise'' messages produced | |
331 | by | |
332 | .I lint . | |
333 | .PP | |
334 | On a global scale, | |
335 | .I lint | |
336 | detects cases where a function returns a value, but this value is sometimes, | |
337 | or always, unused. | |
338 | When the value is always unused, it may constitute an inefficiency in the function definition. | |
339 | When the value is sometimes unused, it may represent bad style (e.g., not testing for | |
340 | error conditions). | |
341 | .PP | |
342 | The dual problem, using a function value when the function does not return one, | |
343 | is also detected. | |
344 | This is a serious problem. | |
345 | Amazingly, this bug has been observed on a couple of occasions | |
346 | in ``working'' programs; the desired function value just happened to have been computed | |
347 | in the function return register! | |
348 | .SH | |
349 | Type Checking | |
350 | .PP | |
351 | .I Lint | |
352 | enforces the type checking rules of C more strictly than the compilers do. | |
353 | The additional checking is in four major areas: | |
354 | across certain binary operators and implied assignments, | |
355 | at the structure selection operators, | |
356 | between the definition and uses of functions, | |
357 | and in the use of enumerations. | |
358 | .PP | |
359 | There are a number of operators which have an implied balancing between types of the operands. | |
360 | The assignment, conditional ( ?\|: ), and relational operators | |
361 | have this property; the argument | |
362 | of a \fBreturn\fR statement, | |
363 | and expressions used in initialization also suffer similar conversions. | |
364 | In these operations, | |
365 | \fBchar\fR, \fBshort\fR, \fBint\fR, \fBlong\fR, \fBunsigned\fR, \fBfloat\fR, and \fBdouble\fR types may be freely intermixed. | |
366 | The types of pointers must agree exactly, | |
367 | except that arrays of \fIx\fR's can, of course, be intermixed with pointers to \fIx\fR's. | |
368 | .PP | |
369 | The type checking rules also require that, in structure references, the | |
370 | left operand of the \(em> be a pointer to structure, the left operand of the \fB.\fR | |
371 | be a structure, and the right operand of these operators be a member | |
372 | of the structure implied by the left operand. | |
373 | Similar checking is done for references to unions. | |
374 | .PP | |
375 | Strict rules apply to function argument and return value | |
376 | matching. | |
377 | The types \fBfloat\fR and \fBdouble\fR may be freely matched, | |
378 | as may the types \fBchar\fR, \fBshort\fR, \fBint\fR, and \fBunsigned\fR. | |
379 | Also, pointers can be matched with the associated arrays. | |
380 | Aside from this, all actual arguments must agree in type with their declared counterparts. | |
381 | .PP | |
382 | With enumerations, checks are made that enumeration variables or members are not mixed | |
383 | with other types, or other enumerations, | |
384 | and that the only operations applied are =, initialization, ==, !=, and function arguments and return values. | |
385 | .SH | |
386 | Type Casts | |
387 | .PP | |
388 | The type cast feature in C was introduced largely as an aid | |
389 | to producing more portable programs. | |
390 | Consider the assignment | |
391 | .DS | |
392 | p = 1 ; | |
393 | .DE | |
394 | where | |
395 | .I p | |
396 | is a character pointer. | |
397 | .I Lint | |
398 | will quite rightly complain. | |
399 | Now, consider the assignment | |
400 | .DS | |
401 | p = (char \(**)1 ; | |
402 | .DE | |
403 | in which a cast has been used to | |
404 | convert the integer to a character pointer. | |
405 | The programmer obviously had a strong motivation | |
406 | for doing this, and has clearly signaled his intentions. | |
407 | It seems harsh for | |
408 | .I lint | |
409 | to continue to complain about this. | |
410 | On the other hand, if this code is moved to another | |
411 | machine, such code should be looked at carefully. | |
412 | The | |
413 | .B \-c | |
414 | flag controls the printing of comments about casts. | |
415 | When | |
416 | .B \-c | |
417 | is in effect, casts are treated as though they were assignments | |
418 | subject to complaint; otherwise, all legal casts are passed without comment, | |
419 | no matter how strange the type mixing seems to be. | |
420 | .SH | |
421 | Nonportable Character Use | |
422 | .PP | |
423 | On the PDP-11, characters are signed quantities, with a range | |
424 | from \-128 to 127. | |
425 | On most of the other C implementations, characters take on only positive | |
426 | values. | |
427 | Thus, | |
428 | .I lint | |
429 | will flag certain comparisons and assignments as being | |
430 | illegal or nonportable. | |
431 | For example, the fragment | |
432 | .DS | |
433 | char c; | |
434 | ... | |
435 | if( (c = getchar(\|)) < 0 ) .... | |
436 | .DE | |
437 | works on the PDP-11, but | |
438 | will fail on machines where characters always take | |
439 | on positive values. | |
440 | The real solution is to declare | |
441 | .I c | |
442 | an integer, since | |
443 | .I getchar | |
444 | is actually returning | |
445 | integer values. | |
446 | In any case, | |
447 | .I lint | |
448 | will say | |
449 | ``nonportable character comparison''. | |
450 | .PP | |
451 | A similar issue arises with bitfields; when assignments | |
452 | of constant values are made to bitfields, the field may | |
453 | be too small to hold the value. | |
454 | This is especially true because | |
455 | on some machines bitfields are considered as signed | |
456 | quantities. | |
457 | While it may seem unintuitive to consider | |
458 | that a two bit field declared of type | |
459 | .B int | |
460 | cannot hold the value 3, the problem disappears | |
461 | if the bitfield is declared to have type | |
462 | .B unsigned . | |
463 | .SH | |
464 | Assignments of longs to ints | |
465 | .PP | |
466 | Bugs may arise from the assignment of | |
467 | .B long | |
468 | to | |
469 | an | |
470 | .B int , | |
471 | which loses accuracy. | |
472 | This may happen in programs | |
473 | which have been incompletely converted to use | |
474 | .B typedefs . | |
475 | When a | |
476 | .B typedef | |
477 | variable | |
478 | is changed from \fBint\fR to \fBlong\fR, | |
479 | the program can stop working because | |
480 | some intermediate results may be assigned | |
481 | to \fBints\fR, losing accuracy. | |
482 | Since there are a number of legitimate reasons for | |
483 | assigning \fBlongs\fR to \fBints\fR, the detection | |
484 | of these assignments is enabled | |
485 | by the | |
486 | .B \-a | |
487 | flag. | |
488 | .SH | |
489 | Strange Constructions | |
490 | .PP | |
491 | Several perfectly legal, but somewhat strange, constructions | |
492 | are flagged by | |
493 | .I lint; | |
494 | the messages hopefully encourage better code quality, clearer style, and | |
495 | may even point out bugs. | |
496 | The | |
497 | .B \-h | |
498 | flag is used to enable these checks. | |
499 | For example, in the statement | |
500 | .DS | |
501 | \(**p++ ; | |
502 | .DE | |
503 | the \(** does nothing; this provokes the message ``null effect'' from | |
504 | .I lint . | |
505 | The program fragment | |
506 | .DS | |
507 | unsigned x ; | |
508 | if( x < 0 ) ... | |
509 | .DE | |
510 | is clearly somewhat strange; the | |
511 | test will never succeed. | |
512 | Similarly, the test | |
513 | .DS | |
514 | if( x > 0 ) ... | |
515 | .DE | |
516 | is equivalent to | |
517 | .DS | |
518 | if( x != 0 ) | |
519 | .DE | |
520 | which may not be the intended action. | |
521 | .I Lint | |
522 | will say ``degenerate unsigned comparison'' in these cases. | |
523 | If one says | |
524 | .DS | |
525 | if( 1 != 0 ) .... | |
526 | .DE | |
527 | .I lint | |
528 | will report | |
529 | ``constant in conditional context'', since the comparison | |
530 | of 1 with 0 gives a constant result. | |
531 | .PP | |
532 | Another construction | |
533 | detected by | |
534 | .I lint | |
535 | involves | |
536 | operator precedence. | |
537 | Bugs which arise from misunderstandings about the precedence | |
538 | of operators can be accentuated by spacing and formatting, | |
539 | making such bugs extremely hard to find. | |
540 | For example, the statements | |
541 | .DS | |
542 | if( x&077 == 0 ) ... | |
543 | .DE | |
544 | or | |
545 | .DS | |
546 | x<\h'-.3m'<2 + 40 | |
547 | .DE | |
548 | probably do not do what was intended. | |
549 | The best solution is to parenthesize such expressions, | |
550 | and | |
551 | .I lint | |
552 | encourages this by an appropriate message. | |
553 | .PP | |
554 | Finally, when the | |
555 | .B \-h | |
556 | flag is in force | |
557 | .I lint | |
558 | complains about variables which are redeclared in inner blocks | |
559 | in a way that conflicts with their use in outer blocks. | |
560 | This is legal, but is considered by many (including the author) to | |
561 | be bad style, usually unnecessary, and frequently a bug. | |
562 | .SH | |
563 | Ancient History | |
564 | .PP | |
565 | There are several forms of older syntax which are being officially | |
566 | discouraged. | |
567 | These fall into two classes, assignment operators and initialization. | |
568 | .PP | |
569 | The older forms of assignment operators (e.g., =+, =\-, . . . ) | |
570 | could cause ambiguous expressions, such as | |
571 | .DS | |
572 | a =\-1 ; | |
573 | .DE | |
574 | which could be taken as either | |
575 | .DS | |
576 | a =\- 1 ; | |
577 | .DE | |
578 | or | |
579 | .DS | |
580 | a = \-1 ; | |
581 | .DE | |
582 | The situation is especially perplexing if this | |
583 | kind of ambiguity arises as the result of a macro substitution. | |
584 | The newer, and preferred operators (+=, \-=, etc. ) | |
585 | have no such ambiguities. | |
586 | To spur the abandonment of the older forms, | |
587 | .I lint | |
588 | complains about these old fashioned operators. | |
589 | .PP | |
590 | A similar issue arises with initialization. | |
591 | The older language allowed | |
592 | .DS | |
593 | int x \fR1 ; | |
594 | .DE | |
595 | to initialize | |
596 | .I x | |
597 | to 1. | |
598 | This also caused syntactic difficulties: for example, | |
599 | .DS | |
600 | int x ( \-1 ) ; | |
601 | .DE | |
602 | looks somewhat like the beginning of a function declaration: | |
603 | .DS | |
604 | int x ( y ) { . . . | |
605 | .DE | |
606 | and the compiler must read a fair ways past | |
607 | .I x | |
608 | in order to sure what the declaration really is.. | |
609 | Again, the problem is even more perplexing when the | |
610 | initializer involves a macro. | |
611 | The current syntax places an equals sign between the | |
612 | variable and the initializer: | |
613 | .DS | |
614 | int x = \-1 ; | |
615 | .DE | |
616 | This is free of any possible syntactic ambiguity. | |
617 | .SH | |
618 | Pointer Alignment | |
619 | .PP | |
620 | Certain pointer assignments may be reasonable on some machines, | |
621 | and illegal on others, due entirely to | |
622 | alignment restrictions. | |
623 | For example, on the PDP-11, it is reasonable | |
624 | to assign integer pointers to double pointers, since | |
625 | double precision values may begin on any integer boundary. | |
626 | On the Honeywell 6000, double precision values must begin | |
627 | on even word boundaries; | |
628 | thus, not all such assignments make sense. | |
629 | .I Lint | |
630 | tries to detect cases where pointers are assigned to other | |
631 | pointers, and such alignment problems might arise. | |
632 | The message ``possible pointer alignment problem'' | |
633 | results from this situation whenever either the | |
634 | .B \-p | |
635 | or | |
636 | .B \-h | |
637 | flags are in effect. | |
638 | .SH | |
639 | Multiple Uses and Side Effects | |
640 | .PP | |
641 | In complicated expressions, the best order in which to evaluate | |
642 | subexpressions may be highly machine dependent. | |
643 | For example, on machines (like the PDP-11) in which the stack | |
644 | runs backwards, function arguments will probably be best evaluated | |
645 | from right-to-left; on machines with a stack running forward, | |
646 | left-to-right seems most attractive. | |
647 | Function calls embedded as arguments of other functions | |
648 | may or may not be treated similarly to ordinary arguments. | |
649 | Similar issues arise with other operators which have side effects, | |
650 | such as the assignment operators and the increment and decrement operators. | |
651 | .PP | |
652 | In order that the efficiency of C on a particular machine not be | |
653 | unduly compromised, the C language leaves the order | |
654 | of evaluation of complicated expressions up to the | |
655 | local compiler, and, in fact, the various C compilers have considerable | |
656 | differences in the order in which they will evaluate complicated | |
657 | expressions. | |
658 | In particular, if any variable is changed by a side effect, and | |
659 | also used elsewhere in the same expression, the result is explicitly undefined. | |
660 | .PP | |
661 | .I Lint | |
662 | checks for the important special case where | |
663 | a simple scalar variable is affected. | |
664 | For example, the statement | |
665 | .DS | |
666 | \fIa\fR[\fIi\|\fR] = \fIb\fR[\fIi\fR++] ; | |
667 | .DE | |
668 | will draw the complaint: | |
669 | .DS | |
670 | warning: \fIi\fR evaluation order undefined | |
671 | .DE | |
672 | .SH | |
673 | Implementation | |
674 | .PP | |
675 | .I Lint | |
676 | consists of two programs and a driver. | |
677 | The first program is a version of the | |
678 | Portable C Compiler | |
679 | .[ | |
680 | Johnson Ritchie BSTJ Portability Programs System | |
681 | .] | |
682 | .[ | |
683 | Johnson portable compiler 1978 | |
684 | .] | |
685 | which is the basis of the | |
686 | IBM 370, Honeywell 6000, and Interdata 8/32 C compilers. | |
687 | This compiler does lexical and syntax analysis on the input text, | |
688 | constructs and maintains symbol tables, and builds trees for expressions. | |
689 | Instead of writing an intermediate file which is passed to | |
690 | a code generator, as the other compilers | |
691 | do, | |
692 | .I lint | |
693 | produces an intermediate file which consists of lines of ascii text. | |
694 | Each line contains an external variable name, | |
695 | an encoding of the context in which it was seen (use, definition, declaration, etc.), | |
696 | a type specifier, and a source file name and line number. | |
697 | The information about variables local to a function or file | |
698 | is collected | |
699 | by accessing the symbol table, and examining the expression trees. | |
700 | .PP | |
701 | Comments about local problems are produced as detected. | |
702 | The information about external names is collected | |
703 | onto an intermediate file. | |
704 | After all the source files and library descriptions have | |
705 | been collected, the intermediate file is sorted | |
706 | to bring all information collected about a given external | |
707 | name together. | |
708 | The second, rather small, program then reads the lines | |
709 | from the intermediate file and compares all of the | |
710 | definitions, declarations, and uses for consistency. | |
711 | .PP | |
712 | The driver controls this | |
713 | process, and is also responsible for making the options available | |
714 | to both passes of | |
715 | .I lint . | |
716 | .SH | |
717 | Portability | |
718 | .PP | |
719 | C on the Honeywell and IBM systems is used, in part, to write system code for the host operating system. | |
720 | This means that the implementation of C tends to follow local conventions rather than | |
721 | adhere strictly to | |
722 | .UX | |
723 | system conventions. | |
724 | Despite these differences, many C programs have been successfully moved to GCOS and the various IBM | |
725 | installations with little effort. | |
726 | This section describes some of the differences between the implementations, and | |
727 | discusses the | |
728 | .I lint | |
729 | features which encourage portability. | |
730 | .PP | |
731 | Uninitialized external variables are treated differently in different | |
732 | implementations of C. | |
733 | Suppose two files both contain a declaration without initialization, such as | |
734 | .DS | |
735 | int a ; | |
736 | .DE | |
737 | outside of any function. | |
738 | The | |
739 | .UX | |
740 | loader will resolve these declarations, and cause only a single word of storage | |
741 | to be set aside for \fIa\fR. | |
742 | Under the GCOS and IBM implementations, this is not feasible (for various stupid reasons!) | |
743 | so each such declaration causes a word of storage to be set aside and called \fIa\fR. | |
744 | When loading or library editing takes place, this causes fatal conflicts which prevent | |
745 | the proper operation of the program. | |
746 | If | |
747 | .I lint | |
748 | is invoked with the \fB\-p\fR flag, | |
749 | it will detect such multiple definitions. | |
750 | .PP | |
751 | A related difficulty comes from the amount of information retained about external names during the | |
752 | loading process. | |
753 | On the | |
754 | .UX | |
755 | system, externally known names have seven significant characters, with the upper/lower | |
756 | case distinction kept. | |
757 | On the IBM systems, there are eight significant characters, but the case distinction | |
758 | is lost. | |
759 | On GCOS, there are only six characters, of a single case. | |
760 | This leads to situations where programs run on the | |
761 | .UX | |
762 | system, but encounter loader | |
763 | problems on the IBM or GCOS systems. | |
764 | .I Lint | |
765 | .B \-p | |
766 | causes all external symbols to be mapped to one case and truncated to six characters, | |
767 | providing a worst-case analysis. | |
768 | .PP | |
769 | A number of differences arise in the area of character handling: characters in the | |
770 | .UX | |
771 | system are eight bit ascii, while they are eight bit ebcdic on the IBM, and | |
772 | nine bit ascii on GCOS. | |
773 | Moreover, character strings go from high to low bit positions (``left to right'') | |
774 | on GCOS and IBM, and low to high (``right to left'') on the PDP-11. | |
775 | This means that code attempting to construct strings | |
776 | out of character constants, or attempting to use characters as indices | |
777 | into arrays, must be looked at with great suspicion. | |
778 | .I Lint | |
779 | is of little help here, except to flag multi-character character constants. | |
780 | .PP | |
781 | Of course, the word sizes are different! | |
782 | This causes less trouble than might be expected, at least when | |
783 | moving from the | |
784 | .UX | |
785 | system (16 bit words) to the IBM (32 bits) or GCOS (36 bits). | |
786 | The main problems are likely to arise in shifting or masking. | |
787 | C now supports a bit-field facility, which can be used to write much of | |
788 | this code in a reasonably portable way. | |
789 | Frequently, portability of such code can be enhanced by | |
790 | slight rearrangements in coding style. | |
791 | Many of the incompatibilities seem to have the flavor of writing | |
792 | .DS | |
793 | x &= 0177700 ; | |
794 | .DE | |
795 | to clear the low order six bits of \fIx\fR. | |
796 | This suffices on the PDP-11, but fails badly on GCOS and IBM. | |
797 | If the bit field feature cannot be used, the same effect can be obtained by | |
798 | writing | |
799 | .DS | |
800 | x &= \(ap 077 ; | |
801 | .DE | |
802 | which will work on all these machines. | |
803 | .PP | |
804 | The right shift operator is arithmetic shift on the PDP-11, and logical shift on most | |
805 | other machines. | |
806 | To obtain a logical shift on all machines, the left operand can be | |
807 | typed \fBunsigned\fR. | |
808 | Characters are considered signed integers on the PDP-11, and unsigned on the other machines. | |
809 | This persistence of the sign bit may be reasonably considered a bug in the PDP-11 hardware | |
810 | which has infiltrated itself into the C language. | |
811 | If there were a good way to discover the programs which would be affected, C could be changed; | |
812 | in any case, | |
813 | .I lint | |
814 | is no help here. | |
815 | .PP | |
816 | The above discussion may have made the problem of portability seem | |
817 | bigger than it in fact is. | |
818 | The issues involved here are rarely subtle or mysterious, at least to the | |
819 | implementor of the program, although they can involve some work to straighten out. | |
820 | The most serious bar to the portability of | |
821 | .UX | |
822 | system utilities has been the inability to mimic | |
823 | essential | |
824 | .UX | |
825 | system functions on the other systems. | |
826 | The inability to seek to a random character position in a text file, or to establish a pipe | |
827 | between processes, has involved far more rewriting | |
828 | and debugging than any of the differences in C compilers. | |
829 | On the other hand, | |
830 | .I lint | |
831 | has been very helpful | |
832 | in moving the | |
833 | .UX | |
834 | operating system and associated | |
835 | utility programs to other machines. | |
836 | .SH | |
837 | Shutting Lint Up | |
838 | .PP | |
839 | There are occasions when | |
840 | the programmer is smarter than | |
841 | .I lint . | |
842 | There may be valid reasons for ``illegal'' type casts, | |
843 | functions with a variable number of arguments, etc. | |
844 | Moreover, as specified above, the flow of control information | |
845 | produced by | |
846 | .I lint | |
847 | often has blind spots, causing occasional spurious | |
848 | messages about perfectly reasonable programs. | |
849 | Thus, some way of communicating with | |
850 | .I lint , | |
851 | typically to shut it up, is desirable. | |
852 | .PP | |
853 | The form which this mechanism should take is not at all clear. | |
854 | New keywords would require current and old compilers to | |
855 | recognize these keywords, if only to ignore them. | |
856 | This has both philosophical and practical problems. | |
857 | New preprocessor syntax suffers from similar problems. | |
858 | .PP | |
859 | What was finally done was to cause a number of words | |
860 | to be recognized by | |
861 | .I lint | |
862 | when they were embedded in comments. | |
863 | This required minimal preprocessor changes; | |
864 | the preprocessor just had to agree to pass comments | |
865 | through to its output, instead of deleting them | |
866 | as had been previously done. | |
867 | Thus, | |
868 | .I lint | |
869 | directives are invisible to the compilers, and | |
870 | the effect on systems with the older preprocessors | |
871 | is merely that the | |
872 | .I lint | |
873 | directives don't work. | |
874 | .PP | |
875 | The first directive is concerned with flow of control information; | |
876 | if a particular place in the program cannot be reached, | |
877 | but this is not apparent to | |
878 | .I lint , | |
879 | this can be asserted by the directive | |
880 | .DS | |
881 | /* NOTREACHED */ | |
882 | .DE | |
883 | at the appropriate spot in the program. | |
884 | Similarly, if it is desired to turn off | |
885 | strict type checking for | |
886 | the next expression, the directive | |
887 | .DS | |
888 | /* NOSTRICT */ | |
889 | .DE | |
890 | can be used; the situation reverts to the | |
891 | previous default after the next expression. | |
892 | The | |
893 | .B \-v | |
894 | flag can be turned on for one function by the directive | |
895 | .DS | |
896 | /* ARGSUSED */ | |
897 | .DE | |
898 | Complaints about variable number of arguments in calls to a function | |
899 | can be turned off by the directive | |
900 | .DS | |
901 | /* VARARGS */ | |
902 | .DE | |
903 | preceding the function definition. | |
904 | In some cases, it is desirable to check the | |
905 | first several arguments, and leave the later arguments unchecked. | |
906 | This can be done by following the VARARGS keyword immediately | |
907 | with a digit giving the number of arguments which should be checked; thus, | |
908 | .DS | |
909 | /* VARARGS2 */ | |
910 | .DE | |
911 | will cause the first two arguments to be checked, the others unchecked. | |
912 | Finally, the directive | |
913 | .DS | |
914 | /* LINTLIBRARY */ | |
915 | .DE | |
916 | at the head of a file identifies this file as | |
917 | a library declaration file; this topic is worth a | |
918 | section by itself. | |
919 | .SH | |
920 | Library Declaration Files | |
921 | .PP | |
922 | .I Lint | |
923 | accepts certain library directives, such as | |
924 | .DS | |
925 | \-ly | |
926 | .DE | |
927 | and tests the source files for compatibility with these libraries. | |
928 | This is done by accessing library description files whose | |
929 | names are constructed from the library directives. | |
930 | These files all begin with the directive | |
931 | .DS | |
932 | /* LINTLIBRARY */ | |
933 | .DE | |
934 | which is followed by a series of dummy function | |
935 | definitions. | |
936 | The critical parts of these definitions | |
937 | are the declaration of the function return type, | |
938 | whether the dummy function returns a value, and | |
939 | the number and types of arguments to the function. | |
940 | The VARARGS and ARGSUSED directives can | |
941 | be used to specify features of the library functions. | |
942 | .PP | |
943 | .I Lint | |
944 | library files are processed almost exactly like ordinary | |
945 | source files. | |
946 | The only difference is that functions which are defined on a library file, | |
947 | but are not used on a source file, draw no complaints. | |
948 | .I Lint | |
949 | does not simulate a full library search algorithm, | |
950 | and complains if the source files contain a redefinition of | |
951 | a library routine (this is a feature!). | |
952 | .PP | |
953 | By default, | |
954 | .I lint | |
955 | checks the programs it is given against a standard library | |
956 | file, which contains descriptions of the programs which | |
957 | are normally loaded when | |
958 | a C program | |
959 | is run. | |
960 | When the | |
961 | .B -p | |
962 | flag is in effect, another file is checked containing | |
963 | descriptions of the standard I/O library routines | |
964 | which are expected to be portable across various machines. | |
965 | The | |
966 | .B -n | |
967 | flag can be used to suppress all library checking. | |
968 | .SH | |
969 | Bugs, etc. | |
970 | .PP | |
971 | .I Lint | |
972 | was a difficult program to write, partially | |
973 | because it is closely connected with matters of programming style, | |
974 | and partially because users usually don't notice bugs which cause | |
975 | .I lint | |
976 | to miss errors which it should have caught. | |
977 | (By contrast, if | |
978 | .I lint | |
979 | incorrectly complains about something that is correct, the | |
980 | programmer reports that immediately!) | |
981 | .PP | |
982 | A number of areas remain to be further developed. | |
983 | The checking of structures and arrays is rather inadequate; | |
984 | size | |
985 | incompatibilities go unchecked, | |
986 | and no attempt is made to match up structure and union | |
987 | declarations across files. | |
988 | Some stricter checking of the use of the | |
989 | .B typedef | |
990 | is clearly desirable, but what checking is appropriate, and how | |
991 | to carry it out, is still to be determined. | |
992 | .PP | |
993 | .I Lint | |
994 | shares the preprocessor with the C compiler. | |
995 | At some point it may be appropriate for a | |
996 | special version of the preprocessor to be constructed | |
997 | which checks for things such as unused macro definitions, | |
998 | macro arguments which have side effects which are | |
999 | not expanded at all, or are expanded more than once, etc. | |
1000 | .PP | |
1001 | The central problem with | |
1002 | .I lint | |
1003 | is the packaging of the information which it collects. | |
1004 | There are many options which | |
1005 | serve only to turn off, or slightly modify, | |
1006 | certain features. | |
1007 | There are pressures to add even more of these options. | |
1008 | .PP | |
1009 | In conclusion, it appears that the general notion of having two | |
1010 | programs is a good one. | |
1011 | The compiler concentrates on quickly and accurately turning the | |
1012 | program text into bits which can be run; | |
1013 | .I lint | |
1014 | concentrates on issues | |
1015 | of portability, style, and efficiency. | |
1016 | .I Lint | |
1017 | can afford to be wrong, since incorrectness and over-conservatism | |
1018 | are merely annoying, not fatal. | |
1019 | The compiler can be fast since it knows that | |
1020 | .I lint | |
1021 | will cover its flanks. | |
1022 | Finally, the programmer can | |
1023 | concentrate at one stage | |
1024 | of the programming process solely on the algorithms, | |
1025 | data structures, and correctness of the | |
1026 | program, and then later retrofit, | |
1027 | with the aid of | |
1028 | .I lint , | |
1029 | the desirable properties of universality and portability. | |
1030 | .SG MH-1273-SCJ-unix | |
4ef40fda | 1031 | .\".bp |
795f68a3 KM |
1032 | .[ |
1033 | $LIST$ | |
1034 | .] | |
1035 | .bp | |
1036 | .SH | |
1037 | Appendix: Current Lint Options | |
1038 | .PP | |
1039 | The command currently has the form | |
1040 | .DS | |
1041 | lint\fR [\fB\-\fRoptions ] files... library-descriptors... | |
1042 | .DE | |
1043 | The options are | |
1044 | .IP \fBh\fR | |
1045 | Perform heuristic checks | |
1046 | .IP \fBp\fR | |
1047 | Perform portability checks | |
1048 | .IP \fBv\fR | |
1049 | Don't report unused arguments | |
1050 | .IP \fBu\fR | |
1051 | Don't report unused or undefined externals | |
1052 | .IP \fBb\fR | |
1053 | Report unreachable | |
1054 | .B break | |
1055 | statements. | |
1056 | .IP \fBx\fR | |
1057 | Report unused external declarations | |
1058 | .IP \fBa\fR | |
1059 | Report assignments of | |
1060 | .B long | |
1061 | to | |
1062 | .B int | |
1063 | or shorter. | |
1064 | .IP \fBc\fR | |
1065 | Complain about questionable casts | |
1066 | .IP \fBn\fR | |
1067 | No library checking is done | |
1068 | .IP \fBs\fR | |
1069 | Same as | |
1070 | .B h | |
1071 | (for historical reasons) |