BSD 4_4 release
[unix-history] / usr / src / old / pcc / lint / PSD.doc / lint.ms
CommitLineData
ad787160
C
1.\" This module is believed to contain source code proprietary to AT&T.
2.\" Use and redistribution is subject to the Berkeley Software License
3.\" Agreement and your Software Agreement with AT&T (Western Electric).
3edcb7c8 4.\"
ad787160 5.\" @(#)lint.ms 6.2 (Berkeley) 4/17/91
795f68a3 6.\"
4ef40fda
KM
7.EH 'PS1:9-%''Lint, a C Program Checker'
8.OH 'Lint, a C Program Checker''PS1:9-%'
9.\".RP
795f68a3
KM
10.ND "July 26, 1978"
11.OK
4ef40fda
KM
12.\"Program Portability
13.\"Strong Type Checking
795f68a3
KM
14.TL
15Lint, a C Program Checker
16.AU "MH 2C-559" 3968
17S. C. Johnson
18.AI
19.MH
20.AB
21.PP
22.I Lint
23is a command which examines C source programs,
24detecting
25a number of bugs and obscurities.
26It enforces the type rules of C more strictly than
27the C compilers.
28It may also be used to enforce a number of portability
29restrictions involved in moving
30programs between different machines and/or operating systems.
31Another option detects a number of wasteful, or error prone, constructions
32which nevertheless are, strictly speaking, legal.
33.PP
34.I Lint
35accepts multiple input files and library specifications, and checks them for consistency.
36.PP
37The separation of function between
38.I lint
39and the C compilers has both historical and practical
40rationale.
41The compilers turn C programs into executable files rapidly
42and efficiently.
43This is possible in part because the
44compilers do not do sophisticated
45type checking, especially between
46separately compiled programs.
47.I Lint
48takes a more global, leisurely view of the program,
49looking much more carefully at the compatibilities.
50.PP
51This document discusses the use of
52.I lint ,
53gives an overview of the implementation, and gives some hints on the
54writing of machine independent C code.
55.AE
56.CS 10 2 12 0 0 5
57.SH
58Introduction and Usage
59.PP
60Suppose there are two C
61.[
62Kernighan Ritchie Programming Prentice 1978
63.]
64source files,
65.I file1. c
66and
67.I file2.c ,
68which are ordinarily compiled and loaded together.
69Then the command
70.DS
71lint file1.c file2.c
72.DE
73produces messages describing inconsistencies and inefficiencies
74in the programs.
75The program enforces the typing rules of C
76more strictly than the C compilers
77(for both historical and practical reasons)
78enforce them.
79The command
80.DS
81lint \-p file1.c file2.c
82.DE
83will produce, in addition to the above messages, additional messages
84which relate to the portability of the programs to other operating
85systems and machines.
86Replacing the
87.B \-p
88by
89.B \-h
90will produce messages about various error-prone or wasteful constructions
91which, strictly speaking, are not bugs.
92Saying
93.B \-hp
94gets the whole works.
95.PP
96The next several sections describe the major messages;
97the document closes with sections
98discussing the implementation and giving suggestions
99for writing portable C.
100An appendix gives a summary of the
101.I lint
102options.
103.SH
104A Word About Philosophy
105.PP
106Many of the facts which
107.I lint
108needs may be impossible to
109discover.
110For example, whether a given function in a program ever gets called
111may depend on the input data.
112Deciding whether
113.I exit
114is ever called is equivalent to solving the famous ``halting problem,'' known to be
115recursively undecidable.
116.PP
117Thus, most of the
118.I lint
119algorithms are a compromise.
120If a function is never mentioned, it can never be called.
121If a function is mentioned,
122.I lint
123assumes it can be called; this is not necessarily so, but in practice is quite reasonable.
124.PP
125.I Lint
126tries to give information with a high degree of relevance.
127Messages of the form ``\fIxxx\fR might be a bug''
128are easy to generate, but are acceptable only in proportion
129to the fraction of real bugs they uncover.
130If this fraction of real bugs is too small, the messages lose their credibility
131and serve merely to clutter up the output,
132obscuring the more important messages.
133.PP
134Keeping these issues in mind, we now consider in more detail
135the classes of messages which
136.I lint
137produces.
138.SH
139Unused Variables and Functions
140.PP
141As sets of programs evolve and develop,
142previously used variables and arguments to
143functions may become unused;
144it is not uncommon for external variables, or even entire
145functions, to become unnecessary, and yet
146not be removed from the source.
147These ``errors of commission'' rarely cause working programs to fail, but they are a source
148of inefficiency, and make programs harder to understand
149and change.
150Moreover, information about such unused variables and functions can occasionally
151serve to discover bugs; if a function does a necessary job, and
152is never called, something is wrong!
153.PP
154.I Lint
155complains about variables and functions which are defined but not otherwise
156mentioned.
157An exception is variables which are declared through explicit
158.B extern
159statements but are never referenced; thus the statement
160.DS
161extern float sin(\|);
162.DE
163will evoke no comment if
164.I sin
165is never used.
166Note that this agrees with the semantics of the C compiler.
167In some cases, these unused external declarations might be of some interest; they
168can be discovered by adding the
169.B \-x
170flag to the
171.I lint
172invocation.
173.PP
174Certain styles of programming
175require many functions to be written with similar interfaces;
176frequently, some of the arguments may be unused
177in many of the calls.
178The
179.B \-v
180option is available to suppress the printing of
181complaints about unused arguments.
182When
183.B \-v
184is in effect, no messages are produced about unused
185arguments except for those
186arguments which are unused and also declared as
187register arguments; this can be considered
188an active (and preventable) waste of the register
189resources of the machine.
190.PP
191There is one case where information about unused, or
192undefined, variables is more distracting
193than helpful.
194This is when
195.I lint
196is applied to some, but not all, files out of a collection
197which are to be loaded together.
198In this case, many of the functions and variables defined
199may not be used, and, conversely,
200many functions and variables defined elsewhere may be used.
201The
202.B \-u
203flag may be used to suppress the spurious messages which might otherwise appear.
204.SH
205Set/Used Information
206.PP
207.I Lint
208attempts to detect cases where a variable is used before it is set.
209This is very difficult to do well;
210many algorithms take a good deal of time and space,
211and still produce messages about perfectly valid programs.
212.I Lint
213detects local variables (automatic and register storage classes)
214whose first use appears physically earlier in the input file than the first assignment to the variable.
215It assumes that taking the address of a variable constitutes a ``use,'' since the actual use
216may occur at any later time, in a data dependent fashion.
217.PP
218The restriction to the physical appearance of variables in the file makes the
219algorithm very simple and quick to implement,
220since the true flow of control need not be discovered.
221It does mean that
222.I lint
223can complain about some programs which are legal,
224but these programs would probably be considered bad on stylistic grounds (e.g. might
225contain at least two \fBgoto\fR's).
226Because static and external variables are initialized to 0,
227no meaningful information can be discovered about their uses.
228The algorithm deals correctly, however, with initialized automatic variables, and variables
229which are used in the expression which first sets them.
230.PP
231The set/used information also permits recognition of those local variables which are set
232and never used; these form a frequent source of inefficiencies, and may also be symptomatic of bugs.
233.SH
234Flow of Control
235.PP
236.I Lint
237attempts to detect unreachable portions of the programs which it processes.
238It will complain about unlabeled statements immediately following
239\fBgoto\fR, \fBbreak\fR, \fBcontinue\fR, or \fBreturn\fR statements.
240An attempt is made to detect loops which can never be left at the bottom, detecting the
241special cases
242\fBwhile\fR( 1 ) and \fBfor\fR(;;) as infinite loops.
243.I Lint
244also complains about loops which cannot be entered at the top;
245some valid programs may have such loops, but at best they are bad style,
246at worst bugs.
247.PP
248.I Lint
249has an important area of blindness in the flow of control algorithm:
250it has no way of detecting functions which are called and never return.
251Thus, a call to
252.I exit
253may cause unreachable code which
254.I lint
255does not detect; the most serious effects of this are in the
256determination of returned function values (see the next section).
257.PP
258One form of unreachable statement is not usually complained about by
259.I lint;
260a
261.B break
262statement that cannot be reached causes no message.
263Programs generated by
264.I yacc ,
265.[
266Johnson Yacc 1975
267.]
268and especially
269.I lex ,
270.[
271Lesk Lex
272.]
273may have literally hundreds of unreachable
274.B break
275statements.
276The
277.B \-O
278flag in the C compiler will often eliminate the resulting object code inefficiency.
279Thus, these unreached statements are of little importance,
280there is typically nothing the user can do about them, and the
281resulting messages would clutter up the
282.I lint
283output.
284If these messages are desired,
285.I lint
286can be invoked with the
287.B \-b
288option.
289.SH
290Function Values
291.PP
292Sometimes functions return values which are never used;
293sometimes programs incorrectly use function ``values''
294which have never been returned.
295.I Lint
296addresses this problem in a number of ways.
297.PP
298Locally, within a function definition,
299the appearance of both
300.DS
301return( \fIexpr\fR );
302.DE
303and
304.DS
305return ;
306.DE
307statements is cause for alarm;
308.I lint
309will give the message
310.DS
311function \fIname\fR contains return(e) and return
312.DE
313The most serious difficulty with this is detecting when a function return is implied
314by flow of control reaching the end of the function.
315This can be seen with a simple example:
316.DS
317.ta .5i 1i 1.5i
318\fRf ( a ) {
319 if ( a ) return ( 3 );
320 g (\|);
321 }
322.DE
323Notice that, if \fIa\fR tests false, \fIf\fR will call \fIg\fR and then return
324with no defined return value; this will trigger a complaint from
325.I lint .
326If \fIg\fR, like \fIexit\fR, never returns,
327the message will still be produced when in fact nothing is wrong.
328.PP
329In practice, some potentially serious bugs have been discovered by this feature;
330it also accounts for a substantial fraction of the ``noise'' messages produced
331by
332.I lint .
333.PP
334On a global scale,
335.I lint
336detects cases where a function returns a value, but this value is sometimes,
337or always, unused.
338When the value is always unused, it may constitute an inefficiency in the function definition.
339When the value is sometimes unused, it may represent bad style (e.g., not testing for
340error conditions).
341.PP
342The dual problem, using a function value when the function does not return one,
343is also detected.
344This is a serious problem.
345Amazingly, this bug has been observed on a couple of occasions
346in ``working'' programs; the desired function value just happened to have been computed
347in the function return register!
348.SH
349Type Checking
350.PP
351.I Lint
352enforces the type checking rules of C more strictly than the compilers do.
353The additional checking is in four major areas:
354across certain binary operators and implied assignments,
355at the structure selection operators,
356between the definition and uses of functions,
357and in the use of enumerations.
358.PP
359There are a number of operators which have an implied balancing between types of the operands.
360The assignment, conditional ( ?\|: ), and relational operators
361have this property; the argument
362of a \fBreturn\fR statement,
363and expressions used in initialization also suffer similar conversions.
364In these operations,
365\fBchar\fR, \fBshort\fR, \fBint\fR, \fBlong\fR, \fBunsigned\fR, \fBfloat\fR, and \fBdouble\fR types may be freely intermixed.
366The types of pointers must agree exactly,
367except that arrays of \fIx\fR's can, of course, be intermixed with pointers to \fIx\fR's.
368.PP
369The type checking rules also require that, in structure references, the
370left operand of the \(em> be a pointer to structure, the left operand of the \fB.\fR
371be a structure, and the right operand of these operators be a member
372of the structure implied by the left operand.
373Similar checking is done for references to unions.
374.PP
375Strict rules apply to function argument and return value
376matching.
377The types \fBfloat\fR and \fBdouble\fR may be freely matched,
378as may the types \fBchar\fR, \fBshort\fR, \fBint\fR, and \fBunsigned\fR.
379Also, pointers can be matched with the associated arrays.
380Aside from this, all actual arguments must agree in type with their declared counterparts.
381.PP
382With enumerations, checks are made that enumeration variables or members are not mixed
383with other types, or other enumerations,
384and that the only operations applied are =, initialization, ==, !=, and function arguments and return values.
385.SH
386Type Casts
387.PP
388The type cast feature in C was introduced largely as an aid
389to producing more portable programs.
390Consider the assignment
391.DS
392p = 1 ;
393.DE
394where
395.I p
396is a character pointer.
397.I Lint
398will quite rightly complain.
399Now, consider the assignment
400.DS
401p = (char \(**)1 ;
402.DE
403in which a cast has been used to
404convert the integer to a character pointer.
405The programmer obviously had a strong motivation
406for doing this, and has clearly signaled his intentions.
407It seems harsh for
408.I lint
409to continue to complain about this.
410On the other hand, if this code is moved to another
411machine, such code should be looked at carefully.
412The
413.B \-c
414flag controls the printing of comments about casts.
415When
416.B \-c
417is in effect, casts are treated as though they were assignments
418subject to complaint; otherwise, all legal casts are passed without comment,
419no matter how strange the type mixing seems to be.
420.SH
421Nonportable Character Use
422.PP
423On the PDP-11, characters are signed quantities, with a range
424from \-128 to 127.
425On most of the other C implementations, characters take on only positive
426values.
427Thus,
428.I lint
429will flag certain comparisons and assignments as being
430illegal or nonportable.
431For example, the fragment
432.DS
433char c;
434 ...
435if( (c = getchar(\|)) < 0 ) ....
436.DE
437works on the PDP-11, but
438will fail on machines where characters always take
439on positive values.
440The real solution is to declare
441.I c
442an integer, since
443.I getchar
444is actually returning
445integer values.
446In any case,
447.I lint
448will say
449``nonportable character comparison''.
450.PP
451A similar issue arises with bitfields; when assignments
452of constant values are made to bitfields, the field may
453be too small to hold the value.
454This is especially true because
455on some machines bitfields are considered as signed
456quantities.
457While it may seem unintuitive to consider
458that a two bit field declared of type
459.B int
460cannot hold the value 3, the problem disappears
461if the bitfield is declared to have type
462.B unsigned .
463.SH
464Assignments of longs to ints
465.PP
466Bugs may arise from the assignment of
467.B long
468to
469an
470.B int ,
471which loses accuracy.
472This may happen in programs
473which have been incompletely converted to use
474.B typedefs .
475When a
476.B typedef
477variable
478is changed from \fBint\fR to \fBlong\fR,
479the program can stop working because
480some intermediate results may be assigned
481to \fBints\fR, losing accuracy.
482Since there are a number of legitimate reasons for
483assigning \fBlongs\fR to \fBints\fR, the detection
484of these assignments is enabled
485by the
486.B \-a
487flag.
488.SH
489Strange Constructions
490.PP
491Several perfectly legal, but somewhat strange, constructions
492are flagged by
493.I lint;
494the messages hopefully encourage better code quality, clearer style, and
495may even point out bugs.
496The
497.B \-h
498flag is used to enable these checks.
499For example, in the statement
500.DS
501\(**p++ ;
502.DE
503the \(** does nothing; this provokes the message ``null effect'' from
504.I lint .
505The program fragment
506.DS
507unsigned x ;
508if( x < 0 ) ...
509.DE
510is clearly somewhat strange; the
511test will never succeed.
512Similarly, the test
513.DS
514if( x > 0 ) ...
515.DE
516is equivalent to
517.DS
518if( x != 0 )
519.DE
520which may not be the intended action.
521.I Lint
522will say ``degenerate unsigned comparison'' in these cases.
523If one says
524.DS
525if( 1 != 0 ) ....
526.DE
527.I lint
528will report
529``constant in conditional context'', since the comparison
530of 1 with 0 gives a constant result.
531.PP
532Another construction
533detected by
534.I lint
535involves
536operator precedence.
537Bugs which arise from misunderstandings about the precedence
538of operators can be accentuated by spacing and formatting,
539making such bugs extremely hard to find.
540For example, the statements
541.DS
542if( x&077 == 0 ) ...
543.DE
544or
545.DS
546x<\h'-.3m'<2 + 40
547.DE
548probably do not do what was intended.
549The best solution is to parenthesize such expressions,
550and
551.I lint
552encourages this by an appropriate message.
553.PP
554Finally, when the
555.B \-h
556flag is in force
557.I lint
558complains about variables which are redeclared in inner blocks
559in a way that conflicts with their use in outer blocks.
560This is legal, but is considered by many (including the author) to
561be bad style, usually unnecessary, and frequently a bug.
562.SH
563Ancient History
564.PP
565There are several forms of older syntax which are being officially
566discouraged.
567These fall into two classes, assignment operators and initialization.
568.PP
569The older forms of assignment operators (e.g., =+, =\-, . . . )
570could cause ambiguous expressions, such as
571.DS
572a =\-1 ;
573.DE
574which could be taken as either
575.DS
576a =\- 1 ;
577.DE
578or
579.DS
580a = \-1 ;
581.DE
582The situation is especially perplexing if this
583kind of ambiguity arises as the result of a macro substitution.
584The newer, and preferred operators (+=, \-=, etc. )
585have no such ambiguities.
586To spur the abandonment of the older forms,
587.I lint
588complains about these old fashioned operators.
589.PP
590A similar issue arises with initialization.
591The older language allowed
592.DS
593int x \fR1 ;
594.DE
595to initialize
596.I x
597to 1.
598This also caused syntactic difficulties: for example,
599.DS
600int x ( \-1 ) ;
601.DE
602looks somewhat like the beginning of a function declaration:
603.DS
604int x ( y ) { . . .
605.DE
606and the compiler must read a fair ways past
607.I x
608in order to sure what the declaration really is..
609Again, the problem is even more perplexing when the
610initializer involves a macro.
611The current syntax places an equals sign between the
612variable and the initializer:
613.DS
614int x = \-1 ;
615.DE
616This is free of any possible syntactic ambiguity.
617.SH
618Pointer Alignment
619.PP
620Certain pointer assignments may be reasonable on some machines,
621and illegal on others, due entirely to
622alignment restrictions.
623For example, on the PDP-11, it is reasonable
624to assign integer pointers to double pointers, since
625double precision values may begin on any integer boundary.
626On the Honeywell 6000, double precision values must begin
627on even word boundaries;
628thus, not all such assignments make sense.
629.I Lint
630tries to detect cases where pointers are assigned to other
631pointers, and such alignment problems might arise.
632The message ``possible pointer alignment problem''
633results from this situation whenever either the
634.B \-p
635or
636.B \-h
637flags are in effect.
638.SH
639Multiple Uses and Side Effects
640.PP
641In complicated expressions, the best order in which to evaluate
642subexpressions may be highly machine dependent.
643For example, on machines (like the PDP-11) in which the stack
644runs backwards, function arguments will probably be best evaluated
645from right-to-left; on machines with a stack running forward,
646left-to-right seems most attractive.
647Function calls embedded as arguments of other functions
648may or may not be treated similarly to ordinary arguments.
649Similar issues arise with other operators which have side effects,
650such as the assignment operators and the increment and decrement operators.
651.PP
652In order that the efficiency of C on a particular machine not be
653unduly compromised, the C language leaves the order
654of evaluation of complicated expressions up to the
655local compiler, and, in fact, the various C compilers have considerable
656differences in the order in which they will evaluate complicated
657expressions.
658In particular, if any variable is changed by a side effect, and
659also used elsewhere in the same expression, the result is explicitly undefined.
660.PP
661.I Lint
662checks for the important special case where
663a simple scalar variable is affected.
664For example, the statement
665.DS
666\fIa\fR[\fIi\|\fR] = \fIb\fR[\fIi\fR++] ;
667.DE
668will draw the complaint:
669.DS
670warning: \fIi\fR evaluation order undefined
671.DE
672.SH
673Implementation
674.PP
675.I Lint
676consists of two programs and a driver.
677The first program is a version of the
678Portable C Compiler
679.[
680Johnson Ritchie BSTJ Portability Programs System
681.]
682.[
683Johnson portable compiler 1978
684.]
685which is the basis of the
686IBM 370, Honeywell 6000, and Interdata 8/32 C compilers.
687This compiler does lexical and syntax analysis on the input text,
688constructs and maintains symbol tables, and builds trees for expressions.
689Instead of writing an intermediate file which is passed to
690a code generator, as the other compilers
691do,
692.I lint
693produces an intermediate file which consists of lines of ascii text.
694Each line contains an external variable name,
695an encoding of the context in which it was seen (use, definition, declaration, etc.),
696a type specifier, and a source file name and line number.
697The information about variables local to a function or file
698is collected
699by accessing the symbol table, and examining the expression trees.
700.PP
701Comments about local problems are produced as detected.
702The information about external names is collected
703onto an intermediate file.
704After all the source files and library descriptions have
705been collected, the intermediate file is sorted
706to bring all information collected about a given external
707name together.
708The second, rather small, program then reads the lines
709from the intermediate file and compares all of the
710definitions, declarations, and uses for consistency.
711.PP
712The driver controls this
713process, and is also responsible for making the options available
714to both passes of
715.I lint .
716.SH
717Portability
718.PP
719C on the Honeywell and IBM systems is used, in part, to write system code for the host operating system.
720This means that the implementation of C tends to follow local conventions rather than
721adhere strictly to
722.UX
723system conventions.
724Despite these differences, many C programs have been successfully moved to GCOS and the various IBM
725installations with little effort.
726This section describes some of the differences between the implementations, and
727discusses the
728.I lint
729features which encourage portability.
730.PP
731Uninitialized external variables are treated differently in different
732implementations of C.
733Suppose two files both contain a declaration without initialization, such as
734.DS
735int a ;
736.DE
737outside of any function.
738The
739.UX
740loader will resolve these declarations, and cause only a single word of storage
741to be set aside for \fIa\fR.
742Under the GCOS and IBM implementations, this is not feasible (for various stupid reasons!)
743so each such declaration causes a word of storage to be set aside and called \fIa\fR.
744When loading or library editing takes place, this causes fatal conflicts which prevent
745the proper operation of the program.
746If
747.I lint
748is invoked with the \fB\-p\fR flag,
749it will detect such multiple definitions.
750.PP
751A related difficulty comes from the amount of information retained about external names during the
752loading process.
753On the
754.UX
755system, externally known names have seven significant characters, with the upper/lower
756case distinction kept.
757On the IBM systems, there are eight significant characters, but the case distinction
758is lost.
759On GCOS, there are only six characters, of a single case.
760This leads to situations where programs run on the
761.UX
762system, but encounter loader
763problems on the IBM or GCOS systems.
764.I Lint
765.B \-p
766causes all external symbols to be mapped to one case and truncated to six characters,
767providing a worst-case analysis.
768.PP
769A number of differences arise in the area of character handling: characters in the
770.UX
771system are eight bit ascii, while they are eight bit ebcdic on the IBM, and
772nine bit ascii on GCOS.
773Moreover, character strings go from high to low bit positions (``left to right'')
774on GCOS and IBM, and low to high (``right to left'') on the PDP-11.
775This means that code attempting to construct strings
776out of character constants, or attempting to use characters as indices
777into arrays, must be looked at with great suspicion.
778.I Lint
779is of little help here, except to flag multi-character character constants.
780.PP
781Of course, the word sizes are different!
782This causes less trouble than might be expected, at least when
783moving from the
784.UX
785system (16 bit words) to the IBM (32 bits) or GCOS (36 bits).
786The main problems are likely to arise in shifting or masking.
787C now supports a bit-field facility, which can be used to write much of
788this code in a reasonably portable way.
789Frequently, portability of such code can be enhanced by
790slight rearrangements in coding style.
791Many of the incompatibilities seem to have the flavor of writing
792.DS
793x &= 0177700 ;
794.DE
795to clear the low order six bits of \fIx\fR.
796This suffices on the PDP-11, but fails badly on GCOS and IBM.
797If the bit field feature cannot be used, the same effect can be obtained by
798writing
799.DS
800x &= \(ap 077 ;
801.DE
802which will work on all these machines.
803.PP
804The right shift operator is arithmetic shift on the PDP-11, and logical shift on most
805other machines.
806To obtain a logical shift on all machines, the left operand can be
807typed \fBunsigned\fR.
808Characters are considered signed integers on the PDP-11, and unsigned on the other machines.
809This persistence of the sign bit may be reasonably considered a bug in the PDP-11 hardware
810which has infiltrated itself into the C language.
811If there were a good way to discover the programs which would be affected, C could be changed;
812in any case,
813.I lint
814is no help here.
815.PP
816The above discussion may have made the problem of portability seem
817bigger than it in fact is.
818The issues involved here are rarely subtle or mysterious, at least to the
819implementor of the program, although they can involve some work to straighten out.
820The most serious bar to the portability of
821.UX
822system utilities has been the inability to mimic
823essential
824.UX
825system functions on the other systems.
826The inability to seek to a random character position in a text file, or to establish a pipe
827between processes, has involved far more rewriting
828and debugging than any of the differences in C compilers.
829On the other hand,
830.I lint
831has been very helpful
832in moving the
833.UX
834operating system and associated
835utility programs to other machines.
836.SH
837Shutting Lint Up
838.PP
839There are occasions when
840the programmer is smarter than
841.I lint .
842There may be valid reasons for ``illegal'' type casts,
843functions with a variable number of arguments, etc.
844Moreover, as specified above, the flow of control information
845produced by
846.I lint
847often has blind spots, causing occasional spurious
848messages about perfectly reasonable programs.
849Thus, some way of communicating with
850.I lint ,
851typically to shut it up, is desirable.
852.PP
853The form which this mechanism should take is not at all clear.
854New keywords would require current and old compilers to
855recognize these keywords, if only to ignore them.
856This has both philosophical and practical problems.
857New preprocessor syntax suffers from similar problems.
858.PP
859What was finally done was to cause a number of words
860to be recognized by
861.I lint
862when they were embedded in comments.
863This required minimal preprocessor changes;
864the preprocessor just had to agree to pass comments
865through to its output, instead of deleting them
866as had been previously done.
867Thus,
868.I lint
869directives are invisible to the compilers, and
870the effect on systems with the older preprocessors
871is merely that the
872.I lint
873directives don't work.
874.PP
875The first directive is concerned with flow of control information;
876if a particular place in the program cannot be reached,
877but this is not apparent to
878.I lint ,
879this can be asserted by the directive
880.DS
881/* NOTREACHED */
882.DE
883at the appropriate spot in the program.
884Similarly, if it is desired to turn off
885strict type checking for
886the next expression, the directive
887.DS
888/* NOSTRICT */
889.DE
890can be used; the situation reverts to the
891previous default after the next expression.
892The
893.B \-v
894flag can be turned on for one function by the directive
895.DS
896/* ARGSUSED */
897.DE
898Complaints about variable number of arguments in calls to a function
899can be turned off by the directive
900.DS
901/* VARARGS */
902.DE
903preceding the function definition.
904In some cases, it is desirable to check the
905first several arguments, and leave the later arguments unchecked.
906This can be done by following the VARARGS keyword immediately
907with a digit giving the number of arguments which should be checked; thus,
908.DS
909/* VARARGS2 */
910.DE
911will cause the first two arguments to be checked, the others unchecked.
912Finally, the directive
913.DS
914/* LINTLIBRARY */
915.DE
916at the head of a file identifies this file as
917a library declaration file; this topic is worth a
918section by itself.
919.SH
920Library Declaration Files
921.PP
922.I Lint
923accepts certain library directives, such as
924.DS
925\-ly
926.DE
927and tests the source files for compatibility with these libraries.
928This is done by accessing library description files whose
929names are constructed from the library directives.
930These files all begin with the directive
931.DS
932/* LINTLIBRARY */
933.DE
934which is followed by a series of dummy function
935definitions.
936The critical parts of these definitions
937are the declaration of the function return type,
938whether the dummy function returns a value, and
939the number and types of arguments to the function.
940The VARARGS and ARGSUSED directives can
941be used to specify features of the library functions.
942.PP
943.I Lint
944library files are processed almost exactly like ordinary
945source files.
946The only difference is that functions which are defined on a library file,
947but are not used on a source file, draw no complaints.
948.I Lint
949does not simulate a full library search algorithm,
950and complains if the source files contain a redefinition of
951a library routine (this is a feature!).
952.PP
953By default,
954.I lint
955checks the programs it is given against a standard library
956file, which contains descriptions of the programs which
957are normally loaded when
958a C program
959is run.
960When the
961.B -p
962flag is in effect, another file is checked containing
963descriptions of the standard I/O library routines
964which are expected to be portable across various machines.
965The
966.B -n
967flag can be used to suppress all library checking.
968.SH
969Bugs, etc.
970.PP
971.I Lint
972was a difficult program to write, partially
973because it is closely connected with matters of programming style,
974and partially because users usually don't notice bugs which cause
975.I lint
976to miss errors which it should have caught.
977(By contrast, if
978.I lint
979incorrectly complains about something that is correct, the
980programmer reports that immediately!)
981.PP
982A number of areas remain to be further developed.
983The checking of structures and arrays is rather inadequate;
984size
985incompatibilities go unchecked,
986and no attempt is made to match up structure and union
987declarations across files.
988Some stricter checking of the use of the
989.B typedef
990is clearly desirable, but what checking is appropriate, and how
991to carry it out, is still to be determined.
992.PP
993.I Lint
994shares the preprocessor with the C compiler.
995At some point it may be appropriate for a
996special version of the preprocessor to be constructed
997which checks for things such as unused macro definitions,
998macro arguments which have side effects which are
999not expanded at all, or are expanded more than once, etc.
1000.PP
1001The central problem with
1002.I lint
1003is the packaging of the information which it collects.
1004There are many options which
1005serve only to turn off, or slightly modify,
1006certain features.
1007There are pressures to add even more of these options.
1008.PP
1009In conclusion, it appears that the general notion of having two
1010programs is a good one.
1011The compiler concentrates on quickly and accurately turning the
1012program text into bits which can be run;
1013.I lint
1014concentrates on issues
1015of portability, style, and efficiency.
1016.I Lint
1017can afford to be wrong, since incorrectness and over-conservatism
1018are merely annoying, not fatal.
1019The compiler can be fast since it knows that
1020.I lint
1021will cover its flanks.
1022Finally, the programmer can
1023concentrate at one stage
1024of the programming process solely on the algorithms,
1025data structures, and correctness of the
1026program, and then later retrofit,
1027with the aid of
1028.I lint ,
1029the desirable properties of universality and portability.
1030.SG MH-1273-SCJ-unix
4ef40fda 1031.\".bp
795f68a3
KM
1032.[
1033$LIST$
1034.]
1035.bp
1036.SH
1037Appendix: Current Lint Options
1038.PP
1039The command currently has the form
1040.DS
1041lint\fR [\fB\-\fRoptions ] files... library-descriptors...
1042.DE
1043The options are
1044.IP \fBh\fR
1045Perform heuristic checks
1046.IP \fBp\fR
1047Perform portability checks
1048.IP \fBv\fR
1049Don't report unused arguments
1050.IP \fBu\fR
1051Don't report unused or undefined externals
1052.IP \fBb\fR
1053Report unreachable
1054.B break
1055statements.
1056.IP \fBx\fR
1057Report unused external declarations
1058.IP \fBa\fR
1059Report assignments of
1060.B long
1061to
1062.B int
1063or shorter.
1064.IP \fBc\fR
1065Complain about questionable casts
1066.IP \fBn\fR
1067No library checking is done
1068.IP \fBs\fR
1069Same as
1070.B h
1071(for historical reasons)