BSD 3 development
[unix-history] / usr / doc / lint
CommitLineData
2074ceed
BJ
1.RP
2.ND "July 26, 1978"
3.OK
4Program Portability
5Strong Type Checking
6.TL
7Lint, a C Program Checker
8.AU "MH 2C-559" 3968
9S. C. Johnson
10.AI
11.MH
12.AB
13.PP
14.I Lint
15is a command which examines C source programs,
16detecting
17a number of bugs and obscurities.
18It enforces the type rules of C more strictly than
19the C compilers.
20It may also be used to enforce a number of portability
21restrictions involved in moving
22programs between different machines and/or operating systems.
23Another option detects a number of wasteful, or error prone, constructions
24which nevertheless are, strictly speaking, legal.
25.PP
26.I Lint
27accepts multiple input files and library specifications, and checks them for consistency.
28.PP
29The separation of function between
30.I lint
31and the C compilers has both historical and practical
32rationale.
33The compilers turn C programs into executable files rapidly
34and efficiently.
35This is possible in part because the
36compilers do not do sophisticated
37type checking, especially between
38separately compiled programs.
39.I Lint
40takes a more global, leisurely view of the program,
41looking much more carefully at the compatibilities.
42.PP
43This document discusses the use of
44.I lint ,
45gives an overview of the implementation, and gives some hints on the
46writing of machine independent C code.
47.AE
48.CS 10 2 12 0 0 5
49.SH
50Introduction and Usage
51.PP
52Suppose there are two C
53.[
54Kernighan Ritchie Programming Prentice 1978
55.]
56source files,
57.I file1. c
58and
59.I file2.c ,
60which are ordinarily compiled and loaded together.
61Then the command
62.DS
63lint file1.c file2.c
64.DE
65produces messages describing inconsistencies and inefficiencies
66in the programs.
67The program enforces the typing rules of C
68more strictly than the C compilers
69(for both historical and practical reasons)
70enforce them.
71The command
72.DS
73lint \-p file1.c file2.c
74.DE
75will produce, in addition to the above messages, additional messages
76which relate to the portability of the programs to other operating
77systems and machines.
78Replacing the
79.B \-p
80by
81.B \-h
82will produce messages about various error-prone or wasteful constructions
83which, strictly speaking, are not bugs.
84Saying
85.B \-hp
86gets the whole works.
87.PP
88The next several sections describe the major messages;
89the document closes with sections
90discussing the implementation and giving suggestions
91for writing portable C.
92An appendix gives a summary of the
93.I lint
94options.
95.SH
96A Word About Philosophy
97.PP
98Many of the facts which
99.I lint
100needs may be impossible to
101discover.
102For example, whether a given function in a program ever gets called
103may depend on the input data.
104Deciding whether
105.I exit
106is ever called is equivalent to solving the famous ``halting problem,'' known to be
107recursively undecidable.
108.PP
109Thus, most of the
110.I lint
111algorithms are a compromise.
112If a function is never mentioned, it can never be called.
113If a function is mentioned,
114.I lint
115assumes it can be called; this is not necessarily so, but in practice is quite reasonable.
116.PP
117.I Lint
118tries to give information with a high degree of relevance.
119Messages of the form ``\fIxxx\fR might be a bug''
120are easy to generate, but are acceptable only in proportion
121to the fraction of real bugs they uncover.
122If this fraction of real bugs is too small, the messages lose their credibility
123and serve merely to clutter up the output,
124obscuring the more important messages.
125.PP
126Keeping these issues in mind, we now consider in more detail
127the classes of messages which
128.I lint
129produces.
130.SH
131Unused Variables and Functions
132.PP
133As sets of programs evolve and develop,
134previously used variables and arguments to
135functions may become unused;
136it is not uncommon for external variables, or even entire
137functions, to become unnecessary, and yet
138not be removed from the source.
139These ``errors of commission'' rarely cause working programs to fail, but they are a source
140of inefficiency, and make programs harder to understand
141and change.
142Moreover, information about such unused variables and functions can occasionally
143serve to discover bugs; if a function does a necessary job, and
144is never called, something is wrong!
145.PP
146.I Lint
147complains about variables and functions which are defined but not otherwise
148mentioned.
149An exception is variables which are declared through explicit
150.B extern
151statements but are never referenced; thus the statement
152.DS
153extern float sin(\|);
154.DE
155will evoke no comment if
156.I sin
157is never used.
158Note that this agrees with the semantics of the C compiler.
159In some cases, these unused external declarations might be of some interest; they
160can be discovered by adding the
161.B \-x
162flag to the
163.I lint
164invocation.
165.PP
166Certain styles of programming
167require many functions to be written with similar interfaces;
168frequently, some of the arguments may be unused
169in many of the calls.
170The
171.B \-v
172option is available to suppress the printing of
173complaints about unused arguments.
174When
175.B \-v
176is in effect, no messages are produced about unused
177arguments except for those
178arguments which are unused and also declared as
179register arguments; this can be considered
180an active (and preventable) waste of the register
181resources of the machine.
182.PP
183There is one case where information about unused, or
184undefined, variables is more distracting
185than helpful.
186This is when
187.I lint
188is applied to some, but not all, files out of a collection
189which are to be loaded together.
190In this case, many of the functions and variables defined
191may not be used, and, conversely,
192many functions and variables defined elsewhere may be used.
193The
194.B \-u
195flag may be used to suppress the spurious messages which might otherwise appear.
196.SH
197Set/Used Information
198.PP
199.I Lint
200attempts to detect cases where a variable is used before it is set.
201This is very difficult to do well;
202many algorithms take a good deal of time and space,
203and still produce messages about perfectly valid programs.
204.I Lint
205detects local variables (automatic and register storage classes)
206whose first use appears physically earlier in the input file than the first assignment to the variable.
207It assumes that taking the address of a variable constitutes a ``use,'' since the actual use
208may occur at any later time, in a data dependent fashion.
209.PP
210The restriction to the physical appearance of variables in the file makes the
211algorithm very simple and quick to implement,
212since the true flow of control need not be discovered.
213It does mean that
214.I lint
215can complain about some programs which are legal,
216but these programs would probably be considered bad on stylistic grounds (e.g. might
217contain at least two \fBgoto\fR's).
218Because static and external variables are initialized to 0,
219no meaningful information can be discovered about their uses.
220The algorithm deals correctly, however, with initialized automatic variables, and variables
221which are used in the expression which first sets them.
222.PP
223The set/used information also permits recognition of those local variables which are set
224and never used; these form a frequent source of inefficiencies, and may also be symptomatic of bugs.
225.SH
226Flow of Control
227.PP
228.I Lint
229attempts to detect unreachable portions of the programs which it processes.
230It will complain about unlabeled statements immediately following
231\fBgoto\fR, \fBbreak\fR, \fBcontinue\fR, or \fBreturn\fR statements.
232An attempt is made to detect loops which can never be left at the bottom, detecting the
233special cases
234\fBwhile\fR( 1 ) and \fBfor\fR(;;) as infinite loops.
235.I Lint
236also complains about loops which cannot be entered at the top;
237some valid programs may have such loops, but at best they are bad style,
238at worst bugs.
239.PP
240.I Lint
241has an important area of blindness in the flow of control algorithm:
242it has no way of detecting functions which are called and never return.
243Thus, a call to
244.I exit
245may cause unreachable code which
246.I lint
247does not detect; the most serious effects of this are in the
248determination of returned function values (see the next section).
249.PP
250One form of unreachable statement is not usually complained about by
251.I lint;
252a
253.B break
254statement that cannot be reached causes no message.
255Programs generated by
256.I yacc ,
257.[
258Johnson Yacc 1975
259.]
260and especially
261.I lex ,
262.[
263Lesk Lex
264.]
265may have literally hundreds of unreachable
266.B break
267statements.
268The
269.B \-O
270flag in the C compiler will often eliminate the resulting object code inefficiency.
271Thus, these unreached statements are of little importance,
272there is typically nothing the user can do about them, and the
273resulting messages would clutter up the
274.I lint
275output.
276If these messages are desired,
277.I lint
278can be invoked with the
279.B \-b
280option.
281.SH
282Function Values
283.PP
284Sometimes functions return values which are never used;
285sometimes programs incorrectly use function ``values''
286which have never been returned.
287.I Lint
288addresses this problem in a number of ways.
289.PP
290Locally, within a function definition,
291the appearance of both
292.DS
293return( \fIexpr\fR );
294.DE
295and
296.DS
297return ;
298.DE
299statements is cause for alarm;
300.I lint
301will give the message
302.DS
303function \fIname\fR contains return(e) and return
304.DE
305The most serious difficulty with this is detecting when a function return is implied
306by flow of control reaching the end of the function.
307This can be seen with a simple example:
308.DS
309.ta .5i 1i 1.5i
310\fRf ( a ) {
311 if ( a ) return ( 3 );
312 g (\|);
313 }
314.DE
315Notice that, if \fIa\fR tests false, \fIf\fR will call \fIg\fR and then return
316with no defined return value; this will trigger a complaint from
317.I lint .
318If \fIg\fR, like \fIexit\fR, never returns,
319the message will still be produced when in fact nothing is wrong.
320.PP
321In practice, some potentially serious bugs have been discovered by this feature;
322it also accounts for a substantial fraction of the ``noise'' messages produced
323by
324.I lint .
325.PP
326On a global scale,
327.I lint
328detects cases where a function returns a value, but this value is sometimes,
329or always, unused.
330When the value is always unused, it may constitute an inefficiency in the function definition.
331When the value is sometimes unused, it may represent bad style (e.g., not testing for
332error conditions).
333.PP
334The dual problem, using a function value when the function does not return one,
335is also detected.
336This is a serious problem.
337Amazingly, this bug has been observed on a couple of occasions
338in ``working'' programs; the desired function value just happened to have been computed
339in the function return register!
340.SH
341Type Checking
342.PP
343.I Lint
344enforces the type checking rules of C more strictly than the compilers do.
345The additional checking is in four major areas:
346across certain binary operators and implied assignments,
347at the structure selection operators,
348between the definition and uses of functions,
349and in the use of enumerations.
350.PP
351There are a number of operators which have an implied balancing between types of the operands.
352The assignment, conditional ( ?\|: ), and relational operators
353have this property; the argument
354of a \fBreturn\fR statement,
355and expressions used in initialization also suffer similar conversions.
356In these operations,
357\fBchar\fR, \fBshort\fR, \fBint\fR, \fBlong\fR, \fBunsigned\fR, \fBfloat\fR, and \fBdouble\fR types may be freely intermixed.
358The types of pointers must agree exactly,
359except that arrays of \fIx\fR's can, of course, be intermixed with pointers to \fIx\fR's.
360.PP
361The type checking rules also require that, in structure references, the
362left operand of the \(em> be a pointer to structure, the left operand of the \fB.\fR
363be a structure, and the right operand of these operators be a member
364of the structure implied by the left operand.
365Similar checking is done for references to unions.
366.PP
367Strict rules apply to function argument and return value
368matching.
369The types \fBfloat\fR and \fBdouble\fR may be freely matched,
370as may the types \fBchar\fR, \fBshort\fR, \fBint\fR, and \fBunsigned\fR.
371Also, pointers can be matched with the associated arrays.
372Aside from this, all actual arguments must agree in type with their declared counterparts.
373.PP
374With enumerations, checks are made that enumeration variables or members are not mixed
375with other types, or other enumerations,
376and that the only operations applied are =, initialization, ==, !=, and function arguments and return values.
377.SH
378Type Casts
379.PP
380The type cast feature in C was introduced largely as an aid
381to producing more portable programs.
382Consider the assignment
383.DS
384p = 1 ;
385.DE
386where
387.I p
388is a character pointer.
389.I Lint
390will quite rightly complain.
391Now, consider the assignment
392.DS
393p = (char \(**)1 ;
394.DE
395in which a cast has been used to
396convert the integer to a character pointer.
397The programmer obviously had a strong motivation
398for doing this, and has clearly signaled his intentions.
399It seems harsh for
400.I lint
401to continue to complain about this.
402On the other hand, if this code is moved to another
403machine, such code should be looked at carefully.
404The
405.B \-c
406flag controls the printing of comments about casts.
407When
408.B \-c
409is in effect, casts are treated as though they were assignments
410subject to complaint; otherwise, all legal casts are passed without comment,
411no matter how strange the type mixing seems to be.
412.SH
413Nonportable Character Use
414.PP
415On the PDP-11, characters are signed quantities, with a range
416from \-128 to 127.
417On most of the other C implementations, characters take on only positive
418values.
419Thus,
420.I lint
421will flag certain comparisons and assignments as being
422illegal or nonportable.
423For example, the fragment
424.DS
425char c;
426 ...
427if( (c = getchar(\|)) < 0 ) ....
428.DE
429works on the PDP-11, but
430will fail on machines where characters always take
431on positive values.
432The real solution is to declare
433.I c
434an integer, since
435.I getchar
436is actually returning
437integer values.
438In any case,
439.I lint
440will say
441``nonportable character comparison''.
442.PP
443A similar issue arises with bitfields; when assignments
444of constant values are made to bitfields, the field may
445be too small to hold the value.
446This is especially true because
447on some machines bitfields are considered as signed
448quantities.
449While it may seem unintuitive to consider
450that a two bit field declared of type
451.B int
452cannot hold the value 3, the problem disappears
453if the bitfield is declared to have type
454.B unsigned .
455.SH
456Assignments of longs to ints
457.PP
458Bugs may arise from the assignment of
459.B long
460to
461an
462.B int ,
463which loses accuracy.
464This may happen in programs
465which have been incompletely converted to use
466.B typedefs .
467When a
468.B typedef
469variable
470is changed from \fBint\fR to \fBlong\fR,
471the program can stop working because
472some intermediate results may be assigned
473to \fBints\fR, losing accuracy.
474Since there are a number of legitimate reasons for
475assigning \fBlongs\fR to \fBints\fR, the detection
476of these assignments is enabled
477by the
478.B \-a
479flag.
480.SH
481Strange Constructions
482.PP
483Several perfectly legal, but somewhat strange, constructions
484are flagged by
485.I lint;
486the messages hopefully encourage better code quality, clearer style, and
487may even point out bugs.
488The
489.B \-h
490flag is used to enable these checks.
491For example, in the statement
492.DS
493\(**p++ ;
494.DE
495the \(** does nothing; this provokes the message ``null effect'' from
496.I lint .
497The program fragment
498.DS
499unsigned x ;
500if( x < 0 ) ...
501.DE
502is clearly somewhat strange; the
503test will never succeed.
504Similarly, the test
505.DS
506if( x > 0 ) ...
507.DE
508is equivalent to
509.DS
510if( x != 0 )
511.DE
512which may not be the intended action.
513.I Lint
514will say ``degenerate unsigned comparison'' in these cases.
515If one says
516.DS
517if( 1 != 0 ) ....
518.DE
519.I lint
520will report
521``constant in conditional context'', since the comparison
522of 1 with 0 gives a constant result.
523.PP
524Another construction
525detected by
526.I lint
527involves
528operator precedence.
529Bugs which arise from misunderstandings about the precedence
530of operators can be accentuated by spacing and formatting,
531making such bugs extremely hard to find.
532For example, the statements
533.DS
534if( x&077 == 0 ) ...
535.DE
536or
537.DS
538x<\h'-.3m'<2 + 40
539.DE
540probably do not do what was intended.
541The best solution is to parenthesize such expressions,
542and
543.I lint
544encourages this by an appropriate message.
545.PP
546Finally, when the
547.B \-h
548flag is in force
549.I lint
550complains about variables which are redeclared in inner blocks
551in a way that conflicts with their use in outer blocks.
552This is legal, but is considered by many (including the author) to
553be bad style, usually unnecessary, and frequently a bug.
554.SH
555Ancient History
556.PP
557There are several forms of older syntax which are being officially
558discouraged.
559These fall into two classes, assignment operators and initialization.
560.PP
561The older forms of assignment operators (e.g., =+, =\-, . . . )
562could cause ambiguous expressions, such as
563.DS
564a =\-1 ;
565.DE
566which could be taken as either
567.DS
568a =\- 1 ;
569.DE
570or
571.DS
572a = \-1 ;
573.DE
574The situation is especially perplexing if this
575kind of ambiguity arises as the result of a macro substitution.
576The newer, and preferred operators (+=, \-=, etc. )
577have no such ambiguities.
578To spur the abandonment of the older forms,
579.I lint
580complains about these old fashioned operators.
581.PP
582A similar issue arises with initialization.
583The older language allowed
584.DS
585int x \fR1 ;
586.DE
587to initialize
588.I x
589to 1.
590This also caused syntactic difficulties: for example,
591.DS
592int x ( \-1 ) ;
593.DE
594looks somewhat like the beginning of a function declaration:
595.DS
596int x ( y ) { . . .
597.DE
598and the compiler must read a fair ways past
599.I x
600in order to sure what the declaration really is..
601Again, the problem is even more perplexing when the
602initializer involves a macro.
603The current syntax places an equals sign between the
604variable and the initializer:
605.DS
606int x = \-1 ;
607.DE
608This is free of any possible syntactic ambiguity.
609.SH
610Pointer Alignment
611.PP
612Certain pointer assignments may be reasonable on some machines,
613and illegal on others, due entirely to
614alignment restrictions.
615For example, on the PDP-11, it is reasonable
616to assign integer pointers to double pointers, since
617double precision values may begin on any integer boundary.
618On the Honeywell 6000, double precision values must begin
619on even word boundaries;
620thus, not all such assignments make sense.
621.I Lint
622tries to detect cases where pointers are assigned to other
623pointers, and such alignment problems might arise.
624The message ``possible pointer alignment problem''
625results from this situation whenever either the
626.B \-p
627or
628.B \-h
629flags are in effect.
630.SH
631Multiple Uses and Side Effects
632.PP
633In complicated expressions, the best order in which to evaluate
634subexpressions may be highly machine dependent.
635For example, on machines (like the PDP-11) in which the stack
636runs backwards, function arguments will probably be best evaluated
637from right-to-left; on machines with a stack running forward,
638left-to-right seems most attractive.
639Function calls embedded as arguments of other functions
640may or may not be treated similarly to ordinary arguments.
641Similar issues arise with other operators which have side effects,
642such as the assignment operators and the increment and decrement operators.
643.PP
644In order that the efficiency of C on a particular machine not be
645unduly compromised, the C language leaves the order
646of evaluation of complicated expressions up to the
647local compiler, and, in fact, the various C compilers have considerable
648differences in the order in which they will evaluate complicated
649expressions.
650In particular, if any variable is changed by a side effect, and
651also used elsewhere in the same expression, the result is explicitly undefined.
652.PP
653.I Lint
654checks for the important special case where
655a simple scalar variable is affected.
656For example, the statement
657.DS
658\fIa\fR[\fIi\|\fR] = \fIb\fR[\fIi\fR++] ;
659.DE
660will draw the complaint:
661.DS
662warning: \fIi\fR evaluation order undefined
663.DE
664.SH
665Implementation
666.PP
667.I Lint
668consists of two programs and a driver.
669The first program is a version of the
670Portable C Compiler
671.[
672Johnson Ritchie BSTJ Portability Programs System
673.]
674.[
675Johnson portable compiler 1978
676.]
677which is the basis of the
678IBM 370, Honeywell 6000, and Interdata 8/32 C compilers.
679This compiler does lexical and syntax analysis on the input text,
680constructs and maintains symbol tables, and builds trees for expressions.
681Instead of writing an intermediate file which is passed to
682a code generator, as the other compilers
683do,
684.I lint
685produces an intermediate file which consists of lines of ascii text.
686Each line contains an external variable name,
687an encoding of the context in which it was seen (use, definition, declaration, etc.),
688a type specifier, and a source file name and line number.
689The information about variables local to a function or file
690is collected
691by accessing the symbol table, and examining the expression trees.
692.PP
693Comments about local problems are produced as detected.
694The information about external names is collected
695onto an intermediate file.
696After all the source files and library descriptions have
697been collected, the intermediate file is sorted
698to bring all information collected about a given external
699name together.
700The second, rather small, program then reads the lines
701from the intermediate file and compares all of the
702definitions, declarations, and uses for consistency.
703.PP
704The driver controls this
705process, and is also responsible for making the options available
706to both passes of
707.I lint .
708.SH
709Portability
710.PP
711C on the Honeywell and IBM systems is used, in part, to write system code for the host operating system.
712This means that the implementation of C tends to follow local conventions rather than
713adhere strictly to
714.UX
715system conventions.
716Despite these differences, many C programs have been successfully moved to GCOS and the various IBM
717installations with little effort.
718This section describes some of the differences between the implementations, and
719discusses the
720.I lint
721features which encourage portability.
722.PP
723Uninitialized external variables are treated differently in different
724implementations of C.
725Suppose two files both contain a declaration without initialization, such as
726.DS
727int a ;
728.DE
729outside of any function.
730The
731.UX
732loader will resolve these declarations, and cause only a single word of storage
733to be set aside for \fIa\fR.
734Under the GCOS and IBM implementations, this is not feasible (for various stupid reasons!)
735so each such declaration causes a word of storage to be set aside and called \fIa\fR.
736When loading or library editing takes place, this causes fatal conflicts which prevent
737the proper operation of the program.
738If
739.I lint
740is invoked with the \fB\-p\fR flag,
741it will detect such multiple definitions.
742.PP
743A related difficulty comes from the amount of information retained about external names during the
744loading process.
745On the
746.UX
747system, externally known names have seven significant characters, with the upper/lower
748case distinction kept.
749On the IBM systems, there are eight significant characters, but the case distinction
750is lost.
751On GCOS, there are only six characters, of a single case.
752This leads to situations where programs run on the
753.UX
754system, but encounter loader
755problems on the IBM or GCOS systems.
756.I Lint
757.B \-p
758causes all external symbols to be mapped to one case and truncated to six characters,
759providing a worst-case analysis.
760.PP
761A number of differences arise in the area of character handling: characters in the
762.UX
763system are eight bit ascii, while they are eight bit ebcdic on the IBM, and
764nine bit ascii on GCOS.
765Moreover, character strings go from high to low bit positions (``left to right'')
766on GCOS and IBM, and low to high (``right to left'') on the PDP-11.
767This means that code attempting to construct strings
768out of character constants, or attempting to use characters as indices
769into arrays, must be looked at with great suspicion.
770.I Lint
771is of little help here, except to flag multi-character character constants.
772.PP
773Of course, the word sizes are different!
774This causes less trouble than might be expected, at least when
775moving from the
776.UX
777system (16 bit words) to the IBM (32 bits) or GCOS (36 bits).
778The main problems are likely to arise in shifting or masking.
779C now supports a bit-field facility, which can be used to write much of
780this code in a reasonably portable way.
781Frequently, portability of such code can be enhanced by
782slight rearrangements in coding style.
783Many of the incompatibilities seem to have the flavor of writing
784.DS
785x &= 0177700 ;
786.DE
787to clear the low order six bits of \fIx\fR.
788This suffices on the PDP-11, but fails badly on GCOS and IBM.
789If the bit field feature cannot be used, the same effect can be obtained by
790writing
791.DS
792x &= \(ap 077 ;
793.DE
794which will work on all these machines.
795.PP
796The right shift operator is arithmetic shift on the PDP-11, and logical shift on most
797other machines.
798To obtain a logical shift on all machines, the left operand can be
799typed \fBunsigned\fR.
800Characters are considered signed integers on the PDP-11, and unsigned on the other machines.
801This persistence of the sign bit may be reasonably considered a bug in the PDP-11 hardware
802which has infiltrated itself into the C language.
803If there were a good way to discover the programs which would be affected, C could be changed;
804in any case,
805.I lint
806is no help here.
807.PP
808The above discussion may have made the problem of portability seem
809bigger than it in fact is.
810The issues involved here are rarely subtle or mysterious, at least to the
811implementor of the program, although they can involve some work to straighten out.
812The most serious bar to the portability of
813.UX
814system utilities has been the inability to mimic
815essential
816.UX
817system functions on the other systems.
818The inability to seek to a random character position in a text file, or to establish a pipe
819between processes, has involved far more rewriting
820and debugging than any of the differences in C compilers.
821On the other hand,
822.I lint
823has been very helpful
824in moving the
825.UX
826operating system and associated
827utility programs to other machines.
828.SH
829Shutting Lint Up
830.PP
831There are occasions when
832the programmer is smarter than
833.I lint .
834There may be valid reasons for ``illegal'' type casts,
835functions with a variable number of arguments, etc.
836Moreover, as specified above, the flow of control information
837produced by
838.I lint
839often has blind spots, causing occasional spurious
840messages about perfectly reasonable programs.
841Thus, some way of communicating with
842.I lint ,
843typically to shut it up, is desirable.
844.PP
845The form which this mechanism should take is not at all clear.
846New keywords would require current and old compilers to
847recognize these keywords, if only to ignore them.
848This has both philosophical and practical problems.
849New preprocessor syntax suffers from similar problems.
850.PP
851What was finally done was to cause a number of words
852to be recognized by
853.I lint
854when they were embedded in comments.
855This required minimal preprocessor changes;
856the preprocessor just had to agree to pass comments
857through to its output, instead of deleting them
858as had been previously done.
859Thus,
860.I lint
861directives are invisible to the compilers, and
862the effect on systems with the older preprocessors
863is merely that the
864.I lint
865directives don't work.
866.PP
867The first directive is concerned with flow of control information;
868if a particular place in the program cannot be reached,
869but this is not apparent to
870.I lint ,
871this can be asserted by the directive
872.DS
873/* NOTREACHED */
874.DE
875at the appropriate spot in the program.
876Similarly, if it is desired to turn off
877strict type checking for
878the next expression, the directive
879.DS
880/* NOSTRICT */
881.DE
882can be used; the situation reverts to the
883previous default after the next expression.
884The
885.B \-v
886flag can be turned on for one function by the directive
887.DS
888/* ARGSUSED */
889.DE
890Complaints about variable number of arguments in calls to a function
891can be turned off by the directive
892.DS
893/* VARARGS */
894.DE
895preceding the function definition.
896In some cases, it is desirable to check the
897first several arguments, and leave the later arguments unchecked.
898This can be done by following the VARARGS keyword immediately
899with a digit giving the number of arguments which should be checked; thus,
900.DS
901/* VARARGS2 */
902.DE
903will cause the first two arguments to be checked, the others unchecked.
904Finally, the directive
905.DS
906/* LINTLIBRARY */
907.DE
908at the head of a file identifies this file as
909a library declaration file; this topic is worth a
910section by itself.
911.SH
912Library Declaration Files
913.PP
914.I Lint
915accepts certain library directives, such as
916.DS
917\-ly
918.DE
919and tests the source files for compatibility with these libraries.
920This is done by accessing library description files whose
921names are constructed from the library directives.
922These files all begin with the directive
923.DS
924/* LINTLIBRARY */
925.DE
926which is followed by a series of dummy function
927definitions.
928The critical parts of these definitions
929are the declaration of the function return type,
930whether the dummy function returns a value, and
931the number and types of arguments to the function.
932The VARARGS and ARGSUSED directives can
933be used to specify features of the library functions.
934.PP
935.I Lint
936library files are processed almost exactly like ordinary
937source files.
938The only difference is that functions which are defined on a library file,
939but are not used on a source file, draw no complaints.
940.I Lint
941does not simulate a full library search algorithm,
942and complains if the source files contain a redefinition of
943a library routine (this is a feature!).
944.PP
945By default,
946.I lint
947checks the programs it is given against a standard library
948file, which contains descriptions of the programs which
949are normally loaded when
950a C program
951is run.
952When the
953.B -p
954flag is in effect, another file is checked containing
955descriptions of the standard I/O library routines
956which are expected to be portable across various machines.
957The
958.B -n
959flag can be used to suppress all library checking.
960.SH
961Bugs, etc.
962.PP
963.I Lint
964was a difficult program to write, partially
965because it is closely connected with matters of programming style,
966and partially because users usually don't notice bugs which cause
967.I lint
968to miss errors which it should have caught.
969(By contrast, if
970.I lint
971incorrectly complains about something that is correct, the
972programmer reports that immediately!)
973.PP
974A number of areas remain to be further developed.
975The checking of structures and arrays is rather inadequate;
976size
977incompatibilities go unchecked,
978and no attempt is made to match up structure and union
979declarations across files.
980Some stricter checking of the use of the
981.B typedef
982is clearly desirable, but what checking is appropriate, and how
983to carry it out, is still to be determined.
984.PP
985.I Lint
986shares the preprocessor with the C compiler.
987At some point it may be appropriate for a
988special version of the preprocessor to be constructed
989which checks for things such as unused macro definitions,
990macro arguments which have side effects which are
991not expanded at all, or are expanded more than once, etc.
992.PP
993The central problem with
994.I lint
995is the packaging of the information which it collects.
996There are many options which
997serve only to turn off, or slightly modify,
998certain features.
999There are pressures to add even more of these options.
1000.PP
1001In conclusion, it appears that the general notion of having two
1002programs is a good one.
1003The compiler concentrates on quickly and accurately turning the
1004program text into bits which can be run;
1005.I lint
1006concentrates on issues
1007of portability, style, and efficiency.
1008.I Lint
1009can afford to be wrong, since incorrectness and over-conservatism
1010are merely annoying, not fatal.
1011The compiler can be fast since it knows that
1012.I lint
1013will cover its flanks.
1014Finally, the programmer can
1015concentrate at one stage
1016of the programming process solely on the algorithms,
1017data structures, and correctness of the
1018program, and then later retrofit,
1019with the aid of
1020.I lint ,
1021the desirable properties of universality and portability.
1022.SG MH-1273-SCJ-unix
1023.bp
1024.[
1025$LIST$
1026.]
1027.bp
1028.SH
1029Appendix: Current Lint Options
1030.PP
1031The command currently has the form
1032.DS
1033lint\fR [\fB\-\fRoptions ] files... library-descriptors...
1034.DE
1035The options are
1036.IP \fBh\fR
1037Perform heuristic checks
1038.IP \fBp\fR
1039Perform portability checks
1040.IP \fBv\fR
1041Don't report unused arguments
1042.IP \fBu\fR
1043Don't report unused or undefined externals
1044.IP \fBb\fR
1045Report unreachable
1046.B break
1047statements.
1048.IP \fBx\fR
1049Report unused external declarations
1050.IP \fBa\fR
1051Report assignments of
1052.B long
1053to
1054.B int
1055or shorter.
1056.IP \fBc\fR
1057Complain about questionable casts
1058.IP \fBn\fR
1059No library checking is done
1060.IP \fBs\fR
1061Same as
1062.B h
1063(for historical reasons)