document distributed with 4.2BSD
[unix-history] / usr / src / old / pcc / lint / PSD.doc / lint.ms
CommitLineData
18a61723 1.\" @(#)lint.ms 5.1 (Berkeley) %G%
795f68a3
KM
2.\"
3.RP
4.ND "July 26, 1978"
5.OK
6Program Portability
7Strong Type Checking
8.TL
9Lint, a C Program Checker
10.AU "MH 2C-559" 3968
11S. C. Johnson
12.AI
13.MH
14.AB
15.PP
16.I Lint
17is a command which examines C source programs,
18detecting
19a number of bugs and obscurities.
20It enforces the type rules of C more strictly than
21the C compilers.
22It may also be used to enforce a number of portability
23restrictions involved in moving
24programs between different machines and/or operating systems.
25Another option detects a number of wasteful, or error prone, constructions
26which nevertheless are, strictly speaking, legal.
27.PP
28.I Lint
29accepts multiple input files and library specifications, and checks them for consistency.
30.PP
31The separation of function between
32.I lint
33and the C compilers has both historical and practical
34rationale.
35The compilers turn C programs into executable files rapidly
36and efficiently.
37This is possible in part because the
38compilers do not do sophisticated
39type checking, especially between
40separately compiled programs.
41.I Lint
42takes a more global, leisurely view of the program,
43looking much more carefully at the compatibilities.
44.PP
45This document discusses the use of
46.I lint ,
47gives an overview of the implementation, and gives some hints on the
48writing of machine independent C code.
49.AE
50.CS 10 2 12 0 0 5
51.SH
52Introduction and Usage
53.PP
54Suppose there are two C
55.[
56Kernighan Ritchie Programming Prentice 1978
57.]
58source files,
59.I file1. c
60and
61.I file2.c ,
62which are ordinarily compiled and loaded together.
63Then the command
64.DS
65lint file1.c file2.c
66.DE
67produces messages describing inconsistencies and inefficiencies
68in the programs.
69The program enforces the typing rules of C
70more strictly than the C compilers
71(for both historical and practical reasons)
72enforce them.
73The command
74.DS
75lint \-p file1.c file2.c
76.DE
77will produce, in addition to the above messages, additional messages
78which relate to the portability of the programs to other operating
79systems and machines.
80Replacing the
81.B \-p
82by
83.B \-h
84will produce messages about various error-prone or wasteful constructions
85which, strictly speaking, are not bugs.
86Saying
87.B \-hp
88gets the whole works.
89.PP
90The next several sections describe the major messages;
91the document closes with sections
92discussing the implementation and giving suggestions
93for writing portable C.
94An appendix gives a summary of the
95.I lint
96options.
97.SH
98A Word About Philosophy
99.PP
100Many of the facts which
101.I lint
102needs may be impossible to
103discover.
104For example, whether a given function in a program ever gets called
105may depend on the input data.
106Deciding whether
107.I exit
108is ever called is equivalent to solving the famous ``halting problem,'' known to be
109recursively undecidable.
110.PP
111Thus, most of the
112.I lint
113algorithms are a compromise.
114If a function is never mentioned, it can never be called.
115If a function is mentioned,
116.I lint
117assumes it can be called; this is not necessarily so, but in practice is quite reasonable.
118.PP
119.I Lint
120tries to give information with a high degree of relevance.
121Messages of the form ``\fIxxx\fR might be a bug''
122are easy to generate, but are acceptable only in proportion
123to the fraction of real bugs they uncover.
124If this fraction of real bugs is too small, the messages lose their credibility
125and serve merely to clutter up the output,
126obscuring the more important messages.
127.PP
128Keeping these issues in mind, we now consider in more detail
129the classes of messages which
130.I lint
131produces.
132.SH
133Unused Variables and Functions
134.PP
135As sets of programs evolve and develop,
136previously used variables and arguments to
137functions may become unused;
138it is not uncommon for external variables, or even entire
139functions, to become unnecessary, and yet
140not be removed from the source.
141These ``errors of commission'' rarely cause working programs to fail, but they are a source
142of inefficiency, and make programs harder to understand
143and change.
144Moreover, information about such unused variables and functions can occasionally
145serve to discover bugs; if a function does a necessary job, and
146is never called, something is wrong!
147.PP
148.I Lint
149complains about variables and functions which are defined but not otherwise
150mentioned.
151An exception is variables which are declared through explicit
152.B extern
153statements but are never referenced; thus the statement
154.DS
155extern float sin(\|);
156.DE
157will evoke no comment if
158.I sin
159is never used.
160Note that this agrees with the semantics of the C compiler.
161In some cases, these unused external declarations might be of some interest; they
162can be discovered by adding the
163.B \-x
164flag to the
165.I lint
166invocation.
167.PP
168Certain styles of programming
169require many functions to be written with similar interfaces;
170frequently, some of the arguments may be unused
171in many of the calls.
172The
173.B \-v
174option is available to suppress the printing of
175complaints about unused arguments.
176When
177.B \-v
178is in effect, no messages are produced about unused
179arguments except for those
180arguments which are unused and also declared as
181register arguments; this can be considered
182an active (and preventable) waste of the register
183resources of the machine.
184.PP
185There is one case where information about unused, or
186undefined, variables is more distracting
187than helpful.
188This is when
189.I lint
190is applied to some, but not all, files out of a collection
191which are to be loaded together.
192In this case, many of the functions and variables defined
193may not be used, and, conversely,
194many functions and variables defined elsewhere may be used.
195The
196.B \-u
197flag may be used to suppress the spurious messages which might otherwise appear.
198.SH
199Set/Used Information
200.PP
201.I Lint
202attempts to detect cases where a variable is used before it is set.
203This is very difficult to do well;
204many algorithms take a good deal of time and space,
205and still produce messages about perfectly valid programs.
206.I Lint
207detects local variables (automatic and register storage classes)
208whose first use appears physically earlier in the input file than the first assignment to the variable.
209It assumes that taking the address of a variable constitutes a ``use,'' since the actual use
210may occur at any later time, in a data dependent fashion.
211.PP
212The restriction to the physical appearance of variables in the file makes the
213algorithm very simple and quick to implement,
214since the true flow of control need not be discovered.
215It does mean that
216.I lint
217can complain about some programs which are legal,
218but these programs would probably be considered bad on stylistic grounds (e.g. might
219contain at least two \fBgoto\fR's).
220Because static and external variables are initialized to 0,
221no meaningful information can be discovered about their uses.
222The algorithm deals correctly, however, with initialized automatic variables, and variables
223which are used in the expression which first sets them.
224.PP
225The set/used information also permits recognition of those local variables which are set
226and never used; these form a frequent source of inefficiencies, and may also be symptomatic of bugs.
227.SH
228Flow of Control
229.PP
230.I Lint
231attempts to detect unreachable portions of the programs which it processes.
232It will complain about unlabeled statements immediately following
233\fBgoto\fR, \fBbreak\fR, \fBcontinue\fR, or \fBreturn\fR statements.
234An attempt is made to detect loops which can never be left at the bottom, detecting the
235special cases
236\fBwhile\fR( 1 ) and \fBfor\fR(;;) as infinite loops.
237.I Lint
238also complains about loops which cannot be entered at the top;
239some valid programs may have such loops, but at best they are bad style,
240at worst bugs.
241.PP
242.I Lint
243has an important area of blindness in the flow of control algorithm:
244it has no way of detecting functions which are called and never return.
245Thus, a call to
246.I exit
247may cause unreachable code which
248.I lint
249does not detect; the most serious effects of this are in the
250determination of returned function values (see the next section).
251.PP
252One form of unreachable statement is not usually complained about by
253.I lint;
254a
255.B break
256statement that cannot be reached causes no message.
257Programs generated by
258.I yacc ,
259.[
260Johnson Yacc 1975
261.]
262and especially
263.I lex ,
264.[
265Lesk Lex
266.]
267may have literally hundreds of unreachable
268.B break
269statements.
270The
271.B \-O
272flag in the C compiler will often eliminate the resulting object code inefficiency.
273Thus, these unreached statements are of little importance,
274there is typically nothing the user can do about them, and the
275resulting messages would clutter up the
276.I lint
277output.
278If these messages are desired,
279.I lint
280can be invoked with the
281.B \-b
282option.
283.SH
284Function Values
285.PP
286Sometimes functions return values which are never used;
287sometimes programs incorrectly use function ``values''
288which have never been returned.
289.I Lint
290addresses this problem in a number of ways.
291.PP
292Locally, within a function definition,
293the appearance of both
294.DS
295return( \fIexpr\fR );
296.DE
297and
298.DS
299return ;
300.DE
301statements is cause for alarm;
302.I lint
303will give the message
304.DS
305function \fIname\fR contains return(e) and return
306.DE
307The most serious difficulty with this is detecting when a function return is implied
308by flow of control reaching the end of the function.
309This can be seen with a simple example:
310.DS
311.ta .5i 1i 1.5i
312\fRf ( a ) {
313 if ( a ) return ( 3 );
314 g (\|);
315 }
316.DE
317Notice that, if \fIa\fR tests false, \fIf\fR will call \fIg\fR and then return
318with no defined return value; this will trigger a complaint from
319.I lint .
320If \fIg\fR, like \fIexit\fR, never returns,
321the message will still be produced when in fact nothing is wrong.
322.PP
323In practice, some potentially serious bugs have been discovered by this feature;
324it also accounts for a substantial fraction of the ``noise'' messages produced
325by
326.I lint .
327.PP
328On a global scale,
329.I lint
330detects cases where a function returns a value, but this value is sometimes,
331or always, unused.
332When the value is always unused, it may constitute an inefficiency in the function definition.
333When the value is sometimes unused, it may represent bad style (e.g., not testing for
334error conditions).
335.PP
336The dual problem, using a function value when the function does not return one,
337is also detected.
338This is a serious problem.
339Amazingly, this bug has been observed on a couple of occasions
340in ``working'' programs; the desired function value just happened to have been computed
341in the function return register!
342.SH
343Type Checking
344.PP
345.I Lint
346enforces the type checking rules of C more strictly than the compilers do.
347The additional checking is in four major areas:
348across certain binary operators and implied assignments,
349at the structure selection operators,
350between the definition and uses of functions,
351and in the use of enumerations.
352.PP
353There are a number of operators which have an implied balancing between types of the operands.
354The assignment, conditional ( ?\|: ), and relational operators
355have this property; the argument
356of a \fBreturn\fR statement,
357and expressions used in initialization also suffer similar conversions.
358In these operations,
359\fBchar\fR, \fBshort\fR, \fBint\fR, \fBlong\fR, \fBunsigned\fR, \fBfloat\fR, and \fBdouble\fR types may be freely intermixed.
360The types of pointers must agree exactly,
361except that arrays of \fIx\fR's can, of course, be intermixed with pointers to \fIx\fR's.
362.PP
363The type checking rules also require that, in structure references, the
364left operand of the \(em> be a pointer to structure, the left operand of the \fB.\fR
365be a structure, and the right operand of these operators be a member
366of the structure implied by the left operand.
367Similar checking is done for references to unions.
368.PP
369Strict rules apply to function argument and return value
370matching.
371The types \fBfloat\fR and \fBdouble\fR may be freely matched,
372as may the types \fBchar\fR, \fBshort\fR, \fBint\fR, and \fBunsigned\fR.
373Also, pointers can be matched with the associated arrays.
374Aside from this, all actual arguments must agree in type with their declared counterparts.
375.PP
376With enumerations, checks are made that enumeration variables or members are not mixed
377with other types, or other enumerations,
378and that the only operations applied are =, initialization, ==, !=, and function arguments and return values.
379.SH
380Type Casts
381.PP
382The type cast feature in C was introduced largely as an aid
383to producing more portable programs.
384Consider the assignment
385.DS
386p = 1 ;
387.DE
388where
389.I p
390is a character pointer.
391.I Lint
392will quite rightly complain.
393Now, consider the assignment
394.DS
395p = (char \(**)1 ;
396.DE
397in which a cast has been used to
398convert the integer to a character pointer.
399The programmer obviously had a strong motivation
400for doing this, and has clearly signaled his intentions.
401It seems harsh for
402.I lint
403to continue to complain about this.
404On the other hand, if this code is moved to another
405machine, such code should be looked at carefully.
406The
407.B \-c
408flag controls the printing of comments about casts.
409When
410.B \-c
411is in effect, casts are treated as though they were assignments
412subject to complaint; otherwise, all legal casts are passed without comment,
413no matter how strange the type mixing seems to be.
414.SH
415Nonportable Character Use
416.PP
417On the PDP-11, characters are signed quantities, with a range
418from \-128 to 127.
419On most of the other C implementations, characters take on only positive
420values.
421Thus,
422.I lint
423will flag certain comparisons and assignments as being
424illegal or nonportable.
425For example, the fragment
426.DS
427char c;
428 ...
429if( (c = getchar(\|)) < 0 ) ....
430.DE
431works on the PDP-11, but
432will fail on machines where characters always take
433on positive values.
434The real solution is to declare
435.I c
436an integer, since
437.I getchar
438is actually returning
439integer values.
440In any case,
441.I lint
442will say
443``nonportable character comparison''.
444.PP
445A similar issue arises with bitfields; when assignments
446of constant values are made to bitfields, the field may
447be too small to hold the value.
448This is especially true because
449on some machines bitfields are considered as signed
450quantities.
451While it may seem unintuitive to consider
452that a two bit field declared of type
453.B int
454cannot hold the value 3, the problem disappears
455if the bitfield is declared to have type
456.B unsigned .
457.SH
458Assignments of longs to ints
459.PP
460Bugs may arise from the assignment of
461.B long
462to
463an
464.B int ,
465which loses accuracy.
466This may happen in programs
467which have been incompletely converted to use
468.B typedefs .
469When a
470.B typedef
471variable
472is changed from \fBint\fR to \fBlong\fR,
473the program can stop working because
474some intermediate results may be assigned
475to \fBints\fR, losing accuracy.
476Since there are a number of legitimate reasons for
477assigning \fBlongs\fR to \fBints\fR, the detection
478of these assignments is enabled
479by the
480.B \-a
481flag.
482.SH
483Strange Constructions
484.PP
485Several perfectly legal, but somewhat strange, constructions
486are flagged by
487.I lint;
488the messages hopefully encourage better code quality, clearer style, and
489may even point out bugs.
490The
491.B \-h
492flag is used to enable these checks.
493For example, in the statement
494.DS
495\(**p++ ;
496.DE
497the \(** does nothing; this provokes the message ``null effect'' from
498.I lint .
499The program fragment
500.DS
501unsigned x ;
502if( x < 0 ) ...
503.DE
504is clearly somewhat strange; the
505test will never succeed.
506Similarly, the test
507.DS
508if( x > 0 ) ...
509.DE
510is equivalent to
511.DS
512if( x != 0 )
513.DE
514which may not be the intended action.
515.I Lint
516will say ``degenerate unsigned comparison'' in these cases.
517If one says
518.DS
519if( 1 != 0 ) ....
520.DE
521.I lint
522will report
523``constant in conditional context'', since the comparison
524of 1 with 0 gives a constant result.
525.PP
526Another construction
527detected by
528.I lint
529involves
530operator precedence.
531Bugs which arise from misunderstandings about the precedence
532of operators can be accentuated by spacing and formatting,
533making such bugs extremely hard to find.
534For example, the statements
535.DS
536if( x&077 == 0 ) ...
537.DE
538or
539.DS
540x<\h'-.3m'<2 + 40
541.DE
542probably do not do what was intended.
543The best solution is to parenthesize such expressions,
544and
545.I lint
546encourages this by an appropriate message.
547.PP
548Finally, when the
549.B \-h
550flag is in force
551.I lint
552complains about variables which are redeclared in inner blocks
553in a way that conflicts with their use in outer blocks.
554This is legal, but is considered by many (including the author) to
555be bad style, usually unnecessary, and frequently a bug.
556.SH
557Ancient History
558.PP
559There are several forms of older syntax which are being officially
560discouraged.
561These fall into two classes, assignment operators and initialization.
562.PP
563The older forms of assignment operators (e.g., =+, =\-, . . . )
564could cause ambiguous expressions, such as
565.DS
566a =\-1 ;
567.DE
568which could be taken as either
569.DS
570a =\- 1 ;
571.DE
572or
573.DS
574a = \-1 ;
575.DE
576The situation is especially perplexing if this
577kind of ambiguity arises as the result of a macro substitution.
578The newer, and preferred operators (+=, \-=, etc. )
579have no such ambiguities.
580To spur the abandonment of the older forms,
581.I lint
582complains about these old fashioned operators.
583.PP
584A similar issue arises with initialization.
585The older language allowed
586.DS
587int x \fR1 ;
588.DE
589to initialize
590.I x
591to 1.
592This also caused syntactic difficulties: for example,
593.DS
594int x ( \-1 ) ;
595.DE
596looks somewhat like the beginning of a function declaration:
597.DS
598int x ( y ) { . . .
599.DE
600and the compiler must read a fair ways past
601.I x
602in order to sure what the declaration really is..
603Again, the problem is even more perplexing when the
604initializer involves a macro.
605The current syntax places an equals sign between the
606variable and the initializer:
607.DS
608int x = \-1 ;
609.DE
610This is free of any possible syntactic ambiguity.
611.SH
612Pointer Alignment
613.PP
614Certain pointer assignments may be reasonable on some machines,
615and illegal on others, due entirely to
616alignment restrictions.
617For example, on the PDP-11, it is reasonable
618to assign integer pointers to double pointers, since
619double precision values may begin on any integer boundary.
620On the Honeywell 6000, double precision values must begin
621on even word boundaries;
622thus, not all such assignments make sense.
623.I Lint
624tries to detect cases where pointers are assigned to other
625pointers, and such alignment problems might arise.
626The message ``possible pointer alignment problem''
627results from this situation whenever either the
628.B \-p
629or
630.B \-h
631flags are in effect.
632.SH
633Multiple Uses and Side Effects
634.PP
635In complicated expressions, the best order in which to evaluate
636subexpressions may be highly machine dependent.
637For example, on machines (like the PDP-11) in which the stack
638runs backwards, function arguments will probably be best evaluated
639from right-to-left; on machines with a stack running forward,
640left-to-right seems most attractive.
641Function calls embedded as arguments of other functions
642may or may not be treated similarly to ordinary arguments.
643Similar issues arise with other operators which have side effects,
644such as the assignment operators and the increment and decrement operators.
645.PP
646In order that the efficiency of C on a particular machine not be
647unduly compromised, the C language leaves the order
648of evaluation of complicated expressions up to the
649local compiler, and, in fact, the various C compilers have considerable
650differences in the order in which they will evaluate complicated
651expressions.
652In particular, if any variable is changed by a side effect, and
653also used elsewhere in the same expression, the result is explicitly undefined.
654.PP
655.I Lint
656checks for the important special case where
657a simple scalar variable is affected.
658For example, the statement
659.DS
660\fIa\fR[\fIi\|\fR] = \fIb\fR[\fIi\fR++] ;
661.DE
662will draw the complaint:
663.DS
664warning: \fIi\fR evaluation order undefined
665.DE
666.SH
667Implementation
668.PP
669.I Lint
670consists of two programs and a driver.
671The first program is a version of the
672Portable C Compiler
673.[
674Johnson Ritchie BSTJ Portability Programs System
675.]
676.[
677Johnson portable compiler 1978
678.]
679which is the basis of the
680IBM 370, Honeywell 6000, and Interdata 8/32 C compilers.
681This compiler does lexical and syntax analysis on the input text,
682constructs and maintains symbol tables, and builds trees for expressions.
683Instead of writing an intermediate file which is passed to
684a code generator, as the other compilers
685do,
686.I lint
687produces an intermediate file which consists of lines of ascii text.
688Each line contains an external variable name,
689an encoding of the context in which it was seen (use, definition, declaration, etc.),
690a type specifier, and a source file name and line number.
691The information about variables local to a function or file
692is collected
693by accessing the symbol table, and examining the expression trees.
694.PP
695Comments about local problems are produced as detected.
696The information about external names is collected
697onto an intermediate file.
698After all the source files and library descriptions have
699been collected, the intermediate file is sorted
700to bring all information collected about a given external
701name together.
702The second, rather small, program then reads the lines
703from the intermediate file and compares all of the
704definitions, declarations, and uses for consistency.
705.PP
706The driver controls this
707process, and is also responsible for making the options available
708to both passes of
709.I lint .
710.SH
711Portability
712.PP
713C on the Honeywell and IBM systems is used, in part, to write system code for the host operating system.
714This means that the implementation of C tends to follow local conventions rather than
715adhere strictly to
716.UX
717system conventions.
718Despite these differences, many C programs have been successfully moved to GCOS and the various IBM
719installations with little effort.
720This section describes some of the differences between the implementations, and
721discusses the
722.I lint
723features which encourage portability.
724.PP
725Uninitialized external variables are treated differently in different
726implementations of C.
727Suppose two files both contain a declaration without initialization, such as
728.DS
729int a ;
730.DE
731outside of any function.
732The
733.UX
734loader will resolve these declarations, and cause only a single word of storage
735to be set aside for \fIa\fR.
736Under the GCOS and IBM implementations, this is not feasible (for various stupid reasons!)
737so each such declaration causes a word of storage to be set aside and called \fIa\fR.
738When loading or library editing takes place, this causes fatal conflicts which prevent
739the proper operation of the program.
740If
741.I lint
742is invoked with the \fB\-p\fR flag,
743it will detect such multiple definitions.
744.PP
745A related difficulty comes from the amount of information retained about external names during the
746loading process.
747On the
748.UX
749system, externally known names have seven significant characters, with the upper/lower
750case distinction kept.
751On the IBM systems, there are eight significant characters, but the case distinction
752is lost.
753On GCOS, there are only six characters, of a single case.
754This leads to situations where programs run on the
755.UX
756system, but encounter loader
757problems on the IBM or GCOS systems.
758.I Lint
759.B \-p
760causes all external symbols to be mapped to one case and truncated to six characters,
761providing a worst-case analysis.
762.PP
763A number of differences arise in the area of character handling: characters in the
764.UX
765system are eight bit ascii, while they are eight bit ebcdic on the IBM, and
766nine bit ascii on GCOS.
767Moreover, character strings go from high to low bit positions (``left to right'')
768on GCOS and IBM, and low to high (``right to left'') on the PDP-11.
769This means that code attempting to construct strings
770out of character constants, or attempting to use characters as indices
771into arrays, must be looked at with great suspicion.
772.I Lint
773is of little help here, except to flag multi-character character constants.
774.PP
775Of course, the word sizes are different!
776This causes less trouble than might be expected, at least when
777moving from the
778.UX
779system (16 bit words) to the IBM (32 bits) or GCOS (36 bits).
780The main problems are likely to arise in shifting or masking.
781C now supports a bit-field facility, which can be used to write much of
782this code in a reasonably portable way.
783Frequently, portability of such code can be enhanced by
784slight rearrangements in coding style.
785Many of the incompatibilities seem to have the flavor of writing
786.DS
787x &= 0177700 ;
788.DE
789to clear the low order six bits of \fIx\fR.
790This suffices on the PDP-11, but fails badly on GCOS and IBM.
791If the bit field feature cannot be used, the same effect can be obtained by
792writing
793.DS
794x &= \(ap 077 ;
795.DE
796which will work on all these machines.
797.PP
798The right shift operator is arithmetic shift on the PDP-11, and logical shift on most
799other machines.
800To obtain a logical shift on all machines, the left operand can be
801typed \fBunsigned\fR.
802Characters are considered signed integers on the PDP-11, and unsigned on the other machines.
803This persistence of the sign bit may be reasonably considered a bug in the PDP-11 hardware
804which has infiltrated itself into the C language.
805If there were a good way to discover the programs which would be affected, C could be changed;
806in any case,
807.I lint
808is no help here.
809.PP
810The above discussion may have made the problem of portability seem
811bigger than it in fact is.
812The issues involved here are rarely subtle or mysterious, at least to the
813implementor of the program, although they can involve some work to straighten out.
814The most serious bar to the portability of
815.UX
816system utilities has been the inability to mimic
817essential
818.UX
819system functions on the other systems.
820The inability to seek to a random character position in a text file, or to establish a pipe
821between processes, has involved far more rewriting
822and debugging than any of the differences in C compilers.
823On the other hand,
824.I lint
825has been very helpful
826in moving the
827.UX
828operating system and associated
829utility programs to other machines.
830.SH
831Shutting Lint Up
832.PP
833There are occasions when
834the programmer is smarter than
835.I lint .
836There may be valid reasons for ``illegal'' type casts,
837functions with a variable number of arguments, etc.
838Moreover, as specified above, the flow of control information
839produced by
840.I lint
841often has blind spots, causing occasional spurious
842messages about perfectly reasonable programs.
843Thus, some way of communicating with
844.I lint ,
845typically to shut it up, is desirable.
846.PP
847The form which this mechanism should take is not at all clear.
848New keywords would require current and old compilers to
849recognize these keywords, if only to ignore them.
850This has both philosophical and practical problems.
851New preprocessor syntax suffers from similar problems.
852.PP
853What was finally done was to cause a number of words
854to be recognized by
855.I lint
856when they were embedded in comments.
857This required minimal preprocessor changes;
858the preprocessor just had to agree to pass comments
859through to its output, instead of deleting them
860as had been previously done.
861Thus,
862.I lint
863directives are invisible to the compilers, and
864the effect on systems with the older preprocessors
865is merely that the
866.I lint
867directives don't work.
868.PP
869The first directive is concerned with flow of control information;
870if a particular place in the program cannot be reached,
871but this is not apparent to
872.I lint ,
873this can be asserted by the directive
874.DS
875/* NOTREACHED */
876.DE
877at the appropriate spot in the program.
878Similarly, if it is desired to turn off
879strict type checking for
880the next expression, the directive
881.DS
882/* NOSTRICT */
883.DE
884can be used; the situation reverts to the
885previous default after the next expression.
886The
887.B \-v
888flag can be turned on for one function by the directive
889.DS
890/* ARGSUSED */
891.DE
892Complaints about variable number of arguments in calls to a function
893can be turned off by the directive
894.DS
895/* VARARGS */
896.DE
897preceding the function definition.
898In some cases, it is desirable to check the
899first several arguments, and leave the later arguments unchecked.
900This can be done by following the VARARGS keyword immediately
901with a digit giving the number of arguments which should be checked; thus,
902.DS
903/* VARARGS2 */
904.DE
905will cause the first two arguments to be checked, the others unchecked.
906Finally, the directive
907.DS
908/* LINTLIBRARY */
909.DE
910at the head of a file identifies this file as
911a library declaration file; this topic is worth a
912section by itself.
913.SH
914Library Declaration Files
915.PP
916.I Lint
917accepts certain library directives, such as
918.DS
919\-ly
920.DE
921and tests the source files for compatibility with these libraries.
922This is done by accessing library description files whose
923names are constructed from the library directives.
924These files all begin with the directive
925.DS
926/* LINTLIBRARY */
927.DE
928which is followed by a series of dummy function
929definitions.
930The critical parts of these definitions
931are the declaration of the function return type,
932whether the dummy function returns a value, and
933the number and types of arguments to the function.
934The VARARGS and ARGSUSED directives can
935be used to specify features of the library functions.
936.PP
937.I Lint
938library files are processed almost exactly like ordinary
939source files.
940The only difference is that functions which are defined on a library file,
941but are not used on a source file, draw no complaints.
942.I Lint
943does not simulate a full library search algorithm,
944and complains if the source files contain a redefinition of
945a library routine (this is a feature!).
946.PP
947By default,
948.I lint
949checks the programs it is given against a standard library
950file, which contains descriptions of the programs which
951are normally loaded when
952a C program
953is run.
954When the
955.B -p
956flag is in effect, another file is checked containing
957descriptions of the standard I/O library routines
958which are expected to be portable across various machines.
959The
960.B -n
961flag can be used to suppress all library checking.
962.SH
963Bugs, etc.
964.PP
965.I Lint
966was a difficult program to write, partially
967because it is closely connected with matters of programming style,
968and partially because users usually don't notice bugs which cause
969.I lint
970to miss errors which it should have caught.
971(By contrast, if
972.I lint
973incorrectly complains about something that is correct, the
974programmer reports that immediately!)
975.PP
976A number of areas remain to be further developed.
977The checking of structures and arrays is rather inadequate;
978size
979incompatibilities go unchecked,
980and no attempt is made to match up structure and union
981declarations across files.
982Some stricter checking of the use of the
983.B typedef
984is clearly desirable, but what checking is appropriate, and how
985to carry it out, is still to be determined.
986.PP
987.I Lint
988shares the preprocessor with the C compiler.
989At some point it may be appropriate for a
990special version of the preprocessor to be constructed
991which checks for things such as unused macro definitions,
992macro arguments which have side effects which are
993not expanded at all, or are expanded more than once, etc.
994.PP
995The central problem with
996.I lint
997is the packaging of the information which it collects.
998There are many options which
999serve only to turn off, or slightly modify,
1000certain features.
1001There are pressures to add even more of these options.
1002.PP
1003In conclusion, it appears that the general notion of having two
1004programs is a good one.
1005The compiler concentrates on quickly and accurately turning the
1006program text into bits which can be run;
1007.I lint
1008concentrates on issues
1009of portability, style, and efficiency.
1010.I Lint
1011can afford to be wrong, since incorrectness and over-conservatism
1012are merely annoying, not fatal.
1013The compiler can be fast since it knows that
1014.I lint
1015will cover its flanks.
1016Finally, the programmer can
1017concentrate at one stage
1018of the programming process solely on the algorithms,
1019data structures, and correctness of the
1020program, and then later retrofit,
1021with the aid of
1022.I lint ,
1023the desirable properties of universality and portability.
1024.SG MH-1273-SCJ-unix
1025.bp
1026.[
1027$LIST$
1028.]
1029.bp
1030.SH
1031Appendix: Current Lint Options
1032.PP
1033The command currently has the form
1034.DS
1035lint\fR [\fB\-\fRoptions ] files... library-descriptors...
1036.DE
1037The options are
1038.IP \fBh\fR
1039Perform heuristic checks
1040.IP \fBp\fR
1041Perform portability checks
1042.IP \fBv\fR
1043Don't report unused arguments
1044.IP \fBu\fR
1045Don't report unused or undefined externals
1046.IP \fBb\fR
1047Report unreachable
1048.B break
1049statements.
1050.IP \fBx\fR
1051Report unused external declarations
1052.IP \fBa\fR
1053Report assignments of
1054.B long
1055to
1056.B int
1057or shorter.
1058.IP \fBc\fR
1059Complain about questionable casts
1060.IP \fBn\fR
1061No library checking is done
1062.IP \fBs\fR
1063Same as
1064.B h
1065(for historical reasons)