| 1 | .RP |
| 2 | .ND "July 26, 1978" |
| 3 | .OK |
| 4 | Program Portability |
| 5 | Strong Type Checking |
| 6 | .TL |
| 7 | Lint, a C Program Checker |
| 8 | .AU "MH 2C-559" 3968 |
| 9 | S. C. Johnson |
| 10 | .AI |
| 11 | .MH |
| 12 | .AB |
| 13 | .PP |
| 14 | .I Lint |
| 15 | is a command which examines C source programs, |
| 16 | detecting |
| 17 | a number of bugs and obscurities. |
| 18 | It enforces the type rules of C more strictly than |
| 19 | the C compilers. |
| 20 | It may also be used to enforce a number of portability |
| 21 | restrictions involved in moving |
| 22 | programs between different machines and/or operating systems. |
| 23 | Another option detects a number of wasteful, or error prone, constructions |
| 24 | which nevertheless are, strictly speaking, legal. |
| 25 | .PP |
| 26 | .I Lint |
| 27 | accepts multiple input files and library specifications, and checks them for consistency. |
| 28 | .PP |
| 29 | The separation of function between |
| 30 | .I lint |
| 31 | and the C compilers has both historical and practical |
| 32 | rationale. |
| 33 | The compilers turn C programs into executable files rapidly |
| 34 | and efficiently. |
| 35 | This is possible in part because the |
| 36 | compilers do not do sophisticated |
| 37 | type checking, especially between |
| 38 | separately compiled programs. |
| 39 | .I Lint |
| 40 | takes a more global, leisurely view of the program, |
| 41 | looking much more carefully at the compatibilities. |
| 42 | .PP |
| 43 | This document discusses the use of |
| 44 | .I lint , |
| 45 | gives an overview of the implementation, and gives some hints on the |
| 46 | writing of machine independent C code. |
| 47 | .AE |
| 48 | .CS 10 2 12 0 0 5 |
| 49 | .SH |
| 50 | Introduction and Usage |
| 51 | .PP |
| 52 | Suppose there are two C |
| 53 | .[ |
| 54 | Kernighan Ritchie Programming Prentice 1978 |
| 55 | .] |
| 56 | source files, |
| 57 | .I file1. c |
| 58 | and |
| 59 | .I file2.c , |
| 60 | which are ordinarily compiled and loaded together. |
| 61 | Then the command |
| 62 | .DS |
| 63 | lint file1.c file2.c |
| 64 | .DE |
| 65 | produces messages describing inconsistencies and inefficiencies |
| 66 | in the programs. |
| 67 | The program enforces the typing rules of C |
| 68 | more strictly than the C compilers |
| 69 | (for both historical and practical reasons) |
| 70 | enforce them. |
| 71 | The command |
| 72 | .DS |
| 73 | lint \-p file1.c file2.c |
| 74 | .DE |
| 75 | will produce, in addition to the above messages, additional messages |
| 76 | which relate to the portability of the programs to other operating |
| 77 | systems and machines. |
| 78 | Replacing the |
| 79 | .B \-p |
| 80 | by |
| 81 | .B \-h |
| 82 | will produce messages about various error-prone or wasteful constructions |
| 83 | which, strictly speaking, are not bugs. |
| 84 | Saying |
| 85 | .B \-hp |
| 86 | gets the whole works. |
| 87 | .PP |
| 88 | The next several sections describe the major messages; |
| 89 | the document closes with sections |
| 90 | discussing the implementation and giving suggestions |
| 91 | for writing portable C. |
| 92 | An appendix gives a summary of the |
| 93 | .I lint |
| 94 | options. |
| 95 | .SH |
| 96 | A Word About Philosophy |
| 97 | .PP |
| 98 | Many of the facts which |
| 99 | .I lint |
| 100 | needs may be impossible to |
| 101 | discover. |
| 102 | For example, whether a given function in a program ever gets called |
| 103 | may depend on the input data. |
| 104 | Deciding whether |
| 105 | .I exit |
| 106 | is ever called is equivalent to solving the famous ``halting problem,'' known to be |
| 107 | recursively undecidable. |
| 108 | .PP |
| 109 | Thus, most of the |
| 110 | .I lint |
| 111 | algorithms are a compromise. |
| 112 | If a function is never mentioned, it can never be called. |
| 113 | If a function is mentioned, |
| 114 | .I lint |
| 115 | assumes it can be called; this is not necessarily so, but in practice is quite reasonable. |
| 116 | .PP |
| 117 | .I Lint |
| 118 | tries to give information with a high degree of relevance. |
| 119 | Messages of the form ``\fIxxx\fR might be a bug'' |
| 120 | are easy to generate, but are acceptable only in proportion |
| 121 | to the fraction of real bugs they uncover. |
| 122 | If this fraction of real bugs is too small, the messages lose their credibility |
| 123 | and serve merely to clutter up the output, |
| 124 | obscuring the more important messages. |
| 125 | .PP |
| 126 | Keeping these issues in mind, we now consider in more detail |
| 127 | the classes of messages which |
| 128 | .I lint |
| 129 | produces. |
| 130 | .SH |
| 131 | Unused Variables and Functions |
| 132 | .PP |
| 133 | As sets of programs evolve and develop, |
| 134 | previously used variables and arguments to |
| 135 | functions may become unused; |
| 136 | it is not uncommon for external variables, or even entire |
| 137 | functions, to become unnecessary, and yet |
| 138 | not be removed from the source. |
| 139 | These ``errors of commission'' rarely cause working programs to fail, but they are a source |
| 140 | of inefficiency, and make programs harder to understand |
| 141 | and change. |
| 142 | Moreover, information about such unused variables and functions can occasionally |
| 143 | serve to discover bugs; if a function does a necessary job, and |
| 144 | is never called, something is wrong! |
| 145 | .PP |
| 146 | .I Lint |
| 147 | complains about variables and functions which are defined but not otherwise |
| 148 | mentioned. |
| 149 | An exception is variables which are declared through explicit |
| 150 | .B extern |
| 151 | statements but are never referenced; thus the statement |
| 152 | .DS |
| 153 | extern float sin(\|); |
| 154 | .DE |
| 155 | will evoke no comment if |
| 156 | .I sin |
| 157 | is never used. |
| 158 | Note that this agrees with the semantics of the C compiler. |
| 159 | In some cases, these unused external declarations might be of some interest; they |
| 160 | can be discovered by adding the |
| 161 | .B \-x |
| 162 | flag to the |
| 163 | .I lint |
| 164 | invocation. |
| 165 | .PP |
| 166 | Certain styles of programming |
| 167 | require many functions to be written with similar interfaces; |
| 168 | frequently, some of the arguments may be unused |
| 169 | in many of the calls. |
| 170 | The |
| 171 | .B \-v |
| 172 | option is available to suppress the printing of |
| 173 | complaints about unused arguments. |
| 174 | When |
| 175 | .B \-v |
| 176 | is in effect, no messages are produced about unused |
| 177 | arguments except for those |
| 178 | arguments which are unused and also declared as |
| 179 | register arguments; this can be considered |
| 180 | an active (and preventable) waste of the register |
| 181 | resources of the machine. |
| 182 | .PP |
| 183 | There is one case where information about unused, or |
| 184 | undefined, variables is more distracting |
| 185 | than helpful. |
| 186 | This is when |
| 187 | .I lint |
| 188 | is applied to some, but not all, files out of a collection |
| 189 | which are to be loaded together. |
| 190 | In this case, many of the functions and variables defined |
| 191 | may not be used, and, conversely, |
| 192 | many functions and variables defined elsewhere may be used. |
| 193 | The |
| 194 | .B \-u |
| 195 | flag may be used to suppress the spurious messages which might otherwise appear. |
| 196 | .SH |
| 197 | Set/Used Information |
| 198 | .PP |
| 199 | .I Lint |
| 200 | attempts to detect cases where a variable is used before it is set. |
| 201 | This is very difficult to do well; |
| 202 | many algorithms take a good deal of time and space, |
| 203 | and still produce messages about perfectly valid programs. |
| 204 | .I Lint |
| 205 | detects local variables (automatic and register storage classes) |
| 206 | whose first use appears physically earlier in the input file than the first assignment to the variable. |
| 207 | It assumes that taking the address of a variable constitutes a ``use,'' since the actual use |
| 208 | may occur at any later time, in a data dependent fashion. |
| 209 | .PP |
| 210 | The restriction to the physical appearance of variables in the file makes the |
| 211 | algorithm very simple and quick to implement, |
| 212 | since the true flow of control need not be discovered. |
| 213 | It does mean that |
| 214 | .I lint |
| 215 | can complain about some programs which are legal, |
| 216 | but these programs would probably be considered bad on stylistic grounds (e.g. might |
| 217 | contain at least two \fBgoto\fR's). |
| 218 | Because static and external variables are initialized to 0, |
| 219 | no meaningful information can be discovered about their uses. |
| 220 | The algorithm deals correctly, however, with initialized automatic variables, and variables |
| 221 | which are used in the expression which first sets them. |
| 222 | .PP |
| 223 | The set/used information also permits recognition of those local variables which are set |
| 224 | and never used; these form a frequent source of inefficiencies, and may also be symptomatic of bugs. |
| 225 | .SH |
| 226 | Flow of Control |
| 227 | .PP |
| 228 | .I Lint |
| 229 | attempts to detect unreachable portions of the programs which it processes. |
| 230 | It will complain about unlabeled statements immediately following |
| 231 | \fBgoto\fR, \fBbreak\fR, \fBcontinue\fR, or \fBreturn\fR statements. |
| 232 | An attempt is made to detect loops which can never be left at the bottom, detecting the |
| 233 | special cases |
| 234 | \fBwhile\fR( 1 ) and \fBfor\fR(;;) as infinite loops. |
| 235 | .I Lint |
| 236 | also complains about loops which cannot be entered at the top; |
| 237 | some valid programs may have such loops, but at best they are bad style, |
| 238 | at worst bugs. |
| 239 | .PP |
| 240 | .I Lint |
| 241 | has an important area of blindness in the flow of control algorithm: |
| 242 | it has no way of detecting functions which are called and never return. |
| 243 | Thus, a call to |
| 244 | .I exit |
| 245 | may cause unreachable code which |
| 246 | .I lint |
| 247 | does not detect; the most serious effects of this are in the |
| 248 | determination of returned function values (see the next section). |
| 249 | .PP |
| 250 | One form of unreachable statement is not usually complained about by |
| 251 | .I lint; |
| 252 | a |
| 253 | .B break |
| 254 | statement that cannot be reached causes no message. |
| 255 | Programs generated by |
| 256 | .I yacc , |
| 257 | .[ |
| 258 | Johnson Yacc 1975 |
| 259 | .] |
| 260 | and especially |
| 261 | .I lex , |
| 262 | .[ |
| 263 | Lesk Lex |
| 264 | .] |
| 265 | may have literally hundreds of unreachable |
| 266 | .B break |
| 267 | statements. |
| 268 | The |
| 269 | .B \-O |
| 270 | flag in the C compiler will often eliminate the resulting object code inefficiency. |
| 271 | Thus, these unreached statements are of little importance, |
| 272 | there is typically nothing the user can do about them, and the |
| 273 | resulting messages would clutter up the |
| 274 | .I lint |
| 275 | output. |
| 276 | If these messages are desired, |
| 277 | .I lint |
| 278 | can be invoked with the |
| 279 | .B \-b |
| 280 | option. |
| 281 | .SH |
| 282 | Function Values |
| 283 | .PP |
| 284 | Sometimes functions return values which are never used; |
| 285 | sometimes programs incorrectly use function ``values'' |
| 286 | which have never been returned. |
| 287 | .I Lint |
| 288 | addresses this problem in a number of ways. |
| 289 | .PP |
| 290 | Locally, within a function definition, |
| 291 | the appearance of both |
| 292 | .DS |
| 293 | return( \fIexpr\fR ); |
| 294 | .DE |
| 295 | and |
| 296 | .DS |
| 297 | return ; |
| 298 | .DE |
| 299 | statements is cause for alarm; |
| 300 | .I lint |
| 301 | will give the message |
| 302 | .DS |
| 303 | function \fIname\fR contains return(e) and return |
| 304 | .DE |
| 305 | The most serious difficulty with this is detecting when a function return is implied |
| 306 | by flow of control reaching the end of the function. |
| 307 | This can be seen with a simple example: |
| 308 | .DS |
| 309 | .ta .5i 1i 1.5i |
| 310 | \fRf ( a ) { |
| 311 | if ( a ) return ( 3 ); |
| 312 | g (\|); |
| 313 | } |
| 314 | .DE |
| 315 | Notice that, if \fIa\fR tests false, \fIf\fR will call \fIg\fR and then return |
| 316 | with no defined return value; this will trigger a complaint from |
| 317 | .I lint . |
| 318 | If \fIg\fR, like \fIexit\fR, never returns, |
| 319 | the message will still be produced when in fact nothing is wrong. |
| 320 | .PP |
| 321 | In practice, some potentially serious bugs have been discovered by this feature; |
| 322 | it also accounts for a substantial fraction of the ``noise'' messages produced |
| 323 | by |
| 324 | .I lint . |
| 325 | .PP |
| 326 | On a global scale, |
| 327 | .I lint |
| 328 | detects cases where a function returns a value, but this value is sometimes, |
| 329 | or always, unused. |
| 330 | When the value is always unused, it may constitute an inefficiency in the function definition. |
| 331 | When the value is sometimes unused, it may represent bad style (e.g., not testing for |
| 332 | error conditions). |
| 333 | .PP |
| 334 | The dual problem, using a function value when the function does not return one, |
| 335 | is also detected. |
| 336 | This is a serious problem. |
| 337 | Amazingly, this bug has been observed on a couple of occasions |
| 338 | in ``working'' programs; the desired function value just happened to have been computed |
| 339 | in the function return register! |
| 340 | .SH |
| 341 | Type Checking |
| 342 | .PP |
| 343 | .I Lint |
| 344 | enforces the type checking rules of C more strictly than the compilers do. |
| 345 | The additional checking is in four major areas: |
| 346 | across certain binary operators and implied assignments, |
| 347 | at the structure selection operators, |
| 348 | between the definition and uses of functions, |
| 349 | and in the use of enumerations. |
| 350 | .PP |
| 351 | There are a number of operators which have an implied balancing between types of the operands. |
| 352 | The assignment, conditional ( ?\|: ), and relational operators |
| 353 | have this property; the argument |
| 354 | of a \fBreturn\fR statement, |
| 355 | and expressions used in initialization also suffer similar conversions. |
| 356 | In these operations, |
| 357 | \fBchar\fR, \fBshort\fR, \fBint\fR, \fBlong\fR, \fBunsigned\fR, \fBfloat\fR, and \fBdouble\fR types may be freely intermixed. |
| 358 | The types of pointers must agree exactly, |
| 359 | except that arrays of \fIx\fR's can, of course, be intermixed with pointers to \fIx\fR's. |
| 360 | .PP |
| 361 | The type checking rules also require that, in structure references, the |
| 362 | left operand of the \(em> be a pointer to structure, the left operand of the \fB.\fR |
| 363 | be a structure, and the right operand of these operators be a member |
| 364 | of the structure implied by the left operand. |
| 365 | Similar checking is done for references to unions. |
| 366 | .PP |
| 367 | Strict rules apply to function argument and return value |
| 368 | matching. |
| 369 | The types \fBfloat\fR and \fBdouble\fR may be freely matched, |
| 370 | as may the types \fBchar\fR, \fBshort\fR, \fBint\fR, and \fBunsigned\fR. |
| 371 | Also, pointers can be matched with the associated arrays. |
| 372 | Aside from this, all actual arguments must agree in type with their declared counterparts. |
| 373 | .PP |
| 374 | With enumerations, checks are made that enumeration variables or members are not mixed |
| 375 | with other types, or other enumerations, |
| 376 | and that the only operations applied are =, initialization, ==, !=, and function arguments and return values. |
| 377 | .SH |
| 378 | Type Casts |
| 379 | .PP |
| 380 | The type cast feature in C was introduced largely as an aid |
| 381 | to producing more portable programs. |
| 382 | Consider the assignment |
| 383 | .DS |
| 384 | p = 1 ; |
| 385 | .DE |
| 386 | where |
| 387 | .I p |
| 388 | is a character pointer. |
| 389 | .I Lint |
| 390 | will quite rightly complain. |
| 391 | Now, consider the assignment |
| 392 | .DS |
| 393 | p = (char \(**)1 ; |
| 394 | .DE |
| 395 | in which a cast has been used to |
| 396 | convert the integer to a character pointer. |
| 397 | The programmer obviously had a strong motivation |
| 398 | for doing this, and has clearly signaled his intentions. |
| 399 | It seems harsh for |
| 400 | .I lint |
| 401 | to continue to complain about this. |
| 402 | On the other hand, if this code is moved to another |
| 403 | machine, such code should be looked at carefully. |
| 404 | The |
| 405 | .B \-c |
| 406 | flag controls the printing of comments about casts. |
| 407 | When |
| 408 | .B \-c |
| 409 | is in effect, casts are treated as though they were assignments |
| 410 | subject to complaint; otherwise, all legal casts are passed without comment, |
| 411 | no matter how strange the type mixing seems to be. |
| 412 | .SH |
| 413 | Nonportable Character Use |
| 414 | .PP |
| 415 | On the PDP-11, characters are signed quantities, with a range |
| 416 | from \-128 to 127. |
| 417 | On most of the other C implementations, characters take on only positive |
| 418 | values. |
| 419 | Thus, |
| 420 | .I lint |
| 421 | will flag certain comparisons and assignments as being |
| 422 | illegal or nonportable. |
| 423 | For example, the fragment |
| 424 | .DS |
| 425 | char c; |
| 426 | ... |
| 427 | if( (c = getchar(\|)) < 0 ) .... |
| 428 | .DE |
| 429 | works on the PDP-11, but |
| 430 | will fail on machines where characters always take |
| 431 | on positive values. |
| 432 | The real solution is to declare |
| 433 | .I c |
| 434 | an integer, since |
| 435 | .I getchar |
| 436 | is actually returning |
| 437 | integer values. |
| 438 | In any case, |
| 439 | .I lint |
| 440 | will say |
| 441 | ``nonportable character comparison''. |
| 442 | .PP |
| 443 | A similar issue arises with bitfields; when assignments |
| 444 | of constant values are made to bitfields, the field may |
| 445 | be too small to hold the value. |
| 446 | This is especially true because |
| 447 | on some machines bitfields are considered as signed |
| 448 | quantities. |
| 449 | While it may seem unintuitive to consider |
| 450 | that a two bit field declared of type |
| 451 | .B int |
| 452 | cannot hold the value 3, the problem disappears |
| 453 | if the bitfield is declared to have type |
| 454 | .B unsigned . |
| 455 | .SH |
| 456 | Assignments of longs to ints |
| 457 | .PP |
| 458 | Bugs may arise from the assignment of |
| 459 | .B long |
| 460 | to |
| 461 | an |
| 462 | .B int , |
| 463 | which loses accuracy. |
| 464 | This may happen in programs |
| 465 | which have been incompletely converted to use |
| 466 | .B typedefs . |
| 467 | When a |
| 468 | .B typedef |
| 469 | variable |
| 470 | is changed from \fBint\fR to \fBlong\fR, |
| 471 | the program can stop working because |
| 472 | some intermediate results may be assigned |
| 473 | to \fBints\fR, losing accuracy. |
| 474 | Since there are a number of legitimate reasons for |
| 475 | assigning \fBlongs\fR to \fBints\fR, the detection |
| 476 | of these assignments is enabled |
| 477 | by the |
| 478 | .B \-a |
| 479 | flag. |
| 480 | .SH |
| 481 | Strange Constructions |
| 482 | .PP |
| 483 | Several perfectly legal, but somewhat strange, constructions |
| 484 | are flagged by |
| 485 | .I lint; |
| 486 | the messages hopefully encourage better code quality, clearer style, and |
| 487 | may even point out bugs. |
| 488 | The |
| 489 | .B \-h |
| 490 | flag is used to enable these checks. |
| 491 | For example, in the statement |
| 492 | .DS |
| 493 | \(**p++ ; |
| 494 | .DE |
| 495 | the \(** does nothing; this provokes the message ``null effect'' from |
| 496 | .I lint . |
| 497 | The program fragment |
| 498 | .DS |
| 499 | unsigned x ; |
| 500 | if( x < 0 ) ... |
| 501 | .DE |
| 502 | is clearly somewhat strange; the |
| 503 | test will never succeed. |
| 504 | Similarly, the test |
| 505 | .DS |
| 506 | if( x > 0 ) ... |
| 507 | .DE |
| 508 | is equivalent to |
| 509 | .DS |
| 510 | if( x != 0 ) |
| 511 | .DE |
| 512 | which may not be the intended action. |
| 513 | .I Lint |
| 514 | will say ``degenerate unsigned comparison'' in these cases. |
| 515 | If one says |
| 516 | .DS |
| 517 | if( 1 != 0 ) .... |
| 518 | .DE |
| 519 | .I lint |
| 520 | will report |
| 521 | ``constant in conditional context'', since the comparison |
| 522 | of 1 with 0 gives a constant result. |
| 523 | .PP |
| 524 | Another construction |
| 525 | detected by |
| 526 | .I lint |
| 527 | involves |
| 528 | operator precedence. |
| 529 | Bugs which arise from misunderstandings about the precedence |
| 530 | of operators can be accentuated by spacing and formatting, |
| 531 | making such bugs extremely hard to find. |
| 532 | For example, the statements |
| 533 | .DS |
| 534 | if( x&077 == 0 ) ... |
| 535 | .DE |
| 536 | or |
| 537 | .DS |
| 538 | x<\h'-.3m'<2 + 40 |
| 539 | .DE |
| 540 | probably do not do what was intended. |
| 541 | The best solution is to parenthesize such expressions, |
| 542 | and |
| 543 | .I lint |
| 544 | encourages this by an appropriate message. |
| 545 | .PP |
| 546 | Finally, when the |
| 547 | .B \-h |
| 548 | flag is in force |
| 549 | .I lint |
| 550 | complains about variables which are redeclared in inner blocks |
| 551 | in a way that conflicts with their use in outer blocks. |
| 552 | This is legal, but is considered by many (including the author) to |
| 553 | be bad style, usually unnecessary, and frequently a bug. |
| 554 | .SH |
| 555 | Ancient History |
| 556 | .PP |
| 557 | There are several forms of older syntax which are being officially |
| 558 | discouraged. |
| 559 | These fall into two classes, assignment operators and initialization. |
| 560 | .PP |
| 561 | The older forms of assignment operators (e.g., =+, =\-, . . . ) |
| 562 | could cause ambiguous expressions, such as |
| 563 | .DS |
| 564 | a =\-1 ; |
| 565 | .DE |
| 566 | which could be taken as either |
| 567 | .DS |
| 568 | a =\- 1 ; |
| 569 | .DE |
| 570 | or |
| 571 | .DS |
| 572 | a = \-1 ; |
| 573 | .DE |
| 574 | The situation is especially perplexing if this |
| 575 | kind of ambiguity arises as the result of a macro substitution. |
| 576 | The newer, and preferred operators (+=, \-=, etc. ) |
| 577 | have no such ambiguities. |
| 578 | To spur the abandonment of the older forms, |
| 579 | .I lint |
| 580 | complains about these old fashioned operators. |
| 581 | .PP |
| 582 | A similar issue arises with initialization. |
| 583 | The older language allowed |
| 584 | .DS |
| 585 | int x \fR1 ; |
| 586 | .DE |
| 587 | to initialize |
| 588 | .I x |
| 589 | to 1. |
| 590 | This also caused syntactic difficulties: for example, |
| 591 | .DS |
| 592 | int x ( \-1 ) ; |
| 593 | .DE |
| 594 | looks somewhat like the beginning of a function declaration: |
| 595 | .DS |
| 596 | int x ( y ) { . . . |
| 597 | .DE |
| 598 | and the compiler must read a fair ways past |
| 599 | .I x |
| 600 | in order to sure what the declaration really is.. |
| 601 | Again, the problem is even more perplexing when the |
| 602 | initializer involves a macro. |
| 603 | The current syntax places an equals sign between the |
| 604 | variable and the initializer: |
| 605 | .DS |
| 606 | int x = \-1 ; |
| 607 | .DE |
| 608 | This is free of any possible syntactic ambiguity. |
| 609 | .SH |
| 610 | Pointer Alignment |
| 611 | .PP |
| 612 | Certain pointer assignments may be reasonable on some machines, |
| 613 | and illegal on others, due entirely to |
| 614 | alignment restrictions. |
| 615 | For example, on the PDP-11, it is reasonable |
| 616 | to assign integer pointers to double pointers, since |
| 617 | double precision values may begin on any integer boundary. |
| 618 | On the Honeywell 6000, double precision values must begin |
| 619 | on even word boundaries; |
| 620 | thus, not all such assignments make sense. |
| 621 | .I Lint |
| 622 | tries to detect cases where pointers are assigned to other |
| 623 | pointers, and such alignment problems might arise. |
| 624 | The message ``possible pointer alignment problem'' |
| 625 | results from this situation whenever either the |
| 626 | .B \-p |
| 627 | or |
| 628 | .B \-h |
| 629 | flags are in effect. |
| 630 | .SH |
| 631 | Multiple Uses and Side Effects |
| 632 | .PP |
| 633 | In complicated expressions, the best order in which to evaluate |
| 634 | subexpressions may be highly machine dependent. |
| 635 | For example, on machines (like the PDP-11) in which the stack |
| 636 | runs backwards, function arguments will probably be best evaluated |
| 637 | from right-to-left; on machines with a stack running forward, |
| 638 | left-to-right seems most attractive. |
| 639 | Function calls embedded as arguments of other functions |
| 640 | may or may not be treated similarly to ordinary arguments. |
| 641 | Similar issues arise with other operators which have side effects, |
| 642 | such as the assignment operators and the increment and decrement operators. |
| 643 | .PP |
| 644 | In order that the efficiency of C on a particular machine not be |
| 645 | unduly compromised, the C language leaves the order |
| 646 | of evaluation of complicated expressions up to the |
| 647 | local compiler, and, in fact, the various C compilers have considerable |
| 648 | differences in the order in which they will evaluate complicated |
| 649 | expressions. |
| 650 | In particular, if any variable is changed by a side effect, and |
| 651 | also used elsewhere in the same expression, the result is explicitly undefined. |
| 652 | .PP |
| 653 | .I Lint |
| 654 | checks for the important special case where |
| 655 | a simple scalar variable is affected. |
| 656 | For example, the statement |
| 657 | .DS |
| 658 | \fIa\fR[\fIi\|\fR] = \fIb\fR[\fIi\fR++] ; |
| 659 | .DE |
| 660 | will draw the complaint: |
| 661 | .DS |
| 662 | warning: \fIi\fR evaluation order undefined |
| 663 | .DE |
| 664 | .SH |
| 665 | Implementation |
| 666 | .PP |
| 667 | .I Lint |
| 668 | consists of two programs and a driver. |
| 669 | The first program is a version of the |
| 670 | Portable C Compiler |
| 671 | .[ |
| 672 | Johnson Ritchie BSTJ Portability Programs System |
| 673 | .] |
| 674 | .[ |
| 675 | Johnson portable compiler 1978 |
| 676 | .] |
| 677 | which is the basis of the |
| 678 | IBM 370, Honeywell 6000, and Interdata 8/32 C compilers. |
| 679 | This compiler does lexical and syntax analysis on the input text, |
| 680 | constructs and maintains symbol tables, and builds trees for expressions. |
| 681 | Instead of writing an intermediate file which is passed to |
| 682 | a code generator, as the other compilers |
| 683 | do, |
| 684 | .I lint |
| 685 | produces an intermediate file which consists of lines of ascii text. |
| 686 | Each line contains an external variable name, |
| 687 | an encoding of the context in which it was seen (use, definition, declaration, etc.), |
| 688 | a type specifier, and a source file name and line number. |
| 689 | The information about variables local to a function or file |
| 690 | is collected |
| 691 | by accessing the symbol table, and examining the expression trees. |
| 692 | .PP |
| 693 | Comments about local problems are produced as detected. |
| 694 | The information about external names is collected |
| 695 | onto an intermediate file. |
| 696 | After all the source files and library descriptions have |
| 697 | been collected, the intermediate file is sorted |
| 698 | to bring all information collected about a given external |
| 699 | name together. |
| 700 | The second, rather small, program then reads the lines |
| 701 | from the intermediate file and compares all of the |
| 702 | definitions, declarations, and uses for consistency. |
| 703 | .PP |
| 704 | The driver controls this |
| 705 | process, and is also responsible for making the options available |
| 706 | to both passes of |
| 707 | .I lint . |
| 708 | .SH |
| 709 | Portability |
| 710 | .PP |
| 711 | C on the Honeywell and IBM systems is used, in part, to write system code for the host operating system. |
| 712 | This means that the implementation of C tends to follow local conventions rather than |
| 713 | adhere strictly to |
| 714 | .UX |
| 715 | system conventions. |
| 716 | Despite these differences, many C programs have been successfully moved to GCOS and the various IBM |
| 717 | installations with little effort. |
| 718 | This section describes some of the differences between the implementations, and |
| 719 | discusses the |
| 720 | .I lint |
| 721 | features which encourage portability. |
| 722 | .PP |
| 723 | Uninitialized external variables are treated differently in different |
| 724 | implementations of C. |
| 725 | Suppose two files both contain a declaration without initialization, such as |
| 726 | .DS |
| 727 | int a ; |
| 728 | .DE |
| 729 | outside of any function. |
| 730 | The |
| 731 | .UX |
| 732 | loader will resolve these declarations, and cause only a single word of storage |
| 733 | to be set aside for \fIa\fR. |
| 734 | Under the GCOS and IBM implementations, this is not feasible (for various stupid reasons!) |
| 735 | so each such declaration causes a word of storage to be set aside and called \fIa\fR. |
| 736 | When loading or library editing takes place, this causes fatal conflicts which prevent |
| 737 | the proper operation of the program. |
| 738 | If |
| 739 | .I lint |
| 740 | is invoked with the \fB\-p\fR flag, |
| 741 | it will detect such multiple definitions. |
| 742 | .PP |
| 743 | A related difficulty comes from the amount of information retained about external names during the |
| 744 | loading process. |
| 745 | On the |
| 746 | .UX |
| 747 | system, externally known names have seven significant characters, with the upper/lower |
| 748 | case distinction kept. |
| 749 | On the IBM systems, there are eight significant characters, but the case distinction |
| 750 | is lost. |
| 751 | On GCOS, there are only six characters, of a single case. |
| 752 | This leads to situations where programs run on the |
| 753 | .UX |
| 754 | system, but encounter loader |
| 755 | problems on the IBM or GCOS systems. |
| 756 | .I Lint |
| 757 | .B \-p |
| 758 | causes all external symbols to be mapped to one case and truncated to six characters, |
| 759 | providing a worst-case analysis. |
| 760 | .PP |
| 761 | A number of differences arise in the area of character handling: characters in the |
| 762 | .UX |
| 763 | system are eight bit ascii, while they are eight bit ebcdic on the IBM, and |
| 764 | nine bit ascii on GCOS. |
| 765 | Moreover, character strings go from high to low bit positions (``left to right'') |
| 766 | on GCOS and IBM, and low to high (``right to left'') on the PDP-11. |
| 767 | This means that code attempting to construct strings |
| 768 | out of character constants, or attempting to use characters as indices |
| 769 | into arrays, must be looked at with great suspicion. |
| 770 | .I Lint |
| 771 | is of little help here, except to flag multi-character character constants. |
| 772 | .PP |
| 773 | Of course, the word sizes are different! |
| 774 | This causes less trouble than might be expected, at least when |
| 775 | moving from the |
| 776 | .UX |
| 777 | system (16 bit words) to the IBM (32 bits) or GCOS (36 bits). |
| 778 | The main problems are likely to arise in shifting or masking. |
| 779 | C now supports a bit-field facility, which can be used to write much of |
| 780 | this code in a reasonably portable way. |
| 781 | Frequently, portability of such code can be enhanced by |
| 782 | slight rearrangements in coding style. |
| 783 | Many of the incompatibilities seem to have the flavor of writing |
| 784 | .DS |
| 785 | x &= 0177700 ; |
| 786 | .DE |
| 787 | to clear the low order six bits of \fIx\fR. |
| 788 | This suffices on the PDP-11, but fails badly on GCOS and IBM. |
| 789 | If the bit field feature cannot be used, the same effect can be obtained by |
| 790 | writing |
| 791 | .DS |
| 792 | x &= \(ap 077 ; |
| 793 | .DE |
| 794 | which will work on all these machines. |
| 795 | .PP |
| 796 | The right shift operator is arithmetic shift on the PDP-11, and logical shift on most |
| 797 | other machines. |
| 798 | To obtain a logical shift on all machines, the left operand can be |
| 799 | typed \fBunsigned\fR. |
| 800 | Characters are considered signed integers on the PDP-11, and unsigned on the other machines. |
| 801 | This persistence of the sign bit may be reasonably considered a bug in the PDP-11 hardware |
| 802 | which has infiltrated itself into the C language. |
| 803 | If there were a good way to discover the programs which would be affected, C could be changed; |
| 804 | in any case, |
| 805 | .I lint |
| 806 | is no help here. |
| 807 | .PP |
| 808 | The above discussion may have made the problem of portability seem |
| 809 | bigger than it in fact is. |
| 810 | The issues involved here are rarely subtle or mysterious, at least to the |
| 811 | implementor of the program, although they can involve some work to straighten out. |
| 812 | The most serious bar to the portability of |
| 813 | .UX |
| 814 | system utilities has been the inability to mimic |
| 815 | essential |
| 816 | .UX |
| 817 | system functions on the other systems. |
| 818 | The inability to seek to a random character position in a text file, or to establish a pipe |
| 819 | between processes, has involved far more rewriting |
| 820 | and debugging than any of the differences in C compilers. |
| 821 | On the other hand, |
| 822 | .I lint |
| 823 | has been very helpful |
| 824 | in moving the |
| 825 | .UX |
| 826 | operating system and associated |
| 827 | utility programs to other machines. |
| 828 | .SH |
| 829 | Shutting Lint Up |
| 830 | .PP |
| 831 | There are occasions when |
| 832 | the programmer is smarter than |
| 833 | .I lint . |
| 834 | There may be valid reasons for ``illegal'' type casts, |
| 835 | functions with a variable number of arguments, etc. |
| 836 | Moreover, as specified above, the flow of control information |
| 837 | produced by |
| 838 | .I lint |
| 839 | often has blind spots, causing occasional spurious |
| 840 | messages about perfectly reasonable programs. |
| 841 | Thus, some way of communicating with |
| 842 | .I lint , |
| 843 | typically to shut it up, is desirable. |
| 844 | .PP |
| 845 | The form which this mechanism should take is not at all clear. |
| 846 | New keywords would require current and old compilers to |
| 847 | recognize these keywords, if only to ignore them. |
| 848 | This has both philosophical and practical problems. |
| 849 | New preprocessor syntax suffers from similar problems. |
| 850 | .PP |
| 851 | What was finally done was to cause a number of words |
| 852 | to be recognized by |
| 853 | .I lint |
| 854 | when they were embedded in comments. |
| 855 | This required minimal preprocessor changes; |
| 856 | the preprocessor just had to agree to pass comments |
| 857 | through to its output, instead of deleting them |
| 858 | as had been previously done. |
| 859 | Thus, |
| 860 | .I lint |
| 861 | directives are invisible to the compilers, and |
| 862 | the effect on systems with the older preprocessors |
| 863 | is merely that the |
| 864 | .I lint |
| 865 | directives don't work. |
| 866 | .PP |
| 867 | The first directive is concerned with flow of control information; |
| 868 | if a particular place in the program cannot be reached, |
| 869 | but this is not apparent to |
| 870 | .I lint , |
| 871 | this can be asserted by the directive |
| 872 | .DS |
| 873 | /* NOTREACHED */ |
| 874 | .DE |
| 875 | at the appropriate spot in the program. |
| 876 | Similarly, if it is desired to turn off |
| 877 | strict type checking for |
| 878 | the next expression, the directive |
| 879 | .DS |
| 880 | /* NOSTRICT */ |
| 881 | .DE |
| 882 | can be used; the situation reverts to the |
| 883 | previous default after the next expression. |
| 884 | The |
| 885 | .B \-v |
| 886 | flag can be turned on for one function by the directive |
| 887 | .DS |
| 888 | /* ARGSUSED */ |
| 889 | .DE |
| 890 | Complaints about variable number of arguments in calls to a function |
| 891 | can be turned off by the directive |
| 892 | .DS |
| 893 | /* VARARGS */ |
| 894 | .DE |
| 895 | preceding the function definition. |
| 896 | In some cases, it is desirable to check the |
| 897 | first several arguments, and leave the later arguments unchecked. |
| 898 | This can be done by following the VARARGS keyword immediately |
| 899 | with a digit giving the number of arguments which should be checked; thus, |
| 900 | .DS |
| 901 | /* VARARGS2 */ |
| 902 | .DE |
| 903 | will cause the first two arguments to be checked, the others unchecked. |
| 904 | Finally, the directive |
| 905 | .DS |
| 906 | /* LINTLIBRARY */ |
| 907 | .DE |
| 908 | at the head of a file identifies this file as |
| 909 | a library declaration file; this topic is worth a |
| 910 | section by itself. |
| 911 | .SH |
| 912 | Library Declaration Files |
| 913 | .PP |
| 914 | .I Lint |
| 915 | accepts certain library directives, such as |
| 916 | .DS |
| 917 | \-ly |
| 918 | .DE |
| 919 | and tests the source files for compatibility with these libraries. |
| 920 | This is done by accessing library description files whose |
| 921 | names are constructed from the library directives. |
| 922 | These files all begin with the directive |
| 923 | .DS |
| 924 | /* LINTLIBRARY */ |
| 925 | .DE |
| 926 | which is followed by a series of dummy function |
| 927 | definitions. |
| 928 | The critical parts of these definitions |
| 929 | are the declaration of the function return type, |
| 930 | whether the dummy function returns a value, and |
| 931 | the number and types of arguments to the function. |
| 932 | The VARARGS and ARGSUSED directives can |
| 933 | be used to specify features of the library functions. |
| 934 | .PP |
| 935 | .I Lint |
| 936 | library files are processed almost exactly like ordinary |
| 937 | source files. |
| 938 | The only difference is that functions which are defined on a library file, |
| 939 | but are not used on a source file, draw no complaints. |
| 940 | .I Lint |
| 941 | does not simulate a full library search algorithm, |
| 942 | and complains if the source files contain a redefinition of |
| 943 | a library routine (this is a feature!). |
| 944 | .PP |
| 945 | By default, |
| 946 | .I lint |
| 947 | checks the programs it is given against a standard library |
| 948 | file, which contains descriptions of the programs which |
| 949 | are normally loaded when |
| 950 | a C program |
| 951 | is run. |
| 952 | When the |
| 953 | .B -p |
| 954 | flag is in effect, another file is checked containing |
| 955 | descriptions of the standard I/O library routines |
| 956 | which are expected to be portable across various machines. |
| 957 | The |
| 958 | .B -n |
| 959 | flag can be used to suppress all library checking. |
| 960 | .SH |
| 961 | Bugs, etc. |
| 962 | .PP |
| 963 | .I Lint |
| 964 | was a difficult program to write, partially |
| 965 | because it is closely connected with matters of programming style, |
| 966 | and partially because users usually don't notice bugs which cause |
| 967 | .I lint |
| 968 | to miss errors which it should have caught. |
| 969 | (By contrast, if |
| 970 | .I lint |
| 971 | incorrectly complains about something that is correct, the |
| 972 | programmer reports that immediately!) |
| 973 | .PP |
| 974 | A number of areas remain to be further developed. |
| 975 | The checking of structures and arrays is rather inadequate; |
| 976 | size |
| 977 | incompatibilities go unchecked, |
| 978 | and no attempt is made to match up structure and union |
| 979 | declarations across files. |
| 980 | Some stricter checking of the use of the |
| 981 | .B typedef |
| 982 | is clearly desirable, but what checking is appropriate, and how |
| 983 | to carry it out, is still to be determined. |
| 984 | .PP |
| 985 | .I Lint |
| 986 | shares the preprocessor with the C compiler. |
| 987 | At some point it may be appropriate for a |
| 988 | special version of the preprocessor to be constructed |
| 989 | which checks for things such as unused macro definitions, |
| 990 | macro arguments which have side effects which are |
| 991 | not expanded at all, or are expanded more than once, etc. |
| 992 | .PP |
| 993 | The central problem with |
| 994 | .I lint |
| 995 | is the packaging of the information which it collects. |
| 996 | There are many options which |
| 997 | serve only to turn off, or slightly modify, |
| 998 | certain features. |
| 999 | There are pressures to add even more of these options. |
| 1000 | .PP |
| 1001 | In conclusion, it appears that the general notion of having two |
| 1002 | programs is a good one. |
| 1003 | The compiler concentrates on quickly and accurately turning the |
| 1004 | program text into bits which can be run; |
| 1005 | .I lint |
| 1006 | concentrates on issues |
| 1007 | of portability, style, and efficiency. |
| 1008 | .I Lint |
| 1009 | can afford to be wrong, since incorrectness and over-conservatism |
| 1010 | are merely annoying, not fatal. |
| 1011 | The compiler can be fast since it knows that |
| 1012 | .I lint |
| 1013 | will cover its flanks. |
| 1014 | Finally, the programmer can |
| 1015 | concentrate at one stage |
| 1016 | of the programming process solely on the algorithms, |
| 1017 | data structures, and correctness of the |
| 1018 | program, and then later retrofit, |
| 1019 | with the aid of |
| 1020 | .I lint , |
| 1021 | the desirable properties of universality and portability. |
| 1022 | .SG MH-1273-SCJ-unix |
| 1023 | .bp |
| 1024 | .[ |
| 1025 | $LIST$ |
| 1026 | .] |
| 1027 | .bp |
| 1028 | .SH |
| 1029 | Appendix: Current Lint Options |
| 1030 | .PP |
| 1031 | The command currently has the form |
| 1032 | .DS |
| 1033 | lint\fR [\fB\-\fRoptions ] files... library-descriptors... |
| 1034 | .DE |
| 1035 | The options are |
| 1036 | .IP \fBh\fR |
| 1037 | Perform heuristic checks |
| 1038 | .IP \fBp\fR |
| 1039 | Perform portability checks |
| 1040 | .IP \fBv\fR |
| 1041 | Don't report unused arguments |
| 1042 | .IP \fBu\fR |
| 1043 | Don't report unused or undefined externals |
| 1044 | .IP \fBb\fR |
| 1045 | Report unreachable |
| 1046 | .B break |
| 1047 | statements. |
| 1048 | .IP \fBx\fR |
| 1049 | Report unused external declarations |
| 1050 | .IP \fBa\fR |
| 1051 | Report assignments of |
| 1052 | .B long |
| 1053 | to |
| 1054 | .B int |
| 1055 | or shorter. |
| 1056 | .IP \fBc\fR |
| 1057 | Complain about questionable casts |
| 1058 | .IP \fBn\fR |
| 1059 | No library checking is done |
| 1060 | .IP \fBs\fR |
| 1061 | Same as |
| 1062 | .B h |
| 1063 | (for historical reasons) |