| 1 | .if n .ls 2 |
| 2 | .tr _\(em |
| 3 | .tr *\(** |
| 4 | .de UC |
| 5 | \&\\$3\s-1\\$1\\s0\&\\$2 |
| 6 | .. |
| 7 | .de IT |
| 8 | .if n .ul |
| 9 | \&\\$3\f2\\$1\fP\&\\$2 |
| 10 | .. |
| 11 | .de UL |
| 12 | .if n .ul |
| 13 | \&\\$3\f3\\$1\fP\&\\$2 |
| 14 | .. |
| 15 | .de P1 |
| 16 | .DS I 3n |
| 17 | .if n .ls 2 |
| 18 | .nf |
| 19 | .if n .ta 5 10 15 20 25 30 35 40 45 50 55 60 |
| 20 | .if t .ta .4i .8i 1.2i 1.6i 2i 2.4i 2.8i 3.2i 3.6i 4i 4.4i 4.8i 5.2i 5.6i |
| 21 | .if t .tr -\(mi|\(bv'\(fm^\(no*\(** |
| 22 | .tr `\(ga'\(aa |
| 23 | .if t .tr _\(ul |
| 24 | .ft 3 |
| 25 | .lg 0 |
| 26 | .. |
| 27 | .de P2 |
| 28 | .ps \\n(PS |
| 29 | .vs \\n(VSp |
| 30 | .ft R |
| 31 | .if n .ls 2 |
| 32 | .tr --||''^^!! |
| 33 | .if t .tr _\(em |
| 34 | .fi |
| 35 | .lg |
| 36 | .DE |
| 37 | .if t .tr _\(em |
| 38 | .. |
| 39 | .hw semi-colon |
| 40 | .hw estab-lished |
| 41 | .hy 14 |
| 42 | . \"2=not last lines; 4= no -xx; 8=no xx- |
| 43 | . \"special chars in programs |
| 44 | . \" start of text |
| 45 | .RP |
| 46 | .....TR 59 |
| 47 | .....TM 77-1273-6 39199 39199-11 |
| 48 | .ND "July 1, 1977" |
| 49 | .TL |
| 50 | The M4 Macro Processor |
| 51 | .AU "MH 2C-518" 6021 |
| 52 | Brian W. Kernighan |
| 53 | .AU "MH 2C-517" 3770 |
| 54 | Dennis M. Ritchie |
| 55 | .AI |
| 56 | .MH |
| 57 | .AB |
| 58 | .PP |
| 59 | M4 is a macro processor available on |
| 60 | .UX |
| 61 | and |
| 62 | .UC GCOS . |
| 63 | Its primary use has been as a |
| 64 | front end for Ratfor for those |
| 65 | cases where parameterless macros |
| 66 | are not adequately powerful. |
| 67 | It has also been used for languages as disparate as C and Cobol. |
| 68 | M4 is particularly suited for functional languages like Fortran, PL/I and C |
| 69 | since macros are specified in a functional notation. |
| 70 | .PP |
| 71 | M4 provides features seldom found even in much larger |
| 72 | macro processors, |
| 73 | including |
| 74 | .IP " \(bu" |
| 75 | arguments |
| 76 | .IP " \(bu" |
| 77 | condition testing |
| 78 | .IP " \(bu" |
| 79 | arithmetic capabilities |
| 80 | .IP " \(bu" |
| 81 | string and substring functions |
| 82 | .IP " \(bu" |
| 83 | file manipulation |
| 84 | .LP |
| 85 | .PP |
| 86 | This paper is a user's manual for M4. |
| 87 | .AE |
| 88 | .CS 6 0 6 0 0 1 |
| 89 | .if t .2C |
| 90 | .SH |
| 91 | Introduction |
| 92 | .PP |
| 93 | A macro processor is a useful way to enhance a programming language, |
| 94 | to make it more palatable |
| 95 | or more readable, |
| 96 | or to tailor it to a particular application. |
| 97 | The |
| 98 | .UL #define |
| 99 | statement in C |
| 100 | and the analogous |
| 101 | .UL define |
| 102 | in Ratfor |
| 103 | are examples of the basic facility provided by |
| 104 | any macro processor _ |
| 105 | replacement of text by other text. |
| 106 | .PP |
| 107 | The M4 macro processor is an extension of a macro processor called M3 |
| 108 | which was written by D. M. Ritchie |
| 109 | for the AP-3 minicomputer; |
| 110 | M3 was in turn based on a macro processor implemented for [1]. |
| 111 | Readers unfamiliar with the basic ideas of macro processing |
| 112 | may wish to read some of the discussion there. |
| 113 | .PP |
| 114 | M4 is a suitable front end for Ratfor and C, |
| 115 | and has also been used successfully with Cobol. |
| 116 | Besides the straightforward replacement of one string of text by another, |
| 117 | it provides |
| 118 | macros with arguments, |
| 119 | conditional macro expansion, |
| 120 | arithmetic, |
| 121 | file manipulation, |
| 122 | and some specialized string processing functions. |
| 123 | .PP |
| 124 | The basic operation of M4 |
| 125 | is to copy its input to its output. |
| 126 | As the input is read, however, each alphanumeric ``token'' |
| 127 | (that is, string of letters and digits) is checked. |
| 128 | If it is the name of a macro, |
| 129 | then the name of the macro is replaced by its defining text, |
| 130 | and the resulting string is pushed back onto the |
| 131 | input to be rescanned. |
| 132 | Macros may be called with arguments, in which case the arguments are collected |
| 133 | and substituted into the right places in the defining text |
| 134 | before it is rescanned. |
| 135 | .PP |
| 136 | M4 provides a collection of about twenty built-in |
| 137 | macros |
| 138 | which perform various useful operations; |
| 139 | in addition, the user can define new macros. |
| 140 | Built-ins and user-defined macros work exactly the same way, except that |
| 141 | some of the built-in macros have side effects |
| 142 | on the state of the process. |
| 143 | .SH |
| 144 | Usage |
| 145 | .PP |
| 146 | On |
| 147 | .UC UNIX , |
| 148 | use |
| 149 | .P1 |
| 150 | m4 [files] |
| 151 | .P2 |
| 152 | Each argument file is processed in order; |
| 153 | if there are no arguments, or if an argument |
| 154 | is `\-', |
| 155 | the standard input is read at that point. |
| 156 | The processed text is written on the standard output, |
| 157 | which may be captured for subsequent processing with |
| 158 | .P1 |
| 159 | m4 [files] >outputfile |
| 160 | .P2 |
| 161 | On |
| 162 | .UC GCOS , |
| 163 | usage is identical, but the program is called |
| 164 | .UL \&./m4 . |
| 165 | .SH |
| 166 | Defining Macros |
| 167 | .PP |
| 168 | The primary built-in function of M4 |
| 169 | is |
| 170 | .UL define , |
| 171 | which is used to define new macros. |
| 172 | The input |
| 173 | .P1 |
| 174 | define(name, stuff) |
| 175 | .P2 |
| 176 | causes the string |
| 177 | .UL name |
| 178 | to be defined as |
| 179 | .UL stuff . |
| 180 | All subsequent occurrences of |
| 181 | .UL name |
| 182 | will be replaced by |
| 183 | .UL stuff . |
| 184 | .UL name |
| 185 | must be alphanumeric and must begin with a letter |
| 186 | (the underscore \(ul counts as a letter). |
| 187 | .UL stuff |
| 188 | is any text that contains balanced parentheses; |
| 189 | it may stretch over multiple lines. |
| 190 | .PP |
| 191 | Thus, as a typical example, |
| 192 | .P1 |
| 193 | define(N, 100) |
| 194 | ... |
| 195 | if (i > N) |
| 196 | .P2 |
| 197 | defines |
| 198 | .UL N |
| 199 | to be 100, and uses this ``symbolic constant'' in a later |
| 200 | .UL if |
| 201 | statement. |
| 202 | .PP |
| 203 | The left parenthesis must immediately follow the word |
| 204 | .UL define , |
| 205 | to signal that |
| 206 | .UL define |
| 207 | has arguments. |
| 208 | If a macro or built-in name is not followed immediately by `(', |
| 209 | it is assumed to have no arguments. |
| 210 | This is the situation for |
| 211 | .UL N |
| 212 | above; |
| 213 | it is actually a macro with no arguments, |
| 214 | and thus when it is used there need be no (...) following it. |
| 215 | .PP |
| 216 | You should also notice that a macro name is only recognized as such |
| 217 | if it appears surrounded by non-alphanumerics. |
| 218 | For example, in |
| 219 | .P1 |
| 220 | define(N, 100) |
| 221 | ... |
| 222 | if (NNN > 100) |
| 223 | .P2 |
| 224 | the variable |
| 225 | .UL NNN |
| 226 | is absolutely unrelated to the defined macro |
| 227 | .UL N , |
| 228 | even though it contains a lot of |
| 229 | .UL N 's. |
| 230 | .PP |
| 231 | Things may be defined in terms of other things. |
| 232 | For example, |
| 233 | .P1 |
| 234 | define(N, 100) |
| 235 | define(M, N) |
| 236 | .P2 |
| 237 | defines both M and N to be 100. |
| 238 | .PP |
| 239 | What happens if |
| 240 | .UL N |
| 241 | is redefined? |
| 242 | Or, to say it another way, is |
| 243 | .UL M |
| 244 | defined as |
| 245 | .UL N |
| 246 | or as 100? |
| 247 | In M4, |
| 248 | the latter is true _ |
| 249 | .UL M |
| 250 | is 100, so even if |
| 251 | .UL N |
| 252 | subsequently changes, |
| 253 | .UL M |
| 254 | does not. |
| 255 | .PP |
| 256 | This behavior arises because |
| 257 | M4 expands macro names into their defining text as soon as it possibly can. |
| 258 | Here, that means that when the string |
| 259 | .UL N |
| 260 | is seen as the arguments of |
| 261 | .UL define |
| 262 | are being collected, it is immediately replaced by 100; |
| 263 | it's just as if you had said |
| 264 | .P1 |
| 265 | define(M, 100) |
| 266 | .P2 |
| 267 | in the first place. |
| 268 | .PP |
| 269 | If this isn't what you really want, there are two ways out of it. |
| 270 | The first, which is specific to this situation, |
| 271 | is to interchange the order of the definitions: |
| 272 | .P1 |
| 273 | define(M, N) |
| 274 | define(N, 100) |
| 275 | .P2 |
| 276 | Now |
| 277 | .UL M |
| 278 | is defined to be the string |
| 279 | .UL N , |
| 280 | so when you ask for |
| 281 | .UL M |
| 282 | later, you'll always get the value of |
| 283 | .UL N |
| 284 | at that time |
| 285 | (because the |
| 286 | .UL M |
| 287 | will be replaced by |
| 288 | .UL N |
| 289 | which will be replaced by 100). |
| 290 | .SH |
| 291 | Quoting |
| 292 | .PP |
| 293 | The more general solution is to delay the expansion of |
| 294 | the arguments of |
| 295 | .UL define |
| 296 | by |
| 297 | .ul |
| 298 | quoting |
| 299 | them. |
| 300 | Any text surrounded by the single quotes \(ga and \(aa |
| 301 | is not expanded immediately, but has the quotes stripped off. |
| 302 | If you say |
| 303 | .P1 |
| 304 | define(N, 100) |
| 305 | define(M, `N') |
| 306 | .P2 |
| 307 | the quotes around the |
| 308 | .UL N |
| 309 | are stripped off as the argument is being collected, |
| 310 | but they have served their purpose, and |
| 311 | .UL M |
| 312 | is defined as |
| 313 | the string |
| 314 | .UL N , |
| 315 | not 100. |
| 316 | The general rule is that M4 always strips off |
| 317 | one level of single quotes whenever it evaluates |
| 318 | something. |
| 319 | This is true even outside of |
| 320 | macros. |
| 321 | If you want the word |
| 322 | .UL define |
| 323 | to appear in the output, |
| 324 | you have to quote it in the input, |
| 325 | as in |
| 326 | .P1 |
| 327 | `define' = 1; |
| 328 | .P2 |
| 329 | .PP |
| 330 | As another instance of the same thing, which is a bit more surprising, |
| 331 | consider redefining |
| 332 | .UL N : |
| 333 | .P1 |
| 334 | define(N, 100) |
| 335 | ... |
| 336 | define(N, 200) |
| 337 | .P2 |
| 338 | Perhaps regrettably, the |
| 339 | .UL N |
| 340 | in the second definition is |
| 341 | evaluated as soon as it's seen; |
| 342 | that is, it is |
| 343 | replaced by |
| 344 | 100, so it's as if you had written |
| 345 | .P1 |
| 346 | define(100, 200) |
| 347 | .P2 |
| 348 | This statement is ignored by M4, since you can only define things that look |
| 349 | like names, but it obviously doesn't have the effect you wanted. |
| 350 | To really redefine |
| 351 | .UL N , |
| 352 | you must delay the evaluation by quoting: |
| 353 | .P1 |
| 354 | define(N, 100) |
| 355 | ... |
| 356 | define(`N', 200) |
| 357 | .P2 |
| 358 | In M4, |
| 359 | it is often wise to quote the first argument of a macro. |
| 360 | .PP |
| 361 | If \` and \' are not convenient for some reason, |
| 362 | the quote characters can be changed with the built-in |
| 363 | .UL changequote : |
| 364 | .P1 |
| 365 | changequote([, ]) |
| 366 | .P2 |
| 367 | makes the new quote characters the left and right brackets. |
| 368 | You can restore the original characters with just |
| 369 | .P1 |
| 370 | changequote |
| 371 | .P2 |
| 372 | .PP |
| 373 | There are two additional built-ins related to |
| 374 | .UL define . |
| 375 | .UL undefine |
| 376 | removes the definition of some macro or built-in: |
| 377 | .P1 |
| 378 | undefine(`N') |
| 379 | .P2 |
| 380 | removes the definition of |
| 381 | .UL N . |
| 382 | (Why are the quotes absolutely necessary?) |
| 383 | Built-ins can be removed with |
| 384 | .UL undefine , |
| 385 | as in |
| 386 | .P1 |
| 387 | undefine(`define') |
| 388 | .P2 |
| 389 | but once you remove one, you can never get it back. |
| 390 | .PP |
| 391 | The built-in |
| 392 | .UL ifdef |
| 393 | provides a way to determine if a macro is currently defined. |
| 394 | In particular, M4 has pre-defined the names |
| 395 | .UL unix |
| 396 | and |
| 397 | .UL gcos |
| 398 | on the corresponding systems, so you can |
| 399 | tell which one you're using: |
| 400 | .P1 |
| 401 | ifdef(`unix', `define(wordsize,16)' ) |
| 402 | ifdef(`gcos', `define(wordsize,36)' ) |
| 403 | .P2 |
| 404 | makes a definition appropriate for the particular machine. |
| 405 | Don't forget the quotes! |
| 406 | .PP |
| 407 | .UL ifdef |
| 408 | actually permits three arguments; |
| 409 | if the name is undefined, the value of |
| 410 | .UL ifdef |
| 411 | is then the third argument, as in |
| 412 | .P1 |
| 413 | ifdef(`unix', on UNIX, not on UNIX) |
| 414 | .P2 |
| 415 | .SH |
| 416 | Arguments |
| 417 | .PP |
| 418 | So far we have discussed the simplest form of macro processing _ |
| 419 | replacing one string by another (fixed) string. |
| 420 | User-defined macros may also have arguments, so different invocations |
| 421 | can have different results. |
| 422 | Within the replacement text for a macro |
| 423 | (the second argument of its |
| 424 | .UL define ) |
| 425 | any occurrence of |
| 426 | .UL $n |
| 427 | will be replaced by the |
| 428 | .UL n th |
| 429 | argument when the macro |
| 430 | is actually used. |
| 431 | Thus, the macro |
| 432 | .UL bump , |
| 433 | defined as |
| 434 | .P1 |
| 435 | define(bump, $1 = $1 + 1) |
| 436 | .P2 |
| 437 | generates code to increment its argument by 1: |
| 438 | .P1 |
| 439 | bump(x) |
| 440 | .P2 |
| 441 | is |
| 442 | .P1 |
| 443 | x = x + 1 |
| 444 | .P2 |
| 445 | .PP |
| 446 | A macro can have as many arguments as you want, |
| 447 | but only the first nine are accessible, |
| 448 | through |
| 449 | .UL $1 |
| 450 | to |
| 451 | .UL $9 . |
| 452 | (The macro name itself is |
| 453 | .UL $0 , |
| 454 | although that is less commonly used.) |
| 455 | Arguments that are not supplied are replaced by null strings, |
| 456 | so |
| 457 | we can define a macro |
| 458 | .UL cat |
| 459 | which simply concatenates its arguments, like this: |
| 460 | .P1 |
| 461 | define(cat, $1$2$3$4$5$6$7$8$9) |
| 462 | .P2 |
| 463 | Thus |
| 464 | .P1 |
| 465 | cat(x, y, z) |
| 466 | .P2 |
| 467 | is equivalent to |
| 468 | .P1 |
| 469 | xyz |
| 470 | .P2 |
| 471 | .UL $4 |
| 472 | through |
| 473 | .UL $9 |
| 474 | are null, since no corresponding arguments were provided. |
| 475 | .PP |
| 476 | .PP |
| 477 | Leading unquoted blanks, tabs, or newlines that occur during argument collection |
| 478 | are discarded. |
| 479 | All other white space is retained. |
| 480 | Thus |
| 481 | .P1 |
| 482 | define(a, b c) |
| 483 | .P2 |
| 484 | defines |
| 485 | .UL a |
| 486 | to be |
| 487 | .UL b\ \ \ c . |
| 488 | .PP |
| 489 | Arguments are separated by commas, but parentheses are counted properly, |
| 490 | so a comma ``protected'' by parentheses does not terminate an argument. |
| 491 | That is, in |
| 492 | .P1 |
| 493 | define(a, (b,c)) |
| 494 | .P2 |
| 495 | there are only two arguments; |
| 496 | the second is literally |
| 497 | .UL (b,c) . |
| 498 | And of course a bare comma or parenthesis can be inserted by quoting it. |
| 499 | .SH |
| 500 | Arithmetic Built-ins |
| 501 | .PP |
| 502 | M4 provides two built-in functions for doing arithmetic |
| 503 | on integers (only). |
| 504 | The simplest is |
| 505 | .UL incr , |
| 506 | which increments its numeric argument by 1. |
| 507 | Thus to handle the common programming situation |
| 508 | where you want a variable to be defined as ``one more than N'', |
| 509 | write |
| 510 | .P1 |
| 511 | define(N, 100) |
| 512 | define(N1, `incr(N)') |
| 513 | .P2 |
| 514 | Then |
| 515 | .UL N1 |
| 516 | is defined as one more than the current value of |
| 517 | .UL N . |
| 518 | .PP |
| 519 | The more general mechanism for arithmetic is a built-in |
| 520 | called |
| 521 | .UL eval , |
| 522 | which is capable of arbitrary arithmetic on integers. |
| 523 | It provides the operators |
| 524 | (in decreasing order of precedence) |
| 525 | .DS |
| 526 | unary + and \(mi |
| 527 | ** or ^ (exponentiation) |
| 528 | * / % (modulus) |
| 529 | + \(mi |
| 530 | == != < <= > >= |
| 531 | ! (not) |
| 532 | & or && (logical and) |
| 533 | \(or or \(or\(or (logical or) |
| 534 | .DE |
| 535 | Parentheses may be used to group operations where needed. |
| 536 | All the operands of |
| 537 | an expression given to |
| 538 | .UL eval |
| 539 | must ultimately be numeric. |
| 540 | The numeric value of a true relation |
| 541 | (like 1>0) |
| 542 | is 1, and false is 0. |
| 543 | The precision in |
| 544 | .UL eval |
| 545 | is |
| 546 | 32 bits on |
| 547 | .UC UNIX |
| 548 | and 36 bits on |
| 549 | .UC GCOS . |
| 550 | .PP |
| 551 | As a simple example, suppose we want |
| 552 | .UL M |
| 553 | to be |
| 554 | .UL 2**N+1 . |
| 555 | Then |
| 556 | .P1 |
| 557 | define(N, 3) |
| 558 | define(M, `eval(2**N+1)') |
| 559 | .P2 |
| 560 | As a matter of principle, it is advisable |
| 561 | to quote the defining text for a macro |
| 562 | unless it is very simple indeed |
| 563 | (say just a number); |
| 564 | it usually gives the result you want, |
| 565 | and is a good habit to get into. |
| 566 | .SH |
| 567 | File Manipulation |
| 568 | .PP |
| 569 | You can include a new file in the input at any time by |
| 570 | the built-in function |
| 571 | .UL include : |
| 572 | .P1 |
| 573 | include(filename) |
| 574 | .P2 |
| 575 | inserts the contents of |
| 576 | .UL filename |
| 577 | in place of the |
| 578 | .UL include |
| 579 | command. |
| 580 | The contents of the file is often a set of definitions. |
| 581 | The value |
| 582 | of |
| 583 | .UL include |
| 584 | (that is, its replacement text) |
| 585 | is the contents of the file; |
| 586 | this can be captured in definitions, etc. |
| 587 | .PP |
| 588 | It is a fatal error if the file named in |
| 589 | .UL include |
| 590 | cannot be accessed. |
| 591 | To get some control over this situation, the alternate form |
| 592 | .UL sinclude |
| 593 | can be used; |
| 594 | .UL sinclude |
| 595 | (``silent include'') |
| 596 | says nothing and continues if it can't access the file. |
| 597 | .PP |
| 598 | It is also possible to divert the output of M4 to temporary files during processing, |
| 599 | and output the collected material upon command. |
| 600 | M4 maintains nine of these diversions, numbered 1 through 9. |
| 601 | If you say |
| 602 | .P1 |
| 603 | divert(n) |
| 604 | .P2 |
| 605 | all subsequent output is put onto the end of a temporary file |
| 606 | referred to as |
| 607 | .UL n . |
| 608 | Diverting to this file is stopped by another |
| 609 | .UL divert |
| 610 | command; |
| 611 | in particular, |
| 612 | .UL divert |
| 613 | or |
| 614 | .UL divert(0) |
| 615 | resumes the normal output process. |
| 616 | .PP |
| 617 | Diverted text is normally output all at once |
| 618 | at the end of processing, |
| 619 | with the diversions output in numeric order. |
| 620 | It is possible, however, to bring back diversions |
| 621 | at any time, |
| 622 | that is, to append them to the current diversion. |
| 623 | .P1 |
| 624 | undivert |
| 625 | .P2 |
| 626 | brings back all diversions in numeric order, and |
| 627 | .UL undivert |
| 628 | with arguments brings back the selected diversions |
| 629 | in the order given. |
| 630 | The act of undiverting discards the diverted stuff, |
| 631 | as does diverting into a diversion |
| 632 | whose number is not between 0 and 9 inclusive. |
| 633 | .PP |
| 634 | The value of |
| 635 | .UL undivert |
| 636 | is |
| 637 | .ul |
| 638 | not |
| 639 | the diverted stuff. |
| 640 | Furthermore, the diverted material is |
| 641 | .ul |
| 642 | not |
| 643 | rescanned for macros. |
| 644 | .PP |
| 645 | The built-in |
| 646 | .UL divnum |
| 647 | returns the number of the currently active diversion. |
| 648 | This is zero during normal processing. |
| 649 | .SH |
| 650 | System Command |
| 651 | .PP |
| 652 | You can run any program in the local operating system |
| 653 | with the |
| 654 | .UL syscmd |
| 655 | built-in. |
| 656 | For example, |
| 657 | .P1 |
| 658 | syscmd(date) |
| 659 | .P2 |
| 660 | on |
| 661 | .UC UNIX |
| 662 | runs the |
| 663 | .UL date |
| 664 | command. |
| 665 | Normally |
| 666 | .UL syscmd |
| 667 | would be used to create a file |
| 668 | for a subsequent |
| 669 | .UL include . |
| 670 | .PP |
| 671 | To facilitate making unique file names, the built-in |
| 672 | .UL maketemp |
| 673 | is provided, with specifications identical to the system function |
| 674 | .ul |
| 675 | mktemp: |
| 676 | a string of XXXXX in the argument is replaced |
| 677 | by the process id of the current process. |
| 678 | .SH |
| 679 | Conditionals |
| 680 | .PP |
| 681 | There is a built-in called |
| 682 | .UL ifelse |
| 683 | which enables you to perform arbitrary conditional testing. |
| 684 | In the simplest form, |
| 685 | .P1 |
| 686 | ifelse(a, b, c, d) |
| 687 | .P2 |
| 688 | compares the two strings |
| 689 | .UL a |
| 690 | and |
| 691 | .UL b . |
| 692 | If these are identical, |
| 693 | .UL ifelse |
| 694 | returns |
| 695 | the string |
| 696 | .UL c ; |
| 697 | otherwise it returns |
| 698 | .UL d . |
| 699 | Thus we might define a macro called |
| 700 | .UL compare |
| 701 | which compares two strings and returns ``yes'' or ``no'' |
| 702 | if they are the same or different. |
| 703 | .P1 |
| 704 | define(compare, `ifelse($1, $2, yes, no)') |
| 705 | .P2 |
| 706 | Note the quotes, |
| 707 | which prevent too-early evaluation of |
| 708 | .UL ifelse . |
| 709 | .PP |
| 710 | If the fourth argument is missing, it is treated as empty. |
| 711 | .PP |
| 712 | .UL ifelse |
| 713 | can actually have any number of arguments, |
| 714 | and thus provides a limited form of multi-way decision capability. |
| 715 | In the input |
| 716 | .P1 |
| 717 | ifelse(a, b, c, d, e, f, g) |
| 718 | .P2 |
| 719 | if the string |
| 720 | .UL a |
| 721 | matches the string |
| 722 | .UL b , |
| 723 | the result is |
| 724 | .UL c . |
| 725 | Otherwise, if |
| 726 | .UL d |
| 727 | is the same as |
| 728 | .UL e , |
| 729 | the result is |
| 730 | .UL f . |
| 731 | Otherwise the result is |
| 732 | .UL g . |
| 733 | If the final argument |
| 734 | is omitted, the result is null, |
| 735 | so |
| 736 | .P1 |
| 737 | ifelse(a, b, c) |
| 738 | .P2 |
| 739 | is |
| 740 | .UL c |
| 741 | if |
| 742 | .UL a |
| 743 | matches |
| 744 | .UL b , |
| 745 | and null otherwise. |
| 746 | .SH |
| 747 | String Manipulation |
| 748 | .PP |
| 749 | The built-in |
| 750 | .UL len |
| 751 | returns the length of the string that makes up its argument. |
| 752 | Thus |
| 753 | .P1 |
| 754 | len(abcdef) |
| 755 | .P2 |
| 756 | is 6, and |
| 757 | .UL len((a,b)) |
| 758 | is 5. |
| 759 | .PP |
| 760 | The built-in |
| 761 | .UL substr |
| 762 | can be used to produce substrings of strings. |
| 763 | .UL substr(s,\ i,\ n) |
| 764 | returns the substring of |
| 765 | .UL s |
| 766 | that starts at the |
| 767 | .UL i th |
| 768 | position |
| 769 | (origin zero), |
| 770 | and is |
| 771 | .UL n |
| 772 | characters long. |
| 773 | If |
| 774 | .UL n |
| 775 | is omitted, the rest of the string is returned, |
| 776 | so |
| 777 | .P1 |
| 778 | substr(`now is the time', 1) |
| 779 | .P2 |
| 780 | is |
| 781 | .P1 |
| 782 | ow is the time |
| 783 | .P2 |
| 784 | If |
| 785 | .UL i |
| 786 | or |
| 787 | .UL n |
| 788 | are out of range, various sensible things happen. |
| 789 | .PP |
| 790 | .UL index(s1,\ s2) |
| 791 | returns the index (position) in |
| 792 | .UL s1 |
| 793 | where the string |
| 794 | .UL s2 |
| 795 | occurs, or \-1 |
| 796 | if it doesn't occur. |
| 797 | As with |
| 798 | .UL substr , |
| 799 | the origin for strings is 0. |
| 800 | .PP |
| 801 | The built-in |
| 802 | .UL translit |
| 803 | performs character transliteration. |
| 804 | .P1 |
| 805 | translit(s, f, t) |
| 806 | .P2 |
| 807 | modifies |
| 808 | .UL s |
| 809 | by replacing any character found in |
| 810 | .UL f |
| 811 | by the corresponding character of |
| 812 | .UL t . |
| 813 | That is, |
| 814 | .P1 |
| 815 | translit(s, aeiou, 12345) |
| 816 | .P2 |
| 817 | replaces the vowels by the corresponding digits. |
| 818 | If |
| 819 | .UL t |
| 820 | is shorter than |
| 821 | .UL f , |
| 822 | characters which don't have an entry in |
| 823 | .UL t |
| 824 | are deleted; as a limiting case, |
| 825 | if |
| 826 | .UL t |
| 827 | is not present at all, |
| 828 | characters from |
| 829 | .UL f |
| 830 | are deleted from |
| 831 | .UL s . |
| 832 | So |
| 833 | .P1 |
| 834 | translit(s, aeiou) |
| 835 | .P2 |
| 836 | deletes vowels from |
| 837 | .UL s . |
| 838 | .PP |
| 839 | There is also a built-in called |
| 840 | .UL dnl |
| 841 | which deletes all characters that follow it up to |
| 842 | and including the next newline; |
| 843 | it is useful mainly for throwing away |
| 844 | empty lines that otherwise tend to clutter up M4 output. |
| 845 | For example, if you say |
| 846 | .P1 |
| 847 | define(N, 100) |
| 848 | define(M, 200) |
| 849 | define(L, 300) |
| 850 | .P2 |
| 851 | the newline at the end of each line is not part of the definition, |
| 852 | so it is copied into the output, where it may not be wanted. |
| 853 | If you add |
| 854 | .UL dnl |
| 855 | to each of these lines, the newlines will disappear. |
| 856 | .PP |
| 857 | Another way to achieve this, due to J. E. Weythman, |
| 858 | is |
| 859 | .P1 |
| 860 | divert(-1) |
| 861 | define(...) |
| 862 | ... |
| 863 | divert |
| 864 | .P2 |
| 865 | .SH |
| 866 | Printing |
| 867 | .PP |
| 868 | The built-in |
| 869 | .UL errprint |
| 870 | writes its arguments out on the standard error file. |
| 871 | Thus you can say |
| 872 | .P1 |
| 873 | errprint(`fatal error') |
| 874 | .P2 |
| 875 | .PP |
| 876 | .UL dumpdef |
| 877 | is a debugging aid which |
| 878 | dumps the current definitions of defined terms. |
| 879 | If there are no arguments, you get everything; |
| 880 | otherwise you get the ones you name as arguments. |
| 881 | Don't forget to quote the names! |
| 882 | .SH |
| 883 | Summary of Built-ins |
| 884 | .PP |
| 885 | Each entry is preceded by the |
| 886 | page number where it is described. |
| 887 | .DS |
| 888 | .tr '\'`\` |
| 889 | .ta .25i |
| 890 | 3 changequote(L, R) |
| 891 | 1 define(name, replacement) |
| 892 | 4 divert(number) |
| 893 | 4 divnum |
| 894 | 5 dnl |
| 895 | 5 dumpdef(`name', `name', ...) |
| 896 | 5 errprint(s, s, ...) |
| 897 | 4 eval(numeric expression) |
| 898 | 3 ifdef(`name', this if true, this if false) |
| 899 | 5 ifelse(a, b, c, d) |
| 900 | 4 include(file) |
| 901 | 3 incr(number) |
| 902 | 5 index(s1, s2) |
| 903 | 5 len(string) |
| 904 | 4 maketemp(...XXXXX...) |
| 905 | 4 sinclude(file) |
| 906 | 5 substr(string, position, number) |
| 907 | 4 syscmd(s) |
| 908 | 5 translit(str, from, to) |
| 909 | 3 undefine(`name') |
| 910 | 4 undivert(number,number,...) |
| 911 | .DE |
| 912 | .SH |
| 913 | Acknowledgements |
| 914 | .PP |
| 915 | We are indebted to Rick Becker, John Chambers, |
| 916 | Doug McIlroy, |
| 917 | and especially Jim Weythman, |
| 918 | whose pioneering use of M4 has led to several valuable improvements. |
| 919 | We are also deeply grateful to Weythman for several substantial contributions |
| 920 | to the code. |
| 921 | .SG |
| 922 | .SH |
| 923 | References |
| 924 | .LP |
| 925 | .IP [1] |
| 926 | B. W. Kernighan and P. J. Plauger, |
| 927 | .ul |
| 928 | Software Tools, |
| 929 | Addison-Wesley, Inc., 1976. |