BSD 4_4_Lite2 development
[unix-history] / usr / share / man / cat1 / awk.0
CommitLineData
27debc92
C
1
2
3
4AWK(1) Utility Commands AWK(1)
5
6
7N\bNA\bAM\bME\bE
8 awk - pattern scanning and processing language
9
10S\bSY\bYN\bNO\bOP\bPS\bSI\bIS\bS
11 a\baw\bwk\bk [ POSIX or GNU style options ] -\b-f\bf _\bp_\br_\bo_\bg_\br_\ba_\bm_\b-_\bf_\bi_\bl_\be [ -\b--\b- ]
12 file ...
13 a\baw\bwk\bk [ POSIX or GNU style options ] [ -\b--\b- ] _\bp_\br_\bo_\bg_\br_\ba_\bm_\b-_\bt_\be_\bx_\bt
14 file ...
15
16D\bDE\bES\bSC\bCR\bRI\bIP\bPT\bTI\bIO\bON\bN
17 _\bG_\ba_\bw_\bk is the GNU Project's implementation of the AWK pro-
18 gramming language. In the 4.4BSD distribution, it is
19 installed as _\ba_\bw_\bk. It conforms to the definition of the
20 language in the POSIX 1003.2 Command Language And Utili-
21 ties Standard. This version in turn is based on the
22 description in _\bT_\bh_\be _\bA_\bW_\bK _\bP_\br_\bo_\bg_\br_\ba_\bm_\bm_\bi_\bn_\bg _\bL_\ba_\bn_\bg_\bu_\ba_\bg_\be, by Aho,
23 Kernighan, and Weinberger, with the additional features
24 defined in the System V Release 4 version of UNIX _\ba_\bw_\bk.
25 _\bG_\ba_\bw_\bk also provides some GNU-specific extensions.
26
27 The command line consists of options to _\bg_\ba_\bw_\bk itself, the
28 AWK program text (if not supplied via the -\b-f\bf or -\b--\b-f\bfi\bil\ble\be
29 options), and values to be made available in the A\bAR\bRG\bGC\bC and
30 A\bAR\bRG\bGV\bV pre-defined AWK variables.
31
32O\bOP\bPT\bTI\bIO\bON\bNS\bS
33 _\bG_\ba_\bw_\bk options may be either the traditional POSIX one let-
34 ter options, or the GNU style long options. POSIX style
35 options start with a single ``-'', while GNU long options
36 start with ``--''. GNU style long options are provided
37 for both GNU-specific features and for POSIX mandated fea-
38 tures. Other implementations of the AWK language are
39 likely to only accept the traditional one letter options.
40
41 Following the POSIX standard, _\bg_\ba_\bw_\bk-specific options are
42 supplied via arguments to the -\b-W\bW option. Multiple -\b-W\bW
43 options may be supplied, or multiple arguments may be sup-
44 plied together if they are separated by commas, or
45 enclosed in quotes and separated by white space. Case is
46 ignored in arguments to the -\b-W\bW option. Each -\b-W\bW option has
47 a corresponding GNU style long option, as detailed below.
48
49 _\bG_\ba_\bw_\bk accepts the following options.
50
51 -\b-F\bF _\bf_\bs
52 -\b--\b-f\bfi\bie\bel\bld\bd-\b-s\bse\bep\bpa\bar\bra\bat\bto\bor\br=\b=_\bf_\bs
53 Use _\bf_\bs for the input field separator (the value of
54 the F\bFS\bS predefined variable).
55
56 -\b-v\bv _\bv_\ba_\br=\b=_\bv_\ba_\bl
57
58
59
60
61Free Software Foundation April 15 1993 1
62
63
64
65
66
67
68
69
70AWK(1) Utility Commands AWK(1)
71
72
73 -\b--\b-a\bas\bss\bsi\big\bgn\bn=\b=_\bv_\ba_\br=\b=_\bv_\ba_\bl
74 Assign the value _\bv_\ba_\bl, to the variable _\bv_\ba_\br, before
75 execution of the program begins. Such variable
76 values are available to the B\bBE\bEG\bGI\bIN\bN block of an AWK
77 program.
78
79 -\b-f\bf _\bp_\br_\bo_\bg_\br_\ba_\bm_\b-_\bf_\bi_\bl_\be
80 -\b--\b-f\bfi\bil\ble\be=\b=_\bp_\br_\bo_\bg_\br_\ba_\bm_\b-_\bf_\bi_\bl_\be
81 Read the AWK program source from the file _\bp_\br_\bo_\bg_\br_\ba_\bm_\b-
82 _\bf_\bi_\bl_\be, instead of from the first command line argu-
83 ment. Multiple -\b-f\bf (or -\b--\b-f\bfi\bil\ble\be) options may be used.
84
85 -\b-W\bW c\bco\bom\bmp\bpa\bat\bt
86 -\b--\b-c\bco\bom\bmp\bpa\bat\bt Run in _\bc_\bo_\bm_\bp_\ba_\bt_\bi_\bb_\bi_\bl_\bi_\bt_\by mode. In compatibility
87 mode, _\bg_\ba_\bw_\bk behaves identically to UNIX _\ba_\bw_\bk;
88 none of the GNU-specific extensions are recog-
89 nized. See G\bGN\bNU\bU E\bEX\bXT\bTE\bEN\bNS\bSI\bIO\bON\bNS\bS, below, for more
90 information.
91
92 -\b-W\bW c\bco\bop\bpy\byl\ble\bef\bft\bt
93 -\b-W\bW c\bco\bop\bpy\byr\bri\big\bgh\bht\bt
94 -\b--\b-c\bco\bop\bpy\byl\ble\bef\bft\bt
95 -\b--\b-c\bco\bop\bpy\byr\bri\big\bgh\bht\bt Print the short version of the GNU copyright
96 information message on the standard error out-
97 put.
98
99 -\b-W\bW h\bhe\bel\blp\bp
100 -\b-W\bW u\bus\bsa\bag\bge\be
101 -\b--\b-h\bhe\bel\blp\bp
102 -\b--\b-u\bus\bsa\bag\bge\be Print a relatively short summary of the avail-
103 able options on the standard error output.
104
105 -\b-W\bW l\bli\bin\bnt\bt
106 -\b--\b-l\bli\bin\bnt\bt Provide warnings about constructs that are
107 dubious or non-portable to other AWK implemen-
108 tations.
109 -\b-W\bW p\bpo\bos\bsi\bix\bx
110 -\b--\b-p\bpo\bos\bsi\bix\bx This turns on _\bc_\bo_\bm_\bp_\ba_\bt_\bi_\bb_\bi_\bl_\bi_\bt_\by mode, with the
111 following additional restrictions:
112
113 +\bo \\b\x\bx escape sequences are not recognized.
114
115 +\bo The synonym f\bfu\bun\bnc\bc for the keyword f\bfu\bun\bnc\bct\bti\bio\bon\bn is
116 not recognized.
117
118 +\bo The operators *\b**\b* and *\b**\b*=\b= cannot be used in
119 place of ^\b^ and ^\b^=\b=.
120
121 -\b-W\bW s\bso\bou\bur\brc\bce\be=\b=_\bp_\br_\bo_\bg_\br_\ba_\bm_\b-_\bt_\be_\bx_\bt
122 -\b--\b-s\bso\bou\bur\brc\bce\be=\b=_\bp_\br_\bo_\bg_\br_\ba_\bm_\b-_\bt_\be_\bx_\bt
123 Use _\bp_\br_\bo_\bg_\br_\ba_\bm_\b-_\bt_\be_\bx_\bt as AWK program source code.
124
125
126
127Free Software Foundation April 15 1993 2
128
129
130
131
132
133
134
135
136AWK(1) Utility Commands AWK(1)
137
138
139 This option allows the easy intermixing of
140 library functions (used via the -\b-f\bf and -\b--\b-f\bfi\bil\ble\be
141 options) with source code entered on the com-
142 mand line. It is intended primarily for
143 medium to large size AWK programs used in
144 shell scripts.
145 The -\b-W\bW s\bso\bou\bur\brc\bce\be=\b= form of this option uses the
146 rest of the command line argument for _\bp_\br_\bo_\bg_\br_\ba_\bm_\b-
147 _\bt_\be_\bx_\bt; no other options to -\b-W\bW will be recog-
148 nized in the same argument.
149
150 -\b-W\bW v\bve\ber\brs\bsi\bio\bon\bn
151 -\b--\b-v\bve\ber\brs\bsi\bio\bon\bn Print version information for this particular
152 copy of _\bg_\ba_\bw_\bk on the standard error output.
153 This is useful mainly for knowing if the cur-
154 rent copy of _\bg_\ba_\bw_\bk on your system is up to date
155 with respect to whatever the Free Software
156 Foundation is distributing.
157
158 -\b--\b- Signal the end of options. This is useful to
159 allow further arguments to the AWK program
160 itself to start with a ``-''. This is mainly
161 for consistency with the argument parsing con-
162 vention used by most other POSIX programs.
163
164 Any other options are flagged as illegal, but are other-
165 wise ignored.
166
167A\bAW\bWK\bK P\bPR\bRO\bOG\bGR\bRA\bAM\bM E\bEX\bXE\bEC\bCU\bUT\bTI\bIO\bON\bN
168 An AWK program consists of a sequence of pattern-action
169 statements and optional function definitions.
170
171 _\bp_\ba_\bt_\bt_\be_\br_\bn {\b{ _\ba_\bc_\bt_\bi_\bo_\bn _\bs_\bt_\ba_\bt_\be_\bm_\be_\bn_\bt_\bs }\b}
172 f\bfu\bun\bnc\bct\bti\bio\bon\bn _\bn_\ba_\bm_\be(\b(_\bp_\ba_\br_\ba_\bm_\be_\bt_\be_\br _\bl_\bi_\bs_\bt)\b) {\b{ _\bs_\bt_\ba_\bt_\be_\bm_\be_\bn_\bt_\bs }\b}
173
174 _\bG_\ba_\bw_\bk first reads the program source from the _\bp_\br_\bo_\bg_\br_\ba_\bm_\b-
175 _\bf_\bi_\bl_\be(s) if specified, or from the first non-option argu-
176 ment on the command line. The -\b-f\bf option may be used mul-
177 tiple times on the command line. _\bG_\ba_\bw_\bk will read the pro-
178 gram text as if all the _\bp_\br_\bo_\bg_\br_\ba_\bm_\b-_\bf_\bi_\bl_\bes had been concate-
179 nated together. This is useful for building libraries of
180 AWK functions, without having to include them in each new
181 AWK program that uses them. To use a library function in
182 a file from a program typed in on the command line, spec-
183 ify /\b/d\bde\bev\bv/\b/t\btt\bty\by as one of the _\bp_\br_\bo_\bg_\br_\ba_\bm_\b-_\bf_\bi_\bl_\bes, type your pro-
184 gram, and end it with a ^\b^D\bD (control-d).
185
186 The environment variable A\bAW\bWK\bKP\bPA\bAT\bTH\bH specifies a search path
187 to use when finding source files named with the -\b-f\bf option.
188 If this variable does not exist, the default path is
189 "\b".\b.:\b:/\b/u\bus\bsr\br/\b/l\bli\bib\bb/\b/a\baw\bwk\bk:\b:/\b/u\bus\bsr\br/\b/l\blo\boc\bca\bal\bl/\b/l\bli\bib\bb/\b/a\baw\bwk\bk"\b". If a file name given
190
191
192
193Free Software Foundation April 15 1993 3
194
195
196
197
198
199
200
201
202AWK(1) Utility Commands AWK(1)
203
204
205 to the -\b-f\bf option contains a ``/'' character, no path
206 search is performed.
207
208 _\bG_\ba_\bw_\bk executes AWK programs in the following order. First,
209 _\bg_\ba_\bw_\bk compiles the program into an internal form. Next,
210 all variable assignments specified via the -\b-v\bv option are
211 performed. Then, _\bg_\ba_\bw_\bk executes the code in the B\bBE\bEG\bGI\bIN\bN
212 block(s) (if any), and then proceeds to read each file
213 named in the A\bAR\bRG\bGV\bV array. If there are no files named on
214 the command line, _\bg_\ba_\bw_\bk reads the standard input.
215
216 If a filename on the command line has the form _\bv_\ba_\br=\b=_\bv_\ba_\bl it
217 is treated as a variable assignment. The variable _\bv_\ba_\br will
218 be assigned the value _\bv_\ba_\bl. (This happens after any B\bBE\bEG\bGI\bIN\bN
219 block(s) have been run.) Command line variable assignment
220 is most useful for dynamically assigning values to the
221 variables AWK uses to control how input is broken into
222 fields and records. It is also useful for controlling
223 state if multiple passes are needed over a single data
224 file.
225
226 If the value of a particular element of A\bAR\bRG\bGV\bV is empty
227 ("\b""\b"), _\bg_\ba_\bw_\bk skips over it.
228
229 For each line in the input, _\bg_\ba_\bw_\bk tests to see if it
230 matches any _\bp_\ba_\bt_\bt_\be_\br_\bn in the AWK program. For each pattern
231 that the line matches, the associated _\ba_\bc_\bt_\bi_\bo_\bn is executed.
232 The patterns are tested in the order they occur in the
233 program.
234
235 Finally, after all the input is exhausted, _\bg_\ba_\bw_\bk executes
236 the code in the E\bEN\bND\bD block(s) (if any).
237
238V\bVA\bAR\bRI\bIA\bAB\bBL\bLE\bES\bS A\bAN\bND\bD F\bFI\bIE\bEL\bLD\bDS\bS
239 AWK variables are dynamic; they come into existence when
240 they are first used. Their values are either floating-
241 point numbers or strings, or both, depending upon how they
242 are used. AWK also has one dimensional arrays; multiply
243 dimensioned arrays may be simulated. Several pre-defined
244 variables are set as a program runs; these will be
245 described as needed and summarized below.
246
247 F\bFi\bie\bel\bld\bds\bs
248 As each input line is read, _\bg_\ba_\bw_\bk splits the line into
249 _\bf_\bi_\be_\bl_\bd_\bs, using the value of the F\bFS\bS variable as the field
250 separator. If F\bFS\bS is a single character, fields are sepa-
251 rated by that character. Otherwise, F\bFS\bS is expected to be
252 a full regular expression. In the special case that F\bFS\bS is
253 a single blank, fields are separated by runs of blanks
254 and/or tabs. Note that the value of I\bIG\bGN\bNO\bOR\bRE\bEC\bCA\bAS\bSE\bE (see
255 below) will also affect how fields are split when F\bFS\bS is a
256
257
258
259Free Software Foundation April 15 1993 4
260
261
262
263
264
265
266
267
268AWK(1) Utility Commands AWK(1)
269
270
271 regular expression.
272
273 If the F\bFI\bIE\bEL\bLD\bDW\bWI\bID\bDT\bTH\bHS\bS variable is set to a space separated
274 list of numbers, each field is expected to have fixed
275 width, and _\bg_\ba_\bw_\bk will split up the record using the speci-
276 fied widths. The value of F\bFS\bS is ignored. Assigning a new
277 value to F\bFS\bS overrides the use of F\bFI\bIE\bEL\bLD\bDW\bWI\bID\bDT\bTH\bHS\bS, and restores
278 the default behavior.
279
280 Each field in the input line may be referenced by its
281 position, $\b$1\b1, $\b$2\b2, and so on. $\b$0\b0 is the whole line. The
282 value of a field may be assigned to as well. Fields need
283 not be referenced by constants:
284
285 n\bn =\b= 5\b5
286 p\bpr\bri\bin\bnt\bt $\b$n\bn
287
288 prints the fifth field in the input line. The variable N\bNF\bF
289 is set to the total number of fields in the input line.
290
291 References to non-existent fields (i.e., fields after $\b$N\bNF\bF)
292 produce the null-string. However, assigning to a non-
293 existent field (e.g., $\b$(\b(N\bNF\bF+\b+2\b2)\b) =\b= 5\b5) will increase the value
294 of N\bNF\bF, create any intervening fields with the null string
295 as their value, and cause the value of $\b$0\b0 to be recom-
296 puted, with the fields being separated by the value of
297 O\bOF\bFS\bS.
298
299 B\bBu\bui\bil\blt\bt-\b-i\bin\bn V\bVa\bar\bri\bia\bab\bbl\ble\bes\bs
300 AWK's built-in variables are:
301
302
303 A\bAR\bRG\bGC\bC The number of command line arguments (does not
304 include options to _\bg_\ba_\bw_\bk, or the program
305 source).
306
307 A\bAR\bRG\bGI\bIN\bND\bD The index in A\bAR\bRG\bGV\bV of the current file being
308 processed.
309
310 A\bAR\bRG\bGV\bV Array of command line arguments. The array is
311 indexed from 0 to A\bAR\bRG\bGC\bC - 1. Dynamically
312 changing the contents of A\bAR\bRG\bGV\bV can control the
313 files used for data.
314
315 C\bCO\bON\bNV\bVF\bFM\bMT\bT The conversion format for numbers, "\b"%\b%.\b.6\b6g\bg"\b", by
316 default.
317
318 E\bEN\bNV\bVI\bIR\bRO\bON\bN An array containing the values of the current
319 environment. The array is indexed by the
320 environment variables, each element being the
321 value of that variable (e.g., E\bEN\bNV\bVI\bIR\bRO\bON\bN[\b["\b"H\bHO\bOM\bME\bE"\b"]\b]
322
323
324
325Free Software Foundation April 15 1993 5
326
327
328
329
330
331
332
333
334AWK(1) Utility Commands AWK(1)
335
336
337 might be /\b/u\bu/\b/a\bar\brn\bno\bol\bld\bd). Changing this array does
338 not affect the environment seen by programs
339 which _\bg_\ba_\bw_\bk spawns via redirection or the s\bsy\bys\bs-\b-
340 t\bte\bem\bm(\b()\b) function. (This may change in a future
341 version of _\bg_\ba_\bw_\bk.)
342
343 E\bER\bRR\bRN\bNO\bO If a system error occurs either doing a redi-
344 rection for g\bge\bet\btl\bli\bin\bne\be, during a read for g\bge\bet\bt-\b-
345 l\bli\bin\bne\be, or during a c\bcl\blo\bos\bse\be, then E\bER\bRR\bRN\bNO\bO will con-
346 tain a string describing the error.
347
348 F\bFI\bIE\bEL\bLD\bDW\bWI\bID\bDT\bTH\bHS\bS A white-space separated list of fieldwidths.
349 When set, _\bg_\ba_\bw_\bk parses the input into fields of
350 fixed width, instead of using the value of the
351 F\bFS\bS variable as the field separator. The fixed
352 field width facility is still experimental;
353 expect the semantics to change as _\bg_\ba_\bw_\bk evolves
354 over time.
355
356 F\bFI\bIL\bLE\bEN\bNA\bAM\bME\bE The name of the current input file. If no
357 files are specified on the command line, the
358 value of F\bFI\bIL\bLE\bEN\bNA\bAM\bME\bE is ``-''.
359
360 F\bFN\bNR\bR The input record number in the current input
361 file.
362
363 F\bFS\bS The input field separator, a blank by default.
364
365 I\bIG\bGN\bNO\bOR\bRE\bEC\bCA\bAS\bSE\bE Controls the case-sensitivity of all regular
366 expression operations. If I\bIG\bGN\bNO\bOR\bRE\bEC\bCA\bAS\bSE\bE has a
367 non-zero value, then pattern matching in
368 rules, field splitting with F\bFS\bS, regular
369 expression matching with ~\b~ and !\b!~\b~, and the
370 g\bgs\bsu\bub\bb(\b()\b), i\bin\bnd\bde\bex\bx(\b()\b), m\bma\bat\btc\bch\bh(\b()\b), s\bsp\bpl\bli\bit\bt(\b()\b), and s\bsu\bub\bb(\b()\b)
371 pre-defined functions will all ignore case
372 when doing regular expression operations.
373 Thus, if I\bIG\bGN\bNO\bOR\bRE\bEC\bCA\bAS\bSE\bE is not equal to zero, /\b/a\baB\bB/\b/
374 matches all of the strings "\b"a\bab\bb"\b", "\b"a\baB\bB"\b", "\b"A\bAb\bb"\b",
375 and "\b"A\bAB\bB"\b". As with all AWK variables, the ini-
376 tial value of I\bIG\bGN\bNO\bOR\bRE\bEC\bCA\bAS\bSE\bE is zero, so all regu-
377 lar expression operations are normally case-
378 sensitive.
379
380 N\bNF\bF The number of fields in the current input
381 record.
382
383 N\bNR\bR The total number of input records seen so far.
384
385 O\bOF\bFM\bMT\bT The output format for numbers, "\b"%\b%.\b.6\b6g\bg"\b", by
386 default.
387
388
389
390
391Free Software Foundation April 15 1993 6
392
393
394
395
396
397
398
399
400AWK(1) Utility Commands AWK(1)
401
402
403 O\bOF\bFS\bS The output field separator, a blank by
404 default.
405
406 O\bOR\bRS\bS The output record separator, a newline by
407 default.
408
409 R\bRS\bS The input record separator, a newline by
410 default. R\bRS\bS is exceptional in that only the
411 first character of its string value is used
412 for separating records. (This will probably
413 change in a future release of _\bg_\ba_\bw_\bk.) If R\bRS\bS is
414 set to the null string, then records are sepa-
415 rated by blank lines. When R\bRS\bS is set to the
416 null string, then the newline character always
417 acts as a field separator, in addition to
418 whatever value F\bFS\bS may have.
419
420 R\bRS\bST\bTA\bAR\bRT\bT The index of the first character matched by
421 m\bma\bat\btc\bch\bh(\b()\b); 0 if no match.
422
423 R\bRL\bLE\bEN\bNG\bGT\bTH\bH The length of the string matched by m\bma\bat\btc\bch\bh(\b()\b);
424 -1 if no match.
425
426 S\bSU\bUB\bBS\bSE\bEP\bP The character used to separate multiple sub-
427 scripts in array elements, "\b"\\b\0\b03\b34\b4"\b" by default.
428
429 A\bAr\brr\bra\bay\bys\bs
430 Arrays are subscripted with an expression between square
431 brackets ([\b[ and ]\b]). If the expression is an expression
432 list (_\be_\bx_\bp_\br, _\be_\bx_\bp_\br ...) then the array subscript is a
433 string consisting of the concatenation of the (string)
434 value of each expression, separated by the value of the
435 S\bSU\bUB\bBS\bSE\bEP\bP variable. This facility is used to simulate multi-
436 ply dimensioned arrays. For example:
437
438 i\bi =\b= "\b"A\bA"\b" ;\b; j\bj =\b= "\b"B\bB"\b" ;\b; k\bk =\b= "\b"C\bC"\b"
439 x\bx[\b[i\bi,\b, j\bj,\b, k\bk]\b] =\b= "\b"h\bhe\bel\bll\blo\bo,\b, w\bwo\bor\brl\bld\bd\\b\n\bn"\b"
440
441 assigns the string "\b"h\bhe\bel\bll\blo\bo,\b, w\bwo\bor\brl\bld\bd\\b\n\bn"\b" to the element of the
442 array x\bx which is indexed by the string "\b"A\bA\\b\0\b03\b34\b4B\bB\\b\0\b03\b34\b4C\bC"\b". All
443 arrays in AWK are associative, i.e., indexed by string
444 values.
445
446 The special operator i\bin\bn may be used in an i\bif\bf or w\bwh\bhi\bil\ble\be
447 statement to see if an array has an index consisting of a
448 particular value.
449
450 i\bif\bf (\b(v\bva\bal\bl i\bin\bn a\bar\brr\bra\bay\by)\b)
451 p\bpr\bri\bin\bnt\bt a\bar\brr\bra\bay\by[\b[v\bva\bal\bl]\b]
452
453 If the array has multiple subscripts, use (\b(i\bi,\b, j\bj)\b) i\bin\bn a\bar\brr\bra\bay\by.
454
455
456
457Free Software Foundation April 15 1993 7
458
459
460
461
462
463
464
465
466AWK(1) Utility Commands AWK(1)
467
468
469 The i\bin\bn construct may also be used in a f\bfo\bor\br loop to iterate
470 over all the elements of an array.
471
472 An element may be deleted from an array using the d\bde\bel\ble\bet\bte\be
473 statement.
474
475 V\bVa\bar\bri\bia\bab\bbl\ble\be T\bTy\byp\bpi\bin\bng\bg A\bAn\bnd\bd C\bCo\bon\bnv\bve\ber\brs\bsi\bio\bon\bn
476 Variables and fields may be (floating point) numbers, or
477 strings, or both. How the value of a variable is inter-
478 preted depends upon its context. If used in a numeric
479 expression, it will be treated as a number, if used as a
480 string it will be treated as a string.
481
482 To force a variable to be treated as a number, add 0 to
483 it; to force it to be treated as a string, concatenate it
484 with the null string.
485
486 When a string must be converted to a number, the conver-
487 sion is accomplished using _\ba_\bt_\bo_\bf(3). A number is converted
488 to a string by using the value of C\bCO\bON\bNV\bVF\bFM\bMT\bT as a format
489 string for _\bs_\bp_\br_\bi_\bn_\bt_\bf(3), with the numeric value of the vari-
490 able as the argument. However, even though all numbers in
491 AWK are floating-point, integral values are _\ba_\bl_\bw_\ba_\by_\bs con-
492 verted as integers. Thus, given
493
494 C\bCO\bON\bNV\bVF\bFM\bMT\bT =\b= "\b"%\b%2\b2.\b.2\b2f\bf"\b"
495 a\ba =\b= 1\b12\b2
496 b\bb =\b= a\ba "\b""\b"
497
498 the variable b\bb has a value of "\b"1\b12\b2"\b" and not "\b"1\b12\b2.\b.0\b00\b0"\b".
499
500 _\bG_\ba_\bw_\bk performs comparisons as follows: If two variables are
501 numeric, they are compared numerically. If one value is
502 numeric and the other has a string value that is a
503 ``numeric string,'' then comparisons are also done numeri-
504 cally. Otherwise, the numeric value is converted to a
505 string and a string comparison is performed. Two strings
506 are compared, of course, as strings. According to the
507 POSIX standard, even if two strings are numeric strings, a
508 numeric comparison is performed. However, this is clearly
509 incorrect, and _\bg_\ba_\bw_\bk does not do this.
510
511 Uninitialized variables have the numeric value 0 and the
512 string value "" (the null, or empty, string).
513
514P\bPA\bAT\bTT\bTE\bER\bRN\bNS\bS A\bAN\bND\bD A\bAC\bCT\bTI\bIO\bON\bNS\bS
515 AWK is a line oriented language. The pattern comes first,
516 and then the action. Action statements are enclosed in {\b{
517 and }\b}. Either the pattern may be missing, or the action
518 may be missing, but, of course, not both. If the pattern
519 is missing, the action will be executed for every single
520
521
522
523Free Software Foundation April 15 1993 8
524
525
526
527
528
529
530
531
532AWK(1) Utility Commands AWK(1)
533
534
535 line of input. A missing action is equivalent to
536
537 {\b{ p\bpr\bri\bin\bnt\bt }\b}
538
539 which prints the entire line.
540
541 Comments begin with the ``#'' character, and continue
542 until the end of the line. Blank lines may be used to
543 separate statements. Normally, a statement ends with a
544 newline, however, this is not the case for lines ending in
545 a ``,'', ``{'', ``?'', ``:'', ``&&'', or ``||''. Lines
546 ending in d\bdo\bo or e\bel\bls\bse\be also have their statements automati-
547 cally continued on the following line. In other cases, a
548 line can be continued by ending it with a ``\'', in which
549 case the newline will be ignored.
550
551 Multiple statements may be put on one line by separating
552 them with a ``;''. This applies to both the statements
553 within the action part of a pattern-action pair (the usual
554 case), and to the pattern-action statements themselves.
555
556 P\bPa\bat\btt\bte\ber\brn\bns\bs
557 AWK patterns may be one of the following:
558
559 B\bBE\bEG\bGI\bIN\bN
560 E\bEN\bND\bD
561 /\b/_\br_\be_\bg_\bu_\bl_\ba_\br _\be_\bx_\bp_\br_\be_\bs_\bs_\bi_\bo_\bn/\b/
562 _\br_\be_\bl_\ba_\bt_\bi_\bo_\bn_\ba_\bl _\be_\bx_\bp_\br_\be_\bs_\bs_\bi_\bo_\bn
563 _\bp_\ba_\bt_\bt_\be_\br_\bn &\b&&\b& _\bp_\ba_\bt_\bt_\be_\br_\bn
564 _\bp_\ba_\bt_\bt_\be_\br_\bn |\b||\b| _\bp_\ba_\bt_\bt_\be_\br_\bn
565 _\bp_\ba_\bt_\bt_\be_\br_\bn ?\b? _\bp_\ba_\bt_\bt_\be_\br_\bn :\b: _\bp_\ba_\bt_\bt_\be_\br_\bn
566 (\b(_\bp_\ba_\bt_\bt_\be_\br_\bn)\b)
567 !\b! _\bp_\ba_\bt_\bt_\be_\br_\bn
568 _\bp_\ba_\bt_\bt_\be_\br_\bn_\b1,\b, _\bp_\ba_\bt_\bt_\be_\br_\bn_\b2
569
570 B\bBE\bEG\bGI\bIN\bN and E\bEN\bND\bD are two special kinds of patterns which are
571 not tested against the input. The action parts of all
572 B\bBE\bEG\bGI\bIN\bN patterns are merged as if all the statements had
573 been written in a single B\bBE\bEG\bGI\bIN\bN block. They are executed
574 before any of the input is read. Similarly, all the E\bEN\bND\bD
575 blocks are merged, and executed when all the input is
576 exhausted (or when an e\bex\bxi\bit\bt statement is executed). B\bBE\bEG\bGI\bIN\bN
577 and E\bEN\bND\bD patterns cannot be combined with other patterns in
578 pattern expressions. B\bBE\bEG\bGI\bIN\bN and E\bEN\bND\bD patterns cannot have
579 missing action parts.
580
581 For /\b/_\br_\be_\bg_\bu_\bl_\ba_\br _\be_\bx_\bp_\br_\be_\bs_\bs_\bi_\bo_\bn/\b/ patterns, the associated state-
582 ment is executed for each input line that matches the reg-
583 ular expression. Regular expressions are the same as
584 those in _\be_\bg_\br_\be_\bp(1), and are summarized below.
585
586
587
588
589Free Software Foundation April 15 1993 9
590
591
592
593
594
595
596
597
598AWK(1) Utility Commands AWK(1)
599
600
601 A _\br_\be_\bl_\ba_\bt_\bi_\bo_\bn_\ba_\bl _\be_\bx_\bp_\br_\be_\bs_\bs_\bi_\bo_\bn may use any of the operators
602 defined below in the section on actions. These generally
603 test whether certain fields match certain regular expres-
604 sions.
605
606 The &\b&&\b&, |\b||\b|, and !\b! operators are logical AND, logical OR,
607 and logical NOT, respectively, as in C. They do short-
608 circuit evaluation, also as in C, and are used for combin-
609 ing more primitive pattern expressions. As in most lan-
610 guages, parentheses may be used to change the order of
611 evaluation.
612
613 The ?\b?:\b: operator is like the same operator in C. If the
614 first pattern is true then the pattern used for testing is
615 the second pattern, otherwise it is the third. Only one of
616 the second and third patterns is evaluated.
617
618 The _\bp_\ba_\bt_\bt_\be_\br_\bn_\b1,\b, _\bp_\ba_\bt_\bt_\be_\br_\bn_\b2 form of an expression is called a
619 range pattern. It matches all input records starting with
620 a line that matches _\bp_\ba_\bt_\bt_\be_\br_\bn_\b1, and continuing until a
621 record that matches _\bp_\ba_\bt_\bt_\be_\br_\bn_\b2, inclusive. It does not com-
622 bine with any other sort of pattern expression.
623
624 R\bRe\beg\bgu\bul\bla\bar\br E\bEx\bxp\bpr\bre\bes\bss\bsi\bio\bon\bns\bs
625 Regular expressions are the extended kind found in _\be_\bg_\br_\be_\bp.
626 They are composed of characters as follows:
627
628 _\bc matches the non-metacharacter _\bc.
629
630 _\b\_\bc matches the literal character _\bc.
631
632 .\b. matches any character except newline.
633
634 ^\b^ matches the beginning of a line or a string.
635
636 $\b$ matches the end of a line or a string.
637
638 [\b[_\ba_\bb_\bc_\b._\b._\b.]\b] character class, matches any of the characters
639 _\ba_\bb_\bc_\b._\b._\b..
640
641 [\b[^\b^_\ba_\bb_\bc_\b._\b._\b.]\b] negated character class, matches any character
642 except _\ba_\bb_\bc_\b._\b._\b. and newline.
643
644 _\br_\b1|\b|_\br_\b2 alternation: matches either _\br_\b1 or _\br_\b2.
645
646 _\br_\b1_\br_\b2 concatenation: matches _\br_\b1, and then _\br_\b2.
647
648 _\br+\b+ matches one or more _\br's.
649
650 _\br*\b* matches zero or more _\br's.
651
652
653
654
655Free Software Foundation April 15 1993 10
656
657
658
659
660
661
662
663
664AWK(1) Utility Commands AWK(1)
665
666
667 _\br?\b? matches zero or one _\br's.
668
669 (\b(_\br)\b) grouping: matches _\br.
670
671 The escape sequences that are valid in string constants
672 (see below) are also legal in regular expressions.
673
674 A\bAc\bct\bti\bio\bon\bns\bs
675 Action statements are enclosed in braces, {\b{ and }\b}. Action
676 statements consist of the usual assignment, conditional,
677 and looping statements found in most languages. The opera-
678 tors, control statements, and input/output statements
679 available are patterned after those in C.
680
681 O\bOp\bpe\ber\bra\bat\bto\bor\brs\bs
682 The operators in AWK, in order of increasing precedence,
683 are
684
685
686 =\b= +\b+=\b= -\b-=\b=
687 *\b*=\b= /\b/=\b= %\b%=\b= ^\b^=\b= Assignment. Both absolute assignment (\b(_\bv_\ba_\br =\b=
688 _\bv_\ba_\bl_\bu_\be)\b) and operator-assignment (the other
689 forms) are supported.
690
691 ?\b?:\b: The C conditional expression. This has the
692 form _\be_\bx_\bp_\br_\b1 ?\b? _\be_\bx_\bp_\br_\b2 :\b: _\be_\bx_\bp_\br_\b3. If _\be_\bx_\bp_\br_\b1 is true,
693 the value of the expression is _\be_\bx_\bp_\br_\b2, other-
694 wise it is _\be_\bx_\bp_\br_\b3. Only one of _\be_\bx_\bp_\br_\b2 and _\be_\bx_\bp_\br_\b3
695 is evaluated.
696
697 |\b||\b| Logical OR.
698
699 &\b&&\b& Logical AND.
700
701 ~\b~ !\b!~\b~ Regular expression match, negated match.
702 N\bNO\bOT\bTE\bE:\b: Do not use a constant regular expression
703 (/\b/f\bfo\boo\bo/\b/) on the left-hand side of a ~\b~ or !\b!~\b~.
704 Only use one on the right-hand side. The
705 expression /\b/f\bfo\boo\bo/\b/ ~\b~ _\be_\bx_\bp has the same meaning as
706 (\b((\b($\b$0\b0 ~\b~ /\b/f\bfo\boo\bo/\b/)\b) ~\b~ _\be_\bx_\bp)\b). This is usually _\bn_\bo_\bt
707 what was intended.
708
709 <\b< >\b>
710 <\b<=\b= >\b>=\b=
711 !\b!=\b= =\b==\b= The regular relational operators.
712
713 _\bb_\bl_\ba_\bn_\bk String concatenation.
714
715 +\b+ -\b- Addition and subtraction.
716
717 *\b* /\b/ %\b% Multiplication, division, and modulus.
718
719
720
721Free Software Foundation April 15 1993 11
722
723
724
725
726
727
728
729
730AWK(1) Utility Commands AWK(1)
731
732
733 +\b+ -\b- !\b! Unary plus, unary minus, and logical negation.
734
735 ^\b^ Exponentiation (*\b**\b* may also be used, and *\b**\b*=\b=
736 for the assignment operator).
737
738 +\b++\b+ -\b--\b- Increment and decrement, both prefix and post-
739 fix.
740
741 $\b$ Field reference.
742
743 C\bCo\bon\bnt\btr\bro\bol\bl S\bSt\bta\bat\bte\bem\bme\ben\bnt\bts\bs
744 The control statements are as follows:
745
746 i\bif\bf (\b(_\bc_\bo_\bn_\bd_\bi_\bt_\bi_\bo_\bn)\b) _\bs_\bt_\ba_\bt_\be_\bm_\be_\bn_\bt [ e\bel\bls\bse\be _\bs_\bt_\ba_\bt_\be_\bm_\be_\bn_\bt ]
747 w\bwh\bhi\bil\ble\be (\b(_\bc_\bo_\bn_\bd_\bi_\bt_\bi_\bo_\bn)\b) _\bs_\bt_\ba_\bt_\be_\bm_\be_\bn_\bt
748 d\bdo\bo _\bs_\bt_\ba_\bt_\be_\bm_\be_\bn_\bt w\bwh\bhi\bil\ble\be (\b(_\bc_\bo_\bn_\bd_\bi_\bt_\bi_\bo_\bn)\b)
749 f\bfo\bor\br (\b(_\be_\bx_\bp_\br_\b1;\b; _\be_\bx_\bp_\br_\b2;\b; _\be_\bx_\bp_\br_\b3)\b) _\bs_\bt_\ba_\bt_\be_\bm_\be_\bn_\bt
750 f\bfo\bor\br (\b(_\bv_\ba_\br i\bin\bn _\ba_\br_\br_\ba_\by)\b) _\bs_\bt_\ba_\bt_\be_\bm_\be_\bn_\bt
751 b\bbr\bre\bea\bak\bk
752 c\bco\bon\bnt\bti\bin\bnu\bue\be
753 d\bde\bel\ble\bet\bte\be _\ba_\br_\br_\ba_\by[\b[_\bi_\bn_\bd_\be_\bx]\b]
754 e\bex\bxi\bit\bt [ _\be_\bx_\bp_\br_\be_\bs_\bs_\bi_\bo_\bn ]
755 {\b{ _\bs_\bt_\ba_\bt_\be_\bm_\be_\bn_\bt_\bs }\b}
756
757 I\bI/\b/O\bO S\bSt\bta\bat\bte\bem\bme\ben\bnt\bts\bs
758 The input/output statements are as follows:
759
760
761 c\bcl\blo\bos\bse\be(\b(_\bf_\bi_\bl_\be_\bn_\ba_\bm_\be)\b) Close file (or pipe, see below).
762
763 g\bge\bet\btl\bli\bin\bne\be Set $\b$0\b0 from next input record; set
764 N\bNF\bF, N\bNR\bR, F\bFN\bNR\bR.
765
766 g\bge\bet\btl\bli\bin\bne\be <\b<_\bf_\bi_\bl_\be Set $\b$0\b0 from next record of _\bf_\bi_\bl_\be; set
767 N\bNF\bF.
768
769 g\bge\bet\btl\bli\bin\bne\be _\bv_\ba_\br Set _\bv_\ba_\br from next input record; set
770 N\bNF\bF, F\bFN\bNR\bR.
771
772 g\bge\bet\btl\bli\bin\bne\be _\bv_\ba_\br <\b<_\bf_\bi_\bl_\be Set _\bv_\ba_\br from next record of _\bf_\bi_\bl_\be.
773
774 n\bne\bex\bxt\bt Stop processing the current input
775 record. The next input record is
776 read and processing starts over with
777 the first pattern in the AWK pro-
778 gram. If the end of the input data
779 is reached, the E\bEN\bND\bD block(s), if
780 any, are executed.
781
782 n\bne\bex\bxt\bt f\bfi\bil\ble\be Stop processing the current input
783 file. The next input record read
784
785
786
787Free Software Foundation April 15 1993 12
788
789
790
791
792
793
794
795
796AWK(1) Utility Commands AWK(1)
797
798
799 comes from the next input file.
800 F\bFI\bIL\bLE\bEN\bNA\bAM\bME\bE is updated, F\bFN\bNR\bR is reset to
801 1, and processing starts over with
802 the first pattern in the AWK pro-
803 gram. If the end of the input data
804 is reached, the E\bEN\bND\bD block(s), if
805 any, are executed.
806
807 p\bpr\bri\bin\bnt\bt Prints the current record.
808
809 p\bpr\bri\bin\bnt\bt _\be_\bx_\bp_\br_\b-_\bl_\bi_\bs_\bt Prints expressions.
810
811 p\bpr\bri\bin\bnt\bt _\be_\bx_\bp_\br_\b-_\bl_\bi_\bs_\bt >\b>_\bf_\bi_\bl_\be Prints expressions on _\bf_\bi_\bl_\be.
812
813 p\bpr\bri\bin\bnt\btf\bf _\bf_\bm_\bt_\b, _\be_\bx_\bp_\br_\b-_\bl_\bi_\bs_\bt Format and print.
814
815 p\bpr\bri\bin\bnt\btf\bf _\bf_\bm_\bt_\b, _\be_\bx_\bp_\br_\b-_\bl_\bi_\bs_\bt >\b>_\bf_\bi_\bl_\be
816 Format and print on _\bf_\bi_\bl_\be.
817
818 s\bsy\bys\bst\bte\bem\bm(\b(_\bc_\bm_\bd_\b-_\bl_\bi_\bn_\be)\b) Execute the command _\bc_\bm_\bd_\b-_\bl_\bi_\bn_\be, and
819 return the exit status. (This may
820 not be available on non-POSIX sys-
821 tems.)
822
823 Other input/output redirections are also allowed. For
824 p\bpr\bri\bin\bnt\bt and p\bpr\bri\bin\bnt\btf\bf, >\b>>\b>_\bf_\bi_\bl_\be appends output to the _\bf_\bi_\bl_\be, while
825 |\b| _\bc_\bo_\bm_\bm_\ba_\bn_\bd writes on a pipe. In a similar fashion, _\bc_\bo_\bm_\bm_\ba_\bn_\bd
826 |\b| g\bge\bet\btl\bli\bin\bne\be pipes into g\bge\bet\btl\bli\bin\bne\be. G\bGe\bet\btl\bli\bin\bne\be will return 0 on
827 end of file, and -1 on an error.
828
829 T\bTh\bhe\be _\bp_\br_\bi_\bn_\bt_\bf S\bSt\bta\bat\bte\bem\bme\ben\bnt\bt
830 The AWK versions of the p\bpr\bri\bin\bnt\btf\bf statement and s\bsp\bpr\bri\bin\bnt\btf\bf(\b()\b)
831 function (see below) accept the following conversion spec-
832 ification formats:
833
834 %\b%c\bc An ASCII character. If the argument used for %\b%c\bc is
835 numeric, it is treated as a character and printed.
836 Otherwise, the argument is assumed to be a string,
837 and the only first character of that string is
838 printed.
839
840 %\b%d\bd A decimal number (the integer part).
841
842 %\b%i\bi Just like %\b%d\bd.
843
844 %\b%e\be A floating point number of the form
845 [\b[-\b-]\b]d\bd.\b.d\bdd\bdd\bdd\bdd\bdd\bdE\bE[\b[+\b+-\b-]\b]d\bdd\bd.
846
847 %\b%f\bf A floating point number of the form [\b[-\b-]\b]d\bdd\bdd\bd.\b.d\bdd\bdd\bdd\bdd\bdd\bd.
848
849 %\b%g\bg Use e\be or f\bf conversion, whichever is shorter, with
850
851
852
853Free Software Foundation April 15 1993 13
854
855
856
857
858
859
860
861
862AWK(1) Utility Commands AWK(1)
863
864
865 nonsignificant zeros suppressed.
866
867 %\b%o\bo An unsigned octal number (again, an integer).
868
869 %\b%s\bs A character string.
870
871 %\b%x\bx An unsigned hexadecimal number (an integer).
872
873 %\b%X\bX Like %\b%x\bx, but using A\bAB\bBC\bCD\bDE\bEF\bF instead of a\bab\bbc\bcd\bde\bef\bf.
874
875 %\b%%\b% A single %\b% character; no argument is converted.
876
877 There are optional, additional parameters that may lie
878 between the %\b% and the control letter:
879
880 -\b- The expression should be left-justified within its
881 field.
882
883 _\bw_\bi_\bd_\bt_\bh The field should be padded to this width. If the
884 number has a leading zero, then the field will be
885 padded with zeros. Otherwise it is padded with
886 blanks.
887
888 .\b._\bp_\br_\be_\bc A number indicating the maximum width of strings or
889 digits to the right of the decimal point.
890
891 The dynamic _\bw_\bi_\bd_\bt_\bh and _\bp_\br_\be_\bc capabilities of the ANSI C
892 p\bpr\bri\bin\bnt\btf\bf(\b()\b) routines are supported. A *\b* in place of either
893 the w\bwi\bid\bdt\bth\bh or p\bpr\bre\bec\bc specifications will cause their values
894 to be taken from the argument list to p\bpr\bri\bin\bnt\btf\bf or s\bsp\bpr\bri\bin\bnt\btf\bf(\b()\b).
895
896 S\bSp\bpe\bec\bci\bia\bal\bl F\bFi\bil\ble\be N\bNa\bam\bme\bes\bs
897 When doing I/O redirection from either p\bpr\bri\bin\bnt\bt or p\bpr\bri\bin\bnt\btf\bf
898 into a file, or via g\bge\bet\btl\bli\bin\bne\be from a file, _\bg_\ba_\bw_\bk recognizes
899 certain special filenames internally. These filenames
900 allow access to open file descriptors inherited from
901 _\bg_\ba_\bw_\bk's parent process (usually the shell). Other special
902 filenames provide access information about the running
903 g\bga\baw\bwk\bk process. The filenames are:
904
905 /\b/d\bde\bev\bv/\b/p\bpi\bid\bd Reading this file returns the process ID of
906 the current process, in decimal, terminated
907 with a newline.
908
909 /\b/d\bde\bev\bv/\b/p\bpp\bpi\bid\bd Reading this file returns the parent process
910 ID of the current process, in decimal, termi-
911 nated with a newline.
912
913 /\b/d\bde\bev\bv/\b/p\bpg\bgr\brp\bpi\bid\bd Reading this file returns the process group ID
914 of the current process, in decimal, terminated
915 with a newline.
916
917
918
919Free Software Foundation April 15 1993 14
920
921
922
923
924
925
926
927
928AWK(1) Utility Commands AWK(1)
929
930
931 /\b/d\bde\bev\bv/\b/u\bus\bse\ber\br Reading this file returns a single record ter-
932 minated with a newline. The fields are sepa-
933 rated with blanks. $\b$1\b1 is the value of the
934 _\bg_\be_\bt_\bu_\bi_\bd(2) system call, $\b$2\b2 is the value of the
935 _\bg_\be_\bt_\be_\bu_\bi_\bd(2) system call, $\b$3\b3 is the value of the
936 _\bg_\be_\bt_\bg_\bi_\bd(2) system call, and $\b$4\b4 is the value of
937 the _\bg_\be_\bt_\be_\bg_\bi_\bd(2) system call. If there are any
938 additional fields, they are the group IDs
939 returned by _\bg_\be_\bt_\bg_\br_\bo_\bu_\bp_\bs(2). (Multiple groups
940 may not be supported on all systems.)
941
942 /\b/d\bde\bev\bv/\b/s\bst\btd\bdi\bin\bn The standard input.
943
944 /\b/d\bde\bev\bv/\b/s\bst\btd\bdo\bou\but\bt The standard output.
945
946 /\b/d\bde\bev\bv/\b/s\bst\btd\bde\ber\brr\br The standard error output.
947
948 /\b/d\bde\bev\bv/\b/f\bfd\bd/\b/_\bn The file associated with the open file
949 descriptor _\bn.
950
951 These are particularly useful for error messages. For
952 example:
953
954 p\bpr\bri\bin\bnt\bt "\b"Y\bYo\bou\bu b\bbl\ble\bew\bw i\bit\bt!\b!"\b" >\b> "\b"/\b/d\bde\bev\bv/\b/s\bst\btd\bde\ber\brr\br"\b"
955
956 whereas you would otherwise have to use
957
958 p\bpr\bri\bin\bnt\bt "\b"Y\bYo\bou\bu b\bbl\ble\bew\bw i\bit\bt!\b!"\b" |\b| "\b"c\bca\bat\bt 1\b1>\b>&\b&2\b2"\b"
959
960 These file names may also be used on the command line to
961 name data files.
962
963 N\bNu\bum\bme\ber\bri\bic\bc F\bFu\bun\bnc\bct\bti\bio\bon\bns\bs
964 AWK has the following pre-defined arithmetic functions:
965
966
967 a\bat\bta\ban\bn2\b2(\b(_\by,\b, _\bx)\b) returns the arctangent of _\by_\b/_\bx in radians.
968
969 c\bco\bos\bs(\b(_\be_\bx_\bp_\br)\b) returns the cosine in radians.
970
971 e\bex\bxp\bp(\b(_\be_\bx_\bp_\br)\b) the exponential function.
972
973 i\bin\bnt\bt(\b(_\be_\bx_\bp_\br)\b) truncates to integer.
974
975 l\blo\bog\bg(\b(_\be_\bx_\bp_\br)\b) the natural logarithm function.
976
977 r\bra\ban\bnd\bd(\b()\b) returns a random number between 0 and 1.
978
979 s\bsi\bin\bn(\b(_\be_\bx_\bp_\br)\b) returns the sine in radians.
980
981 s\bsq\bqr\brt\bt(\b(_\be_\bx_\bp_\br)\b) the square root function.
982
983
984
985Free Software Foundation April 15 1993 15
986
987
988
989
990
991
992
993
994AWK(1) Utility Commands AWK(1)
995
996
997 s\bsr\bra\ban\bnd\bd(\b(_\be_\bx_\bp_\br)\b) use _\be_\bx_\bp_\br as a new seed for the random number
998 generator. If no _\be_\bx_\bp_\br is provided, the time of
999 day will be used. The return value is the
1000 previous seed for the random number generator.
1001
1002 S\bSt\btr\bri\bin\bng\bg F\bFu\bun\bnc\bct\bti\bio\bon\bns\bs
1003 AWK has the following pre-defined string functions:
1004
1005
1006 g\bgs\bsu\bub\bb(\b(_\br,\b, _\bs,\b, _\bt)\b) for each substring matching the
1007 regular expression _\br in the string
1008 _\bt, substitute the string _\bs, and
1009 return the number of substitu-
1010 tions. If _\bt is not supplied, use
1011 $\b$0\b0.
1012
1013 i\bin\bnd\bde\bex\bx(\b(_\bs,\b, _\bt)\b) returns the index of the string _\bt
1014 in the string _\bs, or 0 if _\bt is not
1015 present.
1016
1017 l\ble\ben\bng\bgt\bth\bh(\b(_\bs)\b) returns the length of the string
1018 _\bs, or the length of $\b$0\b0 if _\bs is not
1019 supplied.
1020
1021 m\bma\bat\btc\bch\bh(\b(_\bs,\b, _\br)\b) returns the position in _\bs where
1022 the regular expression _\br occurs,
1023 or 0 if _\br is not present, and sets
1024 the values of R\bRS\bST\bTA\bAR\bRT\bT and R\bRL\bLE\bEN\bNG\bGT\bTH\bH.
1025
1026 s\bsp\bpl\bli\bit\bt(\b(_\bs,\b, _\ba,\b, _\br)\b) splits the string _\bs into the array
1027 _\ba on the regular expression _\br, and
1028 returns the number of fields. If _\br
1029 is omitted, F\bFS\bS is used.
1030
1031 s\bsp\bpr\bri\bin\bnt\btf\bf(\b(_\bf_\bm_\bt,\b, _\be_\bx_\bp_\br_\b-_\bl_\bi_\bs_\bt)\b) prints _\be_\bx_\bp_\br_\b-_\bl_\bi_\bs_\bt according to _\bf_\bm_\bt,
1032 and returns the resulting string.
1033
1034 s\bsu\bub\bb(\b(_\br,\b, _\bs,\b, _\bt)\b) just like g\bgs\bsu\bub\bb(\b()\b), but only the
1035 first matching substring is
1036 replaced.
1037
1038 s\bsu\bub\bbs\bst\btr\br(\b(_\bs,\b, _\bi,\b, _\bn)\b) returns the _\bn-character substring
1039 of _\bs starting at _\bi. If _\bn is omit-
1040 ted, the rest of _\bs is used.
1041
1042 t\bto\bol\blo\bow\bwe\ber\br(\b(_\bs_\bt_\br)\b) returns a copy of the string _\bs_\bt_\br,
1043 with all the upper-case characters
1044 in _\bs_\bt_\br translated to their corre-
1045 sponding lower-case counterparts.
1046 Non-alphabetic characters are left
1047 unchanged.
1048
1049
1050
1051Free Software Foundation April 15 1993 16
1052
1053
1054
1055
1056
1057
1058
1059
1060AWK(1) Utility Commands AWK(1)
1061
1062
1063 t\bto\bou\bup\bpp\bpe\ber\br(\b(_\bs_\bt_\br)\b) returns a copy of the string _\bs_\bt_\br,
1064 with all the lower-case characters
1065 in _\bs_\bt_\br translated to their corre-
1066 sponding upper-case counterparts.
1067 Non-alphabetic characters are left
1068 unchanged.
1069
1070 T\bTi\bim\bme\be F\bFu\bun\bnc\bct\bti\bio\bon\bns\bs
1071 Since one of the primary uses of AWK programs is process-
1072 ing log files that contain time stamp information, _\bg_\ba_\bw_\bk
1073 provides the following two functions for obtaining time
1074 stamps and formatting them.
1075
1076
1077 s\bsy\bys\bst\bti\bim\bme\be(\b()\b) returns the current time of day as the number of
1078 seconds since the Epoch (Midnight UTC, January
1079 1, 1970 on POSIX systems).
1080
1081 s\bst\btr\brf\bft\bti\bim\bme\be(\b(_\bf_\bo_\br_\bm_\ba_\bt, _\bt_\bi_\bm_\be_\bs_\bt_\ba_\bm_\bp)\b)
1082 formats _\bt_\bi_\bm_\be_\bs_\bt_\ba_\bm_\bp according to the specification
1083 in _\bf_\bo_\br_\bm_\ba_\bt_\b. The _\bt_\bi_\bm_\be_\bs_\bt_\ba_\bm_\bp should be of the same
1084 form as returned by s\bsy\bys\bst\bti\bim\bme\be(\b()\b). If _\bt_\bi_\bm_\be_\bs_\bt_\ba_\bm_\bp is
1085 missing, the current time of day is used. See
1086 the specification for the s\bst\btr\brf\bft\bti\bim\bme\be(\b()\b) function in
1087 ANSI C for the format conversions that are guar-
1088 anteed to be available. A public-domain version
1089 of _\bs_\bt_\br_\bf_\bt_\bi_\bm_\be(3) and a man page for it are shipped
1090 with _\bg_\ba_\bw_\bk; if that version was used to build
1091 _\bg_\ba_\bw_\bk, then all of the conversions described in
1092 that man page are available to _\bg_\ba_\bw_\bk_\b.
1093
1094 S\bSt\btr\bri\bin\bng\bg C\bCo\bon\bns\bst\bta\ban\bnt\bts\bs
1095 String constants in AWK are sequences of characters
1096 enclosed between double quotes ("\b"). Within strings, cer-
1097 tain _\be_\bs_\bc_\ba_\bp_\be _\bs_\be_\bq_\bu_\be_\bn_\bc_\be_\bs are recognized, as in C. These are:
1098
1099
1100 \\b\\\b\ A literal backslash.
1101
1102 \\b\a\ba The ``alert'' character; usually the ASCII BEL char-
1103 acter.
1104
1105 \\b\b\bb backspace.
1106
1107 \\b\f\bf form-feed.
1108
1109 \\b\n\bn newline.
1110
1111 \\b\r\br carriage return.
1112
1113 \\b\t\bt horizontal tab.
1114
1115
1116
1117Free Software Foundation April 15 1993 17
1118
1119
1120
1121
1122
1123
1124
1125
1126AWK(1) Utility Commands AWK(1)
1127
1128
1129 \\b\v\bv vertical tab.
1130
1131 \\b\x\bx_\bh_\be_\bx _\bd_\bi_\bg_\bi_\bt_\bs
1132 The character represented by the string of hexadeci-
1133 mal digits following the \\b\x\bx. As in ANSI C, all fol-
1134 lowing hexadecimal digits are considered part of the
1135 escape sequence. (This feature should tell us some-
1136 thing about language design by committee.) E.g.,
1137 "\x1B" is the ASCII ESC (escape) character.
1138
1139 \\b\_\bd_\bd_\bd The character represented by the 1-, 2-, or 3-digit
1140 sequence of octal digits. E.g. "\033" is the ASCII
1141 ESC (escape) character.
1142
1143 \\b\_\bc The literal character _\bc.
1144
1145 The escape sequences may also be used inside constant reg-
1146 ular expressions (e.g., /\b/[\b[ \\b\t\bt\\b\f\bf\\b\n\bn\\b\r\br\\b\v\bv]\b]/\b/ matches whitespace
1147 characters).
1148
1149F\bFU\bUN\bNC\bCT\bTI\bIO\bON\bNS\bS
1150 Functions in AWK are defined as follows:
1151
1152 f\bfu\bun\bnc\bct\bti\bio\bon\bn _\bn_\ba_\bm_\be(\b(_\bp_\ba_\br_\ba_\bm_\be_\bt_\be_\br _\bl_\bi_\bs_\bt)\b) {\b{ _\bs_\bt_\ba_\bt_\be_\bm_\be_\bn_\bt_\bs }\b}
1153
1154 Functions are executed when called from within the action
1155 parts of regular pattern-action statements. Actual parame-
1156 ters supplied in the function call are used to instantiate
1157 the formal parameters declared in the function. Arrays
1158 are passed by reference, other variables are passed by
1159 value.
1160
1161 Since functions were not originally part of the AWK lan-
1162 guage, the provision for local variables is rather clumsy:
1163 they are declared as extra parameters in the parameter
1164 list. The convention is to separate local variables from
1165 real parameters by extra spaces in the parameter list. For
1166 example:
1167
1168 f\bfu\bun\bnc\bct\bti\bio\bon\bn f\bf(\b(p\bp,\b, q\bq,\b, a\ba,\b, b\bb)\b) {\b{ #\b# a\ba &\b& b\bb a\bar\bre\be l\blo\boc\bca\bal\bl
1169 .\b..\b..\b..\b..\b. }\b}
1170
1171 /\b/a\bab\bbc\bc/\b/ {\b{ .\b..\b..\b. ;\b; f\bf(\b(1\b1,\b, 2\b2)\b) ;\b; .\b..\b..\b. }\b}
1172
1173 The left parenthesis in a function call is required to
1174 immediately follow the function name, without any inter-
1175 vening white space. This is to avoid a syntactic ambigu-
1176 ity with the concatenation operator. This restriction
1177 does not apply to the built-in functions listed above.
1178
1179 Functions may call each other and may be recursive.
1180
1181
1182
1183Free Software Foundation April 15 1993 18
1184
1185
1186
1187
1188
1189
1190
1191
1192AWK(1) Utility Commands AWK(1)
1193
1194
1195 Function parameters used as local variables are initial-
1196 ized to the null string and the number zero upon function
1197 invocation.
1198
1199 The word f\bfu\bun\bnc\bc may be used in place of f\bfu\bun\bnc\bct\bti\bio\bon\bn.
1200
1201E\bEX\bXA\bAM\bMP\bPL\bLE\bES\bS
1202 Print and sort the login names of all users:
1203
1204 B\bBE\bEG\bGI\bIN\bN {\b{ F\bFS\bS =\b= "\b":\b:"\b" }\b}
1205 {\b{ p\bpr\bri\bin\bnt\bt $\b$1\b1 |\b| "\b"s\bso\bor\brt\bt"\b" }\b}
1206
1207 Count lines in a file:
1208
1209 {\b{ n\bnl\bli\bin\bne\bes\bs+\b++\b+ }\b}
1210 E\bEN\bND\bD {\b{ p\bpr\bri\bin\bnt\bt n\bnl\bli\bin\bne\bes\bs }\b}
1211
1212 Precede each line by its number in the file:
1213
1214 {\b{ p\bpr\bri\bin\bnt\bt F\bFN\bNR\bR,\b, $\b$0\b0 }\b}
1215
1216 Concatenate and line number (a variation on a theme):
1217
1218 {\b{ p\bpr\bri\bin\bnt\bt N\bNR\bR,\b, $\b$0\b0 }\b}
1219
1220S\bSE\bEE\bE A\bAL\bLS\bSO\bO
1221 _\be_\bg_\br_\be_\bp(1)
1222
1223 _\bT_\bh_\be _\bA_\bW_\bK _\bP_\br_\bo_\bg_\br_\ba_\bm_\bm_\bi_\bn_\bg _\bL_\ba_\bn_\bg_\bu_\ba_\bg_\be, Alfred V. Aho, Brian W.
1224 Kernighan, Peter J. Weinberger, Addison-Wesley, 1988. ISBN
1225 0-201-07981-X.
1226
1227 _\bT_\bh_\be _\bG_\bA_\bW_\bK _\bM_\ba_\bn_\bu_\ba_\bl, Edition 0.15, published by the Free Soft-
1228 ware Foundation, 1993.
1229
1230P\bPO\bOS\bSI\bIX\bX C\bCO\bOM\bMP\bPA\bAT\bTI\bIB\bBI\bIL\bLI\bIT\bTY\bY
1231 A primary goal for _\bg_\ba_\bw_\bk is compatibility with the POSIX
1232 standard, as well as with the latest version of UNIX _\ba_\bw_\bk.
1233 To this end, _\bg_\ba_\bw_\bk incorporates the following user visible
1234 features which are not described in the AWK book, but are
1235 part of _\ba_\bw_\bk in System V Release 4, and are in the POSIX
1236 standard.
1237
1238 The -\b-v\bv option for assigning variables before program exe-
1239 cution starts is new. The book indicates that command
1240 line variable assignment happens when _\ba_\bw_\bk would otherwise
1241 open the argument as a file, which is after the B\bBE\bEG\bGI\bIN\bN
1242 block is executed. However, in earlier implementations,
1243 when such an assignment appeared before any file names,
1244 the assignment would happen _\bb_\be_\bf_\bo_\br_\be the B\bBE\bEG\bGI\bIN\bN block was
1245 run. Applications came to depend on this ``feature.''
1246
1247
1248
1249Free Software Foundation April 15 1993 19
1250
1251
1252
1253
1254
1255
1256
1257
1258AWK(1) Utility Commands AWK(1)
1259
1260
1261 When _\ba_\bw_\bk was changed to match its documentation, this
1262 option was added to accomodate applications that depended
1263 upon the old behavior. (This feature was agreed upon by
1264 both the AT&T and GNU developers.)
1265
1266 The -\b-W\bW option for implementation specific features is from
1267 the POSIX standard.
1268
1269 When processing arguments, _\bg_\ba_\bw_\bk uses the special option
1270 ``-\b--\b-'' to signal the end of arguments, and warns about,
1271 but otherwise ignores, undefined options.
1272
1273 The AWK book does not define the return value of s\bsr\bra\ban\bnd\bd(\b()\b).
1274 The System V Release 4 version of UNIX _\ba_\bw_\bk (and the POSIX
1275 standard) has it return the seed it was using, to allow
1276 keeping track of random number sequences. Therefore
1277 s\bsr\bra\ban\bnd\bd(\b()\b) in _\bg_\ba_\bw_\bk also returns its current seed.
1278
1279 Other new features are: The use of multiple -\b-f\bf options
1280 (from MKS _\ba_\bw_\bk); the E\bEN\bNV\bVI\bIR\bRO\bON\bN array; the \\b\a\ba, and \\b\v\bv escape
1281 sequences (done originally in _\bg_\ba_\bw_\bk and fed back into
1282 AT&T's version); the t\bto\bol\blo\bow\bwe\ber\br(\b()\b) and t\bto\bou\bup\bpp\bpe\ber\br(\b()\b) built-in
1283 functions (from AT&T); and the ANSI C conversion specifi-
1284 cations in p\bpr\bri\bin\bnt\btf\bf (done first in AT&T's version).
1285
1286G\bGN\bNU\bU E\bEX\bXT\bTE\bEN\bNS\bSI\bIO\bON\bNS\bS
1287 _\bG_\ba_\bw_\bk has some extensions to POSIX _\ba_\bw_\bk. They are described
1288 in this section. All the extensions described here can be
1289 disabled by invoking _\bg_\ba_\bw_\bk with the -\b-W\bW c\bco\bom\bmp\bpa\bat\bt option.
1290
1291 The following features of _\bg_\ba_\bw_\bk are not available in POSIX
1292 _\ba_\bw_\bk.
1293
1294 +\bo The \\b\x\bx escape sequence.
1295
1296 +\bo The s\bsy\bys\bst\bti\bim\bme\be(\b()\b) and s\bst\btr\brf\bft\bti\bim\bme\be(\b()\b) functions.
1297
1298 +\bo The special file names available for I/O redirec-
1299 tion are not recognized.
1300
1301 +\bo The A\bAR\bRG\bGI\bIN\bND\bD and E\bER\bRR\bRN\bNO\bO variables are not special.
1302
1303 +\bo The I\bIG\bGN\bNO\bOR\bRE\bEC\bCA\bAS\bSE\bE variable and its side-effects are
1304 not available.
1305
1306 +\bo The F\bFI\bIE\bEL\bLD\bDW\bWI\bID\bDT\bTH\bHS\bS variable and fixed width field
1307 splitting.
1308
1309 +\bo No path search is performed for files named via
1310 the -\b-f\bf option. Therefore the A\bAW\bWK\bKP\bPA\bAT\bTH\bH environment
1311 variable is not special.
1312
1313
1314
1315Free Software Foundation April 15 1993 20
1316
1317
1318
1319
1320
1321
1322
1323
1324AWK(1) Utility Commands AWK(1)
1325
1326
1327 +\bo The use of n\bne\bex\bxt\bt f\bfi\bil\ble\be to abandon processing of the
1328 current input file.
1329
1330 The AWK book does not define the return value of the
1331 c\bcl\blo\bos\bse\be(\b()\b) function. _\bG_\ba_\bw_\bk's c\bcl\blo\bos\bse\be(\b()\b) returns the value from
1332 _\bf_\bc_\bl_\bo_\bs_\be(3), or _\bp_\bc_\bl_\bo_\bs_\be(3), when closing a file or pipe,
1333 respectively.
1334
1335 When _\bg_\ba_\bw_\bk is invoked with the -\b-W\bW c\bco\bom\bmp\bpa\bat\bt option, if the _\bf_\bs
1336 argument to the -\b-F\bF option is ``t'', then F\bFS\bS will be set to
1337 the tab character. Since this is a rather ugly special
1338 case, it is not the default behavior. This behavior also
1339 does not occur if -\b-W\bW p\bpo\bos\bsi\bix\bx has been specified.
1340
1341H\bHI\bIS\bST\bTO\bOR\bRI\bIC\bCA\bAL\bL F\bFE\bEA\bAT\bTU\bUR\bRE\bES\bS
1342 There are two features of historical AWK implementations
1343 that _\bg_\ba_\bw_\bk supports. First, it is possible to call the
1344 l\ble\ben\bng\bgt\bth\bh(\b()\b) built-in function not only with no argument, but
1345 even without parentheses! Thus,
1346
1347 a\ba =\b= l\ble\ben\bng\bgt\bth\bh
1348
1349 is the same as either of
1350
1351 a\ba =\b= l\ble\ben\bng\bgt\bth\bh(\b()\b)
1352 a\ba =\b= l\ble\ben\bng\bgt\bth\bh(\b($\b$0\b0)\b)
1353
1354 This feature is marked as ``deprecated'' in the POSIX
1355 standard, and _\bg_\ba_\bw_\bk will issue a warning about its use if
1356 -\b-W\bW l\bli\bin\bnt\bt is specified on the command line.
1357
1358 The other feature is the use of the c\bco\bon\bnt\bti\bin\bnu\bue\be statement
1359 outside the body of a w\bwh\bhi\bil\ble\be, f\bfo\bor\br, or d\bdo\bo loop. Traditional
1360 AWK implementations have treated such usage as equivalent
1361 to the n\bne\bex\bxt\bt statement. _\bG_\ba_\bw_\bk will support this usage if -\b-W\bW
1362 p\bpo\bos\bsi\bix\bx has not been specified.
1363
1364B\bBU\bUG\bGS\bS
1365 The -\b-F\bF option is not necessary given the command line
1366 variable assignment feature; it remains only for backwards
1367 compatibility.
1368
1369 If your system actually has support for /\b/d\bde\bev\bv/\b/f\bfd\bd and the
1370 associated /\b/d\bde\bev\bv/\b/s\bst\btd\bdi\bin\bn, /\b/d\bde\bev\bv/\b/s\bst\btd\bdo\bou\but\bt, and /\b/d\bde\bev\bv/\b/s\bst\btd\bde\ber\brr\br files,
1371 you may get different output from _\bg_\ba_\bw_\bk than you would get
1372 on a system without those files. When _\bg_\ba_\bw_\bk interprets
1373 these files internally, it synchronizes output to the
1374 standard output with output to /\b/d\bde\bev\bv/\b/s\bst\btd\bdo\bou\but\bt, while on a
1375 system with those files, the output is actually to differ-
1376 ent open files. Caveat Emptor.
1377
1378
1379
1380
1381Free Software Foundation April 15 1993 21
1382
1383
1384
1385
1386
1387
1388
1389
1390AWK(1) Utility Commands AWK(1)
1391
1392
1393V\bVE\bER\bRS\bSI\bIO\bON\bN I\bIN\bNF\bFO\bOR\bRM\bMA\bAT\bTI\bIO\bON\bN
1394 This man page documents _\bg_\ba_\bw_\bk, version 2.15.
1395
1396 Starting with the 2.15 version of _\bg_\ba_\bw_\bk, the -\b-c\bc, -\b-V\bV, -\b-C\bC,
1397 -\b-a\ba, and -\b-e\be options of the 2.11 version are no longer rec-
1398 ognized.
1399
1400A\bAU\bUT\bTH\bHO\bOR\bRS\bS
1401 The original version of UNIX _\ba_\bw_\bk was designed and imple-
1402 mented by Alfred Aho, Peter Weinberger, and Brian
1403 Kernighan of AT&T Bell Labs. Brian Kernighan continues to
1404 maintain and enhance it.
1405
1406 Paul Rubin and Jay Fenlason, of the Free Software Founda-
1407 tion, wrote _\bg_\ba_\bw_\bk, to be compatible with the original ver-
1408 sion of _\ba_\bw_\bk distributed in Seventh Edition UNIX. John
1409 Woods contributed a number of bug fixes. David Trueman,
1410 with contributions from Arnold Robbins, made _\bg_\ba_\bw_\bk compati-
1411 ble with the new version of UNIX _\ba_\bw_\bk.
1412
1413 The initial DOS port was done by Conrad Kwok and Scott
1414 Garfinkle. Scott Deifik is the current DOS maintainer.
1415 Pat Rankin did the port to VMS, and Michal Jaegermann did
1416 the port to the Atari ST.
1417
1418A\bAC\bCK\bKN\bNO\bOW\bWL\bLE\bED\bDG\bGE\bEM\bME\bEN\bNT\bTS\bS
1419 Brian Kernighan of Bell Labs provided valuable assistance
1420 during testing and debugging. We thank him.
1421
1422
1423
1424
1425
1426
1427
1428
1429
1430
1431
1432
1433
1434
1435
1436
1437
1438
1439
1440
1441
1442
1443
1444
1445
1446
1447Free Software Foundation April 15 1993 22
1448
1449
1450
1451
1452