BSD 4_3_Tahoe development
[unix-history] / usr / man / cat1 / awk.0
CommitLineData
362b077f
C
1
2
3
4AWK(1) UNIX Programmer's Manual AWK(1)
5
6
7
8N\bNA\bAM\bME\bE
9 awk - pattern scanning and processing language
10
11S\bSY\bYN\bNO\bOP\bPS\bSI\bIS\bS
12 a\baw\bwk\bk [ -\b-F\bF_\bc ] [ prog ] [ file ] ...
13
14D\bDE\bES\bSC\bCR\bRI\bIP\bPT\bTI\bIO\bON\bN
15 _\bA_\bw_\bk scans each input _\bf_\bi_\bl_\be for lines that match any of a set
16 of patterns specified in _\bp_\br_\bo_\bg. With each pattern in _\bp_\br_\bo_\bg
17 there can be an associated action that will be performed
18 when a line of a _\bf_\bi_\bl_\be matches the pattern. The set of pat-
19 terns may appear literally as _\bp_\br_\bo_\bg, or in a file specified
20 as -\b-f\bf _\bf_\bi_\bl_\be.
21
22 Files are read in order; if there are no files, the standard
23 input is read. The file name `-' means the standard input.
24 Each line is matched against the pattern portion of every
25 pattern-action statement; the associated action is performed
26 for each matched pattern.
27
28 An input line is made up of fields separated by white space.
29 (This default can be changed by using FS, _\bv_\bi_\bd_\be _\bi_\bn_\bf_\br_\ba.) The
30 fields are denoted $1, $2, ... ; $0 refers to the entire
31 line.
32
33 A pattern-action statement has the form
34
35 pattern { action }
36
37 A missing { action } means print the line; a missing pattern
38 always matches.
39
40 An action is a sequence of statements. A statement can be
41 one of the following:
42
43 if ( conditional ) statement [ else statement ]
44 while ( conditional ) statement
45 for ( expression ; conditional ; expression ) statement
46 break
47 continue
48 { [ statement ] ... }
49 variable = expression
50 print [ expression-list ] [ >expression ]
51 printf format [ , expression-list ] [ >expression ]
52 next # skip remaining patterns on this input line
53 exit # skip the rest of the input
54
55 Statements are terminated by semicolons, newlines or right
56 braces. An empty expression-list stands for the whole line.
57 Expressions take on string or numeric values as appropriate,
58 and are built using the operators +, -, *, /, %, and con-
59 catenation (indicated by a blank). The C operators ++, --,
60
61
62
63Printed 7/9/88 April 29, 1985 1
64
65
66
67
68
69
70AWK(1) UNIX Programmer's Manual AWK(1)
71
72
73
74 +=, -=, *=, /=, and %= are also available in expressions.
75 Variables may be scalars, array elements (denoted x[i]) or
76 fields. Variables are initialized to the null string.
77 Array subscripts may be any string, not necessarily numeric;
78 this allows for a form of associative memory. String con-
79 stants are quoted "...".
80
81 The _\bp_\br_\bi_\bn_\bt statement prints its arguments on the standard
82 output (or on a file if >_\bf_\bi_\bl_\be is present), separated by the
83 current output field separator, and terminated by the output
84 record separator. The _\bp_\br_\bi_\bn_\bt_\bf statement formats its expres-
85 sion list according to the format (see _\bp_\br_\bi_\bn_\bt_\bf(3S)).
86
87 The built-in function _\bl_\be_\bn_\bg_\bt_\bh returns the length of its argu-
88 ment taken as a string, or of the whole line if no argument.
89 There are also built-in functions _\be_\bx_\bp, _\bl_\bo_\bg, _\bs_\bq_\br_\bt, and _\bi_\bn_\bt.
90 The last truncates its argument to an integer.
91 _\bs_\bu_\bb_\bs_\bt_\br(_\bs, _\bm, _\bn) returns the _\bn-character substring of _\bs that
92 begins at position _\bm. The function
93 _\bs_\bp_\br_\bi_\bn_\bt_\bf(_\bf_\bm_\bt, _\be_\bx_\bp_\br, _\be_\bx_\bp_\br, ...) formats the expressions
94 according to the _\bp_\br_\bi_\bn_\bt_\bf(3S) format given by _\bf_\bm_\bt and returns
95 the resulting string.
96
97 Patterns are arbitrary Boolean combinations (!, ||, &&, and
98 parentheses) of regular expressions and relational expres-
99 sions. Regular expressions must be surrounded by slashes
100 and are as in _\be_\bg_\br_\be_\bp. Isolated regular expressions in a pat-
101 tern apply to the entire line. Regular expressions may also
102 occur in relational expressions.
103
104 A pattern may consist of two patterns separated by a comma;
105 in this case, the action is performed for all lines between
106 an occurrence of the first pattern and the next occurrence
107 of the second.
108
109 A relational expression is one of the following:
110
111 expression matchop regular-expression
112 expression relop expression
113
114 where a relop is any of the six relational operators in C,
115 and a matchop is either ~ (for contains) or !~ (for does not
116 contain). A conditional is an arithmetic expression, a
117 relational expression, or a Boolean combination of these.
118
119 The special patterns BEGIN and END may be used to capture
120 control before the first input line is read and after the
121 last. BEGIN must be the first pattern, END the last.
122
123 A single character _\bc may be used to separate the fields by
124 starting the program with
125
126
127
128
129Printed 7/9/88 April 29, 1985 2
130
131
132
133
134
135
136AWK(1) UNIX Programmer's Manual AWK(1)
137
138
139
140 BEGIN { FS = "c" }
141
142 or by using the -\b-F\bF_\bc option.
143
144 Other variable names with special meanings include NF, the
145 number of fields in the current record; NR, the ordinal
146 number of the current record; FILENAME, the name of the
147 current input file; OFS, the output field separator (default
148 blank); ORS, the output record separator (default newline);
149 and OFMT, the output format for numbers (default "%.6g").
150
151E\bEX\bXA\bAM\bMP\bPL\bLE\bES\bS
152 Print lines longer than 72 characters:
153
154 length > 72
155
156 Print first two fields in opposite order:
157
158 { print $2, $1 }
159
160 Add up first column, print sum and average:
161
162 { s += $1 }
163 END { print "sum is", s, " average is", s/NR }
164
165 Print fields in reverse order:
166
167 { for (i = NF; i > 0; --i) print $i }
168
169 Print all lines between start/stop pairs:
170
171 /start/, /stop/
172
173 Print all lines whose first field is different from previous
174 one:
175
176 $1 != prev { print; prev = $1 }
177
178S\bSE\bEE\bE A\bAL\bLS\bSO\bO
179 lex(1), sed(1)
180 A. V. Aho, B. W. Kernighan, P. J. Weinberger, _\bA_\bw_\bk - _\ba _\bp_\ba_\bt_\b-
181 _\bt_\be_\br_\bn _\bs_\bc_\ba_\bn_\bn_\bi_\bn_\bg _\ba_\bn_\bd _\bp_\br_\bo_\bc_\be_\bs_\bs_\bi_\bn_\bg _\bl_\ba_\bn_\bg_\bu_\ba_\bg_\be
182
183B\bBU\bUG\bGS\bS
184 There are no explicit conversions between numbers and
185 strings. To force an expression to be treated as a number
186 add 0 to it; to force it to be treated as a string concaten-
187 ate "" to it.
188
189
190
191
192
193
194
195Printed 7/9/88 April 29, 1985 3
196
197
198