Commit | Line | Data |
---|---|---|
a40f7d8d | 1 | .\" @(#)awk.1 6.1 (Berkeley) %G% |
046bdc0f | 2 | .\" |
a40f7d8d | 3 | .TH AWK 1 "" |
046bdc0f KM |
4 | .AT 3 |
5 | .SH NAME | |
6 | awk \- pattern scanning and processing language | |
7 | .SH SYNOPSIS | |
8 | .B awk | |
9 | [ | |
10 | .BI \-F c | |
11 | ] | |
12 | [ prog ] [ file ] ... | |
13 | .SH DESCRIPTION | |
14 | .I Awk | |
15 | scans each input | |
16 | .I file | |
17 | for lines that match any of a set of patterns specified in | |
18 | .IR prog . | |
19 | With each pattern in | |
20 | .I prog | |
21 | there can be an associated action that will be performed | |
22 | when a line of a | |
23 | .I file | |
24 | matches the pattern. | |
25 | The set of patterns may appear literally as | |
26 | .I prog, | |
27 | or in a file | |
28 | specified as | |
29 | .B \-f | |
30 | .IR file . | |
31 | .PP | |
32 | Files are read in order; | |
33 | if there are no files, the standard input is read. | |
34 | The file name `\-' | |
35 | means the standard input. | |
36 | Each line is matched against the | |
37 | pattern portion of every pattern-action statement; | |
38 | the associated action is performed for each matched pattern. | |
39 | .PP | |
40 | An input line is made up of fields separated by white space. | |
41 | (This default can be changed by using FS, | |
42 | .IR "vide infra" ".)" | |
43 | The fields are denoted $1, $2, ... ; | |
44 | $0 refers to the entire line. | |
45 | .PP | |
46 | .PP | |
47 | A pattern-action statement has the form | |
48 | .PP | |
49 | pattern { action } | |
50 | .PP | |
51 | A missing { action } means print the line; | |
52 | a missing pattern always matches. | |
53 | .PP | |
54 | An action is a sequence of statements. | |
55 | A statement can be one of the following: | |
56 | .PP | |
57 | .nf | |
58 | if ( conditional ) statement [ else statement ] | |
59 | while ( conditional ) statement | |
60 | for ( expression ; conditional ; expression ) statement | |
61 | break | |
62 | continue | |
63 | { [ statement ] ... } | |
64 | variable = expression | |
65 | print [ expression-list ] [ >expression ] | |
66 | printf format [ , expression-list ] [ >expression ] | |
67 | next # skip remaining patterns on this input line | |
68 | exit # skip the rest of the input | |
69 | .fi | |
70 | .PP | |
71 | Statements are terminated by | |
72 | semicolons, newlines or right braces. | |
73 | An empty expression-list stands for the whole line. | |
74 | Expressions take on string or numeric values as appropriate, | |
75 | and are built using the operators | |
76 | +, \-, *, /, %, and concatenation (indicated by a blank). | |
77 | The C operators ++, \-\-, +=, \-=, *=, /=, and %= | |
78 | are also available in expressions. | |
79 | Variables may be scalars, array elements | |
80 | (denoted | |
81 | x[i]) | |
82 | or fields. | |
83 | Variables are initialized to the null string. | |
84 | Array subscripts may be any string, | |
85 | not necessarily numeric; | |
86 | this allows for a form of associative memory. | |
87 | String constants are quoted "...". | |
88 | .PP | |
89 | The | |
90 | .I print | |
91 | statement prints its arguments on the standard output | |
92 | (or on a file if | |
93 | .I >file | |
94 | is present), separated by the current output field separator, | |
95 | and terminated by the output record separator. | |
96 | The | |
97 | .I printf | |
98 | statement formats its expression list according to the format | |
99 | (see | |
b7044523 | 100 | .IR printf (3S)). |
046bdc0f KM |
101 | .PP |
102 | The built-in function | |
103 | .I length | |
104 | returns the length of its argument | |
105 | taken as a string, | |
106 | or of the whole line if no argument. | |
107 | There are also built-in functions | |
108 | .I exp, | |
109 | .I log, | |
110 | .I sqrt, | |
111 | and | |
112 | .IR int . | |
113 | The last truncates its argument to an integer. | |
114 | .IR substr(s,\ m,\ n) | |
115 | returns the | |
116 | .IR n -character | |
117 | substring of | |
118 | .I s | |
119 | that begins at position | |
120 | .IR m . | |
121 | The function | |
122 | .IR sprintf(fmt,\ expr,\ expr,\ ...) | |
123 | formats the expressions | |
124 | according to the | |
b7044523 | 125 | .IR printf (3S) |
046bdc0f KM |
126 | format given by |
127 | .I fmt | |
128 | and returns the resulting string. | |
129 | .PP | |
130 | Patterns are arbitrary Boolean combinations | |
131 | (!, \(or\(or, &&, and parentheses) of | |
132 | regular expressions and | |
133 | relational expressions. | |
134 | Regular expressions must be surrounded | |
135 | by slashes and are as in | |
136 | .IR egrep . | |
137 | Isolated regular expressions | |
138 | in a pattern apply to the entire line. | |
139 | Regular expressions may also occur in | |
140 | relational expressions. | |
141 | .PP | |
142 | A pattern may consist of two patterns separated by a comma; | |
143 | in this case, the action is performed for all lines | |
144 | between an occurrence of the first pattern | |
145 | and the next occurrence of the second. | |
146 | .PP | |
147 | .nf | |
148 | A relational expression is one of the following: | |
149 | .PP | |
150 | .nf | |
151 | expression matchop regular-expression | |
152 | expression relop expression | |
153 | .PP | |
154 | .fi | |
155 | where a relop is any of the six relational operators in C, | |
156 | and a matchop is either ~ (for contains) | |
157 | or !~ (for does not contain). | |
158 | A conditional is an arithmetic expression, | |
159 | a relational expression, | |
160 | or a Boolean combination | |
161 | of these. | |
162 | .PP | |
163 | The special patterns | |
164 | BEGIN | |
165 | and | |
166 | END | |
167 | may be used to capture control before the first input line is read | |
168 | and after the last. | |
169 | BEGIN must be the first pattern, END the last. | |
170 | .PP | |
171 | A single character | |
172 | .I c | |
173 | may be used to separate the fields by starting | |
174 | the program with | |
175 | .PP | |
176 | BEGIN { FS = "c" } | |
177 | .PP | |
178 | or by using the | |
179 | .BI \-F c | |
180 | option. | |
181 | .PP | |
182 | Other variable names with special meanings | |
183 | include NF, the number of fields in the current record; | |
184 | NR, the ordinal number of the current record; | |
185 | FILENAME, the name of the current input file; | |
186 | OFS, the output field separator (default blank); | |
187 | ORS, the output record separator (default newline); | |
188 | and | |
189 | OFMT, the output format for numbers (default "%.6g"). | |
190 | .PP | |
191 | .SH EXAMPLES | |
192 | .PP | |
193 | Print lines longer than 72 characters: | |
194 | .PP | |
195 | .nf | |
196 | length > 72 | |
197 | .fi | |
198 | .PP | |
199 | Print first two fields in opposite order: | |
200 | .PP | |
201 | .nf | |
202 | { print $2, $1 } | |
203 | .fi | |
204 | .PP | |
205 | Add up first column, print sum and average: | |
206 | .PP | |
207 | .nf | |
208 | { s += $1 } | |
209 | END { print "sum is", s, " average is", s/NR } | |
210 | .fi | |
211 | .PP | |
212 | Print fields in reverse order: | |
213 | .PP | |
214 | .nf | |
215 | { for (i = NF; i > 0; \-\-i) print $i } | |
216 | .fi | |
217 | .PP | |
218 | Print all lines between start/stop pairs: | |
219 | .PP | |
220 | .nf | |
221 | /start/, /stop/ | |
222 | .fi | |
223 | .PP | |
224 | Print all lines whose first field is different from previous one: | |
225 | .PP | |
226 | .nf | |
227 | $1 != prev { print; prev = $1 } | |
228 | .fi | |
229 | .SH SEE ALSO | |
230 | .PP | |
231 | lex(1), sed(1) | |
232 | .br | |
233 | A. V. Aho, B. W. Kernighan, P. J. Weinberger, | |
234 | .I | |
235 | Awk \- a pattern scanning and processing language | |
236 | .SH BUGS | |
237 | There are no explicit conversions between numbers and strings. | |
238 | To force an expression to be treated as a number add 0 to it; | |
239 | to force it to be treated as a string concatenate "" | |
240 | to it. |