Commit | Line | Data |
---|---|---|
c629fd77 C |
1 | .TH AWK 1 "18 January 1983" |
2 | .SH NAME | |
3 | awk \- pattern scanning and processing language | |
4 | .SH SYNOPSIS | |
5 | .B awk | |
6 | [ | |
7 | .BI \-F c | |
8 | ] | |
9 | [ prog ] [ file ] ... | |
10 | .SH DESCRIPTION | |
11 | .I Awk | |
12 | scans each input | |
13 | .I file | |
14 | for lines that match any of a set of patterns specified in | |
15 | .IR prog . | |
16 | With each pattern in | |
17 | .I prog | |
18 | there can be an associated action that will be performed | |
19 | when a line of a | |
20 | .I file | |
21 | matches the pattern. | |
22 | The set of patterns may appear literally as | |
23 | .I prog, | |
24 | or in a file | |
25 | specified as | |
26 | .B \-f | |
27 | .IR file . | |
28 | .PP | |
29 | Files are read in order; | |
30 | if there are no files, the standard input is read. | |
31 | The file name `\-' | |
32 | means the standard input. | |
33 | Each line is matched against the | |
34 | pattern portion of every pattern-action statement; | |
35 | the associated action is performed for each matched pattern. | |
36 | .PP | |
37 | An input line is made up of fields separated by white space. | |
38 | (This default can be changed by using FS, | |
39 | .IR "vide infra" ".)" | |
40 | The fields are denoted $1, $2, ... ; | |
41 | $0 refers to the entire line. | |
42 | .PP | |
43 | .PP | |
44 | A pattern-action statement has the form | |
45 | .PP | |
46 | pattern { action } | |
47 | .PP | |
48 | A missing { action } means print the line; | |
49 | a missing pattern always matches. | |
50 | .PP | |
51 | An action is a sequence of statements. | |
52 | A statement can be one of the following: | |
53 | .PP | |
54 | .nf | |
55 | if ( conditional ) statement [ else statement ] | |
56 | while ( conditional ) statement | |
57 | for ( expression ; conditional ; expression ) statement | |
58 | break | |
59 | continue | |
60 | { [ statement ] ... } | |
61 | variable = expression | |
62 | print [ expression-list ] [ >expression ] | |
63 | printf format [ , expression-list ] [ >expression ] | |
64 | next # skip remaining patterns on this input line | |
65 | exit # skip the rest of the input | |
66 | .fi | |
67 | .PP | |
68 | Statements are terminated by | |
69 | semicolons, newlines or right braces. | |
70 | An empty expression-list stands for the whole line. | |
71 | Expressions take on string or numeric values as appropriate, | |
72 | and are built using the operators | |
73 | +, \-, *, /, %, and concatenation (indicated by a blank). | |
74 | The C operators ++, \-\-, +=, \-=, *=, /=, and %= | |
75 | are also available in expressions. | |
76 | Variables may be scalars, array elements | |
77 | (denoted | |
78 | x[i]) | |
79 | or fields. | |
80 | Variables are initialized to the null string. | |
81 | Array subscripts may be any string, | |
82 | not necessarily numeric; | |
83 | this allows for a form of associative memory. | |
84 | String constants are quoted "...". | |
85 | .PP | |
86 | The | |
87 | .I print | |
88 | statement prints its arguments on the standard output | |
89 | (or on a file if | |
90 | .I >file | |
91 | is present), separated by the current output field separator, | |
92 | and terminated by the output record separator. | |
93 | The | |
94 | .I printf | |
95 | statement formats its expression list according to the format | |
96 | (see | |
97 | .IR printf (3S)). | |
98 | .PP | |
99 | The built-in function | |
100 | .I length | |
101 | returns the length of its argument | |
102 | taken as a string, | |
103 | or of the whole line if no argument. | |
104 | There are also built-in functions | |
105 | .I exp, | |
106 | .I log, | |
107 | .I sqrt, | |
108 | and | |
109 | .IR int . | |
110 | The last truncates its argument to an integer. | |
111 | .IR substr(s,\ m,\ n) | |
112 | returns the | |
113 | .IR n -character | |
114 | substring of | |
115 | .I s | |
116 | that begins at position | |
117 | .IR m . | |
118 | The function | |
119 | .IR sprintf(fmt,\ expr,\ expr,\ ...) | |
120 | formats the expressions | |
121 | according to the | |
122 | .IR printf (3S) | |
123 | format given by | |
124 | .I fmt | |
125 | and returns the resulting string. | |
126 | .PP | |
127 | Patterns are arbitrary Boolean combinations | |
128 | (!, \(or\(or, &&, and parentheses) of | |
129 | regular expressions and | |
130 | relational expressions. | |
131 | Regular expressions must be surrounded | |
132 | by slashes and are as in | |
133 | .IR egrep . | |
134 | Isolated regular expressions | |
135 | in a pattern apply to the entire line. | |
136 | Regular expressions may also occur in | |
137 | relational expressions. | |
138 | .PP | |
139 | A pattern may consist of two patterns separated by a comma; | |
140 | in this case, the action is performed for all lines | |
141 | between an occurrence of the first pattern | |
142 | and the next occurrence of the second. | |
143 | .PP | |
144 | .nf | |
145 | A relational expression is one of the following: | |
146 | .PP | |
147 | .nf | |
148 | expression matchop regular-expression | |
149 | expression relop expression | |
150 | .PP | |
151 | .fi | |
152 | where a relop is any of the six relational operators in C, | |
153 | and a matchop is either ~ (for contains) | |
154 | or !~ (for does not contain). | |
155 | A conditional is an arithmetic expression, | |
156 | a relational expression, | |
157 | or a Boolean combination | |
158 | of these. | |
159 | .PP | |
160 | The special patterns | |
161 | BEGIN | |
162 | and | |
163 | END | |
164 | may be used to capture control before the first input line is read | |
165 | and after the last. | |
166 | BEGIN must be the first pattern, END the last. | |
167 | .PP | |
168 | A single character | |
169 | .I c | |
170 | may be used to separate the fields by starting | |
171 | the program with | |
172 | .PP | |
173 | BEGIN { FS = "c" } | |
174 | .PP | |
175 | or by using the | |
176 | .BI \-F c | |
177 | option. | |
178 | .PP | |
179 | Other variable names with special meanings | |
180 | include NF, the number of fields in the current record; | |
181 | NR, the ordinal number of the current record; | |
182 | FILENAME, the name of the current input file; | |
183 | OFS, the output field separator (default blank); | |
184 | ORS, the output record separator (default newline); | |
185 | and | |
186 | OFMT, the output format for numbers (default "%.6g"). | |
187 | .PP | |
188 | .SH EXAMPLES | |
189 | .PP | |
190 | Print lines longer than 72 characters: | |
191 | .PP | |
192 | .nf | |
193 | length > 72 | |
194 | .fi | |
195 | .PP | |
196 | Print first two fields in opposite order: | |
197 | .PP | |
198 | .nf | |
199 | { print $2, $1 } | |
200 | .fi | |
201 | .PP | |
202 | Add up first column, print sum and average: | |
203 | .PP | |
204 | .nf | |
205 | { s += $1 } | |
206 | END { print "sum is", s, " average is", s/NR } | |
207 | .fi | |
208 | .PP | |
209 | Print fields in reverse order: | |
210 | .PP | |
211 | .nf | |
212 | { for (i = NF; i > 0; \-\-i) print $i } | |
213 | .fi | |
214 | .PP | |
215 | Print all lines between start/stop pairs: | |
216 | .PP | |
217 | .nf | |
218 | /start/, /stop/ | |
219 | .fi | |
220 | .PP | |
221 | Print all lines whose first field is different from previous one: | |
222 | .PP | |
223 | .nf | |
224 | $1 != prev { print; prev = $1 } | |
225 | .fi | |
226 | .SH SEE ALSO | |
227 | .PP | |
228 | lex(1), sed(1) | |
229 | .br | |
230 | A. V. Aho, B. W. Kernighan, P. J. Weinberger, | |
231 | .I | |
232 | Awk \- a pattern scanning and processing language | |
233 | .SH BUGS | |
234 | There are no explicit conversions between numbers and strings. | |
235 | To force an expression to be treated as a number add 0 to it; | |
236 | to force it to be treated as a string concatenate "" | |
237 | to it. |