Commit | Line | Data |
---|---|---|
362b077f C |
1 | |
2 | ||
3 | ||
4 | AWK(1) UNIX Programmer's Manual AWK(1) | |
5 | ||
6 | ||
7 | ||
8 | N\bNA\bAM\bME\bE | |
9 | awk - pattern scanning and processing language | |
10 | ||
11 | S\bSY\bYN\bNO\bOP\bPS\bSI\bIS\bS | |
12 | a\baw\bwk\bk [ -\b-F\bF_\bc ] [ prog ] [ file ] ... | |
13 | ||
14 | D\bDE\bES\bSC\bCR\bRI\bIP\bPT\bTI\bIO\bON\bN | |
15 | _\bA_\bw_\bk scans each input _\bf_\bi_\bl_\be for lines that match any of a set | |
16 | of patterns specified in _\bp_\br_\bo_\bg. With each pattern in _\bp_\br_\bo_\bg | |
17 | there can be an associated action that will be performed | |
18 | when a line of a _\bf_\bi_\bl_\be matches the pattern. The set of pat- | |
19 | terns may appear literally as _\bp_\br_\bo_\bg, or in a file specified | |
20 | as -\b-f\bf _\bf_\bi_\bl_\be. | |
21 | ||
22 | Files are read in order; if there are no files, the standard | |
23 | input is read. The file name `-' means the standard input. | |
24 | Each line is matched against the pattern portion of every | |
25 | pattern-action statement; the associated action is performed | |
26 | for each matched pattern. | |
27 | ||
28 | An input line is made up of fields separated by white space. | |
29 | (This default can be changed by using FS, _\bv_\bi_\bd_\be _\bi_\bn_\bf_\br_\ba.) The | |
30 | fields are denoted $1, $2, ... ; $0 refers to the entire | |
31 | line. | |
32 | ||
33 | A pattern-action statement has the form | |
34 | ||
35 | pattern { action } | |
36 | ||
37 | A missing { action } means print the line; a missing pattern | |
38 | always matches. | |
39 | ||
40 | An action is a sequence of statements. A statement can be | |
41 | one of the following: | |
42 | ||
43 | if ( conditional ) statement [ else statement ] | |
44 | while ( conditional ) statement | |
45 | for ( expression ; conditional ; expression ) statement | |
46 | break | |
47 | continue | |
48 | { [ statement ] ... } | |
49 | variable = expression | |
50 | print [ expression-list ] [ >expression ] | |
51 | printf format [ , expression-list ] [ >expression ] | |
52 | next # skip remaining patterns on this input line | |
53 | exit # skip the rest of the input | |
54 | ||
55 | Statements are terminated by semicolons, newlines or right | |
56 | braces. An empty expression-list stands for the whole line. | |
57 | Expressions take on string or numeric values as appropriate, | |
58 | and are built using the operators +, -, *, /, %, and con- | |
59 | catenation (indicated by a blank). The C operators ++, --, | |
60 | ||
61 | ||
62 | ||
63 | Printed 7/9/88 April 29, 1985 1 | |
64 | ||
65 | ||
66 | ||
67 | ||
68 | ||
69 | ||
70 | AWK(1) UNIX Programmer's Manual AWK(1) | |
71 | ||
72 | ||
73 | ||
74 | +=, -=, *=, /=, and %= are also available in expressions. | |
75 | Variables may be scalars, array elements (denoted x[i]) or | |
76 | fields. Variables are initialized to the null string. | |
77 | Array subscripts may be any string, not necessarily numeric; | |
78 | this allows for a form of associative memory. String con- | |
79 | stants are quoted "...". | |
80 | ||
81 | The _\bp_\br_\bi_\bn_\bt statement prints its arguments on the standard | |
82 | output (or on a file if >_\bf_\bi_\bl_\be is present), separated by the | |
83 | current output field separator, and terminated by the output | |
84 | record separator. The _\bp_\br_\bi_\bn_\bt_\bf statement formats its expres- | |
85 | sion list according to the format (see _\bp_\br_\bi_\bn_\bt_\bf(3S)). | |
86 | ||
87 | The built-in function _\bl_\be_\bn_\bg_\bt_\bh returns the length of its argu- | |
88 | ment taken as a string, or of the whole line if no argument. | |
89 | There are also built-in functions _\be_\bx_\bp, _\bl_\bo_\bg, _\bs_\bq_\br_\bt, and _\bi_\bn_\bt. | |
90 | The last truncates its argument to an integer. | |
91 | _\bs_\bu_\bb_\bs_\bt_\br(_\bs, _\bm, _\bn) returns the _\bn-character substring of _\bs that | |
92 | begins at position _\bm. The function | |
93 | _\bs_\bp_\br_\bi_\bn_\bt_\bf(_\bf_\bm_\bt, _\be_\bx_\bp_\br, _\be_\bx_\bp_\br, ...) formats the expressions | |
94 | according to the _\bp_\br_\bi_\bn_\bt_\bf(3S) format given by _\bf_\bm_\bt and returns | |
95 | the resulting string. | |
96 | ||
97 | Patterns are arbitrary Boolean combinations (!, ||, &&, and | |
98 | parentheses) of regular expressions and relational expres- | |
99 | sions. Regular expressions must be surrounded by slashes | |
100 | and are as in _\be_\bg_\br_\be_\bp. Isolated regular expressions in a pat- | |
101 | tern apply to the entire line. Regular expressions may also | |
102 | occur in relational expressions. | |
103 | ||
104 | A pattern may consist of two patterns separated by a comma; | |
105 | in this case, the action is performed for all lines between | |
106 | an occurrence of the first pattern and the next occurrence | |
107 | of the second. | |
108 | ||
109 | A relational expression is one of the following: | |
110 | ||
111 | expression matchop regular-expression | |
112 | expression relop expression | |
113 | ||
114 | where a relop is any of the six relational operators in C, | |
115 | and a matchop is either ~ (for contains) or !~ (for does not | |
116 | contain). A conditional is an arithmetic expression, a | |
117 | relational expression, or a Boolean combination of these. | |
118 | ||
119 | The special patterns BEGIN and END may be used to capture | |
120 | control before the first input line is read and after the | |
121 | last. BEGIN must be the first pattern, END the last. | |
122 | ||
123 | A single character _\bc may be used to separate the fields by | |
124 | starting the program with | |
125 | ||
126 | ||
127 | ||
128 | ||
129 | Printed 7/9/88 April 29, 1985 2 | |
130 | ||
131 | ||
132 | ||
133 | ||
134 | ||
135 | ||
136 | AWK(1) UNIX Programmer's Manual AWK(1) | |
137 | ||
138 | ||
139 | ||
140 | BEGIN { FS = "c" } | |
141 | ||
142 | or by using the -\b-F\bF_\bc option. | |
143 | ||
144 | Other variable names with special meanings include NF, the | |
145 | number of fields in the current record; NR, the ordinal | |
146 | number of the current record; FILENAME, the name of the | |
147 | current input file; OFS, the output field separator (default | |
148 | blank); ORS, the output record separator (default newline); | |
149 | and OFMT, the output format for numbers (default "%.6g"). | |
150 | ||
151 | E\bEX\bXA\bAM\bMP\bPL\bLE\bES\bS | |
152 | Print lines longer than 72 characters: | |
153 | ||
154 | length > 72 | |
155 | ||
156 | Print first two fields in opposite order: | |
157 | ||
158 | { print $2, $1 } | |
159 | ||
160 | Add up first column, print sum and average: | |
161 | ||
162 | { s += $1 } | |
163 | END { print "sum is", s, " average is", s/NR } | |
164 | ||
165 | Print fields in reverse order: | |
166 | ||
167 | { for (i = NF; i > 0; --i) print $i } | |
168 | ||
169 | Print all lines between start/stop pairs: | |
170 | ||
171 | /start/, /stop/ | |
172 | ||
173 | Print all lines whose first field is different from previous | |
174 | one: | |
175 | ||
176 | $1 != prev { print; prev = $1 } | |
177 | ||
178 | S\bSE\bEE\bE A\bAL\bLS\bSO\bO | |
179 | lex(1), sed(1) | |
180 | A. V. Aho, B. W. Kernighan, P. J. Weinberger, _\bA_\bw_\bk - _\ba _\bp_\ba_\bt_\b- | |
181 | _\bt_\be_\br_\bn _\bs_\bc_\ba_\bn_\bn_\bi_\bn_\bg _\ba_\bn_\bd _\bp_\br_\bo_\bc_\be_\bs_\bs_\bi_\bn_\bg _\bl_\ba_\bn_\bg_\bu_\ba_\bg_\be | |
182 | ||
183 | B\bBU\bUG\bGS\bS | |
184 | There are no explicit conversions between numbers and | |
185 | strings. To force an expression to be treated as a number | |
186 | add 0 to it; to force it to be treated as a string concaten- | |
187 | ate "" to it. | |
188 | ||
189 | ||
190 | ||
191 | ||
192 | ||
193 | ||
194 | ||
195 | Printed 7/9/88 April 29, 1985 3 | |
196 | ||
197 | ||
198 |