macros for different classes of network
[unix-history] / .ref-BSD-3 / usr / man / man1 / awk.1
CommitLineData
e6817382
BJ
1.TH AWK 1
2.SH NAME
3awk \- pattern scanning and processing language
4.SH SYNOPSIS
5.B awk
6[
7.BI \-F c
8]
9[ prog ] [ file ] ...
10.SH DESCRIPTION
11.I Awk
12scans each input
13.I file
14for lines that match any of a set of patterns specified in
15.IR prog .
16With each pattern in
17.I prog
18there can be an associated action that will be performed
19when a line of a
20.I file
21matches the pattern.
22The set of patterns may appear literally as
23.I prog,
24or in a file
25specified as
26.B \-f
27.IR file .
28.PP
29Files are read in order;
30if there are no files, the standard input is read.
31The file name `\-'
32means the standard input.
33Each line is matched against the
34pattern portion of every pattern-action statement;
35the associated action is performed for each matched pattern.
36.PP
37An input line is made up of fields separated by white space.
38(This default can be changed by using FS,
39.IR "vide infra" ".)"
40The fields are denoted $1, $2, ... ;
41$0 refers to the entire line.
42.PP
43.PP
44A pattern-action statement has the form
45.PP
46 pattern { action }
47.PP
48A missing { action } means print the line;
49a missing pattern always matches.
50.PP
51An action is a sequence of statements.
52A statement can be one of the following:
53.PP
54.nf
55 if ( conditional ) statement [ else statement ]
56 while ( conditional ) statement
57 for ( expression ; conditional ; expression ) statement
58 break
59 continue
60 { [ statement ] ... }
61 variable = expression
62 print [ expression-list ] [ >expression ]
63 printf format [ , expression-list ] [ >expression ]
64 next # skip remaining patterns on this input line
65 exit # skip the rest of the input
66.fi
67.PP
68Statements are terminated by
69semicolons, newlines or right braces.
70An empty expression-list stands for the whole line.
71Expressions take on string or numeric values as appropriate,
72and are built using the operators
73+, \-, *, /, %, and concatenation (indicated by a blank).
74The C operators ++, \-\-, +=, \-=, *=, /=, and %=
75are also available in expressions.
76Variables may be scalars, array elements
77(denoted
78x[i])
79or fields.
80Variables are initialized to the null string.
81Array subscripts may be any string,
82not necessarily numeric;
83this allows for a form of associative memory.
84String constants are quoted "...".
85.PP
86The
87.I print
88statement prints its arguments on the standard output
89(or on a file if
90.I >file
91is present), separated by the current output field separator,
92and terminated by the output record separator.
93The
94.I printf
95statement formats its expression list according to the format
96(see
97.IR printf (3)).
98.PP
99The built-in function
100.I length
101returns the length of its argument
102taken as a string,
103or of the whole line if no argument.
104There are also built-in functions
105.I exp,
106.I log,
107.I sqrt,
108and
109.IR int .
110The last truncates its argument to an integer.
111.IR substr(s,\ m,\ n)
112returns the
113.IR n -character
114substring of
115.I s
116that begins at position
117.IR m .
118The function
119.IR sprintf(fmt,\ expr,\ expr,\ ...)
120formats the expressions
121according to the
122.IR printf (3)
123format given by
124.I fmt
125and returns the resulting string.
126.PP
127Patterns are arbitrary Boolean combinations
128(!, \(or\(or, &&, and parentheses) of
129regular expressions and
130relational expressions.
131Regular expressions must be surrounded
132by slashes and are as in
133.IR egrep .
134Isolated regular expressions
135in a pattern apply to the entire line.
136Regular expressions may also occur in
137relational expressions.
138.PP
139A pattern may consist of two patterns separated by a comma;
140in this case, the action is performed for all lines
141between an occurrence of the first pattern
142and the next occurrence of the second.
143.PP
144.nf
145A relational expression is one of the following:
146.PP
147.nf
148 expression matchop regular-expression
149 expression relop expression
150.PP
151.fi
152where a relop is any of the six relational operators in C,
153and a matchop is either ~ (for contains)
154or !~ (for does not contain).
155A conditional is an arithmetic expression,
156a relational expression,
157or a Boolean combination
158of these.
159.PP
160The special patterns
161BEGIN
162and
163END
164may be used to capture control before the first input line is read
165and after the last.
166BEGIN must be the first pattern, END the last.
167.PP
168A single character
169.I c
170may be used to separate the fields by starting
171the program with
172.PP
173 BEGIN { FS = "c" }
174.PP
175or by using the
176.BI \-F c
177option.
178.PP
179Other variable names with special meanings
180include NF, the number of fields in the current record;
181NR, the ordinal number of the current record;
182FILENAME, the name of the current input file;
183OFS, the output field separator (default blank);
184ORS, the output record separator (default newline);
185and
186OFMT, the output format for numbers (default "%.6g").
187.PP
188.SH EXAMPLES
189.PP
190Print lines longer than 72 characters:
191.PP
192.nf
193 length > 72
194.fi
195.PP
196Print first two fields in opposite order:
197.PP
198.nf
199 { print $2, $1 }
200.fi
201.PP
202Add up first column, print sum and average:
203.PP
204.nf
205 { s += $1 }
206 END { print "sum is", s, " average is", s/NR }
207.fi
208.PP
209Print fields in reverse order:
210.PP
211.nf
212 { for (i = NF; i > 0; \-\-i) print $i }
213.fi
214.PP
215Print all lines between start/stop pairs:
216.PP
217.nf
218 /start/, /stop/
219.fi
220.PP
221Print all lines whose first field is different from previous one:
222.PP
223.nf
224 $1 != prev { print; prev = $1 }
225.fi
226.SH SEE ALSO
227.PP
228lex(1), sed(1)
229.br
230A. V. Aho, B. W. Kernighan, P. J. Weinberger,
231.I
232Awk \- a pattern scanning and processing language
233.SH BUGS
234There are no explicit conversions between numbers and strings.
235To force an expression to be treated as a number add 0 to it;
236to force it to be treated as a string concatenate ""
237to it.