date and time created 88/12/12 20:55:11 by kfall
[unix-history] / usr / src / old / awk / awk.1
CommitLineData
a40f7d8d 1.\" @(#)awk.1 6.1 (Berkeley) %G%
046bdc0f 2.\"
a40f7d8d 3.TH AWK 1 ""
046bdc0f
KM
4.AT 3
5.SH NAME
6awk \- pattern scanning and processing language
7.SH SYNOPSIS
8.B awk
9[
10.BI \-F c
11]
12[ prog ] [ file ] ...
13.SH DESCRIPTION
14.I Awk
15scans each input
16.I file
17for lines that match any of a set of patterns specified in
18.IR prog .
19With each pattern in
20.I prog
21there can be an associated action that will be performed
22when a line of a
23.I file
24matches the pattern.
25The set of patterns may appear literally as
26.I prog,
27or in a file
28specified as
29.B \-f
30.IR file .
31.PP
32Files are read in order;
33if there are no files, the standard input is read.
34The file name `\-'
35means the standard input.
36Each line is matched against the
37pattern portion of every pattern-action statement;
38the associated action is performed for each matched pattern.
39.PP
40An input line is made up of fields separated by white space.
41(This default can be changed by using FS,
42.IR "vide infra" ".)"
43The fields are denoted $1, $2, ... ;
44$0 refers to the entire line.
45.PP
46.PP
47A pattern-action statement has the form
48.PP
49 pattern { action }
50.PP
51A missing { action } means print the line;
52a missing pattern always matches.
53.PP
54An action is a sequence of statements.
55A statement can be one of the following:
56.PP
57.nf
58 if ( conditional ) statement [ else statement ]
59 while ( conditional ) statement
60 for ( expression ; conditional ; expression ) statement
61 break
62 continue
63 { [ statement ] ... }
64 variable = expression
65 print [ expression-list ] [ >expression ]
66 printf format [ , expression-list ] [ >expression ]
67 next # skip remaining patterns on this input line
68 exit # skip the rest of the input
69.fi
70.PP
71Statements are terminated by
72semicolons, newlines or right braces.
73An empty expression-list stands for the whole line.
74Expressions take on string or numeric values as appropriate,
75and are built using the operators
76+, \-, *, /, %, and concatenation (indicated by a blank).
77The C operators ++, \-\-, +=, \-=, *=, /=, and %=
78are also available in expressions.
79Variables may be scalars, array elements
80(denoted
81x[i])
82or fields.
83Variables are initialized to the null string.
84Array subscripts may be any string,
85not necessarily numeric;
86this allows for a form of associative memory.
87String constants are quoted "...".
88.PP
89The
90.I print
91statement prints its arguments on the standard output
92(or on a file if
93.I >file
94is present), separated by the current output field separator,
95and terminated by the output record separator.
96The
97.I printf
98statement formats its expression list according to the format
99(see
b7044523 100.IR printf (3S)).
046bdc0f
KM
101.PP
102The built-in function
103.I length
104returns the length of its argument
105taken as a string,
106or of the whole line if no argument.
107There are also built-in functions
108.I exp,
109.I log,
110.I sqrt,
111and
112.IR int .
113The last truncates its argument to an integer.
114.IR substr(s,\ m,\ n)
115returns the
116.IR n -character
117substring of
118.I s
119that begins at position
120.IR m .
121The function
122.IR sprintf(fmt,\ expr,\ expr,\ ...)
123formats the expressions
124according to the
b7044523 125.IR printf (3S)
046bdc0f
KM
126format given by
127.I fmt
128and returns the resulting string.
129.PP
130Patterns are arbitrary Boolean combinations
131(!, \(or\(or, &&, and parentheses) of
132regular expressions and
133relational expressions.
134Regular expressions must be surrounded
135by slashes and are as in
136.IR egrep .
137Isolated regular expressions
138in a pattern apply to the entire line.
139Regular expressions may also occur in
140relational expressions.
141.PP
142A pattern may consist of two patterns separated by a comma;
143in this case, the action is performed for all lines
144between an occurrence of the first pattern
145and the next occurrence of the second.
146.PP
147.nf
148A relational expression is one of the following:
149.PP
150.nf
151 expression matchop regular-expression
152 expression relop expression
153.PP
154.fi
155where a relop is any of the six relational operators in C,
156and a matchop is either ~ (for contains)
157or !~ (for does not contain).
158A conditional is an arithmetic expression,
159a relational expression,
160or a Boolean combination
161of these.
162.PP
163The special patterns
164BEGIN
165and
166END
167may be used to capture control before the first input line is read
168and after the last.
169BEGIN must be the first pattern, END the last.
170.PP
171A single character
172.I c
173may be used to separate the fields by starting
174the program with
175.PP
176 BEGIN { FS = "c" }
177.PP
178or by using the
179.BI \-F c
180option.
181.PP
182Other variable names with special meanings
183include NF, the number of fields in the current record;
184NR, the ordinal number of the current record;
185FILENAME, the name of the current input file;
186OFS, the output field separator (default blank);
187ORS, the output record separator (default newline);
188and
189OFMT, the output format for numbers (default "%.6g").
190.PP
191.SH EXAMPLES
192.PP
193Print lines longer than 72 characters:
194.PP
195.nf
196 length > 72
197.fi
198.PP
199Print first two fields in opposite order:
200.PP
201.nf
202 { print $2, $1 }
203.fi
204.PP
205Add up first column, print sum and average:
206.PP
207.nf
208 { s += $1 }
209 END { print "sum is", s, " average is", s/NR }
210.fi
211.PP
212Print fields in reverse order:
213.PP
214.nf
215 { for (i = NF; i > 0; \-\-i) print $i }
216.fi
217.PP
218Print all lines between start/stop pairs:
219.PP
220.nf
221 /start/, /stop/
222.fi
223.PP
224Print all lines whose first field is different from previous one:
225.PP
226.nf
227 $1 != prev { print; prev = $1 }
228.fi
229.SH SEE ALSO
230.PP
231lex(1), sed(1)
232.br
233A. V. Aho, B. W. Kernighan, P. J. Weinberger,
234.I
235Awk \- a pattern scanning and processing language
236.SH BUGS
237There are no explicit conversions between numbers and strings.
238To force an expression to be treated as a number add 0 to it;
239to force it to be treated as a string concatenate ""
240to it.