quad's broke it -- prototyped it rather than figure out the problem
[unix-history] / usr / src / usr.bin / tr / tr.1
CommitLineData
22fbd3aa 1.\" Copyright (c) 1991 Regents of the University of California.
94bed826
KB
2.\" All rights reserved.
3.\"
ae122740
KB
4.\" This code is derived from software contributed to Berkeley by
5.\" the Institute of Electrical and Electronics Engineers, Inc.
6.\"
22fbd3aa 7.\" %sccs.include.redist.man%
94bed826 8.\"
66f46587 9.\" @(#)tr.1 6.9 (Berkeley) %G%
5fc0f5dd 10.\"
22fbd3aa
CL
11.Dd
12.Dt TR 1
13.Os
14.Sh NAME
15.Nm tr
16.Nd translate characters
17.Sh SYNOPSIS
18.Nm tr
304f87e5
KB
19.Op Fl cs
20.Ar string1 string2
21.Nm tr
22.Op Fl c
23.Fl d
24.Ar string1
25.Nm tr
26.Op Fl c
27.Fl s
28.Ar string1
29.Nm tr
22fbd3aa 30.Op Fl c
304f87e5 31.Fl ds
22fbd3aa
CL
32.Ar string1 string2
33.Sh DESCRIPTION
304f87e5
KB
34The
35.Nm tr
36utility copies the standard input to the standard output with substitution
37or deletion of selected characters.
22fbd3aa
CL
38.Pp
39The following options are available:
40.Bl -tag -width Ds
41.It Fl c
304f87e5
KB
42Complements the set of characters in
43.Ar string1 ,
44that is ``-c ab'' includes every character except for ``a'' and ``b''.
22fbd3aa 45.It Fl d
304f87e5
KB
46The
47.Fl d
48option causes characters to be deleted from the input.
22fbd3aa 49.It Fl s
304f87e5
KB
50The
51.Fl s
52option squeezes multiple occurrences of the characters listed in the last
53operand (either
22fbd3aa 54.Ar string1
304f87e5
KB
55or
56.Ar string2 )
57in the input into a single instance of the character.
58This occurs after all deletion and translation is completed.
59.El
60.Pp
61In the first synopsis form, the characters in
62.Ar string1
63are translated into the characters in
22fbd3aa 64.Ar string2
304f87e5 65where the first character in
22fbd3aa 66.Ar string1
304f87e5 67is translated into the first character in
22fbd3aa 68.Ar string2
304f87e5
KB
69and so on.
70If
71.Ar string1
72is longer than
73.Ar string2 ,
74the last character found in
22fbd3aa 75.Ar string2
304f87e5
KB
76is duplicated until
77.Ar string1
78is exhausted.
22fbd3aa 79.Pp
304f87e5
KB
80In the second synopsis form, the characters in
81.Ar string1
82are deleted from the input.
22fbd3aa 83.Pp
304f87e5
KB
84In the third synopsis form, the characters in
85.Ar string1
86are compressed as described for the
87.Fl s
88option.
22fbd3aa 89.Pp
304f87e5 90In the fourth synopsis form, the characters in
22fbd3aa 91.Ar string1
304f87e5
KB
92are deleted from the input, and the characters in
93.Ar string2
94are compressed as described for the
95.Fl s
96option.
22fbd3aa
CL
97.Pp
98The following conventions can be used in
99.Ar string1
304f87e5 100and
22fbd3aa 101.Ar string2
304f87e5 102to specify sets of characters:
6d5d747e 103.Bl -tag -width [:equiv:]
22fbd3aa 104.It character
304f87e5
KB
105Any character not described by one of the following conventions
106represents itself.
22fbd3aa 107.It \eoctal
304f87e5
KB
108A backslash followed by 1, 2 or 3 octal digits represents a character
109with that encoded value.
110To follow an octal sequence with a digit as a character, left zero-pad
111the octal sequence to the full 3 octal digits.
22fbd3aa 112.It \echaracter
304f87e5
KB
113A backslash followed by certain special characters maps to special
114values.
115.sp
116.Bl -column
117.It \ea <alert character>
118.It \eb <backspace>
119.It \ef <form-feed>
120.It \en <newline>
121.It \er <carriage return>
122.It \et <tab>
123.It \ev <vertical tab>
124.El
125.sp
126A backslash followed by any other character maps to that character.
127.It c-c
128Represents the range of characters between the range endpoints, inclusively.
6d5d747e 129.It [:class:]
304f87e5
KB
130Represents all characters belonging to the defined character class.
131Class names are:
132.sp
133.Bl -column
134.It alnum <alphanumeric characters>
135.It alpha <alphabetic characters>
136.It cntrl <control characters>
137.It digit <numeric characters>
138.It graph <graphic characters>
139.It lower <lower-case alphabetic characters>
140.It print <printable characters>
141.It punct <punctuation characters>
142.It space <space characters>
143.It upper <upper-case characters>
144.It xdigit <hexadecimal characters>
22fbd3aa
CL
145.El
146.Pp
6d5d747e
KB
147\." All classes may be used in
148\." .Ar string1 ,
149\." and in
150\." .Ar string2
151\." when both the
152\." .Fl d
153\." and
154\." .Fl s
155\." options are specified.
156\." Otherwise, only the classes ``upper'' and ``lower'' may be used in
157\." .Ar string2
158\." and then only when the corresponding class (``upper'' for ``lower''
159\." and vice-versa) is specified in the same relative position in
160\." .Ar string1 .
161\." .Pp
304f87e5
KB
162With the exception of the ``upper'' and ``lower'' classes, characters
163in the classes are in unspecified order.
164In the ``upper'' and ``lower'' classes, characters are entered in
165ascending order.
166.Pp
167For specific information as to which ASCII characters are included
168in these classes, see
169.Xr ctype 3
170and related manual pages.
6d5d747e 171.It [=equiv=]
304f87e5
KB
172Represents all characters or collating (sorting) elements belonging to
173the same equivalence class as
174.Ar equiv .
22fbd3aa 175If
304f87e5
KB
176there is a secondary ordering within the equivalence class, the characters
177are ordered in ascending sequence.
178Otherwise, they are ordered after their encoded values.
6d5d747e
KB
179An example of an equivalence class might be ``c'' and ``ch'' in Spanish;
180English has no equivalence classes.
304f87e5
KB
181.It [#*n]
182Represents
183.Ar n
184repeated occurrences of the character represented by
185.Ar # .
22fbd3aa
CL
186This
187expression is only valid when it occurs in
188.Ar string2 .
304f87e5
KB
189If
190.Ar n
191is omitted or is zero, it is be interpreted as large enough to extend
192.Ar string2
193sequence to the length of
194.Ar string1 .
195If
196.Ar n
197has a leading zero, it is interpreted as an octal value, otherwise,
198it's interpreted as a decimal value.
22fbd3aa
CL
199.El
200.Pp
304f87e5
KB
201The
202.Nm tr
203utility exits 0 on success, and >0 if an error occurs.
204.Sh EXAMPLES
205The following examples are shown as given to the shell:
206.sp
207Create a list of the words in file1, one per line, where a word is taken to
208be a maximal string of letters.
209.sp
210.D1 Li "tr -cs \*q[:alpha:]\*q \*q\en\*q < file1"
211.sp
212Translate the contents of file1 to upper-case.
213.sp
214.D1 Li "tr \*q[:lower:]\*q \*q[:upper:]\*q < file1"
215.sp
216Strip out non-printable characters from file1.
217.sp
66f46587 218.D1 Li "tr -cd \*q[:print:]\*q < file1"
304f87e5
KB
219.Sh COMPATIBILITY
220System V has historically implemented character ranges using the syntax
221``[c-c]'' instead of the ``c-c'' used by historic BSD implementations and
222standardized by POSIX.
223System V shell scripts should work under this implementation as long as
224the range is intended to map in another range, i.e. the command
225``tr [a-z] [A-Z]'' will work as it will map the ``['' character in
226.Ar string1
227to the ``['' character in
228.Ar string2.
229However, if the shell script is deleting or squeezing characters as in
230the command ``tr -d [a-z]'', the characters ``['' and ``]'' will be
231included in the deletion or compression list which would not have happened
232under an historic System V implementation.
233Additionally, any scripts that depended on the sequence ``a-z'' to
234represent the three characters ``a'', ``-'' and ``z'' will have to be
235rewritten as ``a\e-z''.
22fbd3aa
CL
236.Pp
237The
238.Nm tr
304f87e5
KB
239utility has historically not permitted the manipulation of NUL bytes in
240its input and, additionally, stripped NUL's from its input stream.
241This implementation has removed this behavior as a bug.
242.Pp
243The
244.Nm tr
6d5d747e
KB
245utility has historically been extremely forgiving of syntax errors,
246for example, the
304f87e5
KB
247.Fl c
248and
249.Fl s
250options were ignored unless two strings were specified.
251This implementation will not permit illegal syntax.
22fbd3aa
CL
252.Sh STANDARDS
253The
254.Nm tr
255utility is expected to be
f91a266c
CL
256.St -p1003.2
257compatible.
304f87e5
KB
258It should be noted that the feature wherein the last character of
259.Ar string2
260is duplicated if
261.Ar string2
262has less characters than
263.Ar string1
264is permitted by POSIX but is not required.
265Shell scripts attempting to be portable to other POSIX systems should use
266the ``[#*]'' convention instead of relying on this behavior.