Commit | Line | Data |
---|---|---|
22fbd3aa | 1 | .\" Copyright (c) 1991 Regents of the University of California. |
94bed826 KB |
2 | .\" All rights reserved. |
3 | .\" | |
ae122740 KB |
4 | .\" This code is derived from software contributed to Berkeley by |
5 | .\" the Institute of Electrical and Electronics Engineers, Inc. | |
6 | .\" | |
22fbd3aa | 7 | .\" %sccs.include.redist.man% |
94bed826 | 8 | .\" |
66f46587 | 9 | .\" @(#)tr.1 6.9 (Berkeley) %G% |
5fc0f5dd | 10 | .\" |
22fbd3aa CL |
11 | .Dd |
12 | .Dt TR 1 | |
13 | .Os | |
14 | .Sh NAME | |
15 | .Nm tr | |
16 | .Nd translate characters | |
17 | .Sh SYNOPSIS | |
18 | .Nm tr | |
304f87e5 KB |
19 | .Op Fl cs |
20 | .Ar string1 string2 | |
21 | .Nm tr | |
22 | .Op Fl c | |
23 | .Fl d | |
24 | .Ar string1 | |
25 | .Nm tr | |
26 | .Op Fl c | |
27 | .Fl s | |
28 | .Ar string1 | |
29 | .Nm tr | |
22fbd3aa | 30 | .Op Fl c |
304f87e5 | 31 | .Fl ds |
22fbd3aa CL |
32 | .Ar string1 string2 |
33 | .Sh DESCRIPTION | |
304f87e5 KB |
34 | The |
35 | .Nm tr | |
36 | utility copies the standard input to the standard output with substitution | |
37 | or deletion of selected characters. | |
22fbd3aa CL |
38 | .Pp |
39 | The following options are available: | |
40 | .Bl -tag -width Ds | |
41 | .It Fl c | |
304f87e5 KB |
42 | Complements the set of characters in |
43 | .Ar string1 , | |
44 | that is ``-c ab'' includes every character except for ``a'' and ``b''. | |
22fbd3aa | 45 | .It Fl d |
304f87e5 KB |
46 | The |
47 | .Fl d | |
48 | option causes characters to be deleted from the input. | |
22fbd3aa | 49 | .It Fl s |
304f87e5 KB |
50 | The |
51 | .Fl s | |
52 | option squeezes multiple occurrences of the characters listed in the last | |
53 | operand (either | |
22fbd3aa | 54 | .Ar string1 |
304f87e5 KB |
55 | or |
56 | .Ar string2 ) | |
57 | in the input into a single instance of the character. | |
58 | This occurs after all deletion and translation is completed. | |
59 | .El | |
60 | .Pp | |
61 | In the first synopsis form, the characters in | |
62 | .Ar string1 | |
63 | are translated into the characters in | |
22fbd3aa | 64 | .Ar string2 |
304f87e5 | 65 | where the first character in |
22fbd3aa | 66 | .Ar string1 |
304f87e5 | 67 | is translated into the first character in |
22fbd3aa | 68 | .Ar string2 |
304f87e5 KB |
69 | and so on. |
70 | If | |
71 | .Ar string1 | |
72 | is longer than | |
73 | .Ar string2 , | |
74 | the last character found in | |
22fbd3aa | 75 | .Ar string2 |
304f87e5 KB |
76 | is duplicated until |
77 | .Ar string1 | |
78 | is exhausted. | |
22fbd3aa | 79 | .Pp |
304f87e5 KB |
80 | In the second synopsis form, the characters in |
81 | .Ar string1 | |
82 | are deleted from the input. | |
22fbd3aa | 83 | .Pp |
304f87e5 KB |
84 | In the third synopsis form, the characters in |
85 | .Ar string1 | |
86 | are compressed as described for the | |
87 | .Fl s | |
88 | option. | |
22fbd3aa | 89 | .Pp |
304f87e5 | 90 | In the fourth synopsis form, the characters in |
22fbd3aa | 91 | .Ar string1 |
304f87e5 KB |
92 | are deleted from the input, and the characters in |
93 | .Ar string2 | |
94 | are compressed as described for the | |
95 | .Fl s | |
96 | option. | |
22fbd3aa CL |
97 | .Pp |
98 | The following conventions can be used in | |
99 | .Ar string1 | |
304f87e5 | 100 | and |
22fbd3aa | 101 | .Ar string2 |
304f87e5 | 102 | to specify sets of characters: |
6d5d747e | 103 | .Bl -tag -width [:equiv:] |
22fbd3aa | 104 | .It character |
304f87e5 KB |
105 | Any character not described by one of the following conventions |
106 | represents itself. | |
22fbd3aa | 107 | .It \eoctal |
304f87e5 KB |
108 | A backslash followed by 1, 2 or 3 octal digits represents a character |
109 | with that encoded value. | |
110 | To follow an octal sequence with a digit as a character, left zero-pad | |
111 | the octal sequence to the full 3 octal digits. | |
22fbd3aa | 112 | .It \echaracter |
304f87e5 KB |
113 | A backslash followed by certain special characters maps to special |
114 | values. | |
115 | .sp | |
116 | .Bl -column | |
117 | .It \ea <alert character> | |
118 | .It \eb <backspace> | |
119 | .It \ef <form-feed> | |
120 | .It \en <newline> | |
121 | .It \er <carriage return> | |
122 | .It \et <tab> | |
123 | .It \ev <vertical tab> | |
124 | .El | |
125 | .sp | |
126 | A backslash followed by any other character maps to that character. | |
127 | .It c-c | |
128 | Represents the range of characters between the range endpoints, inclusively. | |
6d5d747e | 129 | .It [:class:] |
304f87e5 KB |
130 | Represents all characters belonging to the defined character class. |
131 | Class names are: | |
132 | .sp | |
133 | .Bl -column | |
134 | .It alnum <alphanumeric characters> | |
135 | .It alpha <alphabetic characters> | |
136 | .It cntrl <control characters> | |
137 | .It digit <numeric characters> | |
138 | .It graph <graphic characters> | |
139 | .It lower <lower-case alphabetic characters> | |
140 | .It print <printable characters> | |
141 | .It punct <punctuation characters> | |
142 | .It space <space characters> | |
143 | .It upper <upper-case characters> | |
144 | .It xdigit <hexadecimal characters> | |
22fbd3aa CL |
145 | .El |
146 | .Pp | |
6d5d747e KB |
147 | \." All classes may be used in |
148 | \." .Ar string1 , | |
149 | \." and in | |
150 | \." .Ar string2 | |
151 | \." when both the | |
152 | \." .Fl d | |
153 | \." and | |
154 | \." .Fl s | |
155 | \." options are specified. | |
156 | \." Otherwise, only the classes ``upper'' and ``lower'' may be used in | |
157 | \." .Ar string2 | |
158 | \." and then only when the corresponding class (``upper'' for ``lower'' | |
159 | \." and vice-versa) is specified in the same relative position in | |
160 | \." .Ar string1 . | |
161 | \." .Pp | |
304f87e5 KB |
162 | With the exception of the ``upper'' and ``lower'' classes, characters |
163 | in the classes are in unspecified order. | |
164 | In the ``upper'' and ``lower'' classes, characters are entered in | |
165 | ascending order. | |
166 | .Pp | |
167 | For specific information as to which ASCII characters are included | |
168 | in these classes, see | |
169 | .Xr ctype 3 | |
170 | and related manual pages. | |
6d5d747e | 171 | .It [=equiv=] |
304f87e5 KB |
172 | Represents all characters or collating (sorting) elements belonging to |
173 | the same equivalence class as | |
174 | .Ar equiv . | |
22fbd3aa | 175 | If |
304f87e5 KB |
176 | there is a secondary ordering within the equivalence class, the characters |
177 | are ordered in ascending sequence. | |
178 | Otherwise, they are ordered after their encoded values. | |
6d5d747e KB |
179 | An example of an equivalence class might be ``c'' and ``ch'' in Spanish; |
180 | English has no equivalence classes. | |
304f87e5 KB |
181 | .It [#*n] |
182 | Represents | |
183 | .Ar n | |
184 | repeated occurrences of the character represented by | |
185 | .Ar # . | |
22fbd3aa CL |
186 | This |
187 | expression is only valid when it occurs in | |
188 | .Ar string2 . | |
304f87e5 KB |
189 | If |
190 | .Ar n | |
191 | is omitted or is zero, it is be interpreted as large enough to extend | |
192 | .Ar string2 | |
193 | sequence to the length of | |
194 | .Ar string1 . | |
195 | If | |
196 | .Ar n | |
197 | has a leading zero, it is interpreted as an octal value, otherwise, | |
198 | it's interpreted as a decimal value. | |
22fbd3aa CL |
199 | .El |
200 | .Pp | |
304f87e5 KB |
201 | The |
202 | .Nm tr | |
203 | utility exits 0 on success, and >0 if an error occurs. | |
204 | .Sh EXAMPLES | |
205 | The following examples are shown as given to the shell: | |
206 | .sp | |
207 | Create a list of the words in file1, one per line, where a word is taken to | |
208 | be a maximal string of letters. | |
209 | .sp | |
210 | .D1 Li "tr -cs \*q[:alpha:]\*q \*q\en\*q < file1" | |
211 | .sp | |
212 | Translate the contents of file1 to upper-case. | |
213 | .sp | |
214 | .D1 Li "tr \*q[:lower:]\*q \*q[:upper:]\*q < file1" | |
215 | .sp | |
216 | Strip out non-printable characters from file1. | |
217 | .sp | |
66f46587 | 218 | .D1 Li "tr -cd \*q[:print:]\*q < file1" |
304f87e5 KB |
219 | .Sh COMPATIBILITY |
220 | System V has historically implemented character ranges using the syntax | |
221 | ``[c-c]'' instead of the ``c-c'' used by historic BSD implementations and | |
222 | standardized by POSIX. | |
223 | System V shell scripts should work under this implementation as long as | |
224 | the range is intended to map in another range, i.e. the command | |
225 | ``tr [a-z] [A-Z]'' will work as it will map the ``['' character in | |
226 | .Ar string1 | |
227 | to the ``['' character in | |
228 | .Ar string2. | |
229 | However, if the shell script is deleting or squeezing characters as in | |
230 | the command ``tr -d [a-z]'', the characters ``['' and ``]'' will be | |
231 | included in the deletion or compression list which would not have happened | |
232 | under an historic System V implementation. | |
233 | Additionally, any scripts that depended on the sequence ``a-z'' to | |
234 | represent the three characters ``a'', ``-'' and ``z'' will have to be | |
235 | rewritten as ``a\e-z''. | |
22fbd3aa CL |
236 | .Pp |
237 | The | |
238 | .Nm tr | |
304f87e5 KB |
239 | utility has historically not permitted the manipulation of NUL bytes in |
240 | its input and, additionally, stripped NUL's from its input stream. | |
241 | This implementation has removed this behavior as a bug. | |
242 | .Pp | |
243 | The | |
244 | .Nm tr | |
6d5d747e KB |
245 | utility has historically been extremely forgiving of syntax errors, |
246 | for example, the | |
304f87e5 KB |
247 | .Fl c |
248 | and | |
249 | .Fl s | |
250 | options were ignored unless two strings were specified. | |
251 | This implementation will not permit illegal syntax. | |
22fbd3aa CL |
252 | .Sh STANDARDS |
253 | The | |
254 | .Nm tr | |
255 | utility is expected to be | |
f91a266c CL |
256 | .St -p1003.2 |
257 | compatible. | |
304f87e5 KB |
258 | It should be noted that the feature wherein the last character of |
259 | .Ar string2 | |
260 | is duplicated if | |
261 | .Ar string2 | |
262 | has less characters than | |
263 | .Ar string1 | |
264 | is permitted by POSIX but is not required. | |
265 | Shell scripts attempting to be portable to other POSIX systems should use | |
266 | the ``[#*]'' convention instead of relying on this behavior. |