Annotation of src/usr.bin/tr/tr.1, Revision 1.6
1.6 ! aaron 1: .\" $OpenBSD: tr.1,v 1.5 2000/03/05 00:28:55 aaron Exp $
1.1 deraadt 2: .\" $NetBSD: tr.1,v 1.5 1994/12/07 08:35:13 jtc Exp $
3: .\"
4: .\" Copyright (c) 1991, 1993
5: .\" The Regents of the University of California. All rights reserved.
6: .\"
7: .\" This code is derived from software contributed to Berkeley by
8: .\" the Institute of Electrical and Electronics Engineers, Inc.
9: .\"
10: .\" Redistribution and use in source and binary forms, with or without
11: .\" modification, are permitted provided that the following conditions
12: .\" are met:
13: .\" 1. Redistributions of source code must retain the above copyright
14: .\" notice, this list of conditions and the following disclaimer.
15: .\" 2. Redistributions in binary form must reproduce the above copyright
16: .\" notice, this list of conditions and the following disclaimer in the
17: .\" documentation and/or other materials provided with the distribution.
18: .\" 3. All advertising materials mentioning features or use of this software
19: .\" must display the following acknowledgement:
20: .\" This product includes software developed by the University of
21: .\" California, Berkeley and its contributors.
22: .\" 4. Neither the name of the University nor the names of its contributors
23: .\" may be used to endorse or promote products derived from this software
24: .\" without specific prior written permission.
25: .\"
26: .\" THIS SOFTWARE IS PROVIDED BY THE REGENTS AND CONTRIBUTORS ``AS IS'' AND
27: .\" ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
28: .\" IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
29: .\" ARE DISCLAIMED. IN NO EVENT SHALL THE REGENTS OR CONTRIBUTORS BE LIABLE
30: .\" FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
31: .\" DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS
32: .\" OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION)
33: .\" HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT
34: .\" LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY
35: .\" OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF
36: .\" SUCH DAMAGE.
37: .\"
38: .\" @(#)tr.1 8.1 (Berkeley) 6/6/93
39: .\"
40: .Dd June 6, 1993
41: .Dt TR 1
42: .Os
43: .Sh NAME
44: .Nm tr
45: .Nd translate characters
46: .Sh SYNOPSIS
47: .Nm tr
48: .Op Fl cs
49: .Ar string1 string2
50: .Nm tr
51: .Op Fl c
52: .Fl d
53: .Ar string1
54: .Nm tr
55: .Op Fl c
56: .Fl s
57: .Ar string1
58: .Nm tr
59: .Op Fl c
60: .Fl ds
61: .Ar string1 string2
62: .Sh DESCRIPTION
63: The
1.6 ! aaron 64: .Nm
1.1 deraadt 65: utility copies the standard input to the standard output with substitution
66: or deletion of selected characters.
67: .Pp
1.5 aaron 68: The options are as follows:
1.1 deraadt 69: .Bl -tag -width Ds
70: .It Fl c
71: Complements the set of characters in
1.4 pjanzen 72: .Ar string1 ;
73: for instance,
74: .Dq -c\ ab
75: includes every character except for
76: .Dq a
77: and
78: .Dq b .
1.1 deraadt 79: .It Fl d
80: The
81: .Fl d
82: option causes characters to be deleted from the input.
83: .It Fl s
84: The
85: .Fl s
86: option squeezes multiple occurrences of the characters listed in the last
87: operand (either
88: .Ar string1
89: or
90: .Ar string2 )
91: in the input into a single instance of the character.
92: This occurs after all deletion and translation is completed.
93: .El
94: .Pp
95: In the first synopsis form, the characters in
96: .Ar string1
97: are translated into the characters in
98: .Ar string2
99: where the first character in
100: .Ar string1
101: is translated into the first character in
102: .Ar string2
103: and so on.
104: If
105: .Ar string1
106: is longer than
107: .Ar string2 ,
108: the last character found in
109: .Ar string2
110: is duplicated until
111: .Ar string1
112: is exhausted.
113: .Pp
114: In the second synopsis form, the characters in
115: .Ar string1
116: are deleted from the input.
117: .Pp
118: In the third synopsis form, the characters in
119: .Ar string1
120: are compressed as described for the
121: .Fl s
122: option.
123: .Pp
124: In the fourth synopsis form, the characters in
125: .Ar string1
126: are deleted from the input, and the characters in
127: .Ar string2
128: are compressed as described for the
129: .Fl s
130: option.
131: .Pp
132: The following conventions can be used in
133: .Ar string1
134: and
135: .Ar string2
136: to specify sets of characters:
137: .Bl -tag -width [:equiv:]
138: .It character
139: Any character not described by one of the following conventions
140: represents itself.
141: .It \eoctal
1.4 pjanzen 142: A backslash followed by 1, 2, or 3 octal digits represents a character
1.1 deraadt 143: with that encoded value.
144: To follow an octal sequence with a digit as a character, left zero-pad
145: the octal sequence to the full 3 octal digits.
146: .It \echaracter
147: A backslash followed by certain special characters maps to special
148: values.
1.6 ! aaron 149: .Pp
1.1 deraadt 150: .Bl -column
151: .It \ea <alert character>
152: .It \eb <backspace>
153: .It \ef <form-feed>
154: .It \en <newline>
155: .It \er <carriage return>
156: .It \et <tab>
157: .It \ev <vertical tab>
158: .El
1.6 ! aaron 159: .Pp
1.1 deraadt 160: A backslash followed by any other character maps to that character.
161: .It c-c
162: Represents the range of characters between the range endpoints, inclusively.
163: .It [:class:]
164: Represents all characters belonging to the defined character class.
165: Class names are:
1.6 ! aaron 166: .Pp
1.1 deraadt 167: .Bl -column
168: .It alnum <alphanumeric characters>
169: .It alpha <alphabetic characters>
170: .It blank <blank characters>
171: .It cntrl <control characters>
172: .It digit <numeric characters>
173: .It graph <graphic characters>
174: .It lower <lower-case alphabetic characters>
175: .It print <printable characters>
176: .It punct <punctuation characters>
177: .It space <space characters>
178: .It upper <upper-case characters>
179: .It xdigit <hexadecimal characters>
180: .El
181: .Pp
182: \." All classes may be used in
183: \." .Ar string1 ,
184: \." and in
185: \." .Ar string2
186: \." when both the
187: \." .Fl d
188: \." and
189: \." .Fl s
190: \." options are specified.
191: \." Otherwise, only the classes ``upper'' and ``lower'' may be used in
192: \." .Ar string2
193: \." and then only when the corresponding class (``upper'' for ``lower''
194: \." and vice-versa) is specified in the same relative position in
195: \." .Ar string1 .
196: \." .Pp
1.4 pjanzen 197: With the exception of the
198: .Dq upper
199: and
200: .Dq lower
201: classes, characters
1.1 deraadt 202: in the classes are in unspecified order.
1.4 pjanzen 203: In the
204: .Dq upper
205: and
206: .Dq lower
207: classes, characters are entered in
1.1 deraadt 208: ascending order.
209: .Pp
210: For specific information as to which ASCII characters are included
211: in these classes, see
212: .Xr ctype 3
213: and related manual pages.
214: .It [=equiv=]
215: Represents all characters or collating (sorting) elements belonging to
216: the same equivalence class as
217: .Ar equiv .
218: If
219: there is a secondary ordering within the equivalence class, the characters
220: are ordered in ascending sequence.
1.4 pjanzen 221: Otherwise, they are ordered after their encoded values.
222: An example of an equivalence class might be
223: .Dq c
224: and
225: .Dq ch
226: in Spanish;
1.1 deraadt 227: English has no equivalence classes.
228: .It [#*n]
229: Represents
230: .Ar n
231: repeated occurrences of the character represented by
232: .Ar # .
233: This
234: expression is only valid when it occurs in
235: .Ar string2 .
236: If
237: .Ar n
238: is omitted or is zero, it is be interpreted as large enough to extend
239: .Ar string2
240: sequence to the length of
241: .Ar string1 .
242: If
243: .Ar n
1.4 pjanzen 244: has a leading zero, it is interpreted as an octal value; otherwise,
1.1 deraadt 245: it's interpreted as a decimal value.
246: .El
247: .Pp
248: The
1.6 ! aaron 249: .Nm
1.3 aaron 250: utility exits 0 on success or >0 if an error occurred.
1.1 deraadt 251: .Sh EXAMPLES
252: The following examples are shown as given to the shell:
1.6 ! aaron 253: .Pp
1.1 deraadt 254: Create a list of the words in file1, one per line, where a word is taken to
255: be a maximal string of letters.
1.6 ! aaron 256: .Pp
1.1 deraadt 257: .D1 Li "tr -cs \*q[:alpha:]\*q \*q\en\*q < file1"
1.6 ! aaron 258: .Pp
1.1 deraadt 259: Translate the contents of file1 to upper-case.
1.6 ! aaron 260: .Pp
1.1 deraadt 261: .D1 Li "tr \*q[:lower:]\*q \*q[:upper:]\*q < file1"
1.6 ! aaron 262: .Pp
1.1 deraadt 263: Strip out non-printable characters from file1.
1.6 ! aaron 264: .Pp
1.1 deraadt 265: .D1 Li "tr -cd \*q[:print:]\*q < file1"
1.6 ! aaron 266: .Sh SEE ALSO
! 267: .Xr sed 1
1.1 deraadt 268: .Sh COMPATIBILITY
269: System V has historically implemented character ranges using the syntax
1.4 pjanzen 270: .Dq [c-c]
271: instead of the
272: .Dq c-c
273: used by historic BSD implementations and
1.1 deraadt 274: standardized by POSIX.
275: System V shell scripts should work under this implementation as long as
1.6 ! aaron 276: the range is intended to map in another range, i.e., the command
1.4 pjanzen 277: .Dq tr\ [a-z]\ [A-Z]
278: will work as it will map the
279: .Dq [
280: character in
281: .Ar string1
282: to the
283: .Dq [
284: character in
1.3 aaron 285: .Ar string2 .
1.1 deraadt 286: However, if the shell script is deleting or squeezing characters as in
1.4 pjanzen 287: the command
288: .Dq tr\ -d\ [a-z] ,
289: the characters
290: .Dq [
291: and
292: .Dq \]
293: will be
294: included in the deletion or compression list, which would not have happened
1.1 deraadt 295: under an historic System V implementation.
1.4 pjanzen 296: Additionally, any scripts that depended on the sequence
297: .Dq a-z
298: to represent the three characters
299: .Dq a ,
300: .Dq - ,
301: and
302: .Dq z
303: will have to be rewritten as
304: .Dq a\e-z .
1.1 deraadt 305: .Pp
306: The
1.6 ! aaron 307: .Nm
1.1 deraadt 308: utility has historically not permitted the manipulation of NUL bytes in
1.4 pjanzen 309: its input and, additionally, has stripped NUL's from its input stream.
1.1 deraadt 310: This implementation has removed this behavior as a bug.
311: .Pp
312: The
1.6 ! aaron 313: .Nm
1.4 pjanzen 314: utility has historically been extremely forgiving of syntax errors:
1.1 deraadt 315: for example, the
316: .Fl c
317: and
318: .Fl s
319: options were ignored unless two strings were specified.
320: This implementation will not permit illegal syntax.
321: .Sh STANDARDS
322: The
1.6 ! aaron 323: .Nm
1.1 deraadt 324: utility is expected to be
325: .St -p1003.2
326: compatible.
327: It should be noted that the feature wherein the last character of
328: .Ar string2
329: is duplicated if
330: .Ar string2
331: has less characters than
332: .Ar string1
333: is permitted by POSIX but is not required.
334: Shell scripts attempting to be portable to other POSIX systems should use
1.4 pjanzen 335: the
336: .Dq [#*]
337: convention instead of relying on this behavior.