Annotation of src/usr.bin/tr/tr.1, Revision 1.2
1.2 ! deraadt 1: .\" $OpenBSD: tr.1,v 1.5 1994/12/07 08:35:13 jtc Exp $
1.1 deraadt 2: .\" $NetBSD: tr.1,v 1.5 1994/12/07 08:35:13 jtc Exp $
3: .\"
4: .\" Copyright (c) 1991, 1993
5: .\" The Regents of the University of California. All rights reserved.
6: .\"
7: .\" This code is derived from software contributed to Berkeley by
8: .\" the Institute of Electrical and Electronics Engineers, Inc.
9: .\"
10: .\" Redistribution and use in source and binary forms, with or without
11: .\" modification, are permitted provided that the following conditions
12: .\" are met:
13: .\" 1. Redistributions of source code must retain the above copyright
14: .\" notice, this list of conditions and the following disclaimer.
15: .\" 2. Redistributions in binary form must reproduce the above copyright
16: .\" notice, this list of conditions and the following disclaimer in the
17: .\" documentation and/or other materials provided with the distribution.
18: .\" 3. All advertising materials mentioning features or use of this software
19: .\" must display the following acknowledgement:
20: .\" This product includes software developed by the University of
21: .\" California, Berkeley and its contributors.
22: .\" 4. Neither the name of the University nor the names of its contributors
23: .\" may be used to endorse or promote products derived from this software
24: .\" without specific prior written permission.
25: .\"
26: .\" THIS SOFTWARE IS PROVIDED BY THE REGENTS AND CONTRIBUTORS ``AS IS'' AND
27: .\" ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
28: .\" IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
29: .\" ARE DISCLAIMED. IN NO EVENT SHALL THE REGENTS OR CONTRIBUTORS BE LIABLE
30: .\" FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
31: .\" DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS
32: .\" OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION)
33: .\" HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT
34: .\" LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY
35: .\" OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF
36: .\" SUCH DAMAGE.
37: .\"
38: .\" @(#)tr.1 8.1 (Berkeley) 6/6/93
39: .\"
40: .Dd June 6, 1993
41: .Dt TR 1
42: .Os
43: .Sh NAME
44: .Nm tr
45: .Nd translate characters
46: .Sh SYNOPSIS
47: .Nm tr
48: .Op Fl cs
49: .Ar string1 string2
50: .Nm tr
51: .Op Fl c
52: .Fl d
53: .Ar string1
54: .Nm tr
55: .Op Fl c
56: .Fl s
57: .Ar string1
58: .Nm tr
59: .Op Fl c
60: .Fl ds
61: .Ar string1 string2
62: .Sh DESCRIPTION
63: The
64: .Nm tr
65: utility copies the standard input to the standard output with substitution
66: or deletion of selected characters.
67: .Pp
68: The following options are available:
69: .Bl -tag -width Ds
70: .It Fl c
71: Complements the set of characters in
72: .Ar string1 ,
73: that is ``-c ab'' includes every character except for ``a'' and ``b''.
74: .It Fl d
75: The
76: .Fl d
77: option causes characters to be deleted from the input.
78: .It Fl s
79: The
80: .Fl s
81: option squeezes multiple occurrences of the characters listed in the last
82: operand (either
83: .Ar string1
84: or
85: .Ar string2 )
86: in the input into a single instance of the character.
87: This occurs after all deletion and translation is completed.
88: .El
89: .Pp
90: In the first synopsis form, the characters in
91: .Ar string1
92: are translated into the characters in
93: .Ar string2
94: where the first character in
95: .Ar string1
96: is translated into the first character in
97: .Ar string2
98: and so on.
99: If
100: .Ar string1
101: is longer than
102: .Ar string2 ,
103: the last character found in
104: .Ar string2
105: is duplicated until
106: .Ar string1
107: is exhausted.
108: .Pp
109: In the second synopsis form, the characters in
110: .Ar string1
111: are deleted from the input.
112: .Pp
113: In the third synopsis form, the characters in
114: .Ar string1
115: are compressed as described for the
116: .Fl s
117: option.
118: .Pp
119: In the fourth synopsis form, the characters in
120: .Ar string1
121: are deleted from the input, and the characters in
122: .Ar string2
123: are compressed as described for the
124: .Fl s
125: option.
126: .Pp
127: The following conventions can be used in
128: .Ar string1
129: and
130: .Ar string2
131: to specify sets of characters:
132: .Bl -tag -width [:equiv:]
133: .It character
134: Any character not described by one of the following conventions
135: represents itself.
136: .It \eoctal
137: A backslash followed by 1, 2 or 3 octal digits represents a character
138: with that encoded value.
139: To follow an octal sequence with a digit as a character, left zero-pad
140: the octal sequence to the full 3 octal digits.
141: .It \echaracter
142: A backslash followed by certain special characters maps to special
143: values.
144: .sp
145: .Bl -column
146: .It \ea <alert character>
147: .It \eb <backspace>
148: .It \ef <form-feed>
149: .It \en <newline>
150: .It \er <carriage return>
151: .It \et <tab>
152: .It \ev <vertical tab>
153: .El
154: .sp
155: A backslash followed by any other character maps to that character.
156: .It c-c
157: Represents the range of characters between the range endpoints, inclusively.
158: .It [:class:]
159: Represents all characters belonging to the defined character class.
160: Class names are:
161: .sp
162: .Bl -column
163: .It alnum <alphanumeric characters>
164: .It alpha <alphabetic characters>
165: .It blank <blank characters>
166: .It cntrl <control characters>
167: .It digit <numeric characters>
168: .It graph <graphic characters>
169: .It lower <lower-case alphabetic characters>
170: .It print <printable characters>
171: .It punct <punctuation characters>
172: .It space <space characters>
173: .It upper <upper-case characters>
174: .It xdigit <hexadecimal characters>
175: .El
176: .Pp
177: \." All classes may be used in
178: \." .Ar string1 ,
179: \." and in
180: \." .Ar string2
181: \." when both the
182: \." .Fl d
183: \." and
184: \." .Fl s
185: \." options are specified.
186: \." Otherwise, only the classes ``upper'' and ``lower'' may be used in
187: \." .Ar string2
188: \." and then only when the corresponding class (``upper'' for ``lower''
189: \." and vice-versa) is specified in the same relative position in
190: \." .Ar string1 .
191: \." .Pp
192: With the exception of the ``upper'' and ``lower'' classes, characters
193: in the classes are in unspecified order.
194: In the ``upper'' and ``lower'' classes, characters are entered in
195: ascending order.
196: .Pp
197: For specific information as to which ASCII characters are included
198: in these classes, see
199: .Xr ctype 3
200: and related manual pages.
201: .It [=equiv=]
202: Represents all characters or collating (sorting) elements belonging to
203: the same equivalence class as
204: .Ar equiv .
205: If
206: there is a secondary ordering within the equivalence class, the characters
207: are ordered in ascending sequence.
208: Otherwise, they are ordered after their encoded values.
209: An example of an equivalence class might be ``c'' and ``ch'' in Spanish;
210: English has no equivalence classes.
211: .It [#*n]
212: Represents
213: .Ar n
214: repeated occurrences of the character represented by
215: .Ar # .
216: This
217: expression is only valid when it occurs in
218: .Ar string2 .
219: If
220: .Ar n
221: is omitted or is zero, it is be interpreted as large enough to extend
222: .Ar string2
223: sequence to the length of
224: .Ar string1 .
225: If
226: .Ar n
227: has a leading zero, it is interpreted as an octal value, otherwise,
228: it's interpreted as a decimal value.
229: .El
230: .Pp
231: The
232: .Nm tr
233: utility exits 0 on success, and >0 if an error occurs.
234: .Sh EXAMPLES
235: The following examples are shown as given to the shell:
236: .sp
237: Create a list of the words in file1, one per line, where a word is taken to
238: be a maximal string of letters.
239: .sp
240: .D1 Li "tr -cs \*q[:alpha:]\*q \*q\en\*q < file1"
241: .sp
242: Translate the contents of file1 to upper-case.
243: .sp
244: .D1 Li "tr \*q[:lower:]\*q \*q[:upper:]\*q < file1"
245: .sp
246: Strip out non-printable characters from file1.
247: .sp
248: .D1 Li "tr -cd \*q[:print:]\*q < file1"
249: .Sh COMPATIBILITY
250: System V has historically implemented character ranges using the syntax
251: ``[c-c]'' instead of the ``c-c'' used by historic BSD implementations and
252: standardized by POSIX.
253: System V shell scripts should work under this implementation as long as
254: the range is intended to map in another range, i.e. the command
255: ``tr [a-z] [A-Z]'' will work as it will map the ``['' character in
256: .Ar string1
257: to the ``['' character in
258: .Ar string2.
259: However, if the shell script is deleting or squeezing characters as in
260: the command ``tr -d [a-z]'', the characters ``['' and ``]'' will be
261: included in the deletion or compression list which would not have happened
262: under an historic System V implementation.
263: Additionally, any scripts that depended on the sequence ``a-z'' to
264: represent the three characters ``a'', ``-'' and ``z'' will have to be
265: rewritten as ``a\e-z''.
266: .Pp
267: The
268: .Nm tr
269: utility has historically not permitted the manipulation of NUL bytes in
270: its input and, additionally, stripped NUL's from its input stream.
271: This implementation has removed this behavior as a bug.
272: .Pp
273: The
274: .Nm tr
275: utility has historically been extremely forgiving of syntax errors,
276: for example, the
277: .Fl c
278: and
279: .Fl s
280: options were ignored unless two strings were specified.
281: This implementation will not permit illegal syntax.
282: .Sh STANDARDS
283: The
284: .Nm tr
285: utility is expected to be
286: .St -p1003.2
287: compatible.
288: It should be noted that the feature wherein the last character of
289: .Ar string2
290: is duplicated if
291: .Ar string2
292: has less characters than
293: .Ar string1
294: is permitted by POSIX but is not required.
295: Shell scripts attempting to be portable to other POSIX systems should use
296: the ``[#*]'' convention instead of relying on this behavior.