Annotation of src/usr.bin/tr/tr.1, Revision 1.4
1.4 ! pjanzen 1: .\" $OpenBSD: tr.1,v 1.3 1998/10/30 00:24:40 aaron Exp $
1.1 deraadt 2: .\" $NetBSD: tr.1,v 1.5 1994/12/07 08:35:13 jtc Exp $
3: .\"
4: .\" Copyright (c) 1991, 1993
5: .\" The Regents of the University of California. All rights reserved.
6: .\"
7: .\" This code is derived from software contributed to Berkeley by
8: .\" the Institute of Electrical and Electronics Engineers, Inc.
9: .\"
10: .\" Redistribution and use in source and binary forms, with or without
11: .\" modification, are permitted provided that the following conditions
12: .\" are met:
13: .\" 1. Redistributions of source code must retain the above copyright
14: .\" notice, this list of conditions and the following disclaimer.
15: .\" 2. Redistributions in binary form must reproduce the above copyright
16: .\" notice, this list of conditions and the following disclaimer in the
17: .\" documentation and/or other materials provided with the distribution.
18: .\" 3. All advertising materials mentioning features or use of this software
19: .\" must display the following acknowledgement:
20: .\" This product includes software developed by the University of
21: .\" California, Berkeley and its contributors.
22: .\" 4. Neither the name of the University nor the names of its contributors
23: .\" may be used to endorse or promote products derived from this software
24: .\" without specific prior written permission.
25: .\"
26: .\" THIS SOFTWARE IS PROVIDED BY THE REGENTS AND CONTRIBUTORS ``AS IS'' AND
27: .\" ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
28: .\" IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
29: .\" ARE DISCLAIMED. IN NO EVENT SHALL THE REGENTS OR CONTRIBUTORS BE LIABLE
30: .\" FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
31: .\" DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS
32: .\" OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION)
33: .\" HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT
34: .\" LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY
35: .\" OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF
36: .\" SUCH DAMAGE.
37: .\"
38: .\" @(#)tr.1 8.1 (Berkeley) 6/6/93
39: .\"
40: .Dd June 6, 1993
41: .Dt TR 1
42: .Os
43: .Sh NAME
44: .Nm tr
45: .Nd translate characters
46: .Sh SYNOPSIS
47: .Nm tr
48: .Op Fl cs
49: .Ar string1 string2
50: .Nm tr
51: .Op Fl c
52: .Fl d
53: .Ar string1
54: .Nm tr
55: .Op Fl c
56: .Fl s
57: .Ar string1
58: .Nm tr
59: .Op Fl c
60: .Fl ds
61: .Ar string1 string2
62: .Sh DESCRIPTION
63: The
64: .Nm tr
65: utility copies the standard input to the standard output with substitution
66: or deletion of selected characters.
67: .Pp
68: The following options are available:
69: .Bl -tag -width Ds
70: .It Fl c
71: Complements the set of characters in
1.4 ! pjanzen 72: .Ar string1 ;
! 73: for instance,
! 74: .Dq -c\ ab
! 75: includes every character except for
! 76: .Dq a
! 77: and
! 78: .Dq b .
1.1 deraadt 79: .It Fl d
80: The
81: .Fl d
82: option causes characters to be deleted from the input.
83: .It Fl s
84: The
85: .Fl s
86: option squeezes multiple occurrences of the characters listed in the last
87: operand (either
88: .Ar string1
89: or
90: .Ar string2 )
91: in the input into a single instance of the character.
92: This occurs after all deletion and translation is completed.
93: .El
94: .Pp
95: In the first synopsis form, the characters in
96: .Ar string1
97: are translated into the characters in
98: .Ar string2
99: where the first character in
100: .Ar string1
101: is translated into the first character in
102: .Ar string2
103: and so on.
104: If
105: .Ar string1
106: is longer than
107: .Ar string2 ,
108: the last character found in
109: .Ar string2
110: is duplicated until
111: .Ar string1
112: is exhausted.
113: .Pp
114: In the second synopsis form, the characters in
115: .Ar string1
116: are deleted from the input.
117: .Pp
118: In the third synopsis form, the characters in
119: .Ar string1
120: are compressed as described for the
121: .Fl s
122: option.
123: .Pp
124: In the fourth synopsis form, the characters in
125: .Ar string1
126: are deleted from the input, and the characters in
127: .Ar string2
128: are compressed as described for the
129: .Fl s
130: option.
131: .Pp
132: The following conventions can be used in
133: .Ar string1
134: and
135: .Ar string2
136: to specify sets of characters:
137: .Bl -tag -width [:equiv:]
138: .It character
139: Any character not described by one of the following conventions
140: represents itself.
141: .It \eoctal
1.4 ! pjanzen 142: A backslash followed by 1, 2, or 3 octal digits represents a character
1.1 deraadt 143: with that encoded value.
144: To follow an octal sequence with a digit as a character, left zero-pad
145: the octal sequence to the full 3 octal digits.
146: .It \echaracter
147: A backslash followed by certain special characters maps to special
148: values.
149: .sp
150: .Bl -column
151: .It \ea <alert character>
152: .It \eb <backspace>
153: .It \ef <form-feed>
154: .It \en <newline>
155: .It \er <carriage return>
156: .It \et <tab>
157: .It \ev <vertical tab>
158: .El
159: .sp
160: A backslash followed by any other character maps to that character.
161: .It c-c
162: Represents the range of characters between the range endpoints, inclusively.
163: .It [:class:]
164: Represents all characters belonging to the defined character class.
165: Class names are:
166: .sp
167: .Bl -column
168: .It alnum <alphanumeric characters>
169: .It alpha <alphabetic characters>
170: .It blank <blank characters>
171: .It cntrl <control characters>
172: .It digit <numeric characters>
173: .It graph <graphic characters>
174: .It lower <lower-case alphabetic characters>
175: .It print <printable characters>
176: .It punct <punctuation characters>
177: .It space <space characters>
178: .It upper <upper-case characters>
179: .It xdigit <hexadecimal characters>
180: .El
181: .Pp
182: \." All classes may be used in
183: \." .Ar string1 ,
184: \." and in
185: \." .Ar string2
186: \." when both the
187: \." .Fl d
188: \." and
189: \." .Fl s
190: \." options are specified.
191: \." Otherwise, only the classes ``upper'' and ``lower'' may be used in
192: \." .Ar string2
193: \." and then only when the corresponding class (``upper'' for ``lower''
194: \." and vice-versa) is specified in the same relative position in
195: \." .Ar string1 .
196: \." .Pp
1.4 ! pjanzen 197: With the exception of the
! 198: .Dq upper
! 199: and
! 200: .Dq lower
! 201: classes, characters
1.1 deraadt 202: in the classes are in unspecified order.
1.4 ! pjanzen 203: In the
! 204: .Dq upper
! 205: and
! 206: .Dq lower
! 207: classes, characters are entered in
1.1 deraadt 208: ascending order.
209: .Pp
210: For specific information as to which ASCII characters are included
211: in these classes, see
212: .Xr ctype 3
213: and related manual pages.
214: .It [=equiv=]
215: Represents all characters or collating (sorting) elements belonging to
216: the same equivalence class as
217: .Ar equiv .
218: If
219: there is a secondary ordering within the equivalence class, the characters
220: are ordered in ascending sequence.
1.4 ! pjanzen 221: Otherwise, they are ordered after their encoded values.
! 222: An example of an equivalence class might be
! 223: .Dq c
! 224: and
! 225: .Dq ch
! 226: in Spanish;
1.1 deraadt 227: English has no equivalence classes.
228: .It [#*n]
229: Represents
230: .Ar n
231: repeated occurrences of the character represented by
232: .Ar # .
233: This
234: expression is only valid when it occurs in
235: .Ar string2 .
236: If
237: .Ar n
238: is omitted or is zero, it is be interpreted as large enough to extend
239: .Ar string2
240: sequence to the length of
241: .Ar string1 .
242: If
243: .Ar n
1.4 ! pjanzen 244: has a leading zero, it is interpreted as an octal value; otherwise,
1.1 deraadt 245: it's interpreted as a decimal value.
246: .El
247: .Pp
248: The
249: .Nm tr
1.3 aaron 250: utility exits 0 on success or >0 if an error occurred.
1.1 deraadt 251: .Sh EXAMPLES
252: The following examples are shown as given to the shell:
253: .sp
254: Create a list of the words in file1, one per line, where a word is taken to
255: be a maximal string of letters.
256: .sp
257: .D1 Li "tr -cs \*q[:alpha:]\*q \*q\en\*q < file1"
258: .sp
259: Translate the contents of file1 to upper-case.
260: .sp
261: .D1 Li "tr \*q[:lower:]\*q \*q[:upper:]\*q < file1"
262: .sp
263: Strip out non-printable characters from file1.
264: .sp
265: .D1 Li "tr -cd \*q[:print:]\*q < file1"
266: .Sh COMPATIBILITY
267: System V has historically implemented character ranges using the syntax
1.4 ! pjanzen 268: .Dq [c-c]
! 269: instead of the
! 270: .Dq c-c
! 271: used by historic BSD implementations and
1.1 deraadt 272: standardized by POSIX.
273: System V shell scripts should work under this implementation as long as
274: the range is intended to map in another range, i.e. the command
1.4 ! pjanzen 275: .Dq tr\ [a-z]\ [A-Z]
! 276: will work as it will map the
! 277: .Dq [
! 278: character in
! 279: .Ar string1
! 280: to the
! 281: .Dq [
! 282: character in
1.3 aaron 283: .Ar string2 .
1.1 deraadt 284: However, if the shell script is deleting or squeezing characters as in
1.4 ! pjanzen 285: the command
! 286: .Dq tr\ -d\ [a-z] ,
! 287: the characters
! 288: .Dq [
! 289: and
! 290: .Dq \]
! 291: will be
! 292: included in the deletion or compression list, which would not have happened
1.1 deraadt 293: under an historic System V implementation.
1.4 ! pjanzen 294: Additionally, any scripts that depended on the sequence
! 295: .Dq a-z
! 296: to represent the three characters
! 297: .Dq a ,
! 298: .Dq - ,
! 299: and
! 300: .Dq z
! 301: will have to be rewritten as
! 302: .Dq a\e-z .
1.1 deraadt 303: .Pp
304: The
305: .Nm tr
306: utility has historically not permitted the manipulation of NUL bytes in
1.4 ! pjanzen 307: its input and, additionally, has stripped NUL's from its input stream.
1.1 deraadt 308: This implementation has removed this behavior as a bug.
309: .Pp
310: The
311: .Nm tr
1.4 ! pjanzen 312: utility has historically been extremely forgiving of syntax errors:
1.1 deraadt 313: for example, the
314: .Fl c
315: and
316: .Fl s
317: options were ignored unless two strings were specified.
318: This implementation will not permit illegal syntax.
319: .Sh STANDARDS
320: The
321: .Nm tr
322: utility is expected to be
323: .St -p1003.2
324: compatible.
325: It should be noted that the feature wherein the last character of
326: .Ar string2
327: is duplicated if
328: .Ar string2
329: has less characters than
330: .Ar string1
331: is permitted by POSIX but is not required.
332: Shell scripts attempting to be portable to other POSIX systems should use
1.4 ! pjanzen 333: the
! 334: .Dq [#*]
! 335: convention instead of relying on this behavior.