Annotation of src/usr.bin/sed/sed.1, Revision 1.43
1.43 ! jmc 1: .\" $OpenBSD: sed.1,v 1.42 2014/05/27 07:00:44 jmc Exp $
1.12 aaron 2: .\"
1.1 deraadt 3: .\" Copyright (c) 1992, 1993
4: .\" The Regents of the University of California. All rights reserved.
5: .\"
6: .\" This code is derived from software contributed to Berkeley by
7: .\" the Institute of Electrical and Electronics Engineers, Inc.
8: .\"
9: .\" Redistribution and use in source and binary forms, with or without
10: .\" modification, are permitted provided that the following conditions
11: .\" are met:
12: .\" 1. Redistributions of source code must retain the above copyright
13: .\" notice, this list of conditions and the following disclaimer.
14: .\" 2. Redistributions in binary form must reproduce the above copyright
15: .\" notice, this list of conditions and the following disclaimer in the
16: .\" documentation and/or other materials provided with the distribution.
1.17 millert 17: .\" 3. Neither the name of the University nor the names of its contributors
1.1 deraadt 18: .\" may be used to endorse or promote products derived from this software
19: .\" without specific prior written permission.
20: .\"
21: .\" THIS SOFTWARE IS PROVIDED BY THE REGENTS AND CONTRIBUTORS ``AS IS'' AND
22: .\" ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
23: .\" IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
24: .\" ARE DISCLAIMED. IN NO EVENT SHALL THE REGENTS OR CONTRIBUTORS BE LIABLE
25: .\" FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
26: .\" DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS
27: .\" OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION)
28: .\" HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT
29: .\" LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY
30: .\" OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF
31: .\" SUCH DAMAGE.
32: .\"
33: .\" from: @(#)sed.1 8.2 (Berkeley) 12/30/93
34: .\"
1.43 ! jmc 35: .Dd $Mdocdate: May 27 2014 $
1.1 deraadt 36: .Dt SED 1
37: .Os
38: .Sh NAME
39: .Nm sed
40: .Nd stream editor
41: .Sh SYNOPSIS
42: .Nm sed
1.34 jmc 43: .Op Fl aEnru
1.1 deraadt 44: .Ar command
1.29 sobrado 45: .Op Ar
1.1 deraadt 46: .Nm sed
1.33 djm 47: .Op Fl aEnru
1.1 deraadt 48: .Op Fl e Ar command
49: .Op Fl f Ar command_file
1.29 sobrado 50: .Op Ar
1.1 deraadt 51: .Sh DESCRIPTION
52: The
1.8 aaron 53: .Nm
1.1 deraadt 54: utility reads the specified files, or the standard input if no files
55: are specified, modifying the input as specified by a list of commands.
56: The input is then written to the standard output.
57: .Pp
58: A single command may be specified as the first argument to
59: .Nm sed .
1.25 jmc 60: Multiple commands may be specified
61: separated by newlines or semicolons,
62: or by using the
1.1 deraadt 63: .Fl e
64: or
65: .Fl f
66: options.
67: All commands are applied to the input in the order they are specified
68: regardless of their origin.
69: .Pp
1.10 aaron 70: The options are as follows:
1.16 aaron 71: .Bl -tag -width Ds
1.1 deraadt 72: .It Fl a
73: The files listed as parameters for the
1.41 jmc 74: .Ic w
75: function or flag are created (or truncated) before any processing begins,
1.1 deraadt 76: by default.
77: The
78: .Fl a
79: option causes
1.8 aaron 80: .Nm
1.1 deraadt 81: to delay opening each file until a command containing the related
1.41 jmc 82: .Ic w
83: function or flag is applied to a line of input.
1.34 jmc 84: .It Fl E
85: Interpret regular expressions using POSIX extended regular expression syntax.
86: The default behaviour is to use POSIX basic regular expression syntax.
1.1 deraadt 87: .It Fl e Ar command
88: Append the editing commands specified by the
89: .Ar command
90: argument
91: to the list of commands.
92: .It Fl f Ar command_file
93: Append the editing commands found in the file
94: .Ar command_file
95: to the list of commands.
96: The editing commands should each be listed on a separate line.
1.33 djm 97: .It Fl r
1.34 jmc 98: An alias for
99: .Fl E ,
100: for compatibility with GNU sed.
1.1 deraadt 101: .It Fl n
102: By default, each line of input is echoed to the standard output after
103: all of the commands have been applied to it.
104: The
105: .Fl n
106: option suppresses this behavior.
1.26 ray 107: .It Fl u
108: Force output to be line buffered,
109: printing each line as it becomes available.
110: By default, output is line buffered when standard output is a terminal
111: and block buffered otherwise.
112: See
113: .Xr setbuf 3
114: for a more detailed explanation.
1.1 deraadt 115: .El
116: .Pp
117: The form of a
1.8 aaron 118: .Nm
1.1 deraadt 119: command is as follows:
1.21 jmc 120: .Pp
1.1 deraadt 121: .Dl [address[,address]]function[arguments]
1.21 jmc 122: .Pp
1.1 deraadt 123: Whitespace may be inserted before the first address and the function
124: portions of the command.
125: .Pp
126: Normally,
1.8 aaron 127: .Nm
1.1 deraadt 128: cyclically copies a line of input, not including its terminating newline
129: character, into a
1.21 jmc 130: .Em pattern space ,
1.1 deraadt 131: (unless there is something left after a
1.37 jmc 132: .Ic D
1.1 deraadt 133: function),
134: applies all of the commands with addresses that select that pattern space,
135: copies the pattern space to the standard output, appending a newline, and
136: deletes the pattern space.
137: .Pp
138: Some of the functions use a
1.21 jmc 139: .Em hold space
1.1 deraadt 140: to save all or part of the pattern space for subsequent retrieval.
1.21 jmc 141: .Sh SED ADDRESSES
1.1 deraadt 142: An address is not required, but if specified must be a number (that counts
143: input lines
1.21 jmc 144: cumulatively across input files), a dollar character
1.8 aaron 145: .Pq Ql $
1.21 jmc 146: that addresses the last line of input, or a context address
1.1 deraadt 147: (which consists of a regular expression preceded and followed by a
148: delimiter).
149: .Pp
150: A command line with no addresses selects every pattern space.
151: .Pp
152: A command line with one address selects all of the pattern spaces
153: that match the address.
154: .Pp
155: A command line with two addresses selects the inclusive range from
156: the first pattern space that matches the first address through the next
157: pattern space that matches the second.
158: (If the second address is a number less than or equal to the line number
159: first selected, only that line is selected.)
160: Starting at the first line following the selected range,
1.8 aaron 161: .Nm
1.1 deraadt 162: starts looking again for the first address.
163: .Pp
164: Editing commands can be applied to non-selected pattern spaces by use
165: of the exclamation character
1.18 jmc 166: .Pq Ql \&!
1.1 deraadt 167: function.
1.21 jmc 168: .Sh SED REGULAR EXPRESSIONS
1.34 jmc 169: By default,
1.8 aaron 170: .Nm
1.23 jmc 171: regular expressions are basic regular expressions
172: .Pq BREs .
1.34 jmc 173: Extended regular expressions are supported using the
174: .Fl E
175: and
176: .Fl r
177: options.
1.23 jmc 178: See
1.19 jmc 179: .Xr re_format 7
1.23 jmc 180: for more information on regular expressions.
1.1 deraadt 181: In addition,
1.8 aaron 182: .Nm
1.23 jmc 183: has the following two additions to BREs:
1.21 jmc 184: .Pp
1.1 deraadt 185: .Bl -enum -compact
186: .It
187: In a context address, any character other than a backslash
1.8 aaron 188: .Pq Ql \e
1.1 deraadt 189: or newline character may be used to delimit the regular expression.
1.30 jmc 190: The opening delimiter should be preceded by a backslash
191: unless it is a slash.
192: Putting a backslash character before the delimiting character
1.1 deraadt 193: causes the character to be treated literally.
194: For example, in the context address \exabc\exdefx, the RE delimiter
195: is an
1.8 aaron 196: .Sq x
1.1 deraadt 197: and the second
1.8 aaron 198: .Sq x
1.1 deraadt 199: stands for itself, so that the regular expression is
200: .Dq abcxdef .
1.21 jmc 201: .Pp
1.1 deraadt 202: .It
203: The escape sequence \en matches a newline character embedded in the
204: pattern space.
205: You can't, however, use a literal newline character in an address or
206: in the substitute command.
207: .El
208: .Pp
209: One special feature of
1.8 aaron 210: .Nm
1.1 deraadt 211: regular expressions is that they can default to the last regular
212: expression used.
1.13 aaron 213: If a regular expression is empty, i.e., just the delimiter characters
1.1 deraadt 214: are specified, the last regular expression encountered is used instead.
215: The last regular expression is defined as the last regular expression
216: used as part of an address or substitute command, and at run-time, not
217: compile-time.
218: For example, the command
219: .Dq /abc/s//XXX/
220: will substitute
221: .Dq XXX
222: for the pattern
223: .Dq abc .
1.21 jmc 224: .Sh SED FUNCTIONS
1.1 deraadt 225: In the following list of commands, the maximum number of permissible
226: addresses for each command is indicated by [0addr], [1addr], or [2addr],
227: representing zero, one, or two addresses.
228: .Pp
229: The argument
1.37 jmc 230: .Ar text
1.1 deraadt 231: consists of one or more lines.
232: To embed a newline in the text, precede it with a backslash.
233: Other backslashes in text are deleted and the following character
234: taken literally.
235: .Pp
236: The
1.40 jmc 237: .Ic r
1.1 deraadt 238: and
1.40 jmc 239: .Ic w
240: functions,
241: as well as the
242: .Cm w
243: flag to the
244: .Ic s
245: function,
246: take an optional
247: .Ar file
248: parameter,
249: which should be separated from the function or flag by whitespace.
250: Files are created
251: (or their contents truncated)
252: before any input processing begins.
1.1 deraadt 253: .Pp
254: The
1.40 jmc 255: .Ic b ,
256: .Ic r ,
257: .Ic s ,
258: .Ic t ,
259: .Ic w ,
260: .Ic y ,
1.1 deraadt 261: and
1.40 jmc 262: .Ic \&:
1.1 deraadt 263: functions all accept additional arguments.
1.40 jmc 264: The synopses below indicate which arguments have to be separated from
1.9 aaron 265: the function letters by whitespace characters.
1.1 deraadt 266: .Pp
1.41 jmc 267: Functions can be combined to form a
268: .Em function list ,
269: a list of
1.8 aaron 270: .Nm
1.1 deraadt 271: functions separated by newlines, as follows:
272: .Bd -literal -offset indent
273: { function
274: function
275: ...
276: function
277: }
278: .Ed
279: .Pp
280: The
1.8 aaron 281: .Ql {
1.9 aaron 282: can be preceded or followed by whitespace.
283: The function can be preceded by whitespace as well.
1.1 deraadt 284: The terminating
1.8 aaron 285: .Ql }
1.9 aaron 286: must be preceded by a newline or optional whitespace.
1.38 jmc 287: .Pp
1.40 jmc 288: Functions and function lists may be preceded by an exclamation mark,
1.38 jmc 289: in which case they are applied only to lines that are
290: .Em not
291: selected by the addresses.
1.37 jmc 292: .Bl -tag -width Ds
293: .It [2addr] Ar function-list
1.15 aaron 294: Execute
1.37 jmc 295: .Ar function-list
1.15 aaron 296: only when the pattern space is selected.
1.37 jmc 297: .It Xo [1 addr] Ic a Ns \e
298: .br
299: .Ar text
300: .Xc
1.21 jmc 301: .Pp
1.1 deraadt 302: Write
1.37 jmc 303: .Ar text
1.1 deraadt 304: to standard output immediately before each attempt to read a line of input,
305: whether by executing the
1.37 jmc 306: .Ic N
1.1 deraadt 307: function or by beginning a new cycle.
1.37 jmc 308: .It [2addr] Ns Ic b Bq Ar label
1.1 deraadt 309: Branch to the
1.37 jmc 310: .Ic \&:
311: function with the specified
312: .Ar label .
1.1 deraadt 313: If the label is not specified, branch to the end of the script.
1.37 jmc 314: .It Xo [2addr] Ic c Ns \e
315: .br
316: .Ar text
317: .Xc
1.21 jmc 318: .Pp
1.1 deraadt 319: Delete the pattern space.
320: With 0 or 1 address or at the end of a 2-address range,
1.37 jmc 321: .Ar text
1.1 deraadt 322: is written to the standard output.
1.37 jmc 323: .It [2addr] Ns Ic d
1.1 deraadt 324: Delete the pattern space and start the next cycle.
1.37 jmc 325: .It [2addr] Ns Ic D
1.1 deraadt 326: Delete the initial segment of the pattern space through the first
327: newline character and start the next cycle.
1.37 jmc 328: .It [2addr] Ns Ic g
1.1 deraadt 329: Replace the contents of the pattern space with the contents of the
330: hold space.
1.37 jmc 331: .It [2addr] Ns Ic G
1.1 deraadt 332: Append a newline character followed by the contents of the hold space
333: to the pattern space.
1.37 jmc 334: .It [2addr] Ns Ic h
1.1 deraadt 335: Replace the contents of the hold space with the contents of the
336: pattern space.
1.37 jmc 337: .It [2addr] Ns Ic H
1.1 deraadt 338: Append a newline character followed by the contents of the pattern space
339: to the hold space.
1.37 jmc 340: .It Xo [1addr] Ic i Ns \e
341: .br
342: .Ar text
343: .Xc
1.21 jmc 344: .Pp
1.1 deraadt 345: Write
1.37 jmc 346: .Ar text
1.1 deraadt 347: to the standard output.
1.37 jmc 348: .It [2addr] Ns Ic l
1.1 deraadt 349: (The letter ell.)
350: Write the pattern space to the standard output in a visually unambiguous
351: form.
352: This form is as follows:
1.21 jmc 353: .Pp
1.1 deraadt 354: .Bl -tag -width "carriage-returnXX" -offset indent -compact
355: .It backslash
1.3 deraadt 356: \e\e
1.1 deraadt 357: .It alert
358: \ea
1.31 millert 359: .It backspace
360: \eb
1.1 deraadt 361: .It form-feed
362: \ef
363: .It carriage-return
364: \er
365: .It tab
366: \et
367: .It vertical tab
368: \ev
369: .El
370: .Pp
1.15 aaron 371: Non-printable characters are written as three-digit octal numbers (with a
1.1 deraadt 372: preceding backslash) for each byte in the character (most significant byte
373: first).
374: Long lines are folded, with the point of folding indicated by displaying
375: a backslash followed by a newline.
376: The end of each line is marked with a
1.8 aaron 377: .Ql $ .
1.37 jmc 378: .It [2addr] Ns Ic n
1.1 deraadt 379: Write the pattern space to the standard output if the default output has
380: not been suppressed, and replace the pattern space with the next line of
381: input.
1.37 jmc 382: .It [2addr] Ns Ic N
1.1 deraadt 383: Append the next line of input to the pattern space, using an embedded
384: newline character to separate the appended material from the original
385: contents.
386: Note that the current line number changes.
1.37 jmc 387: .It [2addr] Ns Ic p
1.1 deraadt 388: Write the pattern space to standard output.
1.37 jmc 389: .It [2addr] Ns Ic P
1.39 jmc 390: Write the pattern space, up to the first newline character,
391: to the standard output.
1.37 jmc 392: .It [1addr] Ns Ic q
1.1 deraadt 393: Branch to the end of the script and quit without starting a new cycle.
1.37 jmc 394: .It [1addr] Ns Ic r Ar file
1.1 deraadt 395: Copy the contents of
1.37 jmc 396: .Ar file
1.1 deraadt 397: to the standard output immediately before the next attempt to read a
398: line of input.
399: If
1.37 jmc 400: .Ar file
1.1 deraadt 401: cannot be read for any reason, it is silently ignored and no error
402: condition is set.
1.37 jmc 403: .It [2addr] Ns Ic s Ns / Ns Ar RE Ns / Ns Ar replacement Ns / Ns Ar flags
404: Substitute the
405: .Ar replacement
406: string for the first instance of the regular expression
407: .Ar RE
408: in the pattern space.
1.1 deraadt 409: Any character other than backslash or newline can be used instead of
1.37 jmc 410: a slash to delimit the regular expression and the replacement.
411: Within the regular expression and the replacement,
412: the regular expression delimiter itself can be used as
1.1 deraadt 413: a literal character if it is preceded by a backslash.
414: .Pp
415: An ampersand
1.8 aaron 416: .Pq Ql &
1.37 jmc 417: appearing in the replacement is replaced by the string matching the
418: regular expression.
1.1 deraadt 419: The special meaning of
1.8 aaron 420: .Ql &
1.1 deraadt 421: in this context can be suppressed by preceding it by a backslash.
422: The string
1.8 aaron 423: .Ql \e# ,
1.1 deraadt 424: where
1.8 aaron 425: .Ql #
1.1 deraadt 426: is a digit, is replaced by the text matched
427: by the corresponding backreference expression (see
1.14 aaron 428: .Xr re_format 7 ) .
1.1 deraadt 429: .Pp
430: A line can be split by substituting a newline character into it.
431: To specify a newline character in the replacement string, precede it with
432: a backslash.
433: .Pp
434: The value of
1.37 jmc 435: .Ar flags
1.1 deraadt 436: in the substitute function is zero or more of the following:
437: .Bl -tag -width "XXXXXX" -offset indent
1.37 jmc 438: .It Cm 0 No ... Cm 9
1.1 deraadt 439: Make the substitution only for the N'th occurrence of the regular
440: expression in the pattern space.
1.37 jmc 441: .It Cm g
1.1 deraadt 442: Make the substitution for all non-overlapping matches of the
443: regular expression, not just the first one.
1.37 jmc 444: .It Cm p
1.1 deraadt 445: Write the pattern space to standard output if a replacement was made.
446: If the replacement string is identical to that which it replaces, it
447: is still considered to have been a replacement.
1.37 jmc 448: .It Cm w Ar file
1.1 deraadt 449: Append the pattern space to
1.37 jmc 450: .Ar file
1.1 deraadt 451: if a replacement was made.
452: If the replacement string is identical to that which it replaces, it
453: is still considered to have been a replacement.
454: .El
1.37 jmc 455: .It [2addr] Ns Ic t Bq Ar label
1.1 deraadt 456: Branch to the
1.37 jmc 457: .Ic \&:
458: function bearing the
459: .Ar label
460: if any substitutions have been made since the
1.1 deraadt 461: most recent reading of an input line or execution of a
1.37 jmc 462: .Ic t
1.1 deraadt 463: function.
464: If no label is specified, branch to the end of the script.
1.37 jmc 465: .It [2addr] Ns Ic w Ar file
1.1 deraadt 466: Append the pattern space to the
1.37 jmc 467: .Ar file .
468: .It [2addr] Ns Ic x
1.1 deraadt 469: Swap the contents of the pattern and hold spaces.
1.37 jmc 470: .It [2addr] Ns Ic y Ns / Ns Ar string1 Ns / Ns Ar string2 Ns /
1.1 deraadt 471: Replace all occurrences of characters in
1.37 jmc 472: .Ar string1
1.1 deraadt 473: in the pattern space with the corresponding characters from
1.37 jmc 474: .Ar string2 .
1.1 deraadt 475: Any character other than a backslash or newline can be used instead of
476: a slash to delimit the strings.
477: Within
1.37 jmc 478: .Ar string1
1.1 deraadt 479: and
1.37 jmc 480: .Ar string2 ,
1.1 deraadt 481: a backslash followed by any character other than a newline is that literal
1.8 aaron 482: character, and a backslash followed by an
483: .Sq n
484: is replaced by a newline character.
1.37 jmc 485: .It [0addr] Ns Ic \&: Ns Ar label
486: This function does nothing; it bears a
487: .Ar label
488: to which the
489: .Ic b
1.1 deraadt 490: and
1.37 jmc 491: .Ic t
1.1 deraadt 492: commands may branch.
1.37 jmc 493: .It [1addr] Ns Ic =
1.15 aaron 494: Write the line number to the standard output followed by a newline character.
1.1 deraadt 495: .It [0addr]
496: Empty lines are ignored.
1.37 jmc 497: .It [0addr] Ns Ic #
1.1 deraadt 498: The
1.8 aaron 499: .Ql #
1.1 deraadt 500: and the remainder of the line are ignored (treated as a comment), with
501: the single exception that if the first two characters in the file are
1.8 aaron 502: .Ql #n ,
1.1 deraadt 503: the default output is suppressed.
504: This is the same as specifying the
505: .Fl n
506: option on the command line.
507: .El
1.36 jmc 508: .Sh EXIT STATUS
1.24 jmc 509: .Ex -std sed
1.43 ! jmc 510: .Sh EXAMPLES
! 511: The following simulates the
! 512: .Xr cat 1
! 513: .Fl s
! 514: command,
! 515: squeezing excess empty lines from standard input:
! 516: .Bd -literal -offset indent
! 517: $ sed -n '
! 518: # Write non-empty lines.
! 519: /./ {
! 520: p
! 521: d
! 522: }
! 523: # Write a single empty line, then look for more empty lines.
! 524: /^$/ p
! 525: # Get the next line, discard the held <newline> (empty line),
! 526: # and look for more empty lines.
! 527: :Empty
! 528: /^$/ {
! 529: N
! 530: s/.//
! 531: b Empty
! 532: }
! 533: # Write the non-empty line before going back to search
! 534: # for the first in a set of empty lines.
! 535: p
! 536: \&'
! 537: .Ed
1.1 deraadt 538: .Sh SEE ALSO
539: .Xr awk 1 ,
540: .Xr ed 1 ,
541: .Xr grep 1 ,
542: .Xr re_format 7
543: .Sh STANDARDS
544: The
1.8 aaron 545: .Nm
1.25 jmc 546: utility is compliant with the
1.32 jmc 547: .St -p1003.1-2008
1.1 deraadt 548: specification.
1.25 jmc 549: .Pp
1.26 ray 550: The flags
1.33 djm 551: .Op Fl aEru
1.27 jmc 552: are extensions to that specification.
1.25 jmc 553: .Pp
554: The use of newlines to separate multiple commands on the command line
555: is non-portable;
556: the use of newlines to separate multiple commands within a command file
557: .Pq Fl f Ar command_file
558: is portable.
1.11 aaron 559: .Sh HISTORY
560: A
561: .Nm
562: command appeared in
563: .At v7 .
1.25 jmc 564: .Sh CAVEATS
565: The use of semicolons to separate multiple commands
566: is not permitted for the following commands:
1.37 jmc 567: .Ic a , b , c ,
568: .Ic i , r , t ,
569: .Ic w , \&: ,
1.25 jmc 570: and
1.37 jmc 571: .Ic # .