[BACK]Return to POSIX CVS log [TXT][DIR] Up to [local] / src / usr.bin / sed

Annotation of src/usr.bin/sed/POSIX, Revision 1.2

1.2     ! deraadt     1: #      $OpenBSD: POSIX,v 1.1.1.1 1995/10/18 08:46:05 deraadt Exp $
1.1       deraadt     2: #      from: @(#)POSIX 8.1 (Berkeley) 6/6/93
                      3:
                      4: Comments on the IEEE P1003.2 Draft 12
                      5:      Part 2: Shell and Utilities
                      6:   Section 4.55: sed - Stream editor
                      7:
                      8: Diomidis Spinellis <dds@doc.ic.ac.uk>
                      9: Keith Bostic <bostic@cs.berkeley.edu>
                     10:
                     11: In the following paragraphs, "wrong" usually means "inconsistent with
                     12: historic practice", as most of the following comments refer to
                     13: undocumented inconsistencies between the historical versions of sed and
                     14: the POSIX 1003.2 standard.  All the comments are notes taken while
                     15: implementing a POSIX-compatible version of sed, and should not be
                     16: interpreted as official opinions or criticism towards the POSIX committee.
                     17: All uses of "POSIX" refer to section 4.55, Draft 12 of POSIX 1003.2.
                     18:
                     19:  1.    32V and BSD derived implementations of sed strip the text
                     20:        arguments of the a, c and i commands of their initial blanks,
                     21:        i.e.
                     22:
                     23:        #!/bin/sed -f
                     24:        a\
                     25:                foo\
                     26:                \  indent\
                     27:                bar
                     28:
                     29:        produces:
                     30:
                     31:        foo
                     32:          indent
                     33:        bar
                     34:
                     35:        POSIX does not specify this behavior as the System V versions of
                     36:        sed do not do this stripping.  The argument against stripping is
                     37:        that it is difficult to write sed scripts that have leading blanks
                     38:        if they are stripped.  The argument for stripping is that it is
                     39:        difficult to write readable sed scripts unless indentation is allowed
                     40:        and ignored, and leading whitespace is obtainable by entering a
                     41:        backslash in front of it.  This implementation follows the BSD
                     42:        historic practice.
                     43:
                     44:  2.    Historical versions of sed required that the w flag be the last
                     45:        flag to an s command as it takes an additional argument.  This
                     46:        is obvious, but not specified in POSIX.
                     47:
                     48:  3.    Historical versions of sed required that whitespace follow a w
                     49:        flag to an s command.  This is not specified in POSIX.  This
                     50:        implementation permits whitespace but does not require it.
                     51:
                     52:  4.    Historical versions of sed permitted any number of whitespace
                     53:        characters to follow the w command.  This is not specified in
                     54:        POSIX.  This implementation permits whitespace but does not
                     55:        require it.
                     56:
                     57:  5.    The rule for the l command differs from historic practice.  Table
                     58:        2-15 includes the various ANSI C escape sequences, including \\
                     59:        for backslash.  Some historical versions of sed displayed two
                     60:        digit octal numbers, too, not three as specified by POSIX.  POSIX
                     61:        is a cleanup, and is followed by this implementation.
                     62:
                     63:  6.    The POSIX specification for ! does not specify that for a single
                     64:        command the command must not contain an address specification
                     65:        whereas the command list can contain address specifications.  The
                     66:        specification for ! implies that "3!/hello/p" works, and it never
                     67:        has, historically.  Note,
                     68:
                     69:                3!{
                     70:                        /hello/p
                     71:                }
                     72:
                     73:        does work.
                     74:
                     75:  7.    POSIX does not specify what happens with consecutive ! commands
                     76:        (e.g. /foo/!!!p).  Historic implementations allow any number of
                     77:        !'s without changing the behaviour.  (It seems logical that each
                     78:        one might reverse the behaviour.)  This implementation follows
                     79:        historic practice.
                     80:
                     81:  8.    Historic versions of sed permitted commands to be separated
                     82:        by semi-colons, e.g. 'sed -ne '1p;2p;3q' printed the first
                     83:        three lines of a file.  This is not specified by POSIX.
                     84:        Note, the ; command separator is not allowed for the commands
                     85:        a, c, i, w, r, :, b, t, # and at the end of a w flag in the s
                     86:        command.  This implementation follows historic practice and
                     87:        implements the ; separator.
                     88:
                     89:  9.    Historic versions of sed terminated the script if EOF was reached
                     90:        during the execution of the 'n' command, i.e.:
                     91:
                     92:        sed -e '
                     93:        n
                     94:        i\
                     95:        hello
                     96:        ' </dev/null
                     97:
                     98:        did not produce any output.  POSIX does not specify this behavior.
                     99:        This implementation follows historic practice.
                    100:
                    101: 10.    Deleted.
                    102:
                    103: 11.    Historical implementations do not output the change text of a c
                    104:        command in the case of an address range whose first line number
                    105:        is greater than the second (e.g. 3,1).  POSIX requires that the
                    106:        text be output.  Since the historic behavior doesn't seem to have
                    107:        any particular purpose, this implementation follows the POSIX
                    108:        behavior.
                    109:
                    110: 12.    POSIX does not specify whether address ranges are checked and
                    111:        reset if a command is not executed due to a jump.  The following
                    112:        program will behave in different ways depending on whether the
                    113:        'c' command is triggered at the third line, i.e. will the text
                    114:        be output even though line 3 of the input will never logically
                    115:        encounter that command.
                    116:
                    117:        2,4b
                    118:        1,3c\
                    119:                text
                    120:
                    121:        Historic implementations, and this implementation, do not output
                    122:        the text in the above example.  The general rule, therefore,
                    123:        is that a range whose second address is never matched extends to
                    124:        the end of the input.
                    125:
                    126: 13.    Historical implementations allow an output suppressing #n at the
                    127:        beginning of -e arguments as well as in a script file.  POSIX
                    128:        does not specify this.  This implementation follows historical
                    129:        practice.
                    130:
                    131: 14.    POSIX does not explicitly specify how sed behaves if no script is
                    132:        specified.  Since the sed Synopsis permits this form of the command,
                    133:        and the language in the Description section states that the input
                    134:        is output, it seems reasonable that it behave like the cat(1)
                    135:        command.  Historic sed implementations behave differently for "ls |
                    136:        sed", where they produce no output, and "ls | sed -e#", where they
                    137:        behave like cat.  This implementation behaves like cat in both cases.
                    138:
                    139: 15.    The POSIX requirement to open all w files at the beginning makes
                    140:        sed behave nonintuitively when the w commands are preceded by
                    141:        addresses or are within conditional blocks.  This implementation
                    142:        follows historic practice and POSIX, by default, and provides the
                    143:        -a option which opens the files only when they are needed.
                    144:
                    145: 16.    POSIX does not specify how escape sequences other than \n and \D
                    146:        (where D is the delimiter character) are to be treated.  This is
                    147:        reasonable, however, it also doesn't state that the backslash is
                    148:        to be discarded from the output regardless.  A strict reading of
                    149:        POSIX would be that "echo xyz | sed s/./\a" would display "\ayz".
                    150:        As historic sed implementations always discarded the backslash,
                    151:        this implementation does as well.
                    152:
                    153: 17.    POSIX specifies that an address can be "empty".  This implies
                    154:        that constructs like ",d" or "1,d" and ",5d" are allowed.  This
                    155:        is not true for historic implementations or this implementation
                    156:        of sed.
                    157:
                    158: 18.    The b t and : commands are documented in POSIX to ignore leading
                    159:        white space, but no mention is made of trailing white space.
                    160:        Historic implementations of sed assigned different locations to
                    161:        the labels "x" and "x ".  This is not useful, and leads to subtle
                    162:        programming errors, but it is historic practice and changing it
                    163:        could theoretically break working scripts.  This implementation
                    164:        follows historic practice.
                    165:
                    166: 19.    Although POSIX specifies that reading from files that do not exist
                    167:        from within the script must not terminate the script, it does not
                    168:        specify what happens if a write command fails.  Historic practice
                    169:        is to fail immediately if the file cannot be opened or written.
                    170:        This implementation follows historic practice.
                    171:
                    172: 20.    Historic practice is that the \n construct can be used for either
                    173:        string1 or string2 of the y command.  This is not specified by
                    174:        POSIX.  This implementation follows historic practice.
                    175:
                    176: 21.    Deleted.
                    177:
                    178: 22.    Historic implementations of sed ignore the RE delimiter characters
                    179:        within character classes.  This is not specified in POSIX.  This
                    180:        implementation follows historic practice.
                    181:
                    182: 23.    Historic implementations handle empty RE's in a special way: the
                    183:        empty RE is interpreted as if it were the last RE encountered,
                    184:        whether in an address or elsewhere.  POSIX does not document this
                    185:        behavior.  For example the command:
                    186:
                    187:                sed -e /abc/s//XXX/
                    188:
                    189:        substitutes XXX for the pattern abc.  The semantics of "the last
                    190:        RE" can be defined in two different ways:
                    191:
                    192:        1. The last RE encountered when compiling (lexical/static scope).
                    193:        2. The last RE encountered while running (dynamic scope).
                    194:
                    195:        While many historical implementations fail on programs depending
                    196:        on scope differences, the SunOS version exhibited dynamic scope
                    197:        behaviour.  This implementation does dynamic scoping, as this seems
                    198:        the most useful and in order to remain consistent with historical
                    199:        practice.