[BACK]Return to awk.1 CVS log [TXT][DIR] Up to [local] / src / usr.bin / awk

Annotation of src/usr.bin/awk/awk.1, Revision 1.64

1.64    ! jsg         1: .\"    $OpenBSD: awk.1,v 1.63 2021/11/08 06:46:22 jmc Exp $
1.11      jmc         2: .\"
                      3: .\" Copyright (C) Lucent Technologies 1997
                      4: .\" All Rights Reserved
1.12      jmc         5: .\"
1.11      jmc         6: .\" Permission to use, copy, modify, and distribute this software and
                      7: .\" its documentation for any purpose and without fee is hereby
                      8: .\" granted, provided that the above copyright notice appear in all
                      9: .\" copies and that both that the copyright notice and this
                     10: .\" permission notice and warranty disclaimer appear in supporting
                     11: .\" documentation, and that the name Lucent Technologies or any of
                     12: .\" its entities not be used in advertising or publicity pertaining
                     13: .\" to distribution of the software without specific, written prior
                     14: .\" permission.
1.12      jmc        15: .\"
1.11      jmc        16: .\" LUCENT DISCLAIMS ALL WARRANTIES WITH REGARD TO THIS SOFTWARE,
                     17: .\" INCLUDING ALL IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS.
                     18: .\" IN NO EVENT SHALL LUCENT OR ANY OF ITS ENTITIES BE LIABLE FOR ANY
                     19: .\" SPECIAL, INDIRECT OR CONSEQUENTIAL DAMAGES OR ANY DAMAGES
                     20: .\" WHATSOEVER RESULTING FROM LOSS OF USE, DATA OR PROFITS, WHETHER
                     21: .\" IN AN ACTION OF CONTRACT, NEGLIGENCE OR OTHER TORTIOUS ACTION,
                     22: .\" ARISING OUT OF OR IN CONNECTION WITH THE USE OR PERFORMANCE OF
                     23: .\" THIS SOFTWARE.
                     24: .\"
1.64    ! jsg        25: .Dd $Mdocdate: November 8 2021 $
1.7       aaron      26: .Dt AWK 1
                     27: .Os
                     28: .Sh NAME
                     29: .Nm awk
                     30: .Nd pattern-directed scanning and processing language
                     31: .Sh SYNOPSIS
                     32: .Nm awk
1.16      jmc        33: .Op Fl safe
                     34: .Op Fl V
                     35: .Op Fl d Ns Op Ar n
1.7       aaron      36: .Op Fl F Ar fs
1.38      schwarze   37: .Op Fl v Ar var Ns = Ns Ar value
1.18      jmc        38: .Op Ar prog | Fl f Ar progfile
1.7       aaron      39: .Ar
                     40: .Sh DESCRIPTION
                     41: .Nm
1.1       tholo      42: scans each input
1.7       aaron      43: .Ar file
1.1       tholo      44: for lines that match any of a set of patterns specified literally in
1.7       aaron      45: .Ar prog
1.16      jmc        46: or in one or more files specified as
1.7       aaron      47: .Fl f Ar progfile .
1.16      jmc        48: With each pattern there can be an associated action that will be performed
1.1       tholo      49: when a line of a
1.7       aaron      50: .Ar file
1.1       tholo      51: matches the pattern.
                     52: Each line is matched against the
                     53: pattern portion of every pattern-action statement;
                     54: the associated action is performed for each matched pattern.
1.6       aaron      55: The file name
1.16      jmc        56: .Sq -
1.1       tholo      57: means the standard input.
                     58: Any
1.7       aaron      59: .Ar file
1.1       tholo      60: of the form
1.16      jmc        61: .Ar var Ns = Ns Ar value
1.1       tholo      62: is treated as an assignment, not a filename,
                     63: and is executed at the time it would have been opened if it were a filename.
1.16      jmc        64: .Pp
                     65: The options are as follows:
1.20      jmc        66: .Bl -tag -width "-safe "
1.16      jmc        67: .It Fl d Ns Op Ar n
                     68: Debug mode.
                     69: Set debug level to
                     70: .Ar n ,
                     71: or 1 if
                     72: .Ar n
                     73: is not specified.
                     74: A value greater than 1 causes
                     75: .Nm
                     76: to dump core on fatal errors.
                     77: .It Fl F Ar fs
                     78: Define the input field separator to be the regular expression
1.7       aaron      79: .Ar fs .
1.25      jmc        80: .It Fl f Ar progfile
1.16      jmc        81: Read program code from the specified file
1.25      jmc        82: .Ar progfile
1.16      jmc        83: instead of from the command line.
                     84: .It Fl safe
                     85: Disable file output
1.17      jmc        86: .Pf ( Ic print No > ,
                     87: .Ic print No >> ) ,
1.7       aaron      88: process creation
                     89: .Po
1.17      jmc        90: .Ar cmd | Ic getline ,
1.40      jmc        91: .Ic print | ,
1.17      jmc        92: .Ic system
1.7       aaron      93: .Pc
                     94: and access to the environment
1.17      jmc        95: .Pf ( Va ENVIRON ;
1.18      jmc        96: see the section on variables below).
1.17      jmc        97: This is a first
1.16      jmc        98: .Pq and not very reliable
                     99: approximation to a
1.7       aaron     100: .Dq safe
                    101: version of
1.16      jmc       102: .Nm .
                    103: .It Fl V
                    104: Print the version number of
                    105: .Nm
                    106: to standard output and exit.
                    107: .It Fl v Ar var Ns = Ns Ar value
                    108: Assign
                    109: .Ar value
                    110: to variable
                    111: .Ar var
                    112: before
                    113: .Ar prog
                    114: is executed;
                    115: any number of
                    116: .Fl v
                    117: options may be present.
                    118: .El
1.7       aaron     119: .Pp
1.18      jmc       120: The input is normally made up of input lines
                    121: .Pq records
                    122: separated by newlines, or by the value of
                    123: .Va RS .
                    124: If
                    125: .Va RS
                    126: is null, then any number of blank lines are used as the record separator,
                    127: and newlines are used as field separators
                    128: (in addition to the value of
                    129: .Va FS ) .
                    130: This is convenient when working with multi-line records.
                    131: .Pp
1.7       aaron     132: An input line is normally made up of fields separated by whitespace,
1.55      millert   133: or by the value of the field separator
                    134: .Va FS
                    135: at the time the line is read.
1.1       tholo     136: The fields are denoted
1.7       aaron     137: .Va $1 , $2 , ... ,
                    138: while
                    139: .Va $0
1.1       tholo     140: refers to the entire line.
1.55      millert   141: .Va FS
                    142: may be set to either a single character or a regular expression.
1.58      jmc       143: As a special case, if
1.55      millert   144: .Va FS
                    145: is a single space
                    146: .Pq the default ,
                    147: fields will be split by one or more whitespace characters.
1.1       tholo     148: If
1.7       aaron     149: .Va FS
1.1       tholo     150: is null, the input line is split into one field per character.
1.7       aaron     151: .Pp
1.18      jmc       152: Normally, any number of blanks separate fields.
                    153: In order to set the field separator to a single blank, use the
                    154: .Fl F
                    155: option with a value of
                    156: .Sq [\ \&] .
                    157: If a field separator of
                    158: .Sq t
                    159: is specified,
                    160: .Nm
                    161: treats it as if
                    162: .Sq \et
                    163: had been specified and uses
                    164: .Aq TAB
                    165: as the field separator.
                    166: In order to use a literal
                    167: .Sq t
                    168: as the field separator, use the
                    169: .Fl F
                    170: option with a value of
                    171: .Sq [t] .
1.55      millert   172: The field separator is usually set via the
                    173: .Fl F
                    174: option or from inside a
                    175: .Ic BEGIN
                    176: block so that it takes effect before the input is read.
1.18      jmc       177: .Pp
1.47      millert   178: A pattern-action statement has the form:
1.7       aaron     179: .Pp
                    180: .D1 Ar pattern Ic \&{ Ar action Ic \&}
                    181: .Pp
1.6       aaron     182: A missing
1.7       aaron     183: .Ic \&{ Ar action Ic \&}
1.1       tholo     184: means print the line;
                    185: a missing pattern always matches.
                    186: Pattern-action statements are separated by newlines or semicolons.
1.7       aaron     187: .Pp
1.18      jmc       188: Newlines are permitted after a terminating statement or following a comma
                    189: .Pq Sq ,\& ,
                    190: an open brace
                    191: .Pq Sq { ,
                    192: a logical AND
                    193: .Pq Sq && ,
                    194: a logical OR
                    195: .Pq Sq || ,
                    196: after the
                    197: .Sq do
                    198: or
                    199: .Sq else
                    200: keywords,
                    201: or after the closing parenthesis of an
                    202: .Sq if ,
                    203: .Sq for ,
                    204: or
                    205: .Sq while
                    206: statement.
                    207: Additionally, a backslash
                    208: .Pq Sq \e
                    209: can be used to escape a newline between tokens.
                    210: .Pp
1.1       tholo     211: An action is a sequence of statements.
                    212: A statement can be one of the following:
1.35      jmc       213: .Pp
                    214: .Bl -tag -width Ds -offset indent -compact
1.43      schwarze  215: .It Ic if Ar ( expression ) Ar statement Op Ic else Ar statement
                    216: .It Ic while Ar ( expression ) Ar statement
                    217: .It Ic for Ar ( expression ; expression ; expression ) statement
                    218: .It Ic for Ar ( var Ic in Ar array ) statement
                    219: .It Ic do Ar statement Ic while Ar ( expression )
1.35      jmc       220: .It Ic break
                    221: .It Ic continue
                    222: .It Xo Ic {
                    223: .Op Ar statement ...
                    224: .Ic }
                    225: .Xc
                    226: .It Xo Ar expression
                    227: .No # commonly
                    228: .Ar var No = Ar expression
1.7       aaron     229: .Xc
1.35      jmc       230: .It Xo Ic print
1.7       aaron     231: .Op Ar expression-list
1.17      jmc       232: .Op > Ns Ar expression
1.7       aaron     233: .Xc
1.35      jmc       234: .It Xo Ic printf Ar format
1.7       aaron     235: .Op Ar ... , expression-list
1.17      jmc       236: .Op > Ns Ar expression
1.7       aaron     237: .Xc
1.35      jmc       238: .It Ic return Op Ar expression
                    239: .It Xo Ic next
                    240: .No # skip remaining patterns on this input line
                    241: .Xc
                    242: .It Xo Ic nextfile
                    243: .No # skip rest of this file, open next, start at top
                    244: .Xc
                    245: .It Xo Ic delete
                    246: .Sm off
                    247: .Ar array Ic \&[ Ar expression Ic \&]
                    248: .Sm on
                    249: .No # delete an array element
1.7       aaron     250: .Xc
1.35      jmc       251: .It Xo Ic delete Ar array
                    252: .No # delete all elements of array
1.7       aaron     253: .Xc
1.35      jmc       254: .It Xo Ic exit
1.7       aaron     255: .Op Ar expression
1.46      deraadt   256: .No # exit processing, and perform
                    257: .Ic END
                    258: processing; status is
                    259: .Ar expression
1.7       aaron     260: .Xc
1.35      jmc       261: .El
1.7       aaron     262: .Pp
1.1       tholo     263: Statements are terminated by
                    264: semicolons, newlines or right braces.
                    265: An empty
1.7       aaron     266: .Ar expression-list
1.1       tholo     267: stands for
1.7       aaron     268: .Ar $0 .
                    269: String constants are quoted
                    270: .Li \&"" ,
1.20      jmc       271: with the usual C escapes recognized within
                    272: (see
                    273: .Xr printf 1
                    274: for a complete list of these).
1.1       tholo     275: Expressions take on string or numeric values as appropriate,
                    276: and are built using the operators
1.7       aaron     277: .Ic + \- * / % ^
1.20      jmc       278: .Pq exponentiation ,
                    279: and concatenation
                    280: .Pq indicated by whitespace .
1.1       tholo     281: The operators
1.16      jmc       282: .Ic \&! ++ \-\- += \-= *= /= %= ^=
1.59      millert   283: .Ic > >= < <= == != ?\&:
1.1       tholo     284: are also available in expressions.
                    285: Variables may be scalars, array elements
                    286: (denoted
1.7       aaron     287: .Li x[i] )
1.1       tholo     288: or fields.
                    289: Variables are initialized to the null string.
                    290: Array subscripts may be any string,
                    291: not necessarily numeric;
                    292: this allows for a form of associative memory.
                    293: Multiple subscripts such as
1.7       aaron     294: .Li [i,j,k]
1.1       tholo     295: are permitted; the constituents are concatenated,
                    296: separated by the value of
1.17      jmc       297: .Va SUBSEP
1.31      deraadt   298: .Pq see the section on variables below .
1.7       aaron     299: .Pp
1.1       tholo     300: The
1.7       aaron     301: .Ic print
1.1       tholo     302: statement prints its arguments on the standard output
                    303: (or on a file if
1.47      millert   304: .Pf >\ \& Ar file
1.1       tholo     305: or
1.47      millert   306: .Pf >>\ \& Ar file
1.1       tholo     307: is present or on a pipe if
1.17      jmc       308: .Pf |\ \& Ar cmd
1.1       tholo     309: is present), separated by the current output field separator,
                    310: and terminated by the output record separator.
1.7       aaron     311: .Ar file
1.1       tholo     312: and
1.7       aaron     313: .Ar cmd
1.1       tholo     314: may be literal names or parenthesized expressions;
                    315: identical string values in different statements denote
                    316: the same open file.
                    317: The
1.7       aaron     318: .Ic printf
1.47      millert   319: statement formats its expression list according to the
                    320: .Ar format
1.1       tholo     321: (see
1.28      jmc       322: .Xr printf 1 ) .
1.18      jmc       323: .Pp
                    324: Patterns are arbitrary Boolean combinations
                    325: (with
                    326: .Ic "\&! || &&" )
                    327: of regular expressions and
                    328: relational expressions.
1.22      jmc       329: .Nm
                    330: supports extended regular expressions
                    331: .Pq EREs .
                    332: See
                    333: .Xr re_format 7
                    334: for more information on regular expressions.
1.18      jmc       335: Isolated regular expressions
                    336: in a pattern apply to the entire line.
                    337: Regular expressions may also occur in
                    338: relational expressions, using the operators
                    339: .Ic ~
                    340: and
                    341: .Ic !~ .
1.44      schwarze  342: .Pf / Ar re Ns /
1.18      jmc       343: is a constant regular expression;
                    344: any string (constant or variable) may be used
                    345: as a regular expression, except in the position of an isolated regular expression
                    346: in a pattern.
                    347: .Pp
                    348: A pattern may consist of two patterns separated by a comma;
                    349: in this case, the action is performed for all lines
                    350: from an occurrence of the first pattern
                    351: through an occurrence of the second.
                    352: .Pp
                    353: A relational expression is one of the following:
1.35      jmc       354: .Pp
                    355: .Bl -tag -width Ds -offset indent -compact
                    356: .It Ar expression matchop regular-expression
                    357: .It Ar expression relop expression
                    358: .It Ar expression Ic in Ar array-name
                    359: .It Xo Ic \&( Ns
1.18      jmc       360: .Ar expr , expr , \&... Ns Ic \&) in
1.35      jmc       361: .Ar array-name
1.18      jmc       362: .Xc
1.35      jmc       363: .El
1.18      jmc       364: .Pp
                    365: where a
                    366: .Ar relop
                    367: is any of the six relational operators in C, and a
                    368: .Ar matchop
                    369: is either
                    370: .Ic ~
                    371: (matches)
                    372: or
                    373: .Ic !~
                    374: (does not match).
                    375: A conditional is an arithmetic expression,
                    376: a relational expression,
                    377: or a Boolean combination
                    378: of these.
                    379: .Pp
1.46      deraadt   380: The special pattern
1.18      jmc       381: .Ic BEGIN
1.46      deraadt   382: may be used to capture control before the first input line is read.
                    383: The special pattern
1.18      jmc       384: .Ic END
1.46      deraadt   385: may be used to capture control after processing is finished.
1.18      jmc       386: .Ic BEGIN
                    387: and
                    388: .Ic END
                    389: do not combine with other patterns.
1.47      millert   390: They may appear multiple times in a program and execute
                    391: in the order they are read by
                    392: .Nm .
1.18      jmc       393: .Pp
                    394: Variable names with special meanings:
                    395: .Pp
1.20      jmc       396: .Bl -tag -width "FILENAME " -compact
1.18      jmc       397: .It Va ARGC
                    398: Argument count, assignable.
                    399: .It Va ARGV
                    400: Argument array, assignable;
                    401: non-null members are taken as filenames.
                    402: .It Va CONVFMT
                    403: Conversion format when converting numbers
                    404: (default
                    405: .Qq Li %.6g ) .
                    406: .It Va ENVIRON
                    407: Array of environment variables; subscripts are names.
                    408: .It Va FILENAME
                    409: The name of the current input file.
                    410: .It Va FNR
                    411: Ordinal number of the current record in the current file.
                    412: .It Va FS
1.55      millert   413: Regular expression used to separate fields (default whitespace);
                    414: also settable by option
1.63      jmc       415: .Fl F Ar fs .
1.18      jmc       416: .It Va NF
                    417: Number of fields in the current record.
                    418: .Va $NF
                    419: can be used to obtain the value of the last field in the current record.
                    420: .It Va NR
                    421: Ordinal number of the current record.
                    422: .It Va OFMT
                    423: Output format for numbers (default
                    424: .Qq Li %.6g ) .
                    425: .It Va OFS
                    426: Output field separator (default blank).
                    427: .It Va ORS
                    428: Output record separator (default newline).
                    429: .It Va RLENGTH
                    430: The length of the string matched by the
                    431: .Fn match
                    432: function.
                    433: .It Va RS
                    434: Input record separator (default newline).
1.49      millert   435: If empty, blank lines separate records.
                    436: If more than one character long,
                    437: .Va RS
                    438: is treated as a regular expression, and records are
                    439: separated by text matching the expression.
1.18      jmc       440: .It Va RSTART
                    441: The starting position of the string matched by the
                    442: .Fn match
                    443: function.
                    444: .It Va SUBSEP
                    445: Separates multiple subscripts (default 034).
                    446: .El
1.17      jmc       447: .Sh FUNCTIONS
                    448: The awk language has a variety of built-in functions:
1.30      jmc       449: arithmetic, string, input/output, general, and bit-operation.
                    450: .Pp
                    451: Functions may be defined (at the position of a pattern-action statement)
                    452: thusly:
                    453: .Pp
                    454: .Dl function foo(a, b, c) { ...; return x }
                    455: .Pp
                    456: Parameters are passed by value if scalar, and by reference if array name;
                    457: functions may be called recursively.
                    458: Parameters are local to the function; all other variables are global.
                    459: Thus local variables may be created by providing excess parameters in
                    460: the function definition.
1.17      jmc       461: .Ss Arithmetic Functions
                    462: .Bl -tag -width "atan2(y, x)"
                    463: .It Fn atan2 y x
                    464: Return the arctangent of
                    465: .Fa y Ns / Ns Fa x
                    466: in radians.
                    467: .It Fn cos x
                    468: Return the cosine of
                    469: .Fa x ,
                    470: where
                    471: .Fa x
                    472: is in radians.
                    473: .It Fn exp x
                    474: Return the exponential of
                    475: .Fa x .
                    476: .It Fn int x
                    477: Return
                    478: .Fa x
                    479: truncated to an integer value.
                    480: .It Fn log x
                    481: Return the natural logarithm of
                    482: .Fa x .
1.7       aaron     483: .It Fn rand
1.17      jmc       484: Return a random number,
                    485: .Fa n ,
                    486: such that
                    487: .Sm off
                    488: .Pf 0 \*(Le Fa n No \*(Lt 1 .
                    489: .Sm on
1.53      tim       490: Random numbers are non-deterministic unless a seed is explicitly set with
                    491: .Fn srand .
1.17      jmc       492: .It Fn sin x
                    493: Return the sine of
                    494: .Fa x ,
                    495: where
                    496: .Fa x
                    497: is in radians.
                    498: .It Fn sqrt x
                    499: Return the square root of
                    500: .Fa x .
                    501: .It Fn srand expr
1.16      jmc       502: Sets seed for
1.7       aaron     503: .Fn rand
1.17      jmc       504: to
                    505: .Fa expr
1.1       tholo     506: and returns the previous seed.
1.17      jmc       507: If
                    508: .Fa expr
1.53      tim       509: is omitted,
                    510: .Fn rand
                    511: will return non-deterministic random numbers.
1.17      jmc       512: .El
                    513: .Ss String Functions
                    514: .Bl -tag -width "split(s, a, fs)"
1.52      millert   515: .It Fn gensub r s h [t]
                    516: Search the target string
                    517: .Ar t
                    518: for matches of the regular expression
                    519: .Ar r .
                    520: If
                    521: .Ar h
                    522: is a string beginning with
                    523: .Ic g
                    524: or
                    525: .Ic G ,
                    526: then replace all matches of
                    527: .Ar r
                    528: with
                    529: .Ar s .
                    530: Otherwise,
                    531: .Ar h
                    532: is a number indicating which match of
                    533: .Ar r
                    534: to replace.
                    535: If no
                    536: .Ar t
                    537: is supplied,
                    538: .Va $0
                    539: is used instead.
                    540: .\"Within the replacement text
                    541: .\".Ar s ,
                    542: .\"the sequence
                    543: .\".Ar \en ,
                    544: .\"where
                    545: .\".Ar n
                    546: .\"is a digit from 1 to 9, may be used to indicate just the text that
                    547: .\"matched the
                    548: .\".Ar n Ap th
                    549: .\"parenthesized subexpression.
                    550: .\"The sequence
                    551: .\".Ic \e0
                    552: .\"represents the entire text, as does the character
                    553: .\".Ic & .
                    554: Unlike
                    555: .Fn sub
                    556: and
                    557: .Fn gsub ,
                    558: the modified string is returned as the result of the function,
                    559: and the original target is
                    560: .Em not
                    561: changed.
                    562: Note that
                    563: .Ar \en
                    564: sequences within the replacement string
                    565: .Ar s ,
                    566: as supported by GNU
                    567: .Nm ,
                    568: are
                    569: .Em not
                    570: supported at this time.
1.17      jmc       571: .It Fn gsub r t s
                    572: The same as
                    573: .Fn sub
                    574: except that all occurrences of the regular expression are replaced.
                    575: .Fn gsub
                    576: returns the number of replacements.
1.7       aaron     577: .It Fn index s t
1.16      jmc       578: The position in
1.7       aaron     579: .Fa s
1.1       tholo     580: where the string
1.7       aaron     581: .Fa t
1.1       tholo     582: occurs, or 0 if it does not.
1.17      jmc       583: .It Fn length s
                    584: The length of
                    585: .Fa s
                    586: taken as a string,
1.47      millert   587: number of elements in an array for an array argument,
                    588: or length of
1.17      jmc       589: .Va $0
                    590: if no argument is given.
1.7       aaron     591: .It Fn match s r
1.16      jmc       592: The position in
1.7       aaron     593: .Fa s
1.1       tholo     594: where the regular expression
1.7       aaron     595: .Fa r
1.1       tholo     596: occurs, or 0 if it does not.
1.17      jmc       597: The variable
1.7       aaron     598: .Va RSTART
1.17      jmc       599: is set to the starting position of the matched string
                    600: .Pq which is the same as the returned value
                    601: or zero if no match is found.
                    602: The variable
1.7       aaron     603: .Va RLENGTH
1.17      jmc       604: is set to the length of the matched string,
                    605: or \-1 if no match is found.
1.7       aaron     606: .It Fn split s a fs
1.16      jmc       607: Splits the string
1.7       aaron     608: .Fa s
1.1       tholo     609: into array elements
1.7       aaron     610: .Va a[1] , a[2] , ... , a[n]
1.1       tholo     611: and returns
1.7       aaron     612: .Va n .
1.1       tholo     613: The separation is done with the regular expression
1.7       aaron     614: .Ar fs
1.1       tholo     615: or with the field separator
1.7       aaron     616: .Va FS
1.1       tholo     617: if
1.7       aaron     618: .Ar fs
1.1       tholo     619: is not given.
                    620: An empty string as field separator splits the string
                    621: into one array element per character.
1.17      jmc       622: .It Fn sprintf fmt expr ...
                    623: The string resulting from formatting
                    624: .Fa expr , ...
                    625: according to the
1.28      jmc       626: .Xr printf 1
1.17      jmc       627: format
                    628: .Fa fmt .
1.7       aaron     629: .It Fn sub r t s
1.16      jmc       630: Substitutes
1.7       aaron     631: .Fa t
1.1       tholo     632: for the first occurrence of the regular expression
1.7       aaron     633: .Fa r
1.1       tholo     634: in the string
1.7       aaron     635: .Fa s .
1.1       tholo     636: If
1.7       aaron     637: .Fa s
1.1       tholo     638: is not given,
1.7       aaron     639: .Va $0
1.1       tholo     640: is used.
1.17      jmc       641: An ampersand
                    642: .Pq Sq &
                    643: in
                    644: .Fa t
                    645: is replaced in string
                    646: .Fa s
                    647: with regular expression
                    648: .Fa r .
                    649: A literal ampersand can be specified by preceding it with two backslashes
                    650: .Pq Sq \e\e .
                    651: A literal backslash can be specified by preceding it with another backslash
                    652: .Pq Sq \e\e .
1.7       aaron     653: .Fn sub
1.17      jmc       654: returns the number of replacements.
                    655: .It Fn substr s m n
                    656: Return at most the
                    657: .Fa n Ns -character
                    658: substring of
                    659: .Fa s
                    660: that begins at position
                    661: .Fa m
                    662: counted from 1.
                    663: If
                    664: .Fa n
                    665: is omitted, or if
                    666: .Fa n
                    667: specifies more characters than are left in the string,
                    668: the length of the substring is limited by the length of
                    669: .Fa s .
1.7       aaron     670: .It Fn tolower str
1.16      jmc       671: Returns a copy of
1.7       aaron     672: .Fa str
1.1       tholo     673: with all upper-case characters translated to their
                    674: corresponding lower-case equivalents.
1.7       aaron     675: .It Fn toupper str
1.16      jmc       676: Returns a copy of
1.7       aaron     677: .Fa str
1.1       tholo     678: with all lower-case characters translated to their
                    679: corresponding upper-case equivalents.
1.7       aaron     680: .El
1.52      millert   681: .Ss Time Functions
                    682: This version of
                    683: .Nm
                    684: provides the following functions for obtaining and formatting time
                    685: stamps.
                    686: .Bl -tag -width indent
1.57      millert   687: .It Fn mktime datespec
                    688: Converts
                    689: .Fa datespec
                    690: into a timestamp in the same form as a value returned by
                    691: .Fn systime .
                    692: The
                    693: .Fa datespec
                    694: is a string composed of six or seven numbers separated by whitespace:
                    695: .Bd -literal -offset indent
                    696: YYYY MM DD HH MM SS [DST]
                    697: .Ed
                    698: .Pp
                    699: The fields in
                    700: .Fa datespec
                    701: are as follows:
                    702: .Bl -tag -width "YYYY"
1.60      millert   703: .It YYYY
1.57      millert   704: Year: a four-digit year, including the century.
                    705: .It MM
                    706: Month: a number from 1 to 12.
                    707: .It DD
                    708: Day: a number from 1 to 31.
                    709: .It HH
                    710: Hour: a number from 0 to 23.
                    711: .It MM
                    712: Minute: a number from 0 to 59.
                    713: .It SS
                    714: Second: a number from 0 to 60 (permitting a leap second).
                    715: .It DST
                    716: Daylight Saving Time: a positive or zero value indicates that
                    717: DST is or is not in effect.
                    718: If DST is not specified, or is negative,
                    719: .Fn mktime
                    720: will attempt to determine the correct value.
                    721: .El
1.52      millert   722: .It Fn strftime "[format [, timestamp]]"
                    723: Formats
                    724: .Ar timestamp
                    725: according to the string
                    726: .Ar format .
                    727: The format string may contain any of the conversion specifications described
                    728: in the
                    729: .Xr strftime 3
                    730: manual page, as well as any arbitrary text.
                    731: The
                    732: .Ar timestamp
                    733: must be in the same form as a value returned by
1.57      millert   734: .Fn mktime
                    735: and
1.52      millert   736: .Fn systime .
                    737: If
                    738: .Ar timestamp
                    739: is not specified, the current time is used.
                    740: If
                    741: .Ar format
                    742: is not specified, a default format equivalent to the output of
                    743: .Xr date 1
                    744: is used.
                    745: .It Fn systime
                    746: Returns the value of time in seconds since 0 hours, 0 minutes,
                    747: 0 seconds, January 1, 1970, Coordinated Universal Time (UTC).
                    748: .El
1.17      jmc       749: .Ss Input/Output and General Functions
                    750: .Bl -tag -width "getline [var] < file"
                    751: .It Fn close expr
                    752: Closes the file or pipe
                    753: .Fa expr .
                    754: .Fa expr
                    755: should match the string that was used to open the file or pipe.
                    756: .It Ar cmd | Ic getline Op Va var
                    757: Read a record of input from a stream piped from the output of
                    758: .Ar cmd .
                    759: If
                    760: .Va var
                    761: is omitted, the variables
                    762: .Va $0
                    763: and
                    764: .Va NF
                    765: are set.
                    766: Otherwise
                    767: .Va var
                    768: is set.
                    769: If the stream is not open, it is opened.
                    770: As long as the stream remains open, subsequent calls
                    771: will read subsequent records from the stream.
                    772: The stream remains open until explicitly closed with a call to
                    773: .Fn close .
1.24      jmc       774: .Ic getline
                    775: returns 1 for a successful input, 0 for end of file, and \-1 for an error.
                    776: .It Fn fflush [expr]
1.39      jmc       777: Flushes any buffered output for the file or pipe
1.24      jmc       778: .Fa expr ,
                    779: or all open files or pipes if
                    780: .Fa expr
                    781: is omitted.
1.17      jmc       782: .Fa expr
                    783: should match the string that was used to open the file or pipe.
                    784: .It Ic getline
                    785: Sets
                    786: .Va $0
                    787: to the next input record from the current input file.
                    788: This form of
                    789: .Ic getline
                    790: sets the variables
                    791: .Va NF ,
                    792: .Va NR ,
                    793: and
                    794: .Va FNR .
1.7       aaron     795: .Ic getline
1.17      jmc       796: returns 1 for a successful input, 0 for end of file, and \-1 for an error.
                    797: .It Ic getline Va var
                    798: Sets
1.7       aaron     799: .Va $0
1.17      jmc       800: to variable
                    801: .Va var .
                    802: This form of
                    803: .Ic getline
                    804: sets the variables
                    805: .Va NR
                    806: and
                    807: .Va FNR .
                    808: .Ic getline
                    809: returns 1 for a successful input, 0 for end of file, and \-1 for an error.
                    810: .It Xo
                    811: .Ic getline Op Va var
1.47      millert   812: .Pf <\ \& Ar file
1.17      jmc       813: .Xc
                    814: Sets
1.7       aaron     815: .Va $0
1.1       tholo     816: to the next record from
1.7       aaron     817: .Ar file .
1.17      jmc       818: If
                    819: .Va var
                    820: is omitted, the variables
                    821: .Va $0
                    822: and
                    823: .Va NF
                    824: are set.
                    825: Otherwise
                    826: .Va var
                    827: is set.
                    828: If
                    829: .Ar file
                    830: is not open, it is opened.
                    831: As long as the stream remains open, subsequent calls will read subsequent
                    832: records from
                    833: .Ar file .
                    834: .Ar file
                    835: remains open until explicitly closed with a call to
                    836: .Fn close .
                    837: .It Fn system cmd
                    838: Executes
                    839: .Fa cmd
                    840: and returns its exit status.
1.47      millert   841: This will be \-1 upon error,
                    842: .Ar cmd Ns 's
                    843: exit status upon a normal exit,
                    844: 256 +
                    845: .Em sig
                    846: if
                    847: .Fa cmd
                    848: was terminated by a signal, where
                    849: .Em sig
                    850: is the number of the signal,
                    851: or 512 +
                    852: .Em sig
                    853: if there was a core dump.
1.17      jmc       854: .El
1.30      jmc       855: .Ss Bit-Operation Functions
1.29      pyr       856: .Bl -tag -width "lshift(a, b)"
                    857: .It Fn compl x
                    858: Returns the bitwise complement of integer argument x.
                    859: .It Fn and x y
1.30      jmc       860: Performs a bitwise AND on integer arguments x and y.
1.29      pyr       861: .It Fn or x y
1.30      jmc       862: Performs a bitwise OR on integer arguments x and y.
1.29      pyr       863: .It Fn xor x y
1.30      jmc       864: Performs a bitwise Exclusive-OR on integer arguments x and y.
1.29      pyr       865: .It Fn lshift x n
1.39      jmc       866: Returns integer argument x shifted by n bits to the left.
1.29      pyr       867: .It Fn rshift x n
1.39      jmc       868: Returns integer argument x shifted by n bits to the right.
1.29      pyr       869: .El
1.50      millert   870: .Sh ENVIRONMENT
                    871: The following environment variables affect the execution of
                    872: .Nm :
                    873: .Bl -tag -width POSIXLY_CORRECT
                    874: .It Ev POSIXLY_CORRECT
                    875: When set, behave in accordance with the standard, even when it conflicts
                    876: with historical behavior.
                    877: .El
1.37      jmc       878: .Sh EXIT STATUS
                    879: .Ex -std awk
                    880: .Pp
                    881: But note that the
                    882: .Ic exit
                    883: expression can modify the exit status.
1.7       aaron     884: .Sh EXAMPLES
1.16      jmc       885: Print lines longer than 72 characters:
                    886: .Pp
1.7       aaron     887: .Dl length($0) > 72
1.16      jmc       888: .Pp
                    889: Print first two fields in opposite order:
1.7       aaron     890: .Pp
                    891: .Dl { print $2, $1 }
1.16      jmc       892: .Pp
1.47      millert   893: Same, with input fields separated by comma and/or spaces and tabs:
1.7       aaron     894: .Bd -literal -offset indent
1.1       tholo     895: BEGIN { FS = ",[ \et]*|[ \et]+" }
                    896:       { print $2, $1 }
1.7       aaron     897: .Ed
1.16      jmc       898: .Pp
                    899: Add up first column, print sum and average:
1.7       aaron     900: .Bd -literal -offset indent
                    901: { s += $1 }
                    902: END { print "sum is", s, " average is", s/NR }
                    903: .Ed
1.16      jmc       904: .Pp
                    905: Print all lines between start/stop pairs:
1.7       aaron     906: .Pp
                    907: .Dl /start/, /stop/
1.16      jmc       908: .Pp
1.45      naddy     909: Simulate
                    910: .Xr echo 1 :
1.7       aaron     911: .Bd -literal -offset indent
                    912: BEGIN { # Simulate echo(1)
                    913:         for (i = 1; i < ARGC; i++) printf "%s ", ARGV[i]
                    914:         printf "\en"
                    915:         exit }
1.19      jmc       916: .Ed
                    917: .Pp
                    918: Print an error message to standard error:
                    919: .Bd -literal -offset indent
                    920: { print "error!" > "/dev/stderr" }
1.7       aaron     921: .Ed
1.59      millert   922: .Sh UNUSUAL FLOATING-POINT VALUES
                    923: .Nm
                    924: was designed before IEEE 754 arithmetic defined Not-A-Number (NaN)
                    925: and Infinity values, which are supported by all modern floating-point
                    926: hardware.
                    927: .Pp
                    928: Because
                    929: .Nm
                    930: uses
                    931: .Xr strtod 3
                    932: and
                    933: .Xr atof 3
                    934: to convert string values to double-precision floating-point values,
                    935: modern C libraries also convert strings starting with
                    936: .Dv inf
                    937: and
                    938: .Dv nan
                    939: into infinity and NaN values respectively.
                    940: This led to strange results,
                    941: with something like this:
                    942: .Pp
                    943: .Li echo nancy | awk '{ print $1 + 0 }'
                    944: .Pp
                    945: printing
                    946: .Dv nan
                    947: instead of zero.
                    948: .Pp
                    949: .Nm
                    950: now follows GNU
                    951: .Nm ,
                    952: and prefilters string values before attempting
                    953: to convert them to numbers, as follows:
                    954: .Bl -tag -width Ds
                    955: .It Hexadecimal values
                    956: Hexadecimal values (allowed since C99) convert to zero, as they did
                    957: prior to C99.
                    958: .It NaN values
                    959: The two strings
                    960: .Dq +NAN
                    961: and
                    962: .Dq -NAN
                    963: (case independent) convert to NaN.
                    964: No others do.
                    965: (NaNs can have signs.)
                    966: .It Infinity values
                    967: The two strings
                    968: .Dq +INF
                    969: and
                    970: .Dq -INF
                    971: (case independent) convert to positive and negative infinity, respectively.
                    972: No others do.
                    973: .El
1.7       aaron     974: .Sh SEE ALSO
1.42      tedu      975: .Xr cut 1 ,
1.52      millert   976: .Xr date 1 ,
1.47      millert   977: .Xr grep 1 ,
1.7       aaron     978: .Xr lex 1 ,
1.20      jmc       979: .Xr printf 1 ,
1.16      jmc       980: .Xr sed 1 ,
1.52      millert   981: .Xr strftime 3 ,
1.23      jmc       982: .Xr re_format 7 ,
                    983: .Xr script 7
1.61      jsg       984: .Rs
                    985: .\" 4.4BSD USD:16
1.62      jsg       986: .\".%R Computing Science Technical Report
                    987: .\".%N 68
                    988: .\".%D July 1978
1.61      jsg       989: .%A A. V. Aho
                    990: .%A P. J. Weinberger
                    991: .%A B. W. Kernighan
                    992: .%T AWK \(em A Pattern Scanning and Processing Language
1.62      jsg       993: .%J Software \(em Practice and Experience
                    994: .%V 9:4
                    995: .%P pp. 267-279
                    996: .%D April 1979
1.61      jsg       997: .Re
1.7       aaron     998: .Rs
                    999: .%A A. V. Aho
                   1000: .%A B. W. Kernighan
                   1001: .%A P. J. Weinberger
                   1002: .%T The AWK Programming Language
                   1003: .%I Addison-Wesley
1.64    ! jsg      1004: .%D 2024
        !          1005: .%O ISBN 0-13-826972-6
1.7       aaron    1006: .Re
1.26      jmc      1007: .Sh STANDARDS
                   1008: The
                   1009: .Nm
                   1010: utility is compliant with the
1.33      jmc      1011: .St -p1003.1-2008
1.50      millert  1012: specification except that consecutive backslashes in the replacement
                   1013: string argument for
                   1014: .Fn sub
                   1015: and
                   1016: .Fn gsub
1.51      millert  1017: are not collapsed and a slash
                   1018: .Pq Ql /
                   1019: does not need to be escaped in a bracket expression.
1.53      tim      1020: Also, the behaviour of
                   1021: .Fn rand
                   1022: and
                   1023: .Fn srand
                   1024: has been changed to support non-deterministic random numbers.
1.26      jmc      1025: .Pp
                   1026: The flags
                   1027: .Op Fl \&dV
                   1028: and
                   1029: .Op Fl safe ,
1.56      millert  1030: support for regular expressions in
                   1031: .Va RS ,
1.52      millert  1032: as well as the functions
                   1033: .Fn fflush ,
                   1034: .Fn gensub ,
                   1035: .Fn compl ,
                   1036: .Fn and ,
                   1037: .Fn or ,
                   1038: .Fn xor ,
                   1039: .Fn lshift ,
                   1040: .Fn rshift ,
1.57      millert  1041: .Fn mktime ,
1.52      millert  1042: .Fn strftime
                   1043: and
                   1044: .Fn systime
1.26      jmc      1045: are extensions to that specification.
1.8       aaron    1046: .Sh HISTORY
1.13      millert  1047: An
1.8       aaron    1048: .Nm
1.13      millert  1049: utility appeared in
                   1050: .At v7 .
1.7       aaron    1051: .Sh BUGS
1.1       tholo    1052: There are no explicit conversions between numbers and strings.
                   1053: To force an expression to be treated as a number add 0 to it;
                   1054: to force it to be treated as a string concatenate
1.7       aaron    1055: .Li \&""
                   1056: to it.
                   1057: .Pp
1.1       tholo    1058: The scope rules for variables in functions are a botch;
                   1059: the syntax is worse.
1.47      millert  1060: .Pp
                   1061: Only eight-bit character sets are handled correctly.