[BACK]Return to awk.1 CVS log [TXT][DIR] Up to [local] / src / usr.bin / awk

Annotation of src/usr.bin/awk/awk.1, Revision 1.13

1.13    ! millert     1: .\"    $OpenBSD: awk.1,v 1.12 2003/06/10 09:12:09 jmc Exp $
1.7       aaron       2: .\" EX/EE is a Bd
1.11      jmc         3: .\"
                      4: .\" Copyright (C) Lucent Technologies 1997
                      5: .\" All Rights Reserved
1.12      jmc         6: .\"
1.11      jmc         7: .\" Permission to use, copy, modify, and distribute this software and
                      8: .\" its documentation for any purpose and without fee is hereby
                      9: .\" granted, provided that the above copyright notice appear in all
                     10: .\" copies and that both that the copyright notice and this
                     11: .\" permission notice and warranty disclaimer appear in supporting
                     12: .\" documentation, and that the name Lucent Technologies or any of
                     13: .\" its entities not be used in advertising or publicity pertaining
                     14: .\" to distribution of the software without specific, written prior
                     15: .\" permission.
1.12      jmc        16: .\"
1.11      jmc        17: .\" LUCENT DISCLAIMS ALL WARRANTIES WITH REGARD TO THIS SOFTWARE,
                     18: .\" INCLUDING ALL IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS.
                     19: .\" IN NO EVENT SHALL LUCENT OR ANY OF ITS ENTITIES BE LIABLE FOR ANY
                     20: .\" SPECIAL, INDIRECT OR CONSEQUENTIAL DAMAGES OR ANY DAMAGES
                     21: .\" WHATSOEVER RESULTING FROM LOSS OF USE, DATA OR PROFITS, WHETHER
                     22: .\" IN AN ACTION OF CONTRACT, NEGLIGENCE OR OTHER TORTIOUS ACTION,
                     23: .\" ARISING OUT OF OR IN CONNECTION WITH THE USE OR PERFORMANCE OF
                     24: .\" THIS SOFTWARE.
                     25: .\"
1.7       aaron      26: .Dd June 29, 1996
                     27: .Dt AWK 1
                     28: .Os
                     29: .Sh NAME
                     30: .Nm awk
                     31: .Nd pattern-directed scanning and processing language
                     32: .Sh SYNOPSIS
                     33: .Nm awk
                     34: .Op Fl F Ar fs
                     35: .Op Fl v Ar var=value
                     36: .Op Fl safe
                     37: .Op Fl mr Ar n
                     38: .Op Fl mf Ar n
                     39: .Op Ar prog | Fl f Ar progfile
                     40: .Ar
                     41: .Nm nawk
                     42: .Ar ...
                     43: .Sh DESCRIPTION
                     44: .Nm
1.1       tholo      45: scans each input
1.7       aaron      46: .Ar file
1.1       tholo      47: for lines that match any of a set of patterns specified literally in
1.7       aaron      48: .Ar prog
1.1       tholo      49: or in one or more files
                     50: specified as
1.7       aaron      51: .Fl f Ar progfile .
1.1       tholo      52: With each pattern
                     53: there can be an associated action that will be performed
                     54: when a line of a
1.7       aaron      55: .Ar file
1.1       tholo      56: matches the pattern.
                     57: Each line is matched against the
                     58: pattern portion of every pattern-action statement;
                     59: the associated action is performed for each matched pattern.
1.6       aaron      60: The file name
1.7       aaron      61: .Sq Pa \-
1.1       tholo      62: means the standard input.
                     63: Any
1.7       aaron      64: .Ar file
1.1       tholo      65: of the form
1.7       aaron      66: .Ar var=value
1.1       tholo      67: is treated as an assignment, not a filename,
                     68: and is executed at the time it would have been opened if it were a filename.
                     69: The option
1.7       aaron      70: .Fl v
1.1       tholo      71: followed by
1.7       aaron      72: .Ar var=value
1.1       tholo      73: is an assignment to be done before
1.7       aaron      74: .Ar prog
1.1       tholo      75: is executed;
                     76: any number of
1.7       aaron      77: .Fl v
1.1       tholo      78: options may be present.
                     79: The
1.7       aaron      80: .Fl F Ar fs
1.1       tholo      81: option defines the input field separator to be the regular expression
1.7       aaron      82: .Ar fs .
1.5       angelos    83: The
1.7       aaron      84: .Fl safe
                     85: option disables file output
                     86: .Po
                     87: .Ic print Ic > ,
                     88: .Ic print Ic >> ,
                     89: .Pc
                     90: process creation
                     91: .Po
                     92: .Ar cmd Ic \&| getline ,
                     93: .Ic print \&| , system
                     94: .Pc
                     95: and access to the environment
                     96: .Pq Va ENVIRON .
                     97: This
                     98: is a first (and not very reliable) approximation to a
                     99: .Dq safe
                    100: version of
                    101: .Nm awk .
                    102: .Pp
                    103: An input line is normally made up of fields separated by whitespace,
1.1       tholo     104: or by regular expression
1.7       aaron     105: .Va FS .
1.1       tholo     106: The fields are denoted
1.7       aaron     107: .Va $1 , $2 , ... ,
                    108: while
                    109: .Va $0
1.1       tholo     110: refers to the entire line.
                    111: If
1.7       aaron     112: .Va FS
1.1       tholo     113: is null, the input line is split into one field per character.
1.7       aaron     114: .Pp
1.1       tholo     115: To compensate for inadequate implementation of storage management,
1.6       aaron     116: the
1.7       aaron     117: .Fl mr
1.1       tholo     118: option can be used to set the maximum size of the input record,
                    119: and the
1.7       aaron     120: .Fl mf
1.1       tholo     121: option to set the maximum number of fields.
1.7       aaron     122: .Pp
1.1       tholo     123: A pattern-action statement has the form
1.7       aaron     124: .Pp
                    125: .D1 Ar pattern Ic \&{ Ar action Ic \&}
                    126: .Pp
1.6       aaron     127: A missing
1.7       aaron     128: .Ic \&{ Ar action Ic \&}
1.1       tholo     129: means print the line;
                    130: a missing pattern always matches.
                    131: Pattern-action statements are separated by newlines or semicolons.
1.7       aaron     132: .Pp
1.1       tholo     133: An action is a sequence of statements.
                    134: A statement can be one of the following:
1.7       aaron     135: .Pp
                    136: .Bd -unfilled -offset indent
                    137: .Ic if ( Xo
                    138: .Ar expression ) statement \&
                    139: .Op Ic else Ar statement
                    140: .Xc
                    141: .Ic while ( Ar expression ) statement
                    142: .Ic for ( Xo
                    143: .Ar expression ; expression ; expression ) statement
                    144: .Xc
                    145: .Ic for ( Xo
                    146: .Ar var Ic in Ar array ) statement
                    147: .Xc
                    148: .Ic do Ar statement Ic while ( Ar expression )
                    149: .Ic break
                    150: .Ic continue
                    151: .Ic { Oo Ar statement ... Oc Ic \& }
                    152: .Ar expression Xo
                    153: .No "# commonly" \&
                    154: .Ar var Ic = Ar expression
                    155: .Xc
                    156: .Ic print Xo
                    157: .Op Ar expression-list
                    158: .Op Ic > Ns Ar expression
                    159: .Xc
                    160: .Ic printf Ar format Xo
                    161: .Op Ar ... , expression-list
                    162: .Op Ic > Ns Ar expression
                    163: .Xc
                    164: .Ic return Op Ar expression
                    165: .Ic next Xo
                    166: .No "# skip remaining patterns on this input line"
                    167: .Xc
                    168: .Ic nextfile Xo
                    169: .No "# skip rest of this file, open next, start at top"
                    170: .Xc
                    171: .Ic delete Ar array Ns Xo
                    172: .Ic \&[ Ns Ar expression Ns Ic \&]
                    173: .No \& "# delete an array element"
                    174: .Xc
                    175: .Ic delete Ar array Xo
                    176: .No "# delete all elements of array"
                    177: .Xc
                    178: .Ic exit Xo
                    179: .Op Ar expression
                    180: .No \& "# exit immediately; status is" Ar expression
                    181: .Xc
                    182: .Ed
                    183: .Pp
1.1       tholo     184: Statements are terminated by
                    185: semicolons, newlines or right braces.
                    186: An empty
1.7       aaron     187: .Ar expression-list
1.1       tholo     188: stands for
1.7       aaron     189: .Ar $0 .
                    190: String constants are quoted
                    191: .Li \&"" ,
1.1       tholo     192: with the usual C escapes recognized within.
                    193: Expressions take on string or numeric values as appropriate,
                    194: and are built using the operators
1.7       aaron     195: .Ic + \- * / % ^
                    196: (exponentiation), and concatenation (indicated by whitespace).
1.1       tholo     197: The operators
1.7       aaron     198: .Ic ! ++ \-\- += \-= *= /= %= ^= > >= < <= == != ?:
1.1       tholo     199: are also available in expressions.
                    200: Variables may be scalars, array elements
                    201: (denoted
1.7       aaron     202: .Li x[i] )
1.1       tholo     203: or fields.
                    204: Variables are initialized to the null string.
                    205: Array subscripts may be any string,
                    206: not necessarily numeric;
                    207: this allows for a form of associative memory.
                    208: Multiple subscripts such as
1.7       aaron     209: .Li [i,j,k]
1.1       tholo     210: are permitted; the constituents are concatenated,
                    211: separated by the value of
1.7       aaron     212: .Va SUBSEP .
                    213: .Pp
1.1       tholo     214: The
1.7       aaron     215: .Ic print
1.1       tholo     216: statement prints its arguments on the standard output
                    217: (or on a file if
1.7       aaron     218: .Ic > Ns Ar file
1.1       tholo     219: or
1.7       aaron     220: .Ic >> Ns Ar file
1.1       tholo     221: is present or on a pipe if
1.7       aaron     222: .Ic \&| Ar cmd
1.1       tholo     223: is present), separated by the current output field separator,
                    224: and terminated by the output record separator.
1.7       aaron     225: .Ar file
1.1       tholo     226: and
1.7       aaron     227: .Ar cmd
1.1       tholo     228: may be literal names or parenthesized expressions;
                    229: identical string values in different statements denote
                    230: the same open file.
                    231: The
1.7       aaron     232: .Ic printf
1.1       tholo     233: statement formats its expression list according to the format
                    234: (see
1.10      pvalchev  235: .Xr printf 3 ) .
1.1       tholo     236: The built-in function
1.7       aaron     237: .Fn close expr
1.1       tholo     238: closes the file or pipe
1.7       aaron     239: .Fa expr .
1.1       tholo     240: The built-in function
1.7       aaron     241: .Fn fflush expr
1.1       tholo     242: flushes any buffered output for the file or pipe
1.7       aaron     243: .Fa expr .
                    244: .Pp
1.1       tholo     245: The mathematical functions
1.7       aaron     246: .Fn exp ,
                    247: .Fn log ,
                    248: .Fn sqrt ,
                    249: .Fn sin ,
                    250: .Fn cos ,
1.1       tholo     251: and
1.7       aaron     252: .Fn atan2
1.1       tholo     253: are built in.
                    254: Other built-in functions:
1.7       aaron     255: .Pp
                    256: .Bl -tag -width Fn
                    257: .It Fn length
1.1       tholo     258: the length of its argument
                    259: taken as a string,
                    260: or of
1.7       aaron     261: .Va $0
1.1       tholo     262: if no argument.
1.7       aaron     263: .It Fn rand
1.1       tholo     264: random number on (0,1)
1.7       aaron     265: .It Fn srand
1.1       tholo     266: sets seed for
1.7       aaron     267: .Fn rand
1.1       tholo     268: and returns the previous seed.
1.7       aaron     269: .It Fn int
                    270: truncates to an integer value.
                    271: .It Fn substr s m n
1.1       tholo     272: the
1.7       aaron     273: .Fa n Ns No -character
1.1       tholo     274: substring of
1.7       aaron     275: .Fa s
1.1       tholo     276: that begins at position
1.7       aaron     277: .Fa m
1.1       tholo     278: counted from 1.
1.7       aaron     279: .It Fn index s t
1.1       tholo     280: the position in
1.7       aaron     281: .Fa s
1.1       tholo     282: where the string
1.7       aaron     283: .Fa t
1.1       tholo     284: occurs, or 0 if it does not.
1.7       aaron     285: .It Fn match s r
1.1       tholo     286: the position in
1.7       aaron     287: .Fa s
1.1       tholo     288: where the regular expression
1.7       aaron     289: .Fa r
1.1       tholo     290: occurs, or 0 if it does not.
                    291: The variables
1.7       aaron     292: .Va RSTART
1.1       tholo     293: and
1.7       aaron     294: .Va RLENGTH
1.1       tholo     295: are set to the position and length of the matched string.
1.7       aaron     296: .It Fn split s a fs
1.1       tholo     297: splits the string
1.7       aaron     298: .Fa s
1.1       tholo     299: into array elements
1.7       aaron     300: .Va a[1] , a[2] , ... , a[n]
1.1       tholo     301: and returns
1.7       aaron     302: .Va n .
1.1       tholo     303: The separation is done with the regular expression
1.7       aaron     304: .Ar fs
1.1       tholo     305: or with the field separator
1.7       aaron     306: .Va FS
1.1       tholo     307: if
1.7       aaron     308: .Ar fs
1.1       tholo     309: is not given.
                    310: An empty string as field separator splits the string
                    311: into one array element per character.
1.7       aaron     312: .It Fn sub r t s
1.1       tholo     313: substitutes
1.7       aaron     314: .Fa t
1.1       tholo     315: for the first occurrence of the regular expression
1.7       aaron     316: .Fa r
1.1       tholo     317: in the string
1.7       aaron     318: .Fa s .
1.1       tholo     319: If
1.7       aaron     320: .Fa s
1.1       tholo     321: is not given,
1.7       aaron     322: .Va $0
1.1       tholo     323: is used.
1.7       aaron     324: .It Fn gsub r t s
1.1       tholo     325: same as
1.7       aaron     326: .Fn sub
1.1       tholo     327: except that all occurrences of the regular expression
                    328: are replaced;
1.7       aaron     329: .Fn sub
1.1       tholo     330: and
1.7       aaron     331: .Fn gsub
1.1       tholo     332: return the number of replacements.
1.7       aaron     333: .It Fn sprintf fmt expr ...
1.1       tholo     334: the string resulting from formatting
1.7       aaron     335: .Fa expr , ...
1.1       tholo     336: according to the
1.7       aaron     337: .Xr printf 3
1.1       tholo     338: format
1.7       aaron     339: .Fa fmt .
                    340: .It Fn system cmd
1.1       tholo     341: executes
1.7       aaron     342: .Fa cmd
                    343: and returns its exit status.
                    344: .It Fn tolower str
1.1       tholo     345: returns a copy of
1.7       aaron     346: .Fa str
1.1       tholo     347: with all upper-case characters translated to their
                    348: corresponding lower-case equivalents.
1.7       aaron     349: .It Fn toupper str
1.1       tholo     350: returns a copy of
1.7       aaron     351: .Fa str
1.1       tholo     352: with all lower-case characters translated to their
                    353: corresponding upper-case equivalents.
1.7       aaron     354: .El
                    355: .Pp
                    356: The
                    357: .Sq function
                    358: .Ic getline
1.1       tholo     359: sets
1.7       aaron     360: .Va $0
1.1       tholo     361: to the next input record from the current input file;
1.7       aaron     362: .Ic getline < Ar file
1.1       tholo     363: sets
1.7       aaron     364: .Va $0
1.1       tholo     365: to the next record from
1.7       aaron     366: .Ar file .
                    367: .Ic getline Va x
1.1       tholo     368: sets variable
1.7       aaron     369: .Va x
1.1       tholo     370: instead.
                    371: Finally,
1.7       aaron     372: .Ar cmd Ic \&| getline
1.1       tholo     373: pipes the output of
1.7       aaron     374: .Ar cmd
1.1       tholo     375: into
1.7       aaron     376: .Ic getline ;
1.1       tholo     377: each call of
1.7       aaron     378: .Ic getline
1.1       tholo     379: returns the next line of output from
1.7       aaron     380: .Ar cmd .
1.1       tholo     381: In all cases,
1.7       aaron     382: .Ic getline
1.1       tholo     383: returns 1 for a successful input,
                    384: 0 for end of file, and \-1 for an error.
1.7       aaron     385: .Pp
1.1       tholo     386: Patterns are arbitrary Boolean combinations
                    387: (with
1.7       aaron     388: .Ic "! || &&" )
1.1       tholo     389: of regular expressions and
                    390: relational expressions.
                    391: Regular expressions are as in
1.12      jmc       392: .Xr egrep 1 .
1.1       tholo     393: Isolated regular expressions
                    394: in a pattern apply to the entire line.
                    395: Regular expressions may also occur in
                    396: relational expressions, using the operators
1.7       aaron     397: .Ic ~
1.1       tholo     398: and
1.7       aaron     399: .Ic !~ .
                    400: .Ic / Ns Ar re Ns Ic /
1.1       tholo     401: is a constant regular expression;
                    402: any string (constant or variable) may be used
                    403: as a regular expression, except in the position of an isolated regular expression
                    404: in a pattern.
1.7       aaron     405: .Pp
1.1       tholo     406: A pattern may consist of two patterns separated by a comma;
                    407: in this case, the action is performed for all lines
                    408: from an occurrence of the first pattern
                    409: though an occurrence of the second.
1.7       aaron     410: .Pp
1.1       tholo     411: A relational expression is one of the following:
1.7       aaron     412: .Bd -unfilled -offset indent
                    413: .Ar expression matchop regular-expression
                    414: .Ar expression relop expression
                    415: .Ar expression Ic in Ar array-name
                    416: .Ic \&( Ns Xo
                    417: .Ar expr , expr , \&... Ns Ic \&) in
                    418: .Ar \& array-name
                    419: .Xc
                    420: .Ed
                    421: where a
                    422: .Ar relop
                    423: is any of the six relational operators in C, and a
                    424: .Ar matchop
                    425: is either
                    426: .Ic ~
1.1       tholo     427: (matches)
                    428: or
1.7       aaron     429: .Ic !~
1.1       tholo     430: (does not match).
                    431: A conditional is an arithmetic expression,
                    432: a relational expression,
                    433: or a Boolean combination
                    434: of these.
1.7       aaron     435: .Pp
1.1       tholo     436: The special patterns
1.7       aaron     437: .Ic BEGIN
1.1       tholo     438: and
1.7       aaron     439: .Ic END
1.1       tholo     440: may be used to capture control before the first input line is read
                    441: and after the last.
1.7       aaron     442: .Ic BEGIN
1.1       tholo     443: and
1.7       aaron     444: .Ic END
1.1       tholo     445: do not combine with other patterns.
1.7       aaron     446: .Pp
1.1       tholo     447: Variable names with special meanings:
1.7       aaron     448: .Pp
                    449: .Bl -tag -width Va -compact
                    450: .It Va CONVFMT
1.1       tholo     451: conversion format used when converting numbers
1.3       millert   452: (default
1.7       aaron     453: .Qq Li %.6g )
                    454: .It Va FS
1.1       tholo     455: regular expression used to separate fields; also settable
                    456: by option
1.9       millert   457: .Fl F Ar fs .
1.7       aaron     458: .It Va NF
1.1       tholo     459: number of fields in the current record
1.7       aaron     460: .It Va NR
1.1       tholo     461: ordinal number of the current record
1.7       aaron     462: .It Va FNR
1.1       tholo     463: ordinal number of the current record in the current file
1.7       aaron     464: .It Va FILENAME
1.1       tholo     465: the name of the current input file
1.7       aaron     466: .It Va RS
1.1       tholo     467: input record separator (default newline)
1.7       aaron     468: .It Va OFS
1.1       tholo     469: output field separator (default blank)
1.7       aaron     470: .It Va ORS
1.1       tholo     471: output record separator (default newline)
1.7       aaron     472: .It Va OFMT
1.1       tholo     473: output format for numbers (default
1.7       aaron     474: .Qq Li %.6g )
                    475: .It Va SUBSEP
1.1       tholo     476: separates multiple subscripts (default 034)
1.7       aaron     477: .It Va ARGC
1.1       tholo     478: argument count, assignable
1.7       aaron     479: .It Va ARGV
1.1       tholo     480: argument array, assignable;
                    481: non-null members are taken as filenames
1.7       aaron     482: .It Va ENVIRON
1.1       tholo     483: array of environment variables; subscripts are names.
1.7       aaron     484: .El
                    485: .Pp
                    486: Functions may be defined (at the position of a pattern-action statement)
                    487: thusly:
                    488: .Pp
                    489: .Dl function foo(a, b, c) { ...; return x }
                    490: .Pp
1.1       tholo     491: Parameters are passed by value if scalar and by reference if array name;
                    492: functions may be called recursively.
                    493: Parameters are local to the function; all other variables are global.
                    494: Thus local variables may be created by providing excess parameters in
                    495: the function definition.
1.7       aaron     496: .Sh EXAMPLES
                    497: .Dl length($0) > 72
1.1       tholo     498: Print lines longer than 72 characters.
1.7       aaron     499: .Pp
                    500: .Dl { print $2, $1 }
1.1       tholo     501: Print first two fields in opposite order.
1.7       aaron     502: .Pp
                    503: .Bd -literal -offset indent
1.1       tholo     504: BEGIN { FS = ",[ \et]*|[ \et]+" }
                    505:       { print $2, $1 }
1.7       aaron     506: .Ed
1.1       tholo     507: Same, with input fields separated by comma and/or blanks and tabs.
1.7       aaron     508: .Pp
                    509: .Bd -literal -offset indent
                    510: { s += $1 }
                    511: END { print "sum is", s, " average is", s/NR }
                    512: .Ed
1.1       tholo     513: Add up first column, print sum and average.
1.7       aaron     514: .Pp
                    515: .Dl /start/, /stop/
1.1       tholo     516: Print all lines between start/stop pairs.
1.7       aaron     517: .Pp
                    518: .Bd -literal -offset indent
                    519: BEGIN { # Simulate echo(1)
                    520:         for (i = 1; i < ARGC; i++) printf "%s ", ARGV[i]
                    521:         printf "\en"
                    522:         exit }
                    523: .Ed
                    524: .Sh SEE ALSO
                    525: .Xr lex 1 ,
                    526: .Xr sed 1
                    527: .Rs
                    528: .%A A. V. Aho
                    529: .%A B. W. Kernighan
                    530: .%A P. J. Weinberger
                    531: .%T The AWK Programming Language
                    532: .%I Addison-Wesley
                    533: .%D 1988
                    534: .%O ISBN 0-201-07981-X
                    535: .Re
1.8       aaron     536: .Sh HISTORY
1.13    ! millert   537: An
1.8       aaron     538: .Nm
1.13    ! millert   539: utility appeared in
        !           540: .At v7 .
1.7       aaron     541: .Sh BUGS
1.1       tholo     542: There are no explicit conversions between numbers and strings.
                    543: To force an expression to be treated as a number add 0 to it;
                    544: to force it to be treated as a string concatenate
1.7       aaron     545: .Li \&""
                    546: to it.
                    547: .Pp
1.1       tholo     548: The scope rules for variables in functions are a botch;
                    549: the syntax is worse.