[BACK]Return to awk.1 CVS log [TXT][DIR] Up to [local] / src / usr.bin / awk

Annotation of src/usr.bin/awk/awk.1, Revision 1.15

1.15    ! jmc         1: .\"    $OpenBSD: awk.1,v 1.14 2003/09/02 18:50:06 jmc Exp $
1.7       aaron       2: .\" EX/EE is a Bd
1.11      jmc         3: .\"
                      4: .\" Copyright (C) Lucent Technologies 1997
                      5: .\" All Rights Reserved
1.12      jmc         6: .\"
1.11      jmc         7: .\" Permission to use, copy, modify, and distribute this software and
                      8: .\" its documentation for any purpose and without fee is hereby
                      9: .\" granted, provided that the above copyright notice appear in all
                     10: .\" copies and that both that the copyright notice and this
                     11: .\" permission notice and warranty disclaimer appear in supporting
                     12: .\" documentation, and that the name Lucent Technologies or any of
                     13: .\" its entities not be used in advertising or publicity pertaining
                     14: .\" to distribution of the software without specific, written prior
                     15: .\" permission.
1.12      jmc        16: .\"
1.11      jmc        17: .\" LUCENT DISCLAIMS ALL WARRANTIES WITH REGARD TO THIS SOFTWARE,
                     18: .\" INCLUDING ALL IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS.
                     19: .\" IN NO EVENT SHALL LUCENT OR ANY OF ITS ENTITIES BE LIABLE FOR ANY
                     20: .\" SPECIAL, INDIRECT OR CONSEQUENTIAL DAMAGES OR ANY DAMAGES
                     21: .\" WHATSOEVER RESULTING FROM LOSS OF USE, DATA OR PROFITS, WHETHER
                     22: .\" IN AN ACTION OF CONTRACT, NEGLIGENCE OR OTHER TORTIOUS ACTION,
                     23: .\" ARISING OUT OF OR IN CONNECTION WITH THE USE OR PERFORMANCE OF
                     24: .\" THIS SOFTWARE.
                     25: .\"
1.7       aaron      26: .Dd June 29, 1996
                     27: .Dt AWK 1
                     28: .Os
                     29: .Sh NAME
                     30: .Nm awk
                     31: .Nd pattern-directed scanning and processing language
                     32: .Sh SYNOPSIS
                     33: .Nm awk
                     34: .Op Fl F Ar fs
                     35: .Op Fl v Ar var=value
                     36: .Op Fl safe
                     37: .Op Fl mr Ar n
                     38: .Op Fl mf Ar n
                     39: .Op Ar prog | Fl f Ar progfile
                     40: .Ar
                     41: .Nm nawk
                     42: .Ar ...
                     43: .Sh DESCRIPTION
                     44: .Nm
1.1       tholo      45: scans each input
1.7       aaron      46: .Ar file
1.1       tholo      47: for lines that match any of a set of patterns specified literally in
1.7       aaron      48: .Ar prog
1.1       tholo      49: or in one or more files
                     50: specified as
1.7       aaron      51: .Fl f Ar progfile .
1.1       tholo      52: With each pattern
                     53: there can be an associated action that will be performed
                     54: when a line of a
1.7       aaron      55: .Ar file
1.1       tholo      56: matches the pattern.
                     57: Each line is matched against the
                     58: pattern portion of every pattern-action statement;
                     59: the associated action is performed for each matched pattern.
1.6       aaron      60: The file name
1.7       aaron      61: .Sq Pa \-
1.1       tholo      62: means the standard input.
                     63: Any
1.7       aaron      64: .Ar file
1.1       tholo      65: of the form
1.7       aaron      66: .Ar var=value
1.1       tholo      67: is treated as an assignment, not a filename,
                     68: and is executed at the time it would have been opened if it were a filename.
                     69: The option
1.7       aaron      70: .Fl v
1.1       tholo      71: followed by
1.7       aaron      72: .Ar var=value
1.1       tholo      73: is an assignment to be done before
1.7       aaron      74: .Ar prog
1.1       tholo      75: is executed;
                     76: any number of
1.7       aaron      77: .Fl v
1.1       tholo      78: options may be present.
                     79: The
1.7       aaron      80: .Fl F Ar fs
1.1       tholo      81: option defines the input field separator to be the regular expression
1.7       aaron      82: .Ar fs .
1.5       angelos    83: The
1.7       aaron      84: .Fl safe
                     85: option disables file output
                     86: .Po
                     87: .Ic print Ic > ,
                     88: .Ic print Ic >> ,
                     89: .Pc
                     90: process creation
                     91: .Po
                     92: .Ar cmd Ic \&| getline ,
                     93: .Ic print \&| , system
                     94: .Pc
                     95: and access to the environment
                     96: .Pq Va ENVIRON .
                     97: This
                     98: is a first (and not very reliable) approximation to a
                     99: .Dq safe
                    100: version of
                    101: .Nm awk .
                    102: .Pp
                    103: An input line is normally made up of fields separated by whitespace,
1.1       tholo     104: or by regular expression
1.7       aaron     105: .Va FS .
1.1       tholo     106: The fields are denoted
1.7       aaron     107: .Va $1 , $2 , ... ,
                    108: while
                    109: .Va $0
1.1       tholo     110: refers to the entire line.
                    111: If
1.7       aaron     112: .Va FS
1.1       tholo     113: is null, the input line is split into one field per character.
1.7       aaron     114: .Pp
1.1       tholo     115: To compensate for inadequate implementation of storage management,
1.6       aaron     116: the
1.7       aaron     117: .Fl mr
1.1       tholo     118: option can be used to set the maximum size of the input record,
                    119: and the
1.7       aaron     120: .Fl mf
1.1       tholo     121: option to set the maximum number of fields.
1.7       aaron     122: .Pp
1.1       tholo     123: A pattern-action statement has the form
1.7       aaron     124: .Pp
                    125: .D1 Ar pattern Ic \&{ Ar action Ic \&}
                    126: .Pp
1.6       aaron     127: A missing
1.7       aaron     128: .Ic \&{ Ar action Ic \&}
1.1       tholo     129: means print the line;
                    130: a missing pattern always matches.
                    131: Pattern-action statements are separated by newlines or semicolons.
1.7       aaron     132: .Pp
1.1       tholo     133: An action is a sequence of statements.
                    134: A statement can be one of the following:
1.7       aaron     135: .Bd -unfilled -offset indent
                    136: .Ic if ( Xo
                    137: .Ar expression ) statement \&
                    138: .Op Ic else Ar statement
                    139: .Xc
                    140: .Ic while ( Ar expression ) statement
                    141: .Ic for ( Xo
                    142: .Ar expression ; expression ; expression ) statement
                    143: .Xc
                    144: .Ic for ( Xo
                    145: .Ar var Ic in Ar array ) statement
                    146: .Xc
                    147: .Ic do Ar statement Ic while ( Ar expression )
                    148: .Ic break
                    149: .Ic continue
                    150: .Ic { Oo Ar statement ... Oc Ic \& }
                    151: .Ar expression Xo
                    152: .No "# commonly" \&
                    153: .Ar var Ic = Ar expression
                    154: .Xc
                    155: .Ic print Xo
                    156: .Op Ar expression-list
                    157: .Op Ic > Ns Ar expression
                    158: .Xc
                    159: .Ic printf Ar format Xo
                    160: .Op Ar ... , expression-list
                    161: .Op Ic > Ns Ar expression
                    162: .Xc
                    163: .Ic return Op Ar expression
                    164: .Ic next Xo
                    165: .No "# skip remaining patterns on this input line"
                    166: .Xc
                    167: .Ic nextfile Xo
                    168: .No "# skip rest of this file, open next, start at top"
                    169: .Xc
                    170: .Ic delete Ar array Ns Xo
                    171: .Ic \&[ Ns Ar expression Ns Ic \&]
                    172: .No \& "# delete an array element"
                    173: .Xc
                    174: .Ic delete Ar array Xo
                    175: .No "# delete all elements of array"
                    176: .Xc
                    177: .Ic exit Xo
                    178: .Op Ar expression
                    179: .No \& "# exit immediately; status is" Ar expression
                    180: .Xc
                    181: .Ed
                    182: .Pp
1.1       tholo     183: Statements are terminated by
                    184: semicolons, newlines or right braces.
                    185: An empty
1.7       aaron     186: .Ar expression-list
1.1       tholo     187: stands for
1.7       aaron     188: .Ar $0 .
                    189: String constants are quoted
                    190: .Li \&"" ,
1.1       tholo     191: with the usual C escapes recognized within.
                    192: Expressions take on string or numeric values as appropriate,
                    193: and are built using the operators
1.7       aaron     194: .Ic + \- * / % ^
                    195: (exponentiation), and concatenation (indicated by whitespace).
1.1       tholo     196: The operators
1.14      jmc       197: .Ic \&! ++ \-\- += \-= *= /= %= ^= > >= < <= == != ?:
1.1       tholo     198: are also available in expressions.
                    199: Variables may be scalars, array elements
                    200: (denoted
1.7       aaron     201: .Li x[i] )
1.1       tholo     202: or fields.
                    203: Variables are initialized to the null string.
                    204: Array subscripts may be any string,
                    205: not necessarily numeric;
                    206: this allows for a form of associative memory.
                    207: Multiple subscripts such as
1.7       aaron     208: .Li [i,j,k]
1.1       tholo     209: are permitted; the constituents are concatenated,
                    210: separated by the value of
1.7       aaron     211: .Va SUBSEP .
                    212: .Pp
1.1       tholo     213: The
1.7       aaron     214: .Ic print
1.1       tholo     215: statement prints its arguments on the standard output
                    216: (or on a file if
1.7       aaron     217: .Ic > Ns Ar file
1.1       tholo     218: or
1.7       aaron     219: .Ic >> Ns Ar file
1.1       tholo     220: is present or on a pipe if
1.7       aaron     221: .Ic \&| Ar cmd
1.1       tholo     222: is present), separated by the current output field separator,
                    223: and terminated by the output record separator.
1.7       aaron     224: .Ar file
1.1       tholo     225: and
1.7       aaron     226: .Ar cmd
1.1       tholo     227: may be literal names or parenthesized expressions;
                    228: identical string values in different statements denote
                    229: the same open file.
                    230: The
1.7       aaron     231: .Ic printf
1.1       tholo     232: statement formats its expression list according to the format
                    233: (see
1.10      pvalchev  234: .Xr printf 3 ) .
1.1       tholo     235: The built-in function
1.7       aaron     236: .Fn close expr
1.1       tholo     237: closes the file or pipe
1.7       aaron     238: .Fa expr .
1.1       tholo     239: The built-in function
1.7       aaron     240: .Fn fflush expr
1.1       tholo     241: flushes any buffered output for the file or pipe
1.7       aaron     242: .Fa expr .
                    243: .Pp
1.1       tholo     244: The mathematical functions
1.7       aaron     245: .Fn exp ,
                    246: .Fn log ,
                    247: .Fn sqrt ,
                    248: .Fn sin ,
                    249: .Fn cos ,
1.1       tholo     250: and
1.7       aaron     251: .Fn atan2
1.1       tholo     252: are built in.
                    253: Other built-in functions:
1.7       aaron     254: .Bl -tag -width Fn
                    255: .It Fn length
1.1       tholo     256: the length of its argument
                    257: taken as a string,
                    258: or of
1.7       aaron     259: .Va $0
1.1       tholo     260: if no argument.
1.7       aaron     261: .It Fn rand
1.1       tholo     262: random number on (0,1)
1.7       aaron     263: .It Fn srand
1.1       tholo     264: sets seed for
1.7       aaron     265: .Fn rand
1.1       tholo     266: and returns the previous seed.
1.7       aaron     267: .It Fn int
                    268: truncates to an integer value.
                    269: .It Fn substr s m n
1.1       tholo     270: the
1.7       aaron     271: .Fa n Ns No -character
1.1       tholo     272: substring of
1.7       aaron     273: .Fa s
1.1       tholo     274: that begins at position
1.7       aaron     275: .Fa m
1.1       tholo     276: counted from 1.
1.7       aaron     277: .It Fn index s t
1.1       tholo     278: the position in
1.7       aaron     279: .Fa s
1.1       tholo     280: where the string
1.7       aaron     281: .Fa t
1.1       tholo     282: occurs, or 0 if it does not.
1.7       aaron     283: .It Fn match s r
1.1       tholo     284: the position in
1.7       aaron     285: .Fa s
1.1       tholo     286: where the regular expression
1.7       aaron     287: .Fa r
1.1       tholo     288: occurs, or 0 if it does not.
                    289: The variables
1.7       aaron     290: .Va RSTART
1.1       tholo     291: and
1.7       aaron     292: .Va RLENGTH
1.1       tholo     293: are set to the position and length of the matched string.
1.7       aaron     294: .It Fn split s a fs
1.1       tholo     295: splits the string
1.7       aaron     296: .Fa s
1.1       tholo     297: into array elements
1.7       aaron     298: .Va a[1] , a[2] , ... , a[n]
1.1       tholo     299: and returns
1.7       aaron     300: .Va n .
1.1       tholo     301: The separation is done with the regular expression
1.7       aaron     302: .Ar fs
1.1       tholo     303: or with the field separator
1.7       aaron     304: .Va FS
1.1       tholo     305: if
1.7       aaron     306: .Ar fs
1.1       tholo     307: is not given.
                    308: An empty string as field separator splits the string
                    309: into one array element per character.
1.7       aaron     310: .It Fn sub r t s
1.1       tholo     311: substitutes
1.7       aaron     312: .Fa t
1.1       tholo     313: for the first occurrence of the regular expression
1.7       aaron     314: .Fa r
1.1       tholo     315: in the string
1.7       aaron     316: .Fa s .
1.1       tholo     317: If
1.7       aaron     318: .Fa s
1.1       tholo     319: is not given,
1.7       aaron     320: .Va $0
1.1       tholo     321: is used.
1.7       aaron     322: .It Fn gsub r t s
1.1       tholo     323: same as
1.7       aaron     324: .Fn sub
1.1       tholo     325: except that all occurrences of the regular expression
                    326: are replaced;
1.7       aaron     327: .Fn sub
1.1       tholo     328: and
1.7       aaron     329: .Fn gsub
1.1       tholo     330: return the number of replacements.
1.7       aaron     331: .It Fn sprintf fmt expr ...
1.1       tholo     332: the string resulting from formatting
1.7       aaron     333: .Fa expr , ...
1.1       tholo     334: according to the
1.7       aaron     335: .Xr printf 3
1.1       tholo     336: format
1.7       aaron     337: .Fa fmt .
                    338: .It Fn system cmd
1.1       tholo     339: executes
1.7       aaron     340: .Fa cmd
                    341: and returns its exit status.
                    342: .It Fn tolower str
1.1       tholo     343: returns a copy of
1.7       aaron     344: .Fa str
1.1       tholo     345: with all upper-case characters translated to their
                    346: corresponding lower-case equivalents.
1.7       aaron     347: .It Fn toupper str
1.1       tholo     348: returns a copy of
1.7       aaron     349: .Fa str
1.1       tholo     350: with all lower-case characters translated to their
                    351: corresponding upper-case equivalents.
1.7       aaron     352: .El
                    353: .Pp
                    354: The
                    355: .Sq function
                    356: .Ic getline
1.1       tholo     357: sets
1.7       aaron     358: .Va $0
1.1       tholo     359: to the next input record from the current input file;
1.7       aaron     360: .Ic getline < Ar file
1.1       tholo     361: sets
1.7       aaron     362: .Va $0
1.1       tholo     363: to the next record from
1.7       aaron     364: .Ar file .
                    365: .Ic getline Va x
1.1       tholo     366: sets variable
1.7       aaron     367: .Va x
1.1       tholo     368: instead.
                    369: Finally,
1.7       aaron     370: .Ar cmd Ic \&| getline
1.1       tholo     371: pipes the output of
1.7       aaron     372: .Ar cmd
1.1       tholo     373: into
1.7       aaron     374: .Ic getline ;
1.1       tholo     375: each call of
1.7       aaron     376: .Ic getline
1.1       tholo     377: returns the next line of output from
1.7       aaron     378: .Ar cmd .
1.1       tholo     379: In all cases,
1.7       aaron     380: .Ic getline
1.1       tholo     381: returns 1 for a successful input,
                    382: 0 for end of file, and \-1 for an error.
1.7       aaron     383: .Pp
1.1       tholo     384: Patterns are arbitrary Boolean combinations
                    385: (with
1.14      jmc       386: .Ic "\&! || &&" )
1.1       tholo     387: of regular expressions and
                    388: relational expressions.
                    389: Regular expressions are as in
1.12      jmc       390: .Xr egrep 1 .
1.1       tholo     391: Isolated regular expressions
                    392: in a pattern apply to the entire line.
                    393: Regular expressions may also occur in
                    394: relational expressions, using the operators
1.7       aaron     395: .Ic ~
1.1       tholo     396: and
1.7       aaron     397: .Ic !~ .
                    398: .Ic / Ns Ar re Ns Ic /
1.1       tholo     399: is a constant regular expression;
                    400: any string (constant or variable) may be used
                    401: as a regular expression, except in the position of an isolated regular expression
                    402: in a pattern.
1.7       aaron     403: .Pp
1.1       tholo     404: A pattern may consist of two patterns separated by a comma;
                    405: in this case, the action is performed for all lines
                    406: from an occurrence of the first pattern
1.15    ! jmc       407: through an occurrence of the second.
1.7       aaron     408: .Pp
1.1       tholo     409: A relational expression is one of the following:
1.7       aaron     410: .Bd -unfilled -offset indent
                    411: .Ar expression matchop regular-expression
                    412: .Ar expression relop expression
                    413: .Ar expression Ic in Ar array-name
                    414: .Ic \&( Ns Xo
                    415: .Ar expr , expr , \&... Ns Ic \&) in
                    416: .Ar \& array-name
                    417: .Xc
                    418: .Ed
1.15    ! jmc       419: .Pp
1.7       aaron     420: where a
                    421: .Ar relop
                    422: is any of the six relational operators in C, and a
                    423: .Ar matchop
                    424: is either
                    425: .Ic ~
1.1       tholo     426: (matches)
                    427: or
1.7       aaron     428: .Ic !~
1.1       tholo     429: (does not match).
                    430: A conditional is an arithmetic expression,
                    431: a relational expression,
                    432: or a Boolean combination
                    433: of these.
1.7       aaron     434: .Pp
1.1       tholo     435: The special patterns
1.7       aaron     436: .Ic BEGIN
1.1       tholo     437: and
1.7       aaron     438: .Ic END
1.1       tholo     439: may be used to capture control before the first input line is read
                    440: and after the last.
1.7       aaron     441: .Ic BEGIN
1.1       tholo     442: and
1.7       aaron     443: .Ic END
1.1       tholo     444: do not combine with other patterns.
1.7       aaron     445: .Pp
1.1       tholo     446: Variable names with special meanings:
1.7       aaron     447: .Pp
                    448: .Bl -tag -width Va -compact
                    449: .It Va CONVFMT
1.1       tholo     450: conversion format used when converting numbers
1.3       millert   451: (default
1.7       aaron     452: .Qq Li %.6g )
                    453: .It Va FS
1.1       tholo     454: regular expression used to separate fields; also settable
                    455: by option
1.9       millert   456: .Fl F Ar fs .
1.7       aaron     457: .It Va NF
1.1       tholo     458: number of fields in the current record
1.7       aaron     459: .It Va NR
1.1       tholo     460: ordinal number of the current record
1.7       aaron     461: .It Va FNR
1.1       tholo     462: ordinal number of the current record in the current file
1.7       aaron     463: .It Va FILENAME
1.1       tholo     464: the name of the current input file
1.7       aaron     465: .It Va RS
1.1       tholo     466: input record separator (default newline)
1.7       aaron     467: .It Va OFS
1.1       tholo     468: output field separator (default blank)
1.7       aaron     469: .It Va ORS
1.1       tholo     470: output record separator (default newline)
1.7       aaron     471: .It Va OFMT
1.1       tholo     472: output format for numbers (default
1.7       aaron     473: .Qq Li %.6g )
                    474: .It Va SUBSEP
1.1       tholo     475: separates multiple subscripts (default 034)
1.7       aaron     476: .It Va ARGC
1.1       tholo     477: argument count, assignable
1.7       aaron     478: .It Va ARGV
1.1       tholo     479: argument array, assignable;
                    480: non-null members are taken as filenames
1.7       aaron     481: .It Va ENVIRON
1.1       tholo     482: array of environment variables; subscripts are names.
1.7       aaron     483: .El
                    484: .Pp
                    485: Functions may be defined (at the position of a pattern-action statement)
                    486: thusly:
                    487: .Pp
                    488: .Dl function foo(a, b, c) { ...; return x }
                    489: .Pp
1.1       tholo     490: Parameters are passed by value if scalar and by reference if array name;
                    491: functions may be called recursively.
                    492: Parameters are local to the function; all other variables are global.
                    493: Thus local variables may be created by providing excess parameters in
                    494: the function definition.
1.7       aaron     495: .Sh EXAMPLES
                    496: .Dl length($0) > 72
1.1       tholo     497: Print lines longer than 72 characters.
1.7       aaron     498: .Pp
                    499: .Dl { print $2, $1 }
1.1       tholo     500: Print first two fields in opposite order.
1.7       aaron     501: .Bd -literal -offset indent
1.1       tholo     502: BEGIN { FS = ",[ \et]*|[ \et]+" }
                    503:       { print $2, $1 }
1.7       aaron     504: .Ed
1.1       tholo     505: Same, with input fields separated by comma and/or blanks and tabs.
1.7       aaron     506: .Bd -literal -offset indent
                    507: { s += $1 }
                    508: END { print "sum is", s, " average is", s/NR }
                    509: .Ed
1.1       tholo     510: Add up first column, print sum and average.
1.7       aaron     511: .Pp
                    512: .Dl /start/, /stop/
1.1       tholo     513: Print all lines between start/stop pairs.
1.7       aaron     514: .Bd -literal -offset indent
                    515: BEGIN { # Simulate echo(1)
                    516:         for (i = 1; i < ARGC; i++) printf "%s ", ARGV[i]
                    517:         printf "\en"
                    518:         exit }
                    519: .Ed
                    520: .Sh SEE ALSO
                    521: .Xr lex 1 ,
                    522: .Xr sed 1
                    523: .Rs
                    524: .%A A. V. Aho
                    525: .%A B. W. Kernighan
                    526: .%A P. J. Weinberger
                    527: .%T The AWK Programming Language
                    528: .%I Addison-Wesley
                    529: .%D 1988
                    530: .%O ISBN 0-201-07981-X
                    531: .Re
1.8       aaron     532: .Sh HISTORY
1.13      millert   533: An
1.8       aaron     534: .Nm
1.13      millert   535: utility appeared in
                    536: .At v7 .
1.7       aaron     537: .Sh BUGS
1.1       tholo     538: There are no explicit conversions between numbers and strings.
                    539: To force an expression to be treated as a number add 0 to it;
                    540: to force it to be treated as a string concatenate
1.7       aaron     541: .Li \&""
                    542: to it.
                    543: .Pp
1.1       tholo     544: The scope rules for variables in functions are a botch;
                    545: the syntax is worse.