[BACK]Return to awk.1 CVS log [TXT][DIR] Up to [local] / src / usr.bin / awk

Annotation of src/usr.bin/awk/awk.1, Revision 1.8

1.8     ! aaron       1: .\"    $OpenBSD: awk.1,v 1.7 2000/08/30 13:37:51 aaron Exp $
1.7       aaron       2: .\" EX/EE is a Bd
                      3: .Dd June 29, 1996
                      4: .Dt AWK 1
                      5: .Os
                      6: .Sh NAME
                      7: .Nm awk
                      8: .Nd pattern-directed scanning and processing language
                      9: .Sh SYNOPSIS
                     10: .Nm awk
                     11: .Op Fl F Ar fs
                     12: .Op Fl v Ar var=value
                     13: .Op Fl safe
                     14: .Op Fl mr Ar n
                     15: .Op Fl mf Ar n
                     16: .Op Ar prog | Fl f Ar progfile
                     17: .Ar
                     18: .Nm nawk
                     19: .Ar ...
                     20: .Sh DESCRIPTION
                     21: .Nm
1.1       tholo      22: scans each input
1.7       aaron      23: .Ar file
1.1       tholo      24: for lines that match any of a set of patterns specified literally in
1.7       aaron      25: .Ar prog
1.1       tholo      26: or in one or more files
                     27: specified as
1.7       aaron      28: .Fl f Ar progfile .
1.1       tholo      29: With each pattern
                     30: there can be an associated action that will be performed
                     31: when a line of a
1.7       aaron      32: .Ar file
1.1       tholo      33: matches the pattern.
                     34: Each line is matched against the
                     35: pattern portion of every pattern-action statement;
                     36: the associated action is performed for each matched pattern.
1.6       aaron      37: The file name
1.7       aaron      38: .Sq Pa \-
1.1       tholo      39: means the standard input.
                     40: Any
1.7       aaron      41: .Ar file
1.1       tholo      42: of the form
1.7       aaron      43: .Ar var=value
1.1       tholo      44: is treated as an assignment, not a filename,
                     45: and is executed at the time it would have been opened if it were a filename.
                     46: The option
1.7       aaron      47: .Fl v
1.1       tholo      48: followed by
1.7       aaron      49: .Ar var=value
1.1       tholo      50: is an assignment to be done before
1.7       aaron      51: .Ar prog
1.1       tholo      52: is executed;
                     53: any number of
1.7       aaron      54: .Fl v
1.1       tholo      55: options may be present.
                     56: The
1.7       aaron      57: .Fl F Ar fs
1.1       tholo      58: option defines the input field separator to be the regular expression
1.7       aaron      59: .Ar fs .
1.5       angelos    60: The
1.7       aaron      61: .Fl safe
                     62: option disables file output
                     63: .Po
                     64: .Ic print Ic > ,
                     65: .Ic print Ic >> ,
                     66: .Pc
                     67: process creation
                     68: .Po
                     69: .Ar cmd Ic \&| getline ,
                     70: .Ic print \&| , system
                     71: .Pc
                     72: and access to the environment
                     73: .Pq Va ENVIRON .
                     74: This
                     75: is a first (and not very reliable) approximation to a
                     76: .Dq safe
                     77: version of
                     78: .Nm awk .
                     79: .Pp
                     80: An input line is normally made up of fields separated by whitespace,
1.1       tholo      81: or by regular expression
1.7       aaron      82: .Va FS .
1.1       tholo      83: The fields are denoted
1.7       aaron      84: .Va $1 , $2 , ... ,
                     85: while
                     86: .Va $0
1.1       tholo      87: refers to the entire line.
                     88: If
1.7       aaron      89: .Va FS
1.1       tholo      90: is null, the input line is split into one field per character.
1.7       aaron      91: .Pp
1.1       tholo      92: To compensate for inadequate implementation of storage management,
1.6       aaron      93: the
1.7       aaron      94: .Fl mr
1.1       tholo      95: option can be used to set the maximum size of the input record,
                     96: and the
1.7       aaron      97: .Fl mf
1.1       tholo      98: option to set the maximum number of fields.
1.7       aaron      99: .Pp
1.1       tholo     100: A pattern-action statement has the form
1.7       aaron     101: .Pp
                    102: .D1 Ar pattern Ic \&{ Ar action Ic \&}
                    103: .Pp
1.6       aaron     104: A missing
1.7       aaron     105: .Ic \&{ Ar action Ic \&}
1.1       tholo     106: means print the line;
                    107: a missing pattern always matches.
                    108: Pattern-action statements are separated by newlines or semicolons.
1.7       aaron     109: .Pp
1.1       tholo     110: An action is a sequence of statements.
                    111: A statement can be one of the following:
1.7       aaron     112: .Pp
                    113: .Bd -unfilled -offset indent
                    114: .Ic if ( Xo
                    115: .Ar expression ) statement \&
                    116: .Op Ic else Ar statement
                    117: .Xc
                    118: .Ic while ( Ar expression ) statement
                    119: .Ic for ( Xo
                    120: .Ar expression ; expression ; expression ) statement
                    121: .Xc
                    122: .Ic for ( Xo
                    123: .Ar var Ic in Ar array ) statement
                    124: .Xc
                    125: .Ic do Ar statement Ic while ( Ar expression )
                    126: .Ic break
                    127: .Ic continue
                    128: .Ic { Oo Ar statement ... Oc Ic \& }
                    129: .Ar expression Xo
                    130: .No "# commonly" \&
                    131: .Ar var Ic = Ar expression
                    132: .Xc
                    133: .Ic print Xo
                    134: .Op Ar expression-list
                    135: .Op Ic > Ns Ar expression
                    136: .Xc
                    137: .Ic printf Ar format Xo
                    138: .Op Ar ... , expression-list
                    139: .Op Ic > Ns Ar expression
                    140: .Xc
                    141: .Ic return Op Ar expression
                    142: .Ic next Xo
                    143: .No "# skip remaining patterns on this input line"
                    144: .Xc
                    145: .Ic nextfile Xo
                    146: .No "# skip rest of this file, open next, start at top"
                    147: .Xc
                    148: .Ic delete Ar array Ns Xo
                    149: .Ic \&[ Ns Ar expression Ns Ic \&]
                    150: .No \& "# delete an array element"
                    151: .Xc
                    152: .Ic delete Ar array Xo
                    153: .No "# delete all elements of array"
                    154: .Xc
                    155: .Ic exit Xo
                    156: .Op Ar expression
                    157: .No \& "# exit immediately; status is" Ar expression
                    158: .Xc
                    159: .Ed
                    160: .Pp
1.1       tholo     161: Statements are terminated by
                    162: semicolons, newlines or right braces.
                    163: An empty
1.7       aaron     164: .Ar expression-list
1.1       tholo     165: stands for
1.7       aaron     166: .Ar $0 .
                    167: String constants are quoted
                    168: .Li \&"" ,
1.1       tholo     169: with the usual C escapes recognized within.
                    170: Expressions take on string or numeric values as appropriate,
                    171: and are built using the operators
1.7       aaron     172: .Ic + \- * / % ^
                    173: (exponentiation), and concatenation (indicated by whitespace).
1.1       tholo     174: The operators
1.7       aaron     175: .Ic ! ++ \-\- += \-= *= /= %= ^= > >= < <= == != ?:
1.1       tholo     176: are also available in expressions.
                    177: Variables may be scalars, array elements
                    178: (denoted
1.7       aaron     179: .Li x[i] )
1.1       tholo     180: or fields.
                    181: Variables are initialized to the null string.
                    182: Array subscripts may be any string,
                    183: not necessarily numeric;
                    184: this allows for a form of associative memory.
                    185: Multiple subscripts such as
1.7       aaron     186: .Li [i,j,k]
1.1       tholo     187: are permitted; the constituents are concatenated,
                    188: separated by the value of
1.7       aaron     189: .Va SUBSEP .
                    190: .Pp
1.1       tholo     191: The
1.7       aaron     192: .Ic print
1.1       tholo     193: statement prints its arguments on the standard output
                    194: (or on a file if
1.7       aaron     195: .Ic > Ns Ar file
1.1       tholo     196: or
1.7       aaron     197: .Ic >> Ns Ar file
1.1       tholo     198: is present or on a pipe if
1.7       aaron     199: .Ic \&| Ar cmd
1.1       tholo     200: is present), separated by the current output field separator,
                    201: and terminated by the output record separator.
1.7       aaron     202: .Ar file
1.1       tholo     203: and
1.7       aaron     204: .Ar cmd
1.1       tholo     205: may be literal names or parenthesized expressions;
                    206: identical string values in different statements denote
                    207: the same open file.
                    208: The
1.7       aaron     209: .Ic printf
1.1       tholo     210: statement formats its expression list according to the format
                    211: (see
1.7       aaron     212: .Xr printf 3 .
1.1       tholo     213: The built-in function
1.7       aaron     214: .Fn close expr
1.1       tholo     215: closes the file or pipe
1.7       aaron     216: .Fa expr .
1.1       tholo     217: The built-in function
1.7       aaron     218: .Fn fflush expr
1.1       tholo     219: flushes any buffered output for the file or pipe
1.7       aaron     220: .Fa expr .
                    221: .Pp
1.1       tholo     222: The mathematical functions
1.7       aaron     223: .Fn exp ,
                    224: .Fn log ,
                    225: .Fn sqrt ,
                    226: .Fn sin ,
                    227: .Fn cos ,
1.1       tholo     228: and
1.7       aaron     229: .Fn atan2
1.1       tholo     230: are built in.
                    231: Other built-in functions:
1.7       aaron     232: .Pp
                    233: .Bl -tag -width Fn
                    234: .It Fn length
1.1       tholo     235: the length of its argument
                    236: taken as a string,
                    237: or of
1.7       aaron     238: .Va $0
1.1       tholo     239: if no argument.
1.7       aaron     240: .It Fn rand
1.1       tholo     241: random number on (0,1)
1.7       aaron     242: .It Fn srand
1.1       tholo     243: sets seed for
1.7       aaron     244: .Fn rand
1.1       tholo     245: and returns the previous seed.
1.7       aaron     246: .It Fn int
                    247: truncates to an integer value.
                    248: .It Fn substr s m n
1.1       tholo     249: the
1.7       aaron     250: .Fa n Ns No -character
1.1       tholo     251: substring of
1.7       aaron     252: .Fa s
1.1       tholo     253: that begins at position
1.7       aaron     254: .Fa m
1.1       tholo     255: counted from 1.
1.7       aaron     256: .It Fn index s t
1.1       tholo     257: the position in
1.7       aaron     258: .Fa s
1.1       tholo     259: where the string
1.7       aaron     260: .Fa t
1.1       tholo     261: occurs, or 0 if it does not.
1.7       aaron     262: .It Fn match s r
1.1       tholo     263: the position in
1.7       aaron     264: .Fa s
1.1       tholo     265: where the regular expression
1.7       aaron     266: .Fa r
1.1       tholo     267: occurs, or 0 if it does not.
                    268: The variables
1.7       aaron     269: .Va RSTART
1.1       tholo     270: and
1.7       aaron     271: .Va RLENGTH
1.1       tholo     272: are set to the position and length of the matched string.
1.7       aaron     273: .It Fn split s a fs
1.1       tholo     274: splits the string
1.7       aaron     275: .Fa s
1.1       tholo     276: into array elements
1.7       aaron     277: .Va a[1] , a[2] , ... , a[n]
1.1       tholo     278: and returns
1.7       aaron     279: .Va n .
1.1       tholo     280: The separation is done with the regular expression
1.7       aaron     281: .Ar fs
1.1       tholo     282: or with the field separator
1.7       aaron     283: .Va FS
1.1       tholo     284: if
1.7       aaron     285: .Ar fs
1.1       tholo     286: is not given.
                    287: An empty string as field separator splits the string
                    288: into one array element per character.
1.7       aaron     289: .It Fn sub r t s
1.1       tholo     290: substitutes
1.7       aaron     291: .Fa t
1.1       tholo     292: for the first occurrence of the regular expression
1.7       aaron     293: .Fa r
1.1       tholo     294: in the string
1.7       aaron     295: .Fa s .
1.1       tholo     296: If
1.7       aaron     297: .Fa s
1.1       tholo     298: is not given,
1.7       aaron     299: .Va $0
1.1       tholo     300: is used.
1.7       aaron     301: .It Fn gsub r t s
1.1       tholo     302: same as
1.7       aaron     303: .Fn sub
1.1       tholo     304: except that all occurrences of the regular expression
                    305: are replaced;
1.7       aaron     306: .Fn sub
1.1       tholo     307: and
1.7       aaron     308: .Fn gsub
1.1       tholo     309: return the number of replacements.
1.7       aaron     310: .It Fn sprintf fmt expr ...
1.1       tholo     311: the string resulting from formatting
1.7       aaron     312: .Fa expr , ...
1.1       tholo     313: according to the
1.7       aaron     314: .Xr printf 3
1.1       tholo     315: format
1.7       aaron     316: .Fa fmt .
                    317: .It Fn system cmd
1.1       tholo     318: executes
1.7       aaron     319: .Fa cmd
                    320: and returns its exit status.
                    321: .It Fn tolower str
1.1       tholo     322: returns a copy of
1.7       aaron     323: .Fa str
1.1       tholo     324: with all upper-case characters translated to their
                    325: corresponding lower-case equivalents.
1.7       aaron     326: .It Fn toupper str
1.1       tholo     327: returns a copy of
1.7       aaron     328: .Fa str
1.1       tholo     329: with all lower-case characters translated to their
                    330: corresponding upper-case equivalents.
1.7       aaron     331: .El
                    332: .Pp
                    333: The
                    334: .Sq function
                    335: .Ic getline
1.1       tholo     336: sets
1.7       aaron     337: .Va $0
1.1       tholo     338: to the next input record from the current input file;
1.7       aaron     339: .Ic getline < Ar file
1.1       tholo     340: sets
1.7       aaron     341: .Va $0
1.1       tholo     342: to the next record from
1.7       aaron     343: .Ar file .
                    344: .Ic getline Va x
1.1       tholo     345: sets variable
1.7       aaron     346: .Va x
1.1       tholo     347: instead.
                    348: Finally,
1.7       aaron     349: .Ar cmd Ic \&| getline
1.1       tholo     350: pipes the output of
1.7       aaron     351: .Ar cmd
1.1       tholo     352: into
1.7       aaron     353: .Ic getline ;
1.1       tholo     354: each call of
1.7       aaron     355: .Ic getline
1.1       tholo     356: returns the next line of output from
1.7       aaron     357: .Ar cmd .
1.1       tholo     358: In all cases,
1.7       aaron     359: .Ic getline
1.1       tholo     360: returns 1 for a successful input,
                    361: 0 for end of file, and \-1 for an error.
1.7       aaron     362: .Pp
1.1       tholo     363: Patterns are arbitrary Boolean combinations
                    364: (with
1.7       aaron     365: .Ic "! || &&" )
1.1       tholo     366: of regular expressions and
                    367: relational expressions.
                    368: Regular expressions are as in
1.7       aaron     369: .Xr egrep  1 .
1.1       tholo     370: Isolated regular expressions
                    371: in a pattern apply to the entire line.
                    372: Regular expressions may also occur in
                    373: relational expressions, using the operators
1.7       aaron     374: .Ic ~
1.1       tholo     375: and
1.7       aaron     376: .Ic !~ .
                    377: .Ic / Ns Ar re Ns Ic /
1.1       tholo     378: is a constant regular expression;
                    379: any string (constant or variable) may be used
                    380: as a regular expression, except in the position of an isolated regular expression
                    381: in a pattern.
1.7       aaron     382: .Pp
1.1       tholo     383: A pattern may consist of two patterns separated by a comma;
                    384: in this case, the action is performed for all lines
                    385: from an occurrence of the first pattern
                    386: though an occurrence of the second.
1.7       aaron     387: .Pp
1.1       tholo     388: A relational expression is one of the following:
1.7       aaron     389: .Bd -unfilled -offset indent
                    390: .Ar expression matchop regular-expression
                    391: .Ar expression relop expression
                    392: .Ar expression Ic in Ar array-name
                    393: .Ic \&( Ns Xo
                    394: .Ar expr , expr , \&... Ns Ic \&) in
                    395: .Ar \& array-name
                    396: .Xc
                    397: .Ed
                    398: where a
                    399: .Ar relop
                    400: is any of the six relational operators in C, and a
                    401: .Ar matchop
                    402: is either
                    403: .Ic ~
1.1       tholo     404: (matches)
                    405: or
1.7       aaron     406: .Ic !~
1.1       tholo     407: (does not match).
                    408: A conditional is an arithmetic expression,
                    409: a relational expression,
                    410: or a Boolean combination
                    411: of these.
1.7       aaron     412: .Pp
1.1       tholo     413: The special patterns
1.7       aaron     414: .Ic BEGIN
1.1       tholo     415: and
1.7       aaron     416: .Ic END
1.1       tholo     417: may be used to capture control before the first input line is read
                    418: and after the last.
1.7       aaron     419: .Ic BEGIN
1.1       tholo     420: and
1.7       aaron     421: .Ic END
1.1       tholo     422: do not combine with other patterns.
1.7       aaron     423: .Pp
1.1       tholo     424: Variable names with special meanings:
1.7       aaron     425: .Pp
                    426: .Bl -tag -width Va -compact
                    427: .It Va CONVFMT
1.1       tholo     428: conversion format used when converting numbers
1.3       millert   429: (default
1.7       aaron     430: .Qq Li %.6g )
                    431: .It Va FS
1.1       tholo     432: regular expression used to separate fields; also settable
                    433: by option
1.7       aaron     434: .Fl fs .
                    435: .It Va NF
1.1       tholo     436: number of fields in the current record
1.7       aaron     437: .It Va NR
1.1       tholo     438: ordinal number of the current record
1.7       aaron     439: .It Va FNR
1.1       tholo     440: ordinal number of the current record in the current file
1.7       aaron     441: .It Va FILENAME
1.1       tholo     442: the name of the current input file
1.7       aaron     443: .It Va RS
1.1       tholo     444: input record separator (default newline)
1.7       aaron     445: .It Va OFS
1.1       tholo     446: output field separator (default blank)
1.7       aaron     447: .It Va ORS
1.1       tholo     448: output record separator (default newline)
1.7       aaron     449: .It Va OFMT
1.1       tholo     450: output format for numbers (default
1.7       aaron     451: .Qq Li %.6g )
                    452: .It Va SUBSEP
1.1       tholo     453: separates multiple subscripts (default 034)
1.7       aaron     454: .It Va ARGC
1.1       tholo     455: argument count, assignable
1.7       aaron     456: .It Va ARGV
1.1       tholo     457: argument array, assignable;
                    458: non-null members are taken as filenames
1.7       aaron     459: .It Va ENVIRON
1.1       tholo     460: array of environment variables; subscripts are names.
1.7       aaron     461: .El
                    462: .Pp
                    463: Functions may be defined (at the position of a pattern-action statement)
                    464: thusly:
                    465: .Pp
                    466: .Dl function foo(a, b, c) { ...; return x }
                    467: .Pp
1.1       tholo     468: Parameters are passed by value if scalar and by reference if array name;
                    469: functions may be called recursively.
                    470: Parameters are local to the function; all other variables are global.
                    471: Thus local variables may be created by providing excess parameters in
                    472: the function definition.
1.7       aaron     473: .Sh EXAMPLES
                    474: .Dl length($0) > 72
1.1       tholo     475: Print lines longer than 72 characters.
1.7       aaron     476: .Pp
                    477: .Dl { print $2, $1 }
1.1       tholo     478: Print first two fields in opposite order.
1.7       aaron     479: .Pp
                    480: .Bd -literal -offset indent
1.1       tholo     481: BEGIN { FS = ",[ \et]*|[ \et]+" }
                    482:       { print $2, $1 }
1.7       aaron     483: .Ed
1.1       tholo     484: Same, with input fields separated by comma and/or blanks and tabs.
1.7       aaron     485: .Pp
                    486: .Bd -literal -offset indent
                    487: { s += $1 }
                    488: END { print "sum is", s, " average is", s/NR }
                    489: .Ed
1.1       tholo     490: Add up first column, print sum and average.
1.7       aaron     491: .Pp
                    492: .Dl /start/, /stop/
1.1       tholo     493: Print all lines between start/stop pairs.
1.7       aaron     494: .Pp
                    495: .Bd -literal -offset indent
                    496: BEGIN { # Simulate echo(1)
                    497:         for (i = 1; i < ARGC; i++) printf "%s ", ARGV[i]
                    498:         printf "\en"
                    499:         exit }
                    500: .Ed
                    501: .Sh SEE ALSO
                    502: .Xr lex 1 ,
                    503: .Xr sed 1
                    504: .Rs
                    505: .%A A. V. Aho
                    506: .%A B. W. Kernighan
                    507: .%A P. J. Weinberger
                    508: .%T The AWK Programming Language
                    509: .%I Addison-Wesley
                    510: .%D 1988
                    511: .%O ISBN 0-201-07981-X
                    512: .Re
1.8     ! aaron     513: .Sh HISTORY
        !           514: AT&T
        !           515: .Nm
        !           516: by B. W. Kernighan was updated for
        !           517: .Bx 4.4
        !           518: and again in 1996.
1.7       aaron     519: .Sh BUGS
1.1       tholo     520: There are no explicit conversions between numbers and strings.
                    521: To force an expression to be treated as a number add 0 to it;
                    522: to force it to be treated as a string concatenate
1.7       aaron     523: .Li \&""
                    524: to it.
                    525: .Pp
1.1       tholo     526: The scope rules for variables in functions are a botch;
                    527: the syntax is worse.