Annotation of src/usr.bin/awk/awk.1, Revision 1.16
1.16 ! jmc 1: .\" $OpenBSD: awk.1,v 1.15 2003/11/24 10:58:08 jmc Exp $
1.7 aaron 2: .\" EX/EE is a Bd
1.11 jmc 3: .\"
4: .\" Copyright (C) Lucent Technologies 1997
5: .\" All Rights Reserved
1.12 jmc 6: .\"
1.11 jmc 7: .\" Permission to use, copy, modify, and distribute this software and
8: .\" its documentation for any purpose and without fee is hereby
9: .\" granted, provided that the above copyright notice appear in all
10: .\" copies and that both that the copyright notice and this
11: .\" permission notice and warranty disclaimer appear in supporting
12: .\" documentation, and that the name Lucent Technologies or any of
13: .\" its entities not be used in advertising or publicity pertaining
14: .\" to distribution of the software without specific, written prior
15: .\" permission.
1.12 jmc 16: .\"
1.11 jmc 17: .\" LUCENT DISCLAIMS ALL WARRANTIES WITH REGARD TO THIS SOFTWARE,
18: .\" INCLUDING ALL IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS.
19: .\" IN NO EVENT SHALL LUCENT OR ANY OF ITS ENTITIES BE LIABLE FOR ANY
20: .\" SPECIAL, INDIRECT OR CONSEQUENTIAL DAMAGES OR ANY DAMAGES
21: .\" WHATSOEVER RESULTING FROM LOSS OF USE, DATA OR PROFITS, WHETHER
22: .\" IN AN ACTION OF CONTRACT, NEGLIGENCE OR OTHER TORTIOUS ACTION,
23: .\" ARISING OUT OF OR IN CONNECTION WITH THE USE OR PERFORMANCE OF
24: .\" THIS SOFTWARE.
25: .\"
1.7 aaron 26: .Dd June 29, 1996
27: .Dt AWK 1
28: .Os
29: .Sh NAME
30: .Nm awk
31: .Nd pattern-directed scanning and processing language
32: .Sh SYNOPSIS
33: .Nm awk
1.16 ! jmc 34: .Op Fl safe
! 35: .Op Fl V
! 36: .Op Fl d Ns Op Ar n
1.7 aaron 37: .Op Fl F Ar fs
1.16 ! jmc 38: .Oo Fl v Ar var Ns =
! 39: .Ns Ar value Oc
! 40: .Ar prog | Fl f Ar progfile
1.7 aaron 41: .Ar
42: .Nm nawk
43: .Ar ...
44: .Sh DESCRIPTION
45: .Nm
1.1 tholo 46: scans each input
1.7 aaron 47: .Ar file
1.1 tholo 48: for lines that match any of a set of patterns specified literally in
1.7 aaron 49: .Ar prog
1.16 ! jmc 50: or in one or more files specified as
1.7 aaron 51: .Fl f Ar progfile .
1.16 ! jmc 52: With each pattern there can be an associated action that will be performed
1.1 tholo 53: when a line of a
1.7 aaron 54: .Ar file
1.1 tholo 55: matches the pattern.
56: Each line is matched against the
57: pattern portion of every pattern-action statement;
58: the associated action is performed for each matched pattern.
1.6 aaron 59: The file name
1.16 ! jmc 60: .Sq -
1.1 tholo 61: means the standard input.
62: Any
1.7 aaron 63: .Ar file
1.1 tholo 64: of the form
1.16 ! jmc 65: .Ar var Ns = Ns Ar value
1.1 tholo 66: is treated as an assignment, not a filename,
67: and is executed at the time it would have been opened if it were a filename.
1.16 ! jmc 68: .Pp
! 69: The options are as follows:
! 70: .Bl -tag -width Ds
! 71: .It Fl d Ns Op Ar n
! 72: Debug mode.
! 73: Set debug level to
! 74: .Ar n ,
! 75: or 1 if
! 76: .Ar n
! 77: is not specified.
! 78: A value greater than 1 causes
! 79: .Nm
! 80: to dump core on fatal errors.
! 81: .It Fl F Ar fs
! 82: Define the input field separator to be the regular expression
1.7 aaron 83: .Ar fs .
1.16 ! jmc 84: .It Fl f Ar filename
! 85: Read program code from the specified file
! 86: .Ar filename
! 87: instead of from the command line.
! 88: .It Fl safe
! 89: Disable file output
! 90: .Pf ( Ic print > ,
! 91: .Ic print >> ) ,
1.7 aaron 92: process creation
93: .Po
94: .Ar cmd Ic \&| getline ,
95: .Ic print \&| , system
96: .Pc
97: and access to the environment
98: .Pq Va ENVIRON .
99: This
1.16 ! jmc 100: is a first
! 101: .Pq and not very reliable
! 102: approximation to a
1.7 aaron 103: .Dq safe
104: version of
1.16 ! jmc 105: .Nm .
! 106: .It Fl V
! 107: Print the version number of
! 108: .Nm
! 109: to standard output and exit.
! 110: .It Fl v Ar var Ns = Ns Ar value
! 111: Assign
! 112: .Ar value
! 113: to variable
! 114: .Ar var
! 115: before
! 116: .Ar prog
! 117: is executed;
! 118: any number of
! 119: .Fl v
! 120: options may be present.
! 121: .El
1.7 aaron 122: .Pp
123: An input line is normally made up of fields separated by whitespace,
1.1 tholo 124: or by regular expression
1.7 aaron 125: .Va FS .
1.1 tholo 126: The fields are denoted
1.7 aaron 127: .Va $1 , $2 , ... ,
128: while
129: .Va $0
1.1 tholo 130: refers to the entire line.
131: If
1.7 aaron 132: .Va FS
1.1 tholo 133: is null, the input line is split into one field per character.
1.7 aaron 134: .Pp
1.1 tholo 135: A pattern-action statement has the form
1.7 aaron 136: .Pp
137: .D1 Ar pattern Ic \&{ Ar action Ic \&}
138: .Pp
1.6 aaron 139: A missing
1.7 aaron 140: .Ic \&{ Ar action Ic \&}
1.1 tholo 141: means print the line;
142: a missing pattern always matches.
143: Pattern-action statements are separated by newlines or semicolons.
1.7 aaron 144: .Pp
1.1 tholo 145: An action is a sequence of statements.
146: A statement can be one of the following:
1.7 aaron 147: .Bd -unfilled -offset indent
148: .Ic if ( Xo
149: .Ar expression ) statement \&
150: .Op Ic else Ar statement
151: .Xc
152: .Ic while ( Ar expression ) statement
153: .Ic for ( Xo
154: .Ar expression ; expression ; expression ) statement
155: .Xc
156: .Ic for ( Xo
157: .Ar var Ic in Ar array ) statement
158: .Xc
159: .Ic do Ar statement Ic while ( Ar expression )
160: .Ic break
161: .Ic continue
162: .Ic { Oo Ar statement ... Oc Ic \& }
163: .Ar expression Xo
164: .No "# commonly" \&
165: .Ar var Ic = Ar expression
166: .Xc
167: .Ic print Xo
168: .Op Ar expression-list
169: .Op Ic > Ns Ar expression
170: .Xc
171: .Ic printf Ar format Xo
172: .Op Ar ... , expression-list
173: .Op Ic > Ns Ar expression
174: .Xc
175: .Ic return Op Ar expression
176: .Ic next Xo
177: .No "# skip remaining patterns on this input line"
178: .Xc
179: .Ic nextfile Xo
180: .No "# skip rest of this file, open next, start at top"
181: .Xc
182: .Ic delete Ar array Ns Xo
183: .Ic \&[ Ns Ar expression Ns Ic \&]
184: .No \& "# delete an array element"
185: .Xc
186: .Ic delete Ar array Xo
187: .No "# delete all elements of array"
188: .Xc
189: .Ic exit Xo
190: .Op Ar expression
191: .No \& "# exit immediately; status is" Ar expression
192: .Xc
193: .Ed
194: .Pp
1.1 tholo 195: Statements are terminated by
196: semicolons, newlines or right braces.
197: An empty
1.7 aaron 198: .Ar expression-list
1.1 tholo 199: stands for
1.7 aaron 200: .Ar $0 .
201: String constants are quoted
202: .Li \&"" ,
1.1 tholo 203: with the usual C escapes recognized within.
204: Expressions take on string or numeric values as appropriate,
205: and are built using the operators
1.7 aaron 206: .Ic + \- * / % ^
207: (exponentiation), and concatenation (indicated by whitespace).
1.1 tholo 208: The operators
1.16 ! jmc 209: .Ic \&! ++ \-\- += \-= *= /= %= ^=
! 210: .Ic > >= < <= == != ?:
1.1 tholo 211: are also available in expressions.
212: Variables may be scalars, array elements
213: (denoted
1.7 aaron 214: .Li x[i] )
1.1 tholo 215: or fields.
216: Variables are initialized to the null string.
217: Array subscripts may be any string,
218: not necessarily numeric;
219: this allows for a form of associative memory.
220: Multiple subscripts such as
1.7 aaron 221: .Li [i,j,k]
1.1 tholo 222: are permitted; the constituents are concatenated,
223: separated by the value of
1.7 aaron 224: .Va SUBSEP .
225: .Pp
1.1 tholo 226: The
1.7 aaron 227: .Ic print
1.1 tholo 228: statement prints its arguments on the standard output
229: (or on a file if
1.7 aaron 230: .Ic > Ns Ar file
1.1 tholo 231: or
1.7 aaron 232: .Ic >> Ns Ar file
1.1 tholo 233: is present or on a pipe if
1.7 aaron 234: .Ic \&| Ar cmd
1.1 tholo 235: is present), separated by the current output field separator,
236: and terminated by the output record separator.
1.7 aaron 237: .Ar file
1.1 tholo 238: and
1.7 aaron 239: .Ar cmd
1.1 tholo 240: may be literal names or parenthesized expressions;
241: identical string values in different statements denote
242: the same open file.
243: The
1.7 aaron 244: .Ic printf
1.1 tholo 245: statement formats its expression list according to the format
246: (see
1.10 pvalchev 247: .Xr printf 3 ) .
1.1 tholo 248: The built-in function
1.7 aaron 249: .Fn close expr
1.1 tholo 250: closes the file or pipe
1.7 aaron 251: .Fa expr .
1.1 tholo 252: The built-in function
1.7 aaron 253: .Fn fflush expr
1.1 tholo 254: flushes any buffered output for the file or pipe
1.7 aaron 255: .Fa expr .
256: .Pp
1.1 tholo 257: The mathematical functions
1.7 aaron 258: .Fn exp ,
259: .Fn log ,
260: .Fn sqrt ,
261: .Fn sin ,
262: .Fn cos ,
1.1 tholo 263: and
1.7 aaron 264: .Fn atan2
1.1 tholo 265: are built in.
266: Other built-in functions:
1.7 aaron 267: .Bl -tag -width Fn
268: .It Fn length
1.16 ! jmc 269: The length of its argument
1.1 tholo 270: taken as a string,
271: or of
1.7 aaron 272: .Va $0
1.1 tholo 273: if no argument.
1.7 aaron 274: .It Fn rand
1.16 ! jmc 275: Random number on (0,1).
1.7 aaron 276: .It Fn srand
1.16 ! jmc 277: Sets seed for
1.7 aaron 278: .Fn rand
1.1 tholo 279: and returns the previous seed.
1.7 aaron 280: .It Fn int
1.16 ! jmc 281: Truncates to an integer value.
1.7 aaron 282: .It Fn substr s m n
1.16 ! jmc 283: The
1.7 aaron 284: .Fa n Ns No -character
1.1 tholo 285: substring of
1.7 aaron 286: .Fa s
1.1 tholo 287: that begins at position
1.7 aaron 288: .Fa m
1.1 tholo 289: counted from 1.
1.7 aaron 290: .It Fn index s t
1.16 ! jmc 291: The position in
1.7 aaron 292: .Fa s
1.1 tholo 293: where the string
1.7 aaron 294: .Fa t
1.1 tholo 295: occurs, or 0 if it does not.
1.7 aaron 296: .It Fn match s r
1.16 ! jmc 297: The position in
1.7 aaron 298: .Fa s
1.1 tholo 299: where the regular expression
1.7 aaron 300: .Fa r
1.1 tholo 301: occurs, or 0 if it does not.
302: The variables
1.7 aaron 303: .Va RSTART
1.1 tholo 304: and
1.7 aaron 305: .Va RLENGTH
1.1 tholo 306: are set to the position and length of the matched string.
1.7 aaron 307: .It Fn split s a fs
1.16 ! jmc 308: Splits the string
1.7 aaron 309: .Fa s
1.1 tholo 310: into array elements
1.7 aaron 311: .Va a[1] , a[2] , ... , a[n]
1.1 tholo 312: and returns
1.7 aaron 313: .Va n .
1.1 tholo 314: The separation is done with the regular expression
1.7 aaron 315: .Ar fs
1.1 tholo 316: or with the field separator
1.7 aaron 317: .Va FS
1.1 tholo 318: if
1.7 aaron 319: .Ar fs
1.1 tholo 320: is not given.
321: An empty string as field separator splits the string
322: into one array element per character.
1.7 aaron 323: .It Fn sub r t s
1.16 ! jmc 324: Substitutes
1.7 aaron 325: .Fa t
1.1 tholo 326: for the first occurrence of the regular expression
1.7 aaron 327: .Fa r
1.1 tholo 328: in the string
1.7 aaron 329: .Fa s .
1.1 tholo 330: If
1.7 aaron 331: .Fa s
1.1 tholo 332: is not given,
1.7 aaron 333: .Va $0
1.1 tholo 334: is used.
1.7 aaron 335: .It Fn gsub r t s
1.16 ! jmc 336: Same as
1.7 aaron 337: .Fn sub
1.1 tholo 338: except that all occurrences of the regular expression
339: are replaced;
1.7 aaron 340: .Fn sub
1.1 tholo 341: and
1.7 aaron 342: .Fn gsub
1.1 tholo 343: return the number of replacements.
1.7 aaron 344: .It Fn sprintf fmt expr ...
1.16 ! jmc 345: The string resulting from formatting
1.7 aaron 346: .Fa expr , ...
1.1 tholo 347: according to the
1.7 aaron 348: .Xr printf 3
1.1 tholo 349: format
1.7 aaron 350: .Fa fmt .
351: .It Fn system cmd
1.16 ! jmc 352: Executes
1.7 aaron 353: .Fa cmd
354: and returns its exit status.
355: .It Fn tolower str
1.16 ! jmc 356: Returns a copy of
1.7 aaron 357: .Fa str
1.1 tholo 358: with all upper-case characters translated to their
359: corresponding lower-case equivalents.
1.7 aaron 360: .It Fn toupper str
1.16 ! jmc 361: Returns a copy of
1.7 aaron 362: .Fa str
1.1 tholo 363: with all lower-case characters translated to their
364: corresponding upper-case equivalents.
1.7 aaron 365: .El
366: .Pp
367: The
368: .Sq function
369: .Ic getline
1.1 tholo 370: sets
1.7 aaron 371: .Va $0
1.1 tholo 372: to the next input record from the current input file;
1.7 aaron 373: .Ic getline < Ar file
1.1 tholo 374: sets
1.7 aaron 375: .Va $0
1.1 tholo 376: to the next record from
1.7 aaron 377: .Ar file .
378: .Ic getline Va x
1.1 tholo 379: sets variable
1.7 aaron 380: .Va x
1.1 tholo 381: instead.
382: Finally,
1.7 aaron 383: .Ar cmd Ic \&| getline
1.1 tholo 384: pipes the output of
1.7 aaron 385: .Ar cmd
1.1 tholo 386: into
1.7 aaron 387: .Ic getline ;
1.1 tholo 388: each call of
1.7 aaron 389: .Ic getline
1.1 tholo 390: returns the next line of output from
1.7 aaron 391: .Ar cmd .
1.1 tholo 392: In all cases,
1.7 aaron 393: .Ic getline
1.1 tholo 394: returns 1 for a successful input,
395: 0 for end of file, and \-1 for an error.
1.7 aaron 396: .Pp
1.1 tholo 397: Patterns are arbitrary Boolean combinations
398: (with
1.14 jmc 399: .Ic "\&! || &&" )
1.1 tholo 400: of regular expressions and
401: relational expressions.
402: Regular expressions are as in
1.12 jmc 403: .Xr egrep 1 .
1.1 tholo 404: Isolated regular expressions
405: in a pattern apply to the entire line.
406: Regular expressions may also occur in
407: relational expressions, using the operators
1.7 aaron 408: .Ic ~
1.1 tholo 409: and
1.7 aaron 410: .Ic !~ .
411: .Ic / Ns Ar re Ns Ic /
1.1 tholo 412: is a constant regular expression;
413: any string (constant or variable) may be used
414: as a regular expression, except in the position of an isolated regular expression
415: in a pattern.
1.7 aaron 416: .Pp
1.1 tholo 417: A pattern may consist of two patterns separated by a comma;
418: in this case, the action is performed for all lines
419: from an occurrence of the first pattern
1.15 jmc 420: through an occurrence of the second.
1.7 aaron 421: .Pp
1.1 tholo 422: A relational expression is one of the following:
1.7 aaron 423: .Bd -unfilled -offset indent
424: .Ar expression matchop regular-expression
425: .Ar expression relop expression
426: .Ar expression Ic in Ar array-name
427: .Ic \&( Ns Xo
428: .Ar expr , expr , \&... Ns Ic \&) in
429: .Ar \& array-name
430: .Xc
431: .Ed
1.15 jmc 432: .Pp
1.7 aaron 433: where a
434: .Ar relop
435: is any of the six relational operators in C, and a
436: .Ar matchop
437: is either
438: .Ic ~
1.1 tholo 439: (matches)
440: or
1.7 aaron 441: .Ic !~
1.1 tholo 442: (does not match).
443: A conditional is an arithmetic expression,
444: a relational expression,
445: or a Boolean combination
446: of these.
1.7 aaron 447: .Pp
1.1 tholo 448: The special patterns
1.7 aaron 449: .Ic BEGIN
1.1 tholo 450: and
1.7 aaron 451: .Ic END
1.1 tholo 452: may be used to capture control before the first input line is read
453: and after the last.
1.7 aaron 454: .Ic BEGIN
1.1 tholo 455: and
1.7 aaron 456: .Ic END
1.1 tholo 457: do not combine with other patterns.
1.7 aaron 458: .Pp
1.1 tholo 459: Variable names with special meanings:
1.7 aaron 460: .Pp
1.16 ! jmc 461: .Bl -tag -width "FILENAME" -compact
! 462: .It Va ARGC
! 463: Argument count, assignable.
! 464: .It Va ARGV
! 465: Argument array, assignable;
! 466: non-null members are taken as filenames.
1.7 aaron 467: .It Va CONVFMT
1.16 ! jmc 468: Conversion format used when converting numbers
1.3 millert 469: (default
1.16 ! jmc 470: .Qq Li %.6g ) .
! 471: .It Va ENVIRON
! 472: Array of environment variables; subscripts are names.
! 473: .It Va FILENAME
! 474: The name of the current input file.
! 475: .It Va FNR
! 476: Ordinal number of the current record in the current file.
1.7 aaron 477: .It Va FS
1.16 ! jmc 478: Regular expression used to separate fields; also settable
1.1 tholo 479: by option
1.9 millert 480: .Fl F Ar fs .
1.7 aaron 481: .It Va NF
1.16 ! jmc 482: Number of fields in the current record.
1.7 aaron 483: .It Va NR
1.16 ! jmc 484: Ordinal number of the current record.
! 485: .It Va OFMT
! 486: Output format for numbers (default
! 487: .Qq Li %.6g ) .
1.7 aaron 488: .It Va OFS
1.16 ! jmc 489: Output field separator (default blank).
1.7 aaron 490: .It Va ORS
1.16 ! jmc 491: Output record separator (default newline).
! 492: .It Va RS
! 493: Input record separator (default newline).
1.7 aaron 494: .It Va SUBSEP
1.16 ! jmc 495: Separates multiple subscripts (default 034).
1.7 aaron 496: .El
497: .Pp
498: Functions may be defined (at the position of a pattern-action statement)
499: thusly:
500: .Pp
501: .Dl function foo(a, b, c) { ...; return x }
502: .Pp
1.16 ! jmc 503: Parameters are passed by value if scalar, and by reference if array name;
1.1 tholo 504: functions may be called recursively.
505: Parameters are local to the function; all other variables are global.
506: Thus local variables may be created by providing excess parameters in
507: the function definition.
1.7 aaron 508: .Sh EXAMPLES
1.16 ! jmc 509: Print lines longer than 72 characters:
! 510: .Pp
1.7 aaron 511: .Dl length($0) > 72
1.16 ! jmc 512: .Pp
! 513: Print first two fields in opposite order:
1.7 aaron 514: .Pp
515: .Dl { print $2, $1 }
1.16 ! jmc 516: .Pp
! 517: Same, with input fields separated by comma and/or blanks and tabs:
1.7 aaron 518: .Bd -literal -offset indent
1.1 tholo 519: BEGIN { FS = ",[ \et]*|[ \et]+" }
520: { print $2, $1 }
1.7 aaron 521: .Ed
1.16 ! jmc 522: .Pp
! 523: Add up first column, print sum and average:
1.7 aaron 524: .Bd -literal -offset indent
525: { s += $1 }
526: END { print "sum is", s, " average is", s/NR }
527: .Ed
1.16 ! jmc 528: .Pp
! 529: Print all lines between start/stop pairs:
1.7 aaron 530: .Pp
531: .Dl /start/, /stop/
1.16 ! jmc 532: .Pp
! 533: Simulate echo(1):
1.7 aaron 534: .Bd -literal -offset indent
535: BEGIN { # Simulate echo(1)
536: for (i = 1; i < ARGC; i++) printf "%s ", ARGV[i]
537: printf "\en"
538: exit }
539: .Ed
540: .Sh SEE ALSO
1.16 ! jmc 541: .Xr egrep 1 ,
1.7 aaron 542: .Xr lex 1 ,
1.16 ! jmc 543: .Xr sed 1 ,
! 544: .Xr printf 3
1.7 aaron 545: .Rs
546: .%A A. V. Aho
547: .%A B. W. Kernighan
548: .%A P. J. Weinberger
549: .%T The AWK Programming Language
550: .%I Addison-Wesley
551: .%D 1988
552: .%O ISBN 0-201-07981-X
553: .Re
1.8 aaron 554: .Sh HISTORY
1.13 millert 555: An
1.8 aaron 556: .Nm
1.13 millert 557: utility appeared in
558: .At v7 .
1.7 aaron 559: .Sh BUGS
1.1 tholo 560: There are no explicit conversions between numbers and strings.
561: To force an expression to be treated as a number add 0 to it;
562: to force it to be treated as a string concatenate
1.7 aaron 563: .Li \&""
564: to it.
565: .Pp
1.1 tholo 566: The scope rules for variables in functions are a botch;
567: the syntax is worse.