Annotation of src/usr.bin/awk/awk.1, Revision 1.13
1.13 ! millert 1: .\" $OpenBSD: awk.1,v 1.12 2003/06/10 09:12:09 jmc Exp $
1.7 aaron 2: .\" EX/EE is a Bd
1.11 jmc 3: .\"
4: .\" Copyright (C) Lucent Technologies 1997
5: .\" All Rights Reserved
1.12 jmc 6: .\"
1.11 jmc 7: .\" Permission to use, copy, modify, and distribute this software and
8: .\" its documentation for any purpose and without fee is hereby
9: .\" granted, provided that the above copyright notice appear in all
10: .\" copies and that both that the copyright notice and this
11: .\" permission notice and warranty disclaimer appear in supporting
12: .\" documentation, and that the name Lucent Technologies or any of
13: .\" its entities not be used in advertising or publicity pertaining
14: .\" to distribution of the software without specific, written prior
15: .\" permission.
1.12 jmc 16: .\"
1.11 jmc 17: .\" LUCENT DISCLAIMS ALL WARRANTIES WITH REGARD TO THIS SOFTWARE,
18: .\" INCLUDING ALL IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS.
19: .\" IN NO EVENT SHALL LUCENT OR ANY OF ITS ENTITIES BE LIABLE FOR ANY
20: .\" SPECIAL, INDIRECT OR CONSEQUENTIAL DAMAGES OR ANY DAMAGES
21: .\" WHATSOEVER RESULTING FROM LOSS OF USE, DATA OR PROFITS, WHETHER
22: .\" IN AN ACTION OF CONTRACT, NEGLIGENCE OR OTHER TORTIOUS ACTION,
23: .\" ARISING OUT OF OR IN CONNECTION WITH THE USE OR PERFORMANCE OF
24: .\" THIS SOFTWARE.
25: .\"
1.7 aaron 26: .Dd June 29, 1996
27: .Dt AWK 1
28: .Os
29: .Sh NAME
30: .Nm awk
31: .Nd pattern-directed scanning and processing language
32: .Sh SYNOPSIS
33: .Nm awk
34: .Op Fl F Ar fs
35: .Op Fl v Ar var=value
36: .Op Fl safe
37: .Op Fl mr Ar n
38: .Op Fl mf Ar n
39: .Op Ar prog | Fl f Ar progfile
40: .Ar
41: .Nm nawk
42: .Ar ...
43: .Sh DESCRIPTION
44: .Nm
1.1 tholo 45: scans each input
1.7 aaron 46: .Ar file
1.1 tholo 47: for lines that match any of a set of patterns specified literally in
1.7 aaron 48: .Ar prog
1.1 tholo 49: or in one or more files
50: specified as
1.7 aaron 51: .Fl f Ar progfile .
1.1 tholo 52: With each pattern
53: there can be an associated action that will be performed
54: when a line of a
1.7 aaron 55: .Ar file
1.1 tholo 56: matches the pattern.
57: Each line is matched against the
58: pattern portion of every pattern-action statement;
59: the associated action is performed for each matched pattern.
1.6 aaron 60: The file name
1.7 aaron 61: .Sq Pa \-
1.1 tholo 62: means the standard input.
63: Any
1.7 aaron 64: .Ar file
1.1 tholo 65: of the form
1.7 aaron 66: .Ar var=value
1.1 tholo 67: is treated as an assignment, not a filename,
68: and is executed at the time it would have been opened if it were a filename.
69: The option
1.7 aaron 70: .Fl v
1.1 tholo 71: followed by
1.7 aaron 72: .Ar var=value
1.1 tholo 73: is an assignment to be done before
1.7 aaron 74: .Ar prog
1.1 tholo 75: is executed;
76: any number of
1.7 aaron 77: .Fl v
1.1 tholo 78: options may be present.
79: The
1.7 aaron 80: .Fl F Ar fs
1.1 tholo 81: option defines the input field separator to be the regular expression
1.7 aaron 82: .Ar fs .
1.5 angelos 83: The
1.7 aaron 84: .Fl safe
85: option disables file output
86: .Po
87: .Ic print Ic > ,
88: .Ic print Ic >> ,
89: .Pc
90: process creation
91: .Po
92: .Ar cmd Ic \&| getline ,
93: .Ic print \&| , system
94: .Pc
95: and access to the environment
96: .Pq Va ENVIRON .
97: This
98: is a first (and not very reliable) approximation to a
99: .Dq safe
100: version of
101: .Nm awk .
102: .Pp
103: An input line is normally made up of fields separated by whitespace,
1.1 tholo 104: or by regular expression
1.7 aaron 105: .Va FS .
1.1 tholo 106: The fields are denoted
1.7 aaron 107: .Va $1 , $2 , ... ,
108: while
109: .Va $0
1.1 tholo 110: refers to the entire line.
111: If
1.7 aaron 112: .Va FS
1.1 tholo 113: is null, the input line is split into one field per character.
1.7 aaron 114: .Pp
1.1 tholo 115: To compensate for inadequate implementation of storage management,
1.6 aaron 116: the
1.7 aaron 117: .Fl mr
1.1 tholo 118: option can be used to set the maximum size of the input record,
119: and the
1.7 aaron 120: .Fl mf
1.1 tholo 121: option to set the maximum number of fields.
1.7 aaron 122: .Pp
1.1 tholo 123: A pattern-action statement has the form
1.7 aaron 124: .Pp
125: .D1 Ar pattern Ic \&{ Ar action Ic \&}
126: .Pp
1.6 aaron 127: A missing
1.7 aaron 128: .Ic \&{ Ar action Ic \&}
1.1 tholo 129: means print the line;
130: a missing pattern always matches.
131: Pattern-action statements are separated by newlines or semicolons.
1.7 aaron 132: .Pp
1.1 tholo 133: An action is a sequence of statements.
134: A statement can be one of the following:
1.7 aaron 135: .Pp
136: .Bd -unfilled -offset indent
137: .Ic if ( Xo
138: .Ar expression ) statement \&
139: .Op Ic else Ar statement
140: .Xc
141: .Ic while ( Ar expression ) statement
142: .Ic for ( Xo
143: .Ar expression ; expression ; expression ) statement
144: .Xc
145: .Ic for ( Xo
146: .Ar var Ic in Ar array ) statement
147: .Xc
148: .Ic do Ar statement Ic while ( Ar expression )
149: .Ic break
150: .Ic continue
151: .Ic { Oo Ar statement ... Oc Ic \& }
152: .Ar expression Xo
153: .No "# commonly" \&
154: .Ar var Ic = Ar expression
155: .Xc
156: .Ic print Xo
157: .Op Ar expression-list
158: .Op Ic > Ns Ar expression
159: .Xc
160: .Ic printf Ar format Xo
161: .Op Ar ... , expression-list
162: .Op Ic > Ns Ar expression
163: .Xc
164: .Ic return Op Ar expression
165: .Ic next Xo
166: .No "# skip remaining patterns on this input line"
167: .Xc
168: .Ic nextfile Xo
169: .No "# skip rest of this file, open next, start at top"
170: .Xc
171: .Ic delete Ar array Ns Xo
172: .Ic \&[ Ns Ar expression Ns Ic \&]
173: .No \& "# delete an array element"
174: .Xc
175: .Ic delete Ar array Xo
176: .No "# delete all elements of array"
177: .Xc
178: .Ic exit Xo
179: .Op Ar expression
180: .No \& "# exit immediately; status is" Ar expression
181: .Xc
182: .Ed
183: .Pp
1.1 tholo 184: Statements are terminated by
185: semicolons, newlines or right braces.
186: An empty
1.7 aaron 187: .Ar expression-list
1.1 tholo 188: stands for
1.7 aaron 189: .Ar $0 .
190: String constants are quoted
191: .Li \&"" ,
1.1 tholo 192: with the usual C escapes recognized within.
193: Expressions take on string or numeric values as appropriate,
194: and are built using the operators
1.7 aaron 195: .Ic + \- * / % ^
196: (exponentiation), and concatenation (indicated by whitespace).
1.1 tholo 197: The operators
1.7 aaron 198: .Ic ! ++ \-\- += \-= *= /= %= ^= > >= < <= == != ?:
1.1 tholo 199: are also available in expressions.
200: Variables may be scalars, array elements
201: (denoted
1.7 aaron 202: .Li x[i] )
1.1 tholo 203: or fields.
204: Variables are initialized to the null string.
205: Array subscripts may be any string,
206: not necessarily numeric;
207: this allows for a form of associative memory.
208: Multiple subscripts such as
1.7 aaron 209: .Li [i,j,k]
1.1 tholo 210: are permitted; the constituents are concatenated,
211: separated by the value of
1.7 aaron 212: .Va SUBSEP .
213: .Pp
1.1 tholo 214: The
1.7 aaron 215: .Ic print
1.1 tholo 216: statement prints its arguments on the standard output
217: (or on a file if
1.7 aaron 218: .Ic > Ns Ar file
1.1 tholo 219: or
1.7 aaron 220: .Ic >> Ns Ar file
1.1 tholo 221: is present or on a pipe if
1.7 aaron 222: .Ic \&| Ar cmd
1.1 tholo 223: is present), separated by the current output field separator,
224: and terminated by the output record separator.
1.7 aaron 225: .Ar file
1.1 tholo 226: and
1.7 aaron 227: .Ar cmd
1.1 tholo 228: may be literal names or parenthesized expressions;
229: identical string values in different statements denote
230: the same open file.
231: The
1.7 aaron 232: .Ic printf
1.1 tholo 233: statement formats its expression list according to the format
234: (see
1.10 pvalchev 235: .Xr printf 3 ) .
1.1 tholo 236: The built-in function
1.7 aaron 237: .Fn close expr
1.1 tholo 238: closes the file or pipe
1.7 aaron 239: .Fa expr .
1.1 tholo 240: The built-in function
1.7 aaron 241: .Fn fflush expr
1.1 tholo 242: flushes any buffered output for the file or pipe
1.7 aaron 243: .Fa expr .
244: .Pp
1.1 tholo 245: The mathematical functions
1.7 aaron 246: .Fn exp ,
247: .Fn log ,
248: .Fn sqrt ,
249: .Fn sin ,
250: .Fn cos ,
1.1 tholo 251: and
1.7 aaron 252: .Fn atan2
1.1 tholo 253: are built in.
254: Other built-in functions:
1.7 aaron 255: .Pp
256: .Bl -tag -width Fn
257: .It Fn length
1.1 tholo 258: the length of its argument
259: taken as a string,
260: or of
1.7 aaron 261: .Va $0
1.1 tholo 262: if no argument.
1.7 aaron 263: .It Fn rand
1.1 tholo 264: random number on (0,1)
1.7 aaron 265: .It Fn srand
1.1 tholo 266: sets seed for
1.7 aaron 267: .Fn rand
1.1 tholo 268: and returns the previous seed.
1.7 aaron 269: .It Fn int
270: truncates to an integer value.
271: .It Fn substr s m n
1.1 tholo 272: the
1.7 aaron 273: .Fa n Ns No -character
1.1 tholo 274: substring of
1.7 aaron 275: .Fa s
1.1 tholo 276: that begins at position
1.7 aaron 277: .Fa m
1.1 tholo 278: counted from 1.
1.7 aaron 279: .It Fn index s t
1.1 tholo 280: the position in
1.7 aaron 281: .Fa s
1.1 tholo 282: where the string
1.7 aaron 283: .Fa t
1.1 tholo 284: occurs, or 0 if it does not.
1.7 aaron 285: .It Fn match s r
1.1 tholo 286: the position in
1.7 aaron 287: .Fa s
1.1 tholo 288: where the regular expression
1.7 aaron 289: .Fa r
1.1 tholo 290: occurs, or 0 if it does not.
291: The variables
1.7 aaron 292: .Va RSTART
1.1 tholo 293: and
1.7 aaron 294: .Va RLENGTH
1.1 tholo 295: are set to the position and length of the matched string.
1.7 aaron 296: .It Fn split s a fs
1.1 tholo 297: splits the string
1.7 aaron 298: .Fa s
1.1 tholo 299: into array elements
1.7 aaron 300: .Va a[1] , a[2] , ... , a[n]
1.1 tholo 301: and returns
1.7 aaron 302: .Va n .
1.1 tholo 303: The separation is done with the regular expression
1.7 aaron 304: .Ar fs
1.1 tholo 305: or with the field separator
1.7 aaron 306: .Va FS
1.1 tholo 307: if
1.7 aaron 308: .Ar fs
1.1 tholo 309: is not given.
310: An empty string as field separator splits the string
311: into one array element per character.
1.7 aaron 312: .It Fn sub r t s
1.1 tholo 313: substitutes
1.7 aaron 314: .Fa t
1.1 tholo 315: for the first occurrence of the regular expression
1.7 aaron 316: .Fa r
1.1 tholo 317: in the string
1.7 aaron 318: .Fa s .
1.1 tholo 319: If
1.7 aaron 320: .Fa s
1.1 tholo 321: is not given,
1.7 aaron 322: .Va $0
1.1 tholo 323: is used.
1.7 aaron 324: .It Fn gsub r t s
1.1 tholo 325: same as
1.7 aaron 326: .Fn sub
1.1 tholo 327: except that all occurrences of the regular expression
328: are replaced;
1.7 aaron 329: .Fn sub
1.1 tholo 330: and
1.7 aaron 331: .Fn gsub
1.1 tholo 332: return the number of replacements.
1.7 aaron 333: .It Fn sprintf fmt expr ...
1.1 tholo 334: the string resulting from formatting
1.7 aaron 335: .Fa expr , ...
1.1 tholo 336: according to the
1.7 aaron 337: .Xr printf 3
1.1 tholo 338: format
1.7 aaron 339: .Fa fmt .
340: .It Fn system cmd
1.1 tholo 341: executes
1.7 aaron 342: .Fa cmd
343: and returns its exit status.
344: .It Fn tolower str
1.1 tholo 345: returns a copy of
1.7 aaron 346: .Fa str
1.1 tholo 347: with all upper-case characters translated to their
348: corresponding lower-case equivalents.
1.7 aaron 349: .It Fn toupper str
1.1 tholo 350: returns a copy of
1.7 aaron 351: .Fa str
1.1 tholo 352: with all lower-case characters translated to their
353: corresponding upper-case equivalents.
1.7 aaron 354: .El
355: .Pp
356: The
357: .Sq function
358: .Ic getline
1.1 tholo 359: sets
1.7 aaron 360: .Va $0
1.1 tholo 361: to the next input record from the current input file;
1.7 aaron 362: .Ic getline < Ar file
1.1 tholo 363: sets
1.7 aaron 364: .Va $0
1.1 tholo 365: to the next record from
1.7 aaron 366: .Ar file .
367: .Ic getline Va x
1.1 tholo 368: sets variable
1.7 aaron 369: .Va x
1.1 tholo 370: instead.
371: Finally,
1.7 aaron 372: .Ar cmd Ic \&| getline
1.1 tholo 373: pipes the output of
1.7 aaron 374: .Ar cmd
1.1 tholo 375: into
1.7 aaron 376: .Ic getline ;
1.1 tholo 377: each call of
1.7 aaron 378: .Ic getline
1.1 tholo 379: returns the next line of output from
1.7 aaron 380: .Ar cmd .
1.1 tholo 381: In all cases,
1.7 aaron 382: .Ic getline
1.1 tholo 383: returns 1 for a successful input,
384: 0 for end of file, and \-1 for an error.
1.7 aaron 385: .Pp
1.1 tholo 386: Patterns are arbitrary Boolean combinations
387: (with
1.7 aaron 388: .Ic "! || &&" )
1.1 tholo 389: of regular expressions and
390: relational expressions.
391: Regular expressions are as in
1.12 jmc 392: .Xr egrep 1 .
1.1 tholo 393: Isolated regular expressions
394: in a pattern apply to the entire line.
395: Regular expressions may also occur in
396: relational expressions, using the operators
1.7 aaron 397: .Ic ~
1.1 tholo 398: and
1.7 aaron 399: .Ic !~ .
400: .Ic / Ns Ar re Ns Ic /
1.1 tholo 401: is a constant regular expression;
402: any string (constant or variable) may be used
403: as a regular expression, except in the position of an isolated regular expression
404: in a pattern.
1.7 aaron 405: .Pp
1.1 tholo 406: A pattern may consist of two patterns separated by a comma;
407: in this case, the action is performed for all lines
408: from an occurrence of the first pattern
409: though an occurrence of the second.
1.7 aaron 410: .Pp
1.1 tholo 411: A relational expression is one of the following:
1.7 aaron 412: .Bd -unfilled -offset indent
413: .Ar expression matchop regular-expression
414: .Ar expression relop expression
415: .Ar expression Ic in Ar array-name
416: .Ic \&( Ns Xo
417: .Ar expr , expr , \&... Ns Ic \&) in
418: .Ar \& array-name
419: .Xc
420: .Ed
421: where a
422: .Ar relop
423: is any of the six relational operators in C, and a
424: .Ar matchop
425: is either
426: .Ic ~
1.1 tholo 427: (matches)
428: or
1.7 aaron 429: .Ic !~
1.1 tholo 430: (does not match).
431: A conditional is an arithmetic expression,
432: a relational expression,
433: or a Boolean combination
434: of these.
1.7 aaron 435: .Pp
1.1 tholo 436: The special patterns
1.7 aaron 437: .Ic BEGIN
1.1 tholo 438: and
1.7 aaron 439: .Ic END
1.1 tholo 440: may be used to capture control before the first input line is read
441: and after the last.
1.7 aaron 442: .Ic BEGIN
1.1 tholo 443: and
1.7 aaron 444: .Ic END
1.1 tholo 445: do not combine with other patterns.
1.7 aaron 446: .Pp
1.1 tholo 447: Variable names with special meanings:
1.7 aaron 448: .Pp
449: .Bl -tag -width Va -compact
450: .It Va CONVFMT
1.1 tholo 451: conversion format used when converting numbers
1.3 millert 452: (default
1.7 aaron 453: .Qq Li %.6g )
454: .It Va FS
1.1 tholo 455: regular expression used to separate fields; also settable
456: by option
1.9 millert 457: .Fl F Ar fs .
1.7 aaron 458: .It Va NF
1.1 tholo 459: number of fields in the current record
1.7 aaron 460: .It Va NR
1.1 tholo 461: ordinal number of the current record
1.7 aaron 462: .It Va FNR
1.1 tholo 463: ordinal number of the current record in the current file
1.7 aaron 464: .It Va FILENAME
1.1 tholo 465: the name of the current input file
1.7 aaron 466: .It Va RS
1.1 tholo 467: input record separator (default newline)
1.7 aaron 468: .It Va OFS
1.1 tholo 469: output field separator (default blank)
1.7 aaron 470: .It Va ORS
1.1 tholo 471: output record separator (default newline)
1.7 aaron 472: .It Va OFMT
1.1 tholo 473: output format for numbers (default
1.7 aaron 474: .Qq Li %.6g )
475: .It Va SUBSEP
1.1 tholo 476: separates multiple subscripts (default 034)
1.7 aaron 477: .It Va ARGC
1.1 tholo 478: argument count, assignable
1.7 aaron 479: .It Va ARGV
1.1 tholo 480: argument array, assignable;
481: non-null members are taken as filenames
1.7 aaron 482: .It Va ENVIRON
1.1 tholo 483: array of environment variables; subscripts are names.
1.7 aaron 484: .El
485: .Pp
486: Functions may be defined (at the position of a pattern-action statement)
487: thusly:
488: .Pp
489: .Dl function foo(a, b, c) { ...; return x }
490: .Pp
1.1 tholo 491: Parameters are passed by value if scalar and by reference if array name;
492: functions may be called recursively.
493: Parameters are local to the function; all other variables are global.
494: Thus local variables may be created by providing excess parameters in
495: the function definition.
1.7 aaron 496: .Sh EXAMPLES
497: .Dl length($0) > 72
1.1 tholo 498: Print lines longer than 72 characters.
1.7 aaron 499: .Pp
500: .Dl { print $2, $1 }
1.1 tholo 501: Print first two fields in opposite order.
1.7 aaron 502: .Pp
503: .Bd -literal -offset indent
1.1 tholo 504: BEGIN { FS = ",[ \et]*|[ \et]+" }
505: { print $2, $1 }
1.7 aaron 506: .Ed
1.1 tholo 507: Same, with input fields separated by comma and/or blanks and tabs.
1.7 aaron 508: .Pp
509: .Bd -literal -offset indent
510: { s += $1 }
511: END { print "sum is", s, " average is", s/NR }
512: .Ed
1.1 tholo 513: Add up first column, print sum and average.
1.7 aaron 514: .Pp
515: .Dl /start/, /stop/
1.1 tholo 516: Print all lines between start/stop pairs.
1.7 aaron 517: .Pp
518: .Bd -literal -offset indent
519: BEGIN { # Simulate echo(1)
520: for (i = 1; i < ARGC; i++) printf "%s ", ARGV[i]
521: printf "\en"
522: exit }
523: .Ed
524: .Sh SEE ALSO
525: .Xr lex 1 ,
526: .Xr sed 1
527: .Rs
528: .%A A. V. Aho
529: .%A B. W. Kernighan
530: .%A P. J. Weinberger
531: .%T The AWK Programming Language
532: .%I Addison-Wesley
533: .%D 1988
534: .%O ISBN 0-201-07981-X
535: .Re
1.8 aaron 536: .Sh HISTORY
1.13 ! millert 537: An
1.8 aaron 538: .Nm
1.13 ! millert 539: utility appeared in
! 540: .At v7 .
1.7 aaron 541: .Sh BUGS
1.1 tholo 542: There are no explicit conversions between numbers and strings.
543: To force an expression to be treated as a number add 0 to it;
544: to force it to be treated as a string concatenate
1.7 aaron 545: .Li \&""
546: to it.
547: .Pp
1.1 tholo 548: The scope rules for variables in functions are a botch;
549: the syntax is worse.