Annotation of src/usr.bin/awk/awk.1, Revision 1.1
1.1 ! tholo 1: .de EX
! 2: .nf
! 3: .ft CW
! 4: ..
! 5: .de EE
! 6: .br
! 7: .fi
! 8: .ft 1
! 9: ..
! 10: awk
! 11: .TH AWK 1
! 12: .CT 1 files prog_other
! 13: .SH NAME
! 14: awk \- pattern-directed scanning and processing language
! 15: .SH SYNOPSIS
! 16: .B awk
! 17: [
! 18: .BI \-F
! 19: .I fs
! 20: ]
! 21: [
! 22: .BI \-v
! 23: .I var=value
! 24: ]
! 25: [
! 26: .BI \-mr n
! 27: ]
! 28: [
! 29: .BI \-mf n
! 30: ]
! 31: [
! 32: .I 'prog'
! 33: |
! 34: .BI \-f
! 35: .I progfile
! 36: ]
! 37: [
! 38: .I file ...
! 39: ]
! 40: .SH DESCRIPTION
! 41: .I Awk
! 42: scans each input
! 43: .I file
! 44: for lines that match any of a set of patterns specified literally in
! 45: .IR prog
! 46: or in one or more files
! 47: specified as
! 48: .B \-f
! 49: .IR progfile .
! 50: With each pattern
! 51: there can be an associated action that will be performed
! 52: when a line of a
! 53: .I file
! 54: matches the pattern.
! 55: Each line is matched against the
! 56: pattern portion of every pattern-action statement;
! 57: the associated action is performed for each matched pattern.
! 58: The file name
! 59: .B \-
! 60: means the standard input.
! 61: Any
! 62: .IR file
! 63: of the form
! 64: .I var=value
! 65: is treated as an assignment, not a filename,
! 66: and is executed at the time it would have been opened if it were a filename.
! 67: The option
! 68: .B \-v
! 69: followed by
! 70: .I var=value
! 71: is an assignment to be done before
! 72: .I prog
! 73: is executed;
! 74: any number of
! 75: .B \-v
! 76: options may be present.
! 77: The
! 78: .B \-F
! 79: .IR fs
! 80: option defines the input field separator to be the regular expression
! 81: .IR fs.
! 82: .PP
! 83: An input line is normally made up of fields separated by white space,
! 84: or by regular expression
! 85: .BR FS .
! 86: The fields are denoted
! 87: .BR $1 ,
! 88: .BR $2 ,
! 89: \&..., while
! 90: .B $0
! 91: refers to the entire line.
! 92: If
! 93: .BR FS
! 94: is null, the input line is split into one field per character.
! 95: .PP
! 96: To compensate for inadequate implementation of storage management,
! 97: the
! 98: .B \-mr
! 99: option can be used to set the maximum size of the input record,
! 100: and the
! 101: .B \-mf
! 102: option to set the maximum number of fields.
! 103: .PP
! 104: A pattern-action statement has the form
! 105: .IP
! 106: .IB pattern " { " action " }
! 107: .PP
! 108: A missing
! 109: .BI { " action " }
! 110: means print the line;
! 111: a missing pattern always matches.
! 112: Pattern-action statements are separated by newlines or semicolons.
! 113: .PP
! 114: An action is a sequence of statements.
! 115: A statement can be one of the following:
! 116: .PP
! 117: .EX
! 118: .ta \w'\f(CWdelete array[expression]'u
! 119: .RS
! 120: .nf
! 121: .ft CW
! 122: if(\fI expression \fP)\fI statement \fP\fR[ \fPelse\fI statement \fP\fR]\fP
! 123: while(\fI expression \fP)\fI statement\fP
! 124: for(\fI expression \fP;\fI expression \fP;\fI expression \fP)\fI statement\fP
! 125: for(\fI var \fPin\fI array \fP)\fI statement\fP
! 126: do\fI statement \fPwhile(\fI expression \fP)
! 127: break
! 128: continue
! 129: {\fR [\fP\fI statement ... \fP\fR] \fP}
! 130: \fIexpression\fP #\fR commonly\fP\fI var = expression\fP
! 131: print\fR [ \fP\fIexpression-list \fP\fR] \fP\fR[ \fP>\fI expression \fP\fR]\fP
! 132: printf\fI format \fP\fR[ \fP,\fI expression-list \fP\fR] \fP\fR[ \fP>\fI expression \fP\fR]\fP
! 133: return\fR [ \fP\fIexpression \fP\fR]\fP
! 134: next #\fR skip remaining patterns on this input line\fP
! 135: nextfile #\fR skip rest of this file, open next, start at top\fP
! 136: delete\fI array\fP[\fI expression \fP] #\fR delete an array element\fP
! 137: delete\fI array\fP #\fR delete all elements of array\fP
! 138: exit\fR [ \fP\fIexpression \fP\fR]\fP #\fR exit immediately; status is \fP\fIexpression\fP
! 139: .fi
! 140: .RE
! 141: .EE
! 142: .DT
! 143: .PP
! 144: Statements are terminated by
! 145: semicolons, newlines or right braces.
! 146: An empty
! 147: .I expression-list
! 148: stands for
! 149: .BR $0 .
! 150: String constants are quoted \&\f(CW"\ "\fR,
! 151: with the usual C escapes recognized within.
! 152: Expressions take on string or numeric values as appropriate,
! 153: and are built using the operators
! 154: .B + \- * / % ^
! 155: (exponentiation), and concatenation (indicated by white space).
! 156: The operators
! 157: .B
! 158: ! ++ \-\- += \-= *= /= %= ^= > >= < <= == != ?:
! 159: are also available in expressions.
! 160: Variables may be scalars, array elements
! 161: (denoted
! 162: .IB x [ i ] )
! 163: or fields.
! 164: Variables are initialized to the null string.
! 165: Array subscripts may be any string,
! 166: not necessarily numeric;
! 167: this allows for a form of associative memory.
! 168: Multiple subscripts such as
! 169: .B [i,j,k]
! 170: are permitted; the constituents are concatenated,
! 171: separated by the value of
! 172: .BR SUBSEP .
! 173: .PP
! 174: The
! 175: .B print
! 176: statement prints its arguments on the standard output
! 177: (or on a file if
! 178: .BI > file
! 179: or
! 180: .BI >> file
! 181: is present or on a pipe if
! 182: .BI | cmd
! 183: is present), separated by the current output field separator,
! 184: and terminated by the output record separator.
! 185: .I file
! 186: and
! 187: .I cmd
! 188: may be literal names or parenthesized expressions;
! 189: identical string values in different statements denote
! 190: the same open file.
! 191: The
! 192: .B printf
! 193: statement formats its expression list according to the format
! 194: (see
! 195: .IR printf (3)) .
! 196: The built-in function
! 197: .BI close( expr )
! 198: closes the file or pipe
! 199: .IR expr .
! 200: The built-in function
! 201: .BI fflush( expr )
! 202: flushes any buffered output for the file or pipe
! 203: .IR expr .
! 204: .PP
! 205: The mathematical functions
! 206: .BR exp ,
! 207: .BR log ,
! 208: .BR sqrt ,
! 209: .BR sin ,
! 210: .BR cos ,
! 211: and
! 212: .BR atan2
! 213: are built in.
! 214: Other built-in functions:
! 215: .TF length
! 216: .TP
! 217: .B length
! 218: the length of its argument
! 219: taken as a string,
! 220: or of
! 221: .B $0
! 222: if no argument.
! 223: .TP
! 224: .B rand
! 225: random number on (0,1)
! 226: .TP
! 227: .B srand
! 228: sets seed for
! 229: .B rand
! 230: and returns the previous seed.
! 231: .TP
! 232: .B int
! 233: truncates to an integer value
! 234: .TP
! 235: .BI substr( s , " m" , " n\fB)
! 236: the
! 237: .IR n -character
! 238: substring of
! 239: .I s
! 240: that begins at position
! 241: .IR m
! 242: counted from 1.
! 243: .TP
! 244: .BI index( s , " t" )
! 245: the position in
! 246: .I s
! 247: where the string
! 248: .I t
! 249: occurs, or 0 if it does not.
! 250: .TP
! 251: .BI match( s , " r" )
! 252: the position in
! 253: .I s
! 254: where the regular expression
! 255: .I r
! 256: occurs, or 0 if it does not.
! 257: The variables
! 258: .B RSTART
! 259: and
! 260: .B RLENGTH
! 261: are set to the position and length of the matched string.
! 262: .TP
! 263: .BI split( s , " a" , " fs\fB)
! 264: splits the string
! 265: .I s
! 266: into array elements
! 267: .IB a [1] ,
! 268: .IB a [2] ,
! 269: \&...,
! 270: .IB a [ n ] ,
! 271: and returns
! 272: .IR n .
! 273: The separation is done with the regular expression
! 274: .I fs
! 275: or with the field separator
! 276: .B FS
! 277: if
! 278: .I fs
! 279: is not given.
! 280: An empty string as field separator splits the string
! 281: into one array element per character.
! 282: .TP
! 283: .BI sub( r , " t" , " s\fB)
! 284: substitutes
! 285: .I t
! 286: for the first occurrence of the regular expression
! 287: .I r
! 288: in the string
! 289: .IR s .
! 290: If
! 291: .I s
! 292: is not given,
! 293: .B $0
! 294: is used.
! 295: .TP
! 296: .B gsub
! 297: same as
! 298: .B sub
! 299: except that all occurrences of the regular expression
! 300: are replaced;
! 301: .B sub
! 302: and
! 303: .B gsub
! 304: return the number of replacements.
! 305: .TP
! 306: .BI sprintf( fmt , " expr" , " ...\fB )
! 307: the string resulting from formatting
! 308: .I expr ...
! 309: according to the
! 310: .IR printf (3)
! 311: format
! 312: .I fmt
! 313: .TP
! 314: .BI system( cmd )
! 315: executes
! 316: .I cmd
! 317: and returns its exit status
! 318: .TP
! 319: .BI tolower( str )
! 320: returns a copy of
! 321: .I str
! 322: with all upper-case characters translated to their
! 323: corresponding lower-case equivalents.
! 324: .TP
! 325: .BI toupper( str )
! 326: returns a copy of
! 327: .I str
! 328: with all lower-case characters translated to their
! 329: corresponding upper-case equivalents.
! 330: .PD
! 331: .PP
! 332: The ``function''
! 333: .B getline
! 334: sets
! 335: .B $0
! 336: to the next input record from the current input file;
! 337: .B getline
! 338: .BI < file
! 339: sets
! 340: .B $0
! 341: to the next record from
! 342: .IR file .
! 343: .B getline
! 344: .I x
! 345: sets variable
! 346: .I x
! 347: instead.
! 348: Finally,
! 349: .IB cmd " | getline
! 350: pipes the output of
! 351: .I cmd
! 352: into
! 353: .BR getline ;
! 354: each call of
! 355: .B getline
! 356: returns the next line of output from
! 357: .IR cmd .
! 358: In all cases,
! 359: .B getline
! 360: returns 1 for a successful input,
! 361: 0 for end of file, and \-1 for an error.
! 362: .PP
! 363: Patterns are arbitrary Boolean combinations
! 364: (with
! 365: .BR "! || &&" )
! 366: of regular expressions and
! 367: relational expressions.
! 368: Regular expressions are as in
! 369: .IR egrep ;
! 370: see
! 371: .IR grep (1).
! 372: Isolated regular expressions
! 373: in a pattern apply to the entire line.
! 374: Regular expressions may also occur in
! 375: relational expressions, using the operators
! 376: .BR ~
! 377: and
! 378: .BR !~ .
! 379: .BI / re /
! 380: is a constant regular expression;
! 381: any string (constant or variable) may be used
! 382: as a regular expression, except in the position of an isolated regular expression
! 383: in a pattern.
! 384: .PP
! 385: A pattern may consist of two patterns separated by a comma;
! 386: in this case, the action is performed for all lines
! 387: from an occurrence of the first pattern
! 388: though an occurrence of the second.
! 389: .PP
! 390: A relational expression is one of the following:
! 391: .IP
! 392: .I expression matchop regular-expression
! 393: .br
! 394: .I expression relop expression
! 395: .br
! 396: .IB expression " in " array-name
! 397: .br
! 398: .BI ( expr , expr,... ") in " array-name
! 399: .PP
! 400: where a relop is any of the six relational operators in C,
! 401: and a matchop is either
! 402: .B ~
! 403: (matches)
! 404: or
! 405: .B !~
! 406: (does not match).
! 407: A conditional is an arithmetic expression,
! 408: a relational expression,
! 409: or a Boolean combination
! 410: of these.
! 411: .PP
! 412: The special patterns
! 413: .B BEGIN
! 414: and
! 415: .B END
! 416: may be used to capture control before the first input line is read
! 417: and after the last.
! 418: .B BEGIN
! 419: and
! 420: .B END
! 421: do not combine with other patterns.
! 422: .PP
! 423: Variable names with special meanings:
! 424: .TF FILENAME
! 425: .TP
! 426: .B CONVFMT
! 427: conversion format used when converting numbers
! 428: .BR "%.6g" )
! 429: .TP
! 430: .B FS
! 431: regular expression used to separate fields; also settable
! 432: by option
! 433: .BI \-F fs.
! 434: .TP
! 435: .BR NF
! 436: number of fields in the current record
! 437: .TP
! 438: .B NR
! 439: ordinal number of the current record
! 440: .TP
! 441: .B FNR
! 442: ordinal number of the current record in the current file
! 443: .TP
! 444: .B FILENAME
! 445: the name of the current input file
! 446: .TP
! 447: .B RS
! 448: input record separator (default newline)
! 449: .TP
! 450: .B OFS
! 451: output field separator (default blank)
! 452: .TP
! 453: .B ORS
! 454: output record separator (default newline)
! 455: .TP
! 456: .B OFMT
! 457: output format for numbers (default
! 458: .BR "%.6g" )
! 459: .TP
! 460: .B SUBSEP
! 461: separates multiple subscripts (default 034)
! 462: .TP
! 463: .B ARGC
! 464: argument count, assignable
! 465: .TP
! 466: .B ARGV
! 467: argument array, assignable;
! 468: non-null members are taken as filenames
! 469: .TP
! 470: .B ENVIRON
! 471: array of environment variables; subscripts are names.
! 472: .PD
! 473: .PP
! 474: Functions may be defined (at the position of a pattern-action statement) thus:
! 475: .IP
! 476: .B
! 477: function foo(a, b, c) { ...; return x }
! 478: .PP
! 479: Parameters are passed by value if scalar and by reference if array name;
! 480: functions may be called recursively.
! 481: Parameters are local to the function; all other variables are global.
! 482: Thus local variables may be created by providing excess parameters in
! 483: the function definition.
! 484: .SH EXAMPLES
! 485: .TP
! 486: .B
! 487: length($0) > 72
! 488: Print lines longer than 72 characters.
! 489: .TP
! 490: .B
! 491: { print $2, $1 }
! 492: Print first two fields in opposite order.
! 493: .PP
! 494: .EX
! 495: BEGIN { FS = ",[ \et]*|[ \et]+" }
! 496: { print $2, $1 }
! 497: .EE
! 498: .ns
! 499: .IP
! 500: Same, with input fields separated by comma and/or blanks and tabs.
! 501: .PP
! 502: .EX
! 503: .nf
! 504: { s += $1 }
! 505: END { print "sum is", s, " average is", s/NR }
! 506: .fi
! 507: .EE
! 508: .ns
! 509: .IP
! 510: Add up first column, print sum and average.
! 511: .TP
! 512: .B
! 513: /start/, /stop/
! 514: Print all lines between start/stop pairs.
! 515: .PP
! 516: .EX
! 517: .nf
! 518: BEGIN { # Simulate echo(1)
! 519: for (i = 1; i < ARGC; i++) printf "%s ", ARGV[i]
! 520: printf "\en"
! 521: exit }
! 522: .fi
! 523: .EE
! 524: .SH SEE ALSO
! 525: .IR lex (1),
! 526: .IR sed (1)
! 527: .br
! 528: A. V. Aho, B. W. Kernighan, P. J. Weinberger,
! 529: .I
! 530: The AWK Programming Language,
! 531: Addison-Wesley, 1988. ISBN 0-201-07981-X
! 532: .SH BUGS
! 533: There are no explicit conversions between numbers and strings.
! 534: To force an expression to be treated as a number add 0 to it;
! 535: to force it to be treated as a string concatenate
! 536: \&\f(CW""\fP to it.
! 537: .br
! 538: The scope rules for variables in functions are a botch;
! 539: the syntax is worse.