=================================================================== RCS file: /cvsrepo/anoncvs/cvs/src/usr.bin/awk/awk.1,v retrieving revision 1.6 retrieving revision 1.7 diff -c -r1.6 -r1.7 *** src/usr.bin/awk/awk.1 1999/06/05 01:21:18 1.6 --- src/usr.bin/awk/awk.1 2000/08/30 13:37:51 1.7 *************** *** 1,551 **** ! .\" $OpenBSD: awk.1,v 1.6 1999/06/05 01:21:18 aaron Exp $ ! .de EX ! .nf ! .ft CW ! .. ! .de EE ! .br ! .fi ! .ft 1 ! .. ! .TH AWK 1 ! .CT 1 files prog_other ! .SH NAME ! awk \- pattern-directed scanning and processing language ! .SH SYNOPSIS ! .B awk|nawk ! [ ! .BI \-F ! .I fs ! ] ! [ ! .BI \-v ! .I var=value ! ] ! [ ! .BI \-safe ! ] ! [ ! .BI \-mr n ! ] ! [ ! .BI \-mf n ! ] ! [ ! .I 'prog' ! | ! .BI \-f ! .I progfile ! ] ! [ ! .I file ... ! ] ! .SH DESCRIPTION ! .I Awk scans each input ! .I file for lines that match any of a set of patterns specified literally in ! .IR prog or in one or more files specified as ! .B \-f ! .IR progfile . With each pattern there can be an associated action that will be performed when a line of a ! .I file matches the pattern. Each line is matched against the pattern portion of every pattern-action statement; the associated action is performed for each matched pattern. The file name ! .B \- means the standard input. Any ! .IR file of the form ! .I var=value is treated as an assignment, not a filename, and is executed at the time it would have been opened if it were a filename. The option ! .B \-v followed by ! .I var=value is an assignment to be done before ! .I prog is executed; any number of ! .B \-v options may be present. The ! .B \-F ! .IR fs option defines the input field separator to be the regular expression ! .IR fs. The ! .B \-safe ! option disables file output (print >, print >>), process creation ! (cmd|getline, print |, system), and access to the environment (ENVIRON). This ! is a first (and not very reliable) approximation to a "safe" version of awk. ! .PP ! An input line is normally made up of fields separated by white space, or by regular expression ! .BR FS . The fields are denoted ! .BR $1 , ! .BR $2 , ! \&..., while ! .B $0 refers to the entire line. If ! .BR FS is null, the input line is split into one field per character. ! .PP To compensate for inadequate implementation of storage management, the ! .B \-mr option can be used to set the maximum size of the input record, and the ! .B \-mf option to set the maximum number of fields. ! .PP A pattern-action statement has the form ! .IP ! .IB pattern " { " action " } ! .PP A missing ! .BI { " action " } means print the line; a missing pattern always matches. Pattern-action statements are separated by newlines or semicolons. ! .PP An action is a sequence of statements. A statement can be one of the following: ! .PP ! .EX ! .ta \w'\f(CWdelete array[expression]'u ! .RS ! .nf ! .ft CW ! if(\fI expression \fP)\fI statement \fP\fR[ \fPelse\fI statement \fP\fR]\fP ! while(\fI expression \fP)\fI statement\fP ! for(\fI expression \fP;\fI expression \fP;\fI expression \fP)\fI statement\fP ! for(\fI var \fPin\fI array \fP)\fI statement\fP ! do\fI statement \fPwhile(\fI expression \fP) ! break ! continue ! {\fR [\fP\fI statement ... \fP\fR] \fP} ! \fIexpression\fP #\fR commonly\fP\fI var = expression\fP ! print\fR [ \fP\fIexpression-list \fP\fR] \fP\fR[ \fP>\fI expression \fP\fR]\fP ! printf\fI format \fP\fR[ \fP,\fI expression-list \fP\fR] \fP\fR[ \fP>\fI expression \fP\fR]\fP ! return\fR [ \fP\fIexpression \fP\fR]\fP ! next #\fR skip remaining patterns on this input line\fP ! nextfile #\fR skip rest of this file, open next, start at top\fP ! delete\fI array\fP[\fI expression \fP] #\fR delete an array element\fP ! delete\fI array\fP #\fR delete all elements of array\fP ! exit\fR [ \fP\fIexpression \fP\fR]\fP #\fR exit immediately; status is \fP\fIexpression\fP ! .fi ! .RE ! .EE ! .DT ! .PP Statements are terminated by semicolons, newlines or right braces. An empty ! .I expression-list stands for ! .BR $0 . ! String constants are quoted \&\f(CW"\ "\fR, with the usual C escapes recognized within. Expressions take on string or numeric values as appropriate, and are built using the operators ! .B + \- * / % ^ ! (exponentiation), and concatenation (indicated by white space). The operators ! .B ! ! ++ \-\- += \-= *= /= %= ^= > >= < <= == != ?: are also available in expressions. Variables may be scalars, array elements (denoted ! .IB x [ i ] ) or fields. Variables are initialized to the null string. Array subscripts may be any string, not necessarily numeric; this allows for a form of associative memory. Multiple subscripts such as ! .B [i,j,k] are permitted; the constituents are concatenated, separated by the value of ! .BR SUBSEP . ! .PP The ! .B print statement prints its arguments on the standard output (or on a file if ! .BI > file or ! .BI >> file is present or on a pipe if ! .BI | cmd is present), separated by the current output field separator, and terminated by the output record separator. ! .I file and ! .I cmd may be literal names or parenthesized expressions; identical string values in different statements denote the same open file. The ! .B printf statement formats its expression list according to the format (see ! .IR printf (3)) . The built-in function ! .BI close( expr ) closes the file or pipe ! .IR expr . The built-in function ! .BI fflush( expr ) flushes any buffered output for the file or pipe ! .IR expr . ! .PP The mathematical functions ! .BR exp , ! .BR log , ! .BR sqrt , ! .BR sin , ! .BR cos , and ! .BR atan2 are built in. Other built-in functions: ! .TF length ! .TP ! .B length the length of its argument taken as a string, or of ! .B $0 if no argument. ! .TP ! .B rand random number on (0,1) ! .TP ! .B srand sets seed for ! .B rand and returns the previous seed. ! .TP ! .B int ! truncates to an integer value ! .TP ! .BI substr( s , " m" , " n\fB) the ! .IR n -character substring of ! .I s that begins at position ! .IR m counted from 1. ! .TP ! .BI index( s , " t" ) the position in ! .I s where the string ! .I t occurs, or 0 if it does not. ! .TP ! .BI match( s , " r" ) the position in ! .I s where the regular expression ! .I r occurs, or 0 if it does not. The variables ! .B RSTART and ! .B RLENGTH are set to the position and length of the matched string. ! .TP ! .BI split( s , " a" , " fs\fB) splits the string ! .I s into array elements ! .IB a [1] , ! .IB a [2] , ! \&..., ! .IB a [ n ] , and returns ! .IR n . The separation is done with the regular expression ! .I fs or with the field separator ! .B FS if ! .I fs is not given. An empty string as field separator splits the string into one array element per character. ! .TP ! .BI sub( r , " t" , " s\fB) substitutes ! .I t for the first occurrence of the regular expression ! .I r in the string ! .IR s . If ! .I s is not given, ! .B $0 is used. ! .TP ! .B gsub same as ! .B sub except that all occurrences of the regular expression are replaced; ! .B sub and ! .B gsub return the number of replacements. ! .TP ! .BI sprintf( fmt , " expr" , " ...\fB ) the string resulting from formatting ! .I expr ... according to the ! .IR printf (3) format ! .I fmt ! .TP ! .BI system( cmd ) executes ! .I cmd ! and returns its exit status ! .TP ! .BI tolower( str ) returns a copy of ! .I str with all upper-case characters translated to their corresponding lower-case equivalents. ! .TP ! .BI toupper( str ) returns a copy of ! .I str with all lower-case characters translated to their corresponding upper-case equivalents. ! .PD ! .PP ! The ``function'' ! .B getline sets ! .B $0 to the next input record from the current input file; ! .B getline ! .BI < file sets ! .B $0 to the next record from ! .IR file . ! .B getline ! .I x sets variable ! .I x instead. Finally, ! .IB cmd " | getline pipes the output of ! .I cmd into ! .BR getline ; each call of ! .B getline returns the next line of output from ! .IR cmd . In all cases, ! .B getline returns 1 for a successful input, 0 for end of file, and \-1 for an error. ! .PP Patterns are arbitrary Boolean combinations (with ! .BR "! || &&" ) of regular expressions and relational expressions. Regular expressions are as in ! .IR egrep ; ! see ! .IR grep (1). Isolated regular expressions in a pattern apply to the entire line. Regular expressions may also occur in relational expressions, using the operators ! .BR ~ and ! .BR !~ . ! .BI / re / is a constant regular expression; any string (constant or variable) may be used as a regular expression, except in the position of an isolated regular expression in a pattern. ! .PP A pattern may consist of two patterns separated by a comma; in this case, the action is performed for all lines from an occurrence of the first pattern though an occurrence of the second. ! .PP A relational expression is one of the following: ! .IP ! .I expression matchop regular-expression ! .br ! .I expression relop expression ! .br ! .IB expression " in " array-name ! .br ! .BI ( expr , expr,... ") in " array-name ! .PP ! where a relop is any of the six relational operators in C, ! and a matchop is either ! .B ~ (matches) or ! .B !~ (does not match). A conditional is an arithmetic expression, a relational expression, or a Boolean combination of these. ! .PP The special patterns ! .B BEGIN and ! .B END may be used to capture control before the first input line is read and after the last. ! .B BEGIN and ! .B END do not combine with other patterns. ! .PP Variable names with special meanings: ! .TF FILENAME ! .TP ! .B CONVFMT conversion format used when converting numbers (default ! .BR "%.6g" ) ! .TP ! .B FS regular expression used to separate fields; also settable by option ! .BI \-F fs. ! .TP ! .BR NF number of fields in the current record ! .TP ! .B NR ordinal number of the current record ! .TP ! .B FNR ordinal number of the current record in the current file ! .TP ! .B FILENAME the name of the current input file ! .TP ! .B RS input record separator (default newline) ! .TP ! .B OFS output field separator (default blank) ! .TP ! .B ORS output record separator (default newline) ! .TP ! .B OFMT output format for numbers (default ! .BR "%.6g" ) ! .TP ! .B SUBSEP separates multiple subscripts (default 034) ! .TP ! .B ARGC argument count, assignable ! .TP ! .B ARGV argument array, assignable; non-null members are taken as filenames ! .TP ! .B ENVIRON array of environment variables; subscripts are names. ! .PD ! .PP ! Functions may be defined (at the position of a pattern-action statement) thus: ! .IP ! .B ! function foo(a, b, c) { ...; return x } ! .PP Parameters are passed by value if scalar and by reference if array name; functions may be called recursively. Parameters are local to the function; all other variables are global. Thus local variables may be created by providing excess parameters in the function definition. ! .SH EXAMPLES ! .TP ! .EX ! length($0) > 72 ! .EE Print lines longer than 72 characters. ! .TP ! .EX ! { print $2, $1 } ! .EE Print first two fields in opposite order. ! .PP ! .EX BEGIN { FS = ",[ \et]*|[ \et]+" } { print $2, $1 } ! .EE ! .ns ! .IP Same, with input fields separated by comma and/or blanks and tabs. ! .PP ! .EX ! .nf ! { s += $1 } ! END { print "sum is", s, " average is", s/NR } ! .fi ! .EE ! .ns ! .IP Add up first column, print sum and average. ! .TP ! .EX ! /start/, /stop/ ! .EE Print all lines between start/stop pairs. ! .PP ! .EX ! .nf ! BEGIN { # Simulate echo(1) ! for (i = 1; i < ARGC; i++) printf "%s ", ARGV[i] ! printf "\en" ! exit } ! .fi ! .EE ! .SH SEE ALSO ! .IR lex (1), ! .IR sed (1) ! .br ! A. V. Aho, B. W. Kernighan, P. J. Weinberger, ! .I ! The AWK Programming Language, ! Addison-Wesley, 1988. ISBN 0-201-07981-X ! .SH BUGS There are no explicit conversions between numbers and strings. To force an expression to be treated as a number add 0 to it; to force it to be treated as a string concatenate ! \&\f(CW""\fP to it. ! .br The scope rules for variables in functions are a botch; the syntax is worse. --- 1,527 ---- ! .\" $OpenBSD: awk.1,v 1.7 2000/08/30 13:37:51 aaron Exp $ ! .\" EX/EE is a Bd ! .Dd June 29, 1996 ! .Dt AWK 1 ! .Os ! .Sh NAME ! .Nm awk ! .Nd pattern-directed scanning and processing language ! .Sh SYNOPSIS ! .Nm awk ! .Op Fl F Ar fs ! .Op Fl v Ar var=value ! .Op Fl safe ! .Op Fl mr Ar n ! .Op Fl mf Ar n ! .Op Ar prog | Fl f Ar progfile ! .Ar ! .Nm nawk ! .Ar ... ! .Sh DESCRIPTION ! .Nm scans each input ! .Ar file for lines that match any of a set of patterns specified literally in ! .Ar prog or in one or more files specified as ! .Fl f Ar progfile . With each pattern there can be an associated action that will be performed when a line of a ! .Ar file matches the pattern. Each line is matched against the pattern portion of every pattern-action statement; the associated action is performed for each matched pattern. The file name ! .Sq Pa \- means the standard input. Any ! .Ar file of the form ! .Ar var=value is treated as an assignment, not a filename, and is executed at the time it would have been opened if it were a filename. The option ! .Fl v followed by ! .Ar var=value is an assignment to be done before ! .Ar prog is executed; any number of ! .Fl v options may be present. The ! .Fl F Ar fs option defines the input field separator to be the regular expression ! .Ar fs . The ! .Fl safe ! option disables file output ! .Po ! .Ic print Ic > , ! .Ic print Ic >> , ! .Pc ! process creation ! .Po ! .Ar cmd Ic \&| getline , ! .Ic print \&| , system ! .Pc ! and access to the environment ! .Pq Va ENVIRON . ! This ! is a first (and not very reliable) approximation to a ! .Dq safe ! version of ! .Nm awk . ! .Pp ! An input line is normally made up of fields separated by whitespace, or by regular expression ! .Va FS . The fields are denoted ! .Va $1 , $2 , ... , ! while ! .Va $0 refers to the entire line. If ! .Va FS is null, the input line is split into one field per character. ! .Pp To compensate for inadequate implementation of storage management, the ! .Fl mr option can be used to set the maximum size of the input record, and the ! .Fl mf option to set the maximum number of fields. ! .Pp A pattern-action statement has the form ! .Pp ! .D1 Ar pattern Ic \&{ Ar action Ic \&} ! .Pp A missing ! .Ic \&{ Ar action Ic \&} means print the line; a missing pattern always matches. Pattern-action statements are separated by newlines or semicolons. ! .Pp An action is a sequence of statements. A statement can be one of the following: ! .Pp ! .Bd -unfilled -offset indent ! .Ic if ( Xo ! .Ar expression ) statement \& ! .Op Ic else Ar statement ! .Xc ! .Ic while ( Ar expression ) statement ! .Ic for ( Xo ! .Ar expression ; expression ; expression ) statement ! .Xc ! .Ic for ( Xo ! .Ar var Ic in Ar array ) statement ! .Xc ! .Ic do Ar statement Ic while ( Ar expression ) ! .Ic break ! .Ic continue ! .Ic { Oo Ar statement ... Oc Ic \& } ! .Ar expression Xo ! .No "# commonly" \& ! .Ar var Ic = Ar expression ! .Xc ! .Ic print Xo ! .Op Ar expression-list ! .Op Ic > Ns Ar expression ! .Xc ! .Ic printf Ar format Xo ! .Op Ar ... , expression-list ! .Op Ic > Ns Ar expression ! .Xc ! .Ic return Op Ar expression ! .Ic next Xo ! .No "# skip remaining patterns on this input line" ! .Xc ! .Ic nextfile Xo ! .No "# skip rest of this file, open next, start at top" ! .Xc ! .Ic delete Ar array Ns Xo ! .Ic \&[ Ns Ar expression Ns Ic \&] ! .No \& "# delete an array element" ! .Xc ! .Ic delete Ar array Xo ! .No "# delete all elements of array" ! .Xc ! .Ic exit Xo ! .Op Ar expression ! .No \& "# exit immediately; status is" Ar expression ! .Xc ! .Ed ! .Pp Statements are terminated by semicolons, newlines or right braces. An empty ! .Ar expression-list stands for ! .Ar $0 . ! String constants are quoted ! .Li \&"" , with the usual C escapes recognized within. Expressions take on string or numeric values as appropriate, and are built using the operators ! .Ic + \- * / % ^ ! (exponentiation), and concatenation (indicated by whitespace). The operators ! .Ic ! ++ \-\- += \-= *= /= %= ^= > >= < <= == != ?: are also available in expressions. Variables may be scalars, array elements (denoted ! .Li x[i] ) or fields. Variables are initialized to the null string. Array subscripts may be any string, not necessarily numeric; this allows for a form of associative memory. Multiple subscripts such as ! .Li [i,j,k] are permitted; the constituents are concatenated, separated by the value of ! .Va SUBSEP . ! .Pp The ! .Ic print statement prints its arguments on the standard output (or on a file if ! .Ic > Ns Ar file or ! .Ic >> Ns Ar file is present or on a pipe if ! .Ic \&| Ar cmd is present), separated by the current output field separator, and terminated by the output record separator. ! .Ar file and ! .Ar cmd may be literal names or parenthesized expressions; identical string values in different statements denote the same open file. The ! .Ic printf statement formats its expression list according to the format (see ! .Xr printf 3 . The built-in function ! .Fn close expr closes the file or pipe ! .Fa expr . The built-in function ! .Fn fflush expr flushes any buffered output for the file or pipe ! .Fa expr . ! .Pp The mathematical functions ! .Fn exp , ! .Fn log , ! .Fn sqrt , ! .Fn sin , ! .Fn cos , and ! .Fn atan2 are built in. Other built-in functions: ! .Pp ! .Bl -tag -width Fn ! .It Fn length the length of its argument taken as a string, or of ! .Va $0 if no argument. ! .It Fn rand random number on (0,1) ! .It Fn srand sets seed for ! .Fn rand and returns the previous seed. ! .It Fn int ! truncates to an integer value. ! .It Fn substr s m n the ! .Fa n Ns No -character substring of ! .Fa s that begins at position ! .Fa m counted from 1. ! .It Fn index s t the position in ! .Fa s where the string ! .Fa t occurs, or 0 if it does not. ! .It Fn match s r the position in ! .Fa s where the regular expression ! .Fa r occurs, or 0 if it does not. The variables ! .Va RSTART and ! .Va RLENGTH are set to the position and length of the matched string. ! .It Fn split s a fs splits the string ! .Fa s into array elements ! .Va a[1] , a[2] , ... , a[n] and returns ! .Va n . The separation is done with the regular expression ! .Ar fs or with the field separator ! .Va FS if ! .Ar fs is not given. An empty string as field separator splits the string into one array element per character. ! .It Fn sub r t s substitutes ! .Fa t for the first occurrence of the regular expression ! .Fa r in the string ! .Fa s . If ! .Fa s is not given, ! .Va $0 is used. ! .It Fn gsub r t s same as ! .Fn sub except that all occurrences of the regular expression are replaced; ! .Fn sub and ! .Fn gsub return the number of replacements. ! .It Fn sprintf fmt expr ... the string resulting from formatting ! .Fa expr , ... according to the ! .Xr printf 3 format ! .Fa fmt . ! .It Fn system cmd executes ! .Fa cmd ! and returns its exit status. ! .It Fn tolower str returns a copy of ! .Fa str with all upper-case characters translated to their corresponding lower-case equivalents. ! .It Fn toupper str returns a copy of ! .Fa str with all lower-case characters translated to their corresponding upper-case equivalents. ! .El ! .Pp ! The ! .Sq function ! .Ic getline sets ! .Va $0 to the next input record from the current input file; ! .Ic getline < Ar file sets ! .Va $0 to the next record from ! .Ar file . ! .Ic getline Va x sets variable ! .Va x instead. Finally, ! .Ar cmd Ic \&| getline pipes the output of ! .Ar cmd into ! .Ic getline ; each call of ! .Ic getline returns the next line of output from ! .Ar cmd . In all cases, ! .Ic getline returns 1 for a successful input, 0 for end of file, and \-1 for an error. ! .Pp Patterns are arbitrary Boolean combinations (with ! .Ic "! || &&" ) of regular expressions and relational expressions. Regular expressions are as in ! .Xr egrep 1 . Isolated regular expressions in a pattern apply to the entire line. Regular expressions may also occur in relational expressions, using the operators ! .Ic ~ and ! .Ic !~ . ! .Ic / Ns Ar re Ns Ic / is a constant regular expression; any string (constant or variable) may be used as a regular expression, except in the position of an isolated regular expression in a pattern. ! .Pp A pattern may consist of two patterns separated by a comma; in this case, the action is performed for all lines from an occurrence of the first pattern though an occurrence of the second. ! .Pp A relational expression is one of the following: ! .Bd -unfilled -offset indent ! .Ar expression matchop regular-expression ! .Ar expression relop expression ! .Ar expression Ic in Ar array-name ! .Ic \&( Ns Xo ! .Ar expr , expr , \&... Ns Ic \&) in ! .Ar \& array-name ! .Xc ! .Ed ! where a ! .Ar relop ! is any of the six relational operators in C, and a ! .Ar matchop ! is either ! .Ic ~ (matches) or ! .Ic !~ (does not match). A conditional is an arithmetic expression, a relational expression, or a Boolean combination of these. ! .Pp The special patterns ! .Ic BEGIN and ! .Ic END may be used to capture control before the first input line is read and after the last. ! .Ic BEGIN and ! .Ic END do not combine with other patterns. ! .Pp Variable names with special meanings: ! .Pp ! .Bl -tag -width Va -compact ! .It Va CONVFMT conversion format used when converting numbers (default ! .Qq Li %.6g ) ! .It Va FS regular expression used to separate fields; also settable by option ! .Fl fs . ! .It Va NF number of fields in the current record ! .It Va NR ordinal number of the current record ! .It Va FNR ordinal number of the current record in the current file ! .It Va FILENAME the name of the current input file ! .It Va RS input record separator (default newline) ! .It Va OFS output field separator (default blank) ! .It Va ORS output record separator (default newline) ! .It Va OFMT output format for numbers (default ! .Qq Li %.6g ) ! .It Va SUBSEP separates multiple subscripts (default 034) ! .It Va ARGC argument count, assignable ! .It Va ARGV argument array, assignable; non-null members are taken as filenames ! .It Va ENVIRON array of environment variables; subscripts are names. ! .El ! .Pp ! Functions may be defined (at the position of a pattern-action statement) ! thusly: ! .Pp ! .Dl function foo(a, b, c) { ...; return x } ! .Pp Parameters are passed by value if scalar and by reference if array name; functions may be called recursively. Parameters are local to the function; all other variables are global. Thus local variables may be created by providing excess parameters in the function definition. ! .Sh EXAMPLES ! .Dl length($0) > 72 Print lines longer than 72 characters. ! .Pp ! .Dl { print $2, $1 } Print first two fields in opposite order. ! .Pp ! .Bd -literal -offset indent BEGIN { FS = ",[ \et]*|[ \et]+" } { print $2, $1 } ! .Ed Same, with input fields separated by comma and/or blanks and tabs. ! .Pp ! .Bd -literal -offset indent ! { s += $1 } ! END { print "sum is", s, " average is", s/NR } ! .Ed Add up first column, print sum and average. ! .Pp ! .Dl /start/, /stop/ Print all lines between start/stop pairs. ! .Pp ! .Bd -literal -offset indent ! BEGIN { # Simulate echo(1) ! for (i = 1; i < ARGC; i++) printf "%s ", ARGV[i] ! printf "\en" ! exit } ! .Ed ! .Sh SEE ALSO ! .Xr lex 1 , ! .Xr sed 1 ! .Rs ! .%A A. V. Aho ! .%A B. W. Kernighan ! .%A P. J. Weinberger ! .%T The AWK Programming Language ! .%I Addison-Wesley ! .%D 1988 ! .%O ISBN 0-201-07981-X ! .Re ! .Sh BUGS There are no explicit conversions between numbers and strings. To force an expression to be treated as a number add 0 to it; to force it to be treated as a string concatenate ! .Li \&"" ! to it. ! .Pp The scope rules for variables in functions are a botch; the syntax is worse. + .Sh HISTORY + AT&T + .Nm + by B. W. Kernighan was updated for + .Bx 4.4 + and again in 1996.