version 1.15, 2003/11/24 10:58:08 |
version 1.16, 2003/12/12 19:50:55 |
|
|
.Nd pattern-directed scanning and processing language |
.Nd pattern-directed scanning and processing language |
.Sh SYNOPSIS |
.Sh SYNOPSIS |
.Nm awk |
.Nm awk |
.Op Fl F Ar fs |
|
.Op Fl v Ar var=value |
|
.Op Fl safe |
.Op Fl safe |
.Op Fl mr Ar n |
.Op Fl V |
.Op Fl mf Ar n |
.Op Fl d Ns Op Ar n |
.Op Ar prog | Fl f Ar progfile |
.Op Fl F Ar fs |
|
.Oo Fl v Ar var Ns = |
|
.Ns Ar value Oc |
|
.Ar prog | Fl f Ar progfile |
.Ar |
.Ar |
.Nm nawk |
.Nm nawk |
.Ar ... |
.Ar ... |
|
|
.Ar file |
.Ar file |
for lines that match any of a set of patterns specified literally in |
for lines that match any of a set of patterns specified literally in |
.Ar prog |
.Ar prog |
or in one or more files |
or in one or more files specified as |
specified as |
|
.Fl f Ar progfile . |
.Fl f Ar progfile . |
With each pattern |
With each pattern there can be an associated action that will be performed |
there can be an associated action that will be performed |
|
when a line of a |
when a line of a |
.Ar file |
.Ar file |
matches the pattern. |
matches the pattern. |
|
|
pattern portion of every pattern-action statement; |
pattern portion of every pattern-action statement; |
the associated action is performed for each matched pattern. |
the associated action is performed for each matched pattern. |
The file name |
The file name |
.Sq Pa \- |
.Sq - |
means the standard input. |
means the standard input. |
Any |
Any |
.Ar file |
.Ar file |
of the form |
of the form |
.Ar var=value |
.Ar var Ns = Ns Ar value |
is treated as an assignment, not a filename, |
is treated as an assignment, not a filename, |
and is executed at the time it would have been opened if it were a filename. |
and is executed at the time it would have been opened if it were a filename. |
The option |
.Pp |
.Fl v |
The options are as follows: |
followed by |
.Bl -tag -width Ds |
.Ar var=value |
.It Fl d Ns Op Ar n |
is an assignment to be done before |
Debug mode. |
.Ar prog |
Set debug level to |
is executed; |
.Ar n , |
any number of |
or 1 if |
.Fl v |
.Ar n |
options may be present. |
is not specified. |
The |
A value greater than 1 causes |
.Fl F Ar fs |
.Nm |
option defines the input field separator to be the regular expression |
to dump core on fatal errors. |
|
.It Fl F Ar fs |
|
Define the input field separator to be the regular expression |
.Ar fs . |
.Ar fs . |
The |
.It Fl f Ar filename |
.Fl safe |
Read program code from the specified file |
option disables file output |
.Ar filename |
.Po |
instead of from the command line. |
.Ic print Ic > , |
.It Fl safe |
.Ic print Ic >> , |
Disable file output |
.Pc |
.Pf ( Ic print > , |
|
.Ic print >> ) , |
process creation |
process creation |
.Po |
.Po |
.Ar cmd Ic \&| getline , |
.Ar cmd Ic \&| getline , |
|
|
and access to the environment |
and access to the environment |
.Pq Va ENVIRON . |
.Pq Va ENVIRON . |
This |
This |
is a first (and not very reliable) approximation to a |
is a first |
|
.Pq and not very reliable |
|
approximation to a |
.Dq safe |
.Dq safe |
version of |
version of |
.Nm awk . |
.Nm . |
|
.It Fl V |
|
Print the version number of |
|
.Nm |
|
to standard output and exit. |
|
.It Fl v Ar var Ns = Ns Ar value |
|
Assign |
|
.Ar value |
|
to variable |
|
.Ar var |
|
before |
|
.Ar prog |
|
is executed; |
|
any number of |
|
.Fl v |
|
options may be present. |
|
.El |
.Pp |
.Pp |
An input line is normally made up of fields separated by whitespace, |
An input line is normally made up of fields separated by whitespace, |
or by regular expression |
or by regular expression |
|
|
.Va FS |
.Va FS |
is null, the input line is split into one field per character. |
is null, the input line is split into one field per character. |
.Pp |
.Pp |
To compensate for inadequate implementation of storage management, |
|
the |
|
.Fl mr |
|
option can be used to set the maximum size of the input record, |
|
and the |
|
.Fl mf |
|
option to set the maximum number of fields. |
|
.Pp |
|
A pattern-action statement has the form |
A pattern-action statement has the form |
.Pp |
.Pp |
.D1 Ar pattern Ic \&{ Ar action Ic \&} |
.D1 Ar pattern Ic \&{ Ar action Ic \&} |
|
|
.Ic + \- * / % ^ |
.Ic + \- * / % ^ |
(exponentiation), and concatenation (indicated by whitespace). |
(exponentiation), and concatenation (indicated by whitespace). |
The operators |
The operators |
.Ic \&! ++ \-\- += \-= *= /= %= ^= > >= < <= == != ?: |
.Ic \&! ++ \-\- += \-= *= /= %= ^= |
|
.Ic > >= < <= == != ?: |
are also available in expressions. |
are also available in expressions. |
Variables may be scalars, array elements |
Variables may be scalars, array elements |
(denoted |
(denoted |
|
|
Other built-in functions: |
Other built-in functions: |
.Bl -tag -width Fn |
.Bl -tag -width Fn |
.It Fn length |
.It Fn length |
the length of its argument |
The length of its argument |
taken as a string, |
taken as a string, |
or of |
or of |
.Va $0 |
.Va $0 |
if no argument. |
if no argument. |
.It Fn rand |
.It Fn rand |
random number on (0,1) |
Random number on (0,1). |
.It Fn srand |
.It Fn srand |
sets seed for |
Sets seed for |
.Fn rand |
.Fn rand |
and returns the previous seed. |
and returns the previous seed. |
.It Fn int |
.It Fn int |
truncates to an integer value. |
Truncates to an integer value. |
.It Fn substr s m n |
.It Fn substr s m n |
the |
The |
.Fa n Ns No -character |
.Fa n Ns No -character |
substring of |
substring of |
.Fa s |
.Fa s |
|
|
.Fa m |
.Fa m |
counted from 1. |
counted from 1. |
.It Fn index s t |
.It Fn index s t |
the position in |
The position in |
.Fa s |
.Fa s |
where the string |
where the string |
.Fa t |
.Fa t |
occurs, or 0 if it does not. |
occurs, or 0 if it does not. |
.It Fn match s r |
.It Fn match s r |
the position in |
The position in |
.Fa s |
.Fa s |
where the regular expression |
where the regular expression |
.Fa r |
.Fa r |
|
|
.Va RLENGTH |
.Va RLENGTH |
are set to the position and length of the matched string. |
are set to the position and length of the matched string. |
.It Fn split s a fs |
.It Fn split s a fs |
splits the string |
Splits the string |
.Fa s |
.Fa s |
into array elements |
into array elements |
.Va a[1] , a[2] , ... , a[n] |
.Va a[1] , a[2] , ... , a[n] |
|
|
An empty string as field separator splits the string |
An empty string as field separator splits the string |
into one array element per character. |
into one array element per character. |
.It Fn sub r t s |
.It Fn sub r t s |
substitutes |
Substitutes |
.Fa t |
.Fa t |
for the first occurrence of the regular expression |
for the first occurrence of the regular expression |
.Fa r |
.Fa r |
|
|
.Va $0 |
.Va $0 |
is used. |
is used. |
.It Fn gsub r t s |
.It Fn gsub r t s |
same as |
Same as |
.Fn sub |
.Fn sub |
except that all occurrences of the regular expression |
except that all occurrences of the regular expression |
are replaced; |
are replaced; |
|
|
.Fn gsub |
.Fn gsub |
return the number of replacements. |
return the number of replacements. |
.It Fn sprintf fmt expr ... |
.It Fn sprintf fmt expr ... |
the string resulting from formatting |
The string resulting from formatting |
.Fa expr , ... |
.Fa expr , ... |
according to the |
according to the |
.Xr printf 3 |
.Xr printf 3 |
format |
format |
.Fa fmt . |
.Fa fmt . |
.It Fn system cmd |
.It Fn system cmd |
executes |
Executes |
.Fa cmd |
.Fa cmd |
and returns its exit status. |
and returns its exit status. |
.It Fn tolower str |
.It Fn tolower str |
returns a copy of |
Returns a copy of |
.Fa str |
.Fa str |
with all upper-case characters translated to their |
with all upper-case characters translated to their |
corresponding lower-case equivalents. |
corresponding lower-case equivalents. |
.It Fn toupper str |
.It Fn toupper str |
returns a copy of |
Returns a copy of |
.Fa str |
.Fa str |
with all lower-case characters translated to their |
with all lower-case characters translated to their |
corresponding upper-case equivalents. |
corresponding upper-case equivalents. |
|
|
.Pp |
.Pp |
Variable names with special meanings: |
Variable names with special meanings: |
.Pp |
.Pp |
.Bl -tag -width Va -compact |
.Bl -tag -width "FILENAME" -compact |
|
.It Va ARGC |
|
Argument count, assignable. |
|
.It Va ARGV |
|
Argument array, assignable; |
|
non-null members are taken as filenames. |
.It Va CONVFMT |
.It Va CONVFMT |
conversion format used when converting numbers |
Conversion format used when converting numbers |
(default |
(default |
.Qq Li %.6g ) |
.Qq Li %.6g ) . |
|
.It Va ENVIRON |
|
Array of environment variables; subscripts are names. |
|
.It Va FILENAME |
|
The name of the current input file. |
|
.It Va FNR |
|
Ordinal number of the current record in the current file. |
.It Va FS |
.It Va FS |
regular expression used to separate fields; also settable |
Regular expression used to separate fields; also settable |
by option |
by option |
.Fl F Ar fs . |
.Fl F Ar fs . |
.It Va NF |
.It Va NF |
number of fields in the current record |
Number of fields in the current record. |
.It Va NR |
.It Va NR |
ordinal number of the current record |
Ordinal number of the current record. |
.It Va FNR |
.It Va OFMT |
ordinal number of the current record in the current file |
Output format for numbers (default |
.It Va FILENAME |
.Qq Li %.6g ) . |
the name of the current input file |
|
.It Va RS |
|
input record separator (default newline) |
|
.It Va OFS |
.It Va OFS |
output field separator (default blank) |
Output field separator (default blank). |
.It Va ORS |
.It Va ORS |
output record separator (default newline) |
Output record separator (default newline). |
.It Va OFMT |
.It Va RS |
output format for numbers (default |
Input record separator (default newline). |
.Qq Li %.6g ) |
|
.It Va SUBSEP |
.It Va SUBSEP |
separates multiple subscripts (default 034) |
Separates multiple subscripts (default 034). |
.It Va ARGC |
|
argument count, assignable |
|
.It Va ARGV |
|
argument array, assignable; |
|
non-null members are taken as filenames |
|
.It Va ENVIRON |
|
array of environment variables; subscripts are names. |
|
.El |
.El |
.Pp |
.Pp |
Functions may be defined (at the position of a pattern-action statement) |
Functions may be defined (at the position of a pattern-action statement) |
|
|
.Pp |
.Pp |
.Dl function foo(a, b, c) { ...; return x } |
.Dl function foo(a, b, c) { ...; return x } |
.Pp |
.Pp |
Parameters are passed by value if scalar and by reference if array name; |
Parameters are passed by value if scalar, and by reference if array name; |
functions may be called recursively. |
functions may be called recursively. |
Parameters are local to the function; all other variables are global. |
Parameters are local to the function; all other variables are global. |
Thus local variables may be created by providing excess parameters in |
Thus local variables may be created by providing excess parameters in |
the function definition. |
the function definition. |
.Sh EXAMPLES |
.Sh EXAMPLES |
|
Print lines longer than 72 characters: |
|
.Pp |
.Dl length($0) > 72 |
.Dl length($0) > 72 |
Print lines longer than 72 characters. |
|
.Pp |
.Pp |
|
Print first two fields in opposite order: |
|
.Pp |
.Dl { print $2, $1 } |
.Dl { print $2, $1 } |
Print first two fields in opposite order. |
.Pp |
|
Same, with input fields separated by comma and/or blanks and tabs: |
.Bd -literal -offset indent |
.Bd -literal -offset indent |
BEGIN { FS = ",[ \et]*|[ \et]+" } |
BEGIN { FS = ",[ \et]*|[ \et]+" } |
{ print $2, $1 } |
{ print $2, $1 } |
.Ed |
.Ed |
Same, with input fields separated by comma and/or blanks and tabs. |
.Pp |
|
Add up first column, print sum and average: |
.Bd -literal -offset indent |
.Bd -literal -offset indent |
{ s += $1 } |
{ s += $1 } |
END { print "sum is", s, " average is", s/NR } |
END { print "sum is", s, " average is", s/NR } |
.Ed |
.Ed |
Add up first column, print sum and average. |
|
.Pp |
.Pp |
|
Print all lines between start/stop pairs: |
|
.Pp |
.Dl /start/, /stop/ |
.Dl /start/, /stop/ |
Print all lines between start/stop pairs. |
.Pp |
|
Simulate echo(1): |
.Bd -literal -offset indent |
.Bd -literal -offset indent |
BEGIN { # Simulate echo(1) |
BEGIN { # Simulate echo(1) |
for (i = 1; i < ARGC; i++) printf "%s ", ARGV[i] |
for (i = 1; i < ARGC; i++) printf "%s ", ARGV[i] |
|
|
exit } |
exit } |
.Ed |
.Ed |
.Sh SEE ALSO |
.Sh SEE ALSO |
|
.Xr egrep 1 , |
.Xr lex 1 , |
.Xr lex 1 , |
.Xr sed 1 |
.Xr sed 1 , |
|
.Xr printf 3 |
.Rs |
.Rs |
.%A A. V. Aho |
.%A A. V. Aho |
.%A B. W. Kernighan |
.%A B. W. Kernighan |