Annotation of src/usr.bin/awk/awk.1, Revision 1.62
1.62 ! jsg 1: .\" $OpenBSD: awk.1,v 1.61 2021/03/08 02:47:27 jsg Exp $
1.11 jmc 2: .\"
3: .\" Copyright (C) Lucent Technologies 1997
4: .\" All Rights Reserved
1.12 jmc 5: .\"
1.11 jmc 6: .\" Permission to use, copy, modify, and distribute this software and
7: .\" its documentation for any purpose and without fee is hereby
8: .\" granted, provided that the above copyright notice appear in all
9: .\" copies and that both that the copyright notice and this
10: .\" permission notice and warranty disclaimer appear in supporting
11: .\" documentation, and that the name Lucent Technologies or any of
12: .\" its entities not be used in advertising or publicity pertaining
13: .\" to distribution of the software without specific, written prior
14: .\" permission.
1.12 jmc 15: .\"
1.11 jmc 16: .\" LUCENT DISCLAIMS ALL WARRANTIES WITH REGARD TO THIS SOFTWARE,
17: .\" INCLUDING ALL IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS.
18: .\" IN NO EVENT SHALL LUCENT OR ANY OF ITS ENTITIES BE LIABLE FOR ANY
19: .\" SPECIAL, INDIRECT OR CONSEQUENTIAL DAMAGES OR ANY DAMAGES
20: .\" WHATSOEVER RESULTING FROM LOSS OF USE, DATA OR PROFITS, WHETHER
21: .\" IN AN ACTION OF CONTRACT, NEGLIGENCE OR OTHER TORTIOUS ACTION,
22: .\" ARISING OUT OF OR IN CONNECTION WITH THE USE OR PERFORMANCE OF
23: .\" THIS SOFTWARE.
24: .\"
1.62 ! jsg 25: .Dd $Mdocdate: March 8 2021 $
1.7 aaron 26: .Dt AWK 1
27: .Os
28: .Sh NAME
29: .Nm awk
30: .Nd pattern-directed scanning and processing language
31: .Sh SYNOPSIS
32: .Nm awk
1.16 jmc 33: .Op Fl safe
34: .Op Fl V
35: .Op Fl d Ns Op Ar n
1.7 aaron 36: .Op Fl F Ar fs
1.38 schwarze 37: .Op Fl v Ar var Ns = Ns Ar value
1.18 jmc 38: .Op Ar prog | Fl f Ar progfile
1.7 aaron 39: .Ar
40: .Sh DESCRIPTION
41: .Nm
1.1 tholo 42: scans each input
1.7 aaron 43: .Ar file
1.1 tholo 44: for lines that match any of a set of patterns specified literally in
1.7 aaron 45: .Ar prog
1.16 jmc 46: or in one or more files specified as
1.7 aaron 47: .Fl f Ar progfile .
1.16 jmc 48: With each pattern there can be an associated action that will be performed
1.1 tholo 49: when a line of a
1.7 aaron 50: .Ar file
1.1 tholo 51: matches the pattern.
52: Each line is matched against the
53: pattern portion of every pattern-action statement;
54: the associated action is performed for each matched pattern.
1.6 aaron 55: The file name
1.16 jmc 56: .Sq -
1.1 tholo 57: means the standard input.
58: Any
1.7 aaron 59: .Ar file
1.1 tholo 60: of the form
1.16 jmc 61: .Ar var Ns = Ns Ar value
1.1 tholo 62: is treated as an assignment, not a filename,
63: and is executed at the time it would have been opened if it were a filename.
1.16 jmc 64: .Pp
65: The options are as follows:
1.20 jmc 66: .Bl -tag -width "-safe "
1.16 jmc 67: .It Fl d Ns Op Ar n
68: Debug mode.
69: Set debug level to
70: .Ar n ,
71: or 1 if
72: .Ar n
73: is not specified.
74: A value greater than 1 causes
75: .Nm
76: to dump core on fatal errors.
77: .It Fl F Ar fs
78: Define the input field separator to be the regular expression
1.7 aaron 79: .Ar fs .
1.25 jmc 80: .It Fl f Ar progfile
1.16 jmc 81: Read program code from the specified file
1.25 jmc 82: .Ar progfile
1.16 jmc 83: instead of from the command line.
84: .It Fl safe
85: Disable file output
1.17 jmc 86: .Pf ( Ic print No > ,
87: .Ic print No >> ) ,
1.7 aaron 88: process creation
89: .Po
1.17 jmc 90: .Ar cmd | Ic getline ,
1.40 jmc 91: .Ic print | ,
1.17 jmc 92: .Ic system
1.7 aaron 93: .Pc
94: and access to the environment
1.17 jmc 95: .Pf ( Va ENVIRON ;
1.18 jmc 96: see the section on variables below).
1.17 jmc 97: This is a first
1.16 jmc 98: .Pq and not very reliable
99: approximation to a
1.7 aaron 100: .Dq safe
101: version of
1.16 jmc 102: .Nm .
103: .It Fl V
104: Print the version number of
105: .Nm
106: to standard output and exit.
107: .It Fl v Ar var Ns = Ns Ar value
108: Assign
109: .Ar value
110: to variable
111: .Ar var
112: before
113: .Ar prog
114: is executed;
115: any number of
116: .Fl v
117: options may be present.
118: .El
1.7 aaron 119: .Pp
1.18 jmc 120: The input is normally made up of input lines
121: .Pq records
122: separated by newlines, or by the value of
123: .Va RS .
124: If
125: .Va RS
126: is null, then any number of blank lines are used as the record separator,
127: and newlines are used as field separators
128: (in addition to the value of
129: .Va FS ) .
130: This is convenient when working with multi-line records.
131: .Pp
1.7 aaron 132: An input line is normally made up of fields separated by whitespace,
1.55 millert 133: or by the value of the field separator
134: .Va FS
135: at the time the line is read.
1.1 tholo 136: The fields are denoted
1.7 aaron 137: .Va $1 , $2 , ... ,
138: while
139: .Va $0
1.1 tholo 140: refers to the entire line.
1.55 millert 141: .Va FS
142: may be set to either a single character or a regular expression.
1.58 jmc 143: As a special case, if
1.55 millert 144: .Va FS
145: is a single space
146: .Pq the default ,
147: fields will be split by one or more whitespace characters.
1.1 tholo 148: If
1.7 aaron 149: .Va FS
1.1 tholo 150: is null, the input line is split into one field per character.
1.7 aaron 151: .Pp
1.18 jmc 152: Normally, any number of blanks separate fields.
153: In order to set the field separator to a single blank, use the
154: .Fl F
155: option with a value of
156: .Sq [\ \&] .
157: If a field separator of
158: .Sq t
159: is specified,
160: .Nm
161: treats it as if
162: .Sq \et
163: had been specified and uses
164: .Aq TAB
165: as the field separator.
166: In order to use a literal
167: .Sq t
168: as the field separator, use the
169: .Fl F
170: option with a value of
171: .Sq [t] .
1.55 millert 172: The field separator is usually set via the
173: .Fl F
174: option or from inside a
175: .Ic BEGIN
176: block so that it takes effect before the input is read.
1.18 jmc 177: .Pp
1.47 millert 178: A pattern-action statement has the form:
1.7 aaron 179: .Pp
180: .D1 Ar pattern Ic \&{ Ar action Ic \&}
181: .Pp
1.6 aaron 182: A missing
1.7 aaron 183: .Ic \&{ Ar action Ic \&}
1.1 tholo 184: means print the line;
185: a missing pattern always matches.
186: Pattern-action statements are separated by newlines or semicolons.
1.7 aaron 187: .Pp
1.18 jmc 188: Newlines are permitted after a terminating statement or following a comma
189: .Pq Sq ,\& ,
190: an open brace
191: .Pq Sq { ,
192: a logical AND
193: .Pq Sq && ,
194: a logical OR
195: .Pq Sq || ,
196: after the
197: .Sq do
198: or
199: .Sq else
200: keywords,
201: or after the closing parenthesis of an
202: .Sq if ,
203: .Sq for ,
204: or
205: .Sq while
206: statement.
207: Additionally, a backslash
208: .Pq Sq \e
209: can be used to escape a newline between tokens.
210: .Pp
1.1 tholo 211: An action is a sequence of statements.
212: A statement can be one of the following:
1.35 jmc 213: .Pp
214: .Bl -tag -width Ds -offset indent -compact
1.43 schwarze 215: .It Ic if Ar ( expression ) Ar statement Op Ic else Ar statement
216: .It Ic while Ar ( expression ) Ar statement
217: .It Ic for Ar ( expression ; expression ; expression ) statement
218: .It Ic for Ar ( var Ic in Ar array ) statement
219: .It Ic do Ar statement Ic while Ar ( expression )
1.35 jmc 220: .It Ic break
221: .It Ic continue
222: .It Xo Ic {
223: .Op Ar statement ...
224: .Ic }
225: .Xc
226: .It Xo Ar expression
227: .No # commonly
228: .Ar var No = Ar expression
1.7 aaron 229: .Xc
1.35 jmc 230: .It Xo Ic print
1.7 aaron 231: .Op Ar expression-list
1.17 jmc 232: .Op > Ns Ar expression
1.7 aaron 233: .Xc
1.35 jmc 234: .It Xo Ic printf Ar format
1.7 aaron 235: .Op Ar ... , expression-list
1.17 jmc 236: .Op > Ns Ar expression
1.7 aaron 237: .Xc
1.35 jmc 238: .It Ic return Op Ar expression
239: .It Xo Ic next
240: .No # skip remaining patterns on this input line
241: .Xc
242: .It Xo Ic nextfile
243: .No # skip rest of this file, open next, start at top
244: .Xc
245: .It Xo Ic delete
246: .Sm off
247: .Ar array Ic \&[ Ar expression Ic \&]
248: .Sm on
249: .No # delete an array element
1.7 aaron 250: .Xc
1.35 jmc 251: .It Xo Ic delete Ar array
252: .No # delete all elements of array
1.7 aaron 253: .Xc
1.35 jmc 254: .It Xo Ic exit
1.7 aaron 255: .Op Ar expression
1.46 deraadt 256: .No # exit processing, and perform
257: .Ic END
258: processing; status is
259: .Ar expression
1.7 aaron 260: .Xc
1.35 jmc 261: .El
1.7 aaron 262: .Pp
1.1 tholo 263: Statements are terminated by
264: semicolons, newlines or right braces.
265: An empty
1.7 aaron 266: .Ar expression-list
1.1 tholo 267: stands for
1.7 aaron 268: .Ar $0 .
269: String constants are quoted
270: .Li \&"" ,
1.20 jmc 271: with the usual C escapes recognized within
272: (see
273: .Xr printf 1
274: for a complete list of these).
1.1 tholo 275: Expressions take on string or numeric values as appropriate,
276: and are built using the operators
1.7 aaron 277: .Ic + \- * / % ^
1.20 jmc 278: .Pq exponentiation ,
279: and concatenation
280: .Pq indicated by whitespace .
1.1 tholo 281: The operators
1.16 jmc 282: .Ic \&! ++ \-\- += \-= *= /= %= ^=
1.59 millert 283: .Ic > >= < <= == != ?\&:
1.1 tholo 284: are also available in expressions.
285: Variables may be scalars, array elements
286: (denoted
1.7 aaron 287: .Li x[i] )
1.1 tholo 288: or fields.
289: Variables are initialized to the null string.
290: Array subscripts may be any string,
291: not necessarily numeric;
292: this allows for a form of associative memory.
293: Multiple subscripts such as
1.7 aaron 294: .Li [i,j,k]
1.1 tholo 295: are permitted; the constituents are concatenated,
296: separated by the value of
1.17 jmc 297: .Va SUBSEP
1.31 deraadt 298: .Pq see the section on variables below .
1.7 aaron 299: .Pp
1.1 tholo 300: The
1.7 aaron 301: .Ic print
1.1 tholo 302: statement prints its arguments on the standard output
303: (or on a file if
1.47 millert 304: .Pf >\ \& Ar file
1.1 tholo 305: or
1.47 millert 306: .Pf >>\ \& Ar file
1.1 tholo 307: is present or on a pipe if
1.17 jmc 308: .Pf |\ \& Ar cmd
1.1 tholo 309: is present), separated by the current output field separator,
310: and terminated by the output record separator.
1.7 aaron 311: .Ar file
1.1 tholo 312: and
1.7 aaron 313: .Ar cmd
1.1 tholo 314: may be literal names or parenthesized expressions;
315: identical string values in different statements denote
316: the same open file.
317: The
1.7 aaron 318: .Ic printf
1.47 millert 319: statement formats its expression list according to the
320: .Ar format
1.1 tholo 321: (see
1.28 jmc 322: .Xr printf 1 ) .
1.18 jmc 323: .Pp
324: Patterns are arbitrary Boolean combinations
325: (with
326: .Ic "\&! || &&" )
327: of regular expressions and
328: relational expressions.
1.22 jmc 329: .Nm
330: supports extended regular expressions
331: .Pq EREs .
332: See
333: .Xr re_format 7
334: for more information on regular expressions.
1.18 jmc 335: Isolated regular expressions
336: in a pattern apply to the entire line.
337: Regular expressions may also occur in
338: relational expressions, using the operators
339: .Ic ~
340: and
341: .Ic !~ .
1.44 schwarze 342: .Pf / Ar re Ns /
1.18 jmc 343: is a constant regular expression;
344: any string (constant or variable) may be used
345: as a regular expression, except in the position of an isolated regular expression
346: in a pattern.
347: .Pp
348: A pattern may consist of two patterns separated by a comma;
349: in this case, the action is performed for all lines
350: from an occurrence of the first pattern
351: through an occurrence of the second.
352: .Pp
353: A relational expression is one of the following:
1.35 jmc 354: .Pp
355: .Bl -tag -width Ds -offset indent -compact
356: .It Ar expression matchop regular-expression
357: .It Ar expression relop expression
358: .It Ar expression Ic in Ar array-name
359: .It Xo Ic \&( Ns
1.18 jmc 360: .Ar expr , expr , \&... Ns Ic \&) in
1.35 jmc 361: .Ar array-name
1.18 jmc 362: .Xc
1.35 jmc 363: .El
1.18 jmc 364: .Pp
365: where a
366: .Ar relop
367: is any of the six relational operators in C, and a
368: .Ar matchop
369: is either
370: .Ic ~
371: (matches)
372: or
373: .Ic !~
374: (does not match).
375: A conditional is an arithmetic expression,
376: a relational expression,
377: or a Boolean combination
378: of these.
379: .Pp
1.46 deraadt 380: The special pattern
1.18 jmc 381: .Ic BEGIN
1.46 deraadt 382: may be used to capture control before the first input line is read.
383: The special pattern
1.18 jmc 384: .Ic END
1.46 deraadt 385: may be used to capture control after processing is finished.
1.18 jmc 386: .Ic BEGIN
387: and
388: .Ic END
389: do not combine with other patterns.
1.47 millert 390: They may appear multiple times in a program and execute
391: in the order they are read by
392: .Nm .
1.18 jmc 393: .Pp
394: Variable names with special meanings:
395: .Pp
1.20 jmc 396: .Bl -tag -width "FILENAME " -compact
1.18 jmc 397: .It Va ARGC
398: Argument count, assignable.
399: .It Va ARGV
400: Argument array, assignable;
401: non-null members are taken as filenames.
402: .It Va CONVFMT
403: Conversion format when converting numbers
404: (default
405: .Qq Li %.6g ) .
406: .It Va ENVIRON
407: Array of environment variables; subscripts are names.
408: .It Va FILENAME
409: The name of the current input file.
410: .It Va FNR
411: Ordinal number of the current record in the current file.
412: .It Va FS
1.55 millert 413: Regular expression used to separate fields (default whitespace);
414: also settable by option
415: .Fl F Ar fs
1.18 jmc 416: .It Va NF
417: Number of fields in the current record.
418: .Va $NF
419: can be used to obtain the value of the last field in the current record.
420: .It Va NR
421: Ordinal number of the current record.
422: .It Va OFMT
423: Output format for numbers (default
424: .Qq Li %.6g ) .
425: .It Va OFS
426: Output field separator (default blank).
427: .It Va ORS
428: Output record separator (default newline).
429: .It Va RLENGTH
430: The length of the string matched by the
431: .Fn match
432: function.
433: .It Va RS
434: Input record separator (default newline).
1.49 millert 435: If empty, blank lines separate records.
436: If more than one character long,
437: .Va RS
438: is treated as a regular expression, and records are
439: separated by text matching the expression.
1.18 jmc 440: .It Va RSTART
441: The starting position of the string matched by the
442: .Fn match
443: function.
444: .It Va SUBSEP
445: Separates multiple subscripts (default 034).
446: .El
1.17 jmc 447: .Sh FUNCTIONS
448: The awk language has a variety of built-in functions:
1.30 jmc 449: arithmetic, string, input/output, general, and bit-operation.
450: .Pp
451: Functions may be defined (at the position of a pattern-action statement)
452: thusly:
453: .Pp
454: .Dl function foo(a, b, c) { ...; return x }
455: .Pp
456: Parameters are passed by value if scalar, and by reference if array name;
457: functions may be called recursively.
458: Parameters are local to the function; all other variables are global.
459: Thus local variables may be created by providing excess parameters in
460: the function definition.
1.17 jmc 461: .Ss Arithmetic Functions
462: .Bl -tag -width "atan2(y, x)"
463: .It Fn atan2 y x
464: Return the arctangent of
465: .Fa y Ns / Ns Fa x
466: in radians.
467: .It Fn cos x
468: Return the cosine of
469: .Fa x ,
470: where
471: .Fa x
472: is in radians.
473: .It Fn exp x
474: Return the exponential of
475: .Fa x .
476: .It Fn int x
477: Return
478: .Fa x
479: truncated to an integer value.
480: .It Fn log x
481: Return the natural logarithm of
482: .Fa x .
1.7 aaron 483: .It Fn rand
1.17 jmc 484: Return a random number,
485: .Fa n ,
486: such that
487: .Sm off
488: .Pf 0 \*(Le Fa n No \*(Lt 1 .
489: .Sm on
1.53 tim 490: Random numbers are non-deterministic unless a seed is explicitly set with
491: .Fn srand .
1.17 jmc 492: .It Fn sin x
493: Return the sine of
494: .Fa x ,
495: where
496: .Fa x
497: is in radians.
498: .It Fn sqrt x
499: Return the square root of
500: .Fa x .
501: .It Fn srand expr
1.16 jmc 502: Sets seed for
1.7 aaron 503: .Fn rand
1.17 jmc 504: to
505: .Fa expr
1.1 tholo 506: and returns the previous seed.
1.17 jmc 507: If
508: .Fa expr
1.53 tim 509: is omitted,
510: .Fn rand
511: will return non-deterministic random numbers.
1.17 jmc 512: .El
513: .Ss String Functions
514: .Bl -tag -width "split(s, a, fs)"
1.52 millert 515: .It Fn gensub r s h [t]
516: Search the target string
517: .Ar t
518: for matches of the regular expression
519: .Ar r .
520: If
521: .Ar h
522: is a string beginning with
523: .Ic g
524: or
525: .Ic G ,
526: then replace all matches of
527: .Ar r
528: with
529: .Ar s .
530: Otherwise,
531: .Ar h
532: is a number indicating which match of
533: .Ar r
534: to replace.
535: If no
536: .Ar t
537: is supplied,
538: .Va $0
539: is used instead.
540: .\"Within the replacement text
541: .\".Ar s ,
542: .\"the sequence
543: .\".Ar \en ,
544: .\"where
545: .\".Ar n
546: .\"is a digit from 1 to 9, may be used to indicate just the text that
547: .\"matched the
548: .\".Ar n Ap th
549: .\"parenthesized subexpression.
550: .\"The sequence
551: .\".Ic \e0
552: .\"represents the entire text, as does the character
553: .\".Ic & .
554: Unlike
555: .Fn sub
556: and
557: .Fn gsub ,
558: the modified string is returned as the result of the function,
559: and the original target is
560: .Em not
561: changed.
562: Note that
563: .Ar \en
564: sequences within the replacement string
565: .Ar s ,
566: as supported by GNU
567: .Nm ,
568: are
569: .Em not
570: supported at this time.
1.17 jmc 571: .It Fn gsub r t s
572: The same as
573: .Fn sub
574: except that all occurrences of the regular expression are replaced.
575: .Fn gsub
576: returns the number of replacements.
1.7 aaron 577: .It Fn index s t
1.16 jmc 578: The position in
1.7 aaron 579: .Fa s
1.1 tholo 580: where the string
1.7 aaron 581: .Fa t
1.1 tholo 582: occurs, or 0 if it does not.
1.17 jmc 583: .It Fn length s
584: The length of
585: .Fa s
586: taken as a string,
1.47 millert 587: number of elements in an array for an array argument,
588: or length of
1.17 jmc 589: .Va $0
590: if no argument is given.
1.7 aaron 591: .It Fn match s r
1.16 jmc 592: The position in
1.7 aaron 593: .Fa s
1.1 tholo 594: where the regular expression
1.7 aaron 595: .Fa r
1.1 tholo 596: occurs, or 0 if it does not.
1.17 jmc 597: The variable
1.7 aaron 598: .Va RSTART
1.17 jmc 599: is set to the starting position of the matched string
600: .Pq which is the same as the returned value
601: or zero if no match is found.
602: The variable
1.7 aaron 603: .Va RLENGTH
1.17 jmc 604: is set to the length of the matched string,
605: or \-1 if no match is found.
1.7 aaron 606: .It Fn split s a fs
1.16 jmc 607: Splits the string
1.7 aaron 608: .Fa s
1.1 tholo 609: into array elements
1.7 aaron 610: .Va a[1] , a[2] , ... , a[n]
1.1 tholo 611: and returns
1.7 aaron 612: .Va n .
1.1 tholo 613: The separation is done with the regular expression
1.7 aaron 614: .Ar fs
1.1 tholo 615: or with the field separator
1.7 aaron 616: .Va FS
1.1 tholo 617: if
1.7 aaron 618: .Ar fs
1.1 tholo 619: is not given.
620: An empty string as field separator splits the string
621: into one array element per character.
1.17 jmc 622: .It Fn sprintf fmt expr ...
623: The string resulting from formatting
624: .Fa expr , ...
625: according to the
1.28 jmc 626: .Xr printf 1
1.17 jmc 627: format
628: .Fa fmt .
1.7 aaron 629: .It Fn sub r t s
1.16 jmc 630: Substitutes
1.7 aaron 631: .Fa t
1.1 tholo 632: for the first occurrence of the regular expression
1.7 aaron 633: .Fa r
1.1 tholo 634: in the string
1.7 aaron 635: .Fa s .
1.1 tholo 636: If
1.7 aaron 637: .Fa s
1.1 tholo 638: is not given,
1.7 aaron 639: .Va $0
1.1 tholo 640: is used.
1.17 jmc 641: An ampersand
642: .Pq Sq &
643: in
644: .Fa t
645: is replaced in string
646: .Fa s
647: with regular expression
648: .Fa r .
649: A literal ampersand can be specified by preceding it with two backslashes
650: .Pq Sq \e\e .
651: A literal backslash can be specified by preceding it with another backslash
652: .Pq Sq \e\e .
1.7 aaron 653: .Fn sub
1.17 jmc 654: returns the number of replacements.
655: .It Fn substr s m n
656: Return at most the
657: .Fa n Ns -character
658: substring of
659: .Fa s
660: that begins at position
661: .Fa m
662: counted from 1.
663: If
664: .Fa n
665: is omitted, or if
666: .Fa n
667: specifies more characters than are left in the string,
668: the length of the substring is limited by the length of
669: .Fa s .
1.7 aaron 670: .It Fn tolower str
1.16 jmc 671: Returns a copy of
1.7 aaron 672: .Fa str
1.1 tholo 673: with all upper-case characters translated to their
674: corresponding lower-case equivalents.
1.7 aaron 675: .It Fn toupper str
1.16 jmc 676: Returns a copy of
1.7 aaron 677: .Fa str
1.1 tholo 678: with all lower-case characters translated to their
679: corresponding upper-case equivalents.
1.7 aaron 680: .El
1.52 millert 681: .Ss Time Functions
682: This version of
683: .Nm
684: provides the following functions for obtaining and formatting time
685: stamps.
686: .Bl -tag -width indent
1.57 millert 687: .It Fn mktime datespec
688: Converts
689: .Fa datespec
690: into a timestamp in the same form as a value returned by
691: .Fn systime .
692: The
693: .Fa datespec
694: is a string composed of six or seven numbers separated by whitespace:
695: .Bd -literal -offset indent
696: YYYY MM DD HH MM SS [DST]
697: .Ed
698: .Pp
699: The fields in
700: .Fa datespec
701: are as follows:
702: .Bl -tag -width "YYYY"
1.60 millert 703: .It YYYY
1.57 millert 704: Year: a four-digit year, including the century.
705: .It MM
706: Month: a number from 1 to 12.
707: .It DD
708: Day: a number from 1 to 31.
709: .It HH
710: Hour: a number from 0 to 23.
711: .It MM
712: Minute: a number from 0 to 59.
713: .It SS
714: Second: a number from 0 to 60 (permitting a leap second).
715: .It DST
716: Daylight Saving Time: a positive or zero value indicates that
717: DST is or is not in effect.
718: If DST is not specified, or is negative,
719: .Fn mktime
720: will attempt to determine the correct value.
721: .El
1.52 millert 722: .It Fn strftime "[format [, timestamp]]"
723: Formats
724: .Ar timestamp
725: according to the string
726: .Ar format .
727: The format string may contain any of the conversion specifications described
728: in the
729: .Xr strftime 3
730: manual page, as well as any arbitrary text.
731: The
732: .Ar timestamp
733: must be in the same form as a value returned by
1.57 millert 734: .Fn mktime
735: and
1.52 millert 736: .Fn systime .
737: If
738: .Ar timestamp
739: is not specified, the current time is used.
740: If
741: .Ar format
742: is not specified, a default format equivalent to the output of
743: .Xr date 1
744: is used.
745: .It Fn systime
746: Returns the value of time in seconds since 0 hours, 0 minutes,
747: 0 seconds, January 1, 1970, Coordinated Universal Time (UTC).
748: .El
1.17 jmc 749: .Ss Input/Output and General Functions
750: .Bl -tag -width "getline [var] < file"
751: .It Fn close expr
752: Closes the file or pipe
753: .Fa expr .
754: .Fa expr
755: should match the string that was used to open the file or pipe.
756: .It Ar cmd | Ic getline Op Va var
757: Read a record of input from a stream piped from the output of
758: .Ar cmd .
759: If
760: .Va var
761: is omitted, the variables
762: .Va $0
763: and
764: .Va NF
765: are set.
766: Otherwise
767: .Va var
768: is set.
769: If the stream is not open, it is opened.
770: As long as the stream remains open, subsequent calls
771: will read subsequent records from the stream.
772: The stream remains open until explicitly closed with a call to
773: .Fn close .
1.24 jmc 774: .Ic getline
775: returns 1 for a successful input, 0 for end of file, and \-1 for an error.
776: .It Fn fflush [expr]
1.39 jmc 777: Flushes any buffered output for the file or pipe
1.24 jmc 778: .Fa expr ,
779: or all open files or pipes if
780: .Fa expr
781: is omitted.
1.17 jmc 782: .Fa expr
783: should match the string that was used to open the file or pipe.
784: .It Ic getline
785: Sets
786: .Va $0
787: to the next input record from the current input file.
788: This form of
789: .Ic getline
790: sets the variables
791: .Va NF ,
792: .Va NR ,
793: and
794: .Va FNR .
1.7 aaron 795: .Ic getline
1.17 jmc 796: returns 1 for a successful input, 0 for end of file, and \-1 for an error.
797: .It Ic getline Va var
798: Sets
1.7 aaron 799: .Va $0
1.17 jmc 800: to variable
801: .Va var .
802: This form of
803: .Ic getline
804: sets the variables
805: .Va NR
806: and
807: .Va FNR .
808: .Ic getline
809: returns 1 for a successful input, 0 for end of file, and \-1 for an error.
810: .It Xo
811: .Ic getline Op Va var
1.47 millert 812: .Pf <\ \& Ar file
1.17 jmc 813: .Xc
814: Sets
1.7 aaron 815: .Va $0
1.1 tholo 816: to the next record from
1.7 aaron 817: .Ar file .
1.17 jmc 818: If
819: .Va var
820: is omitted, the variables
821: .Va $0
822: and
823: .Va NF
824: are set.
825: Otherwise
826: .Va var
827: is set.
828: If
829: .Ar file
830: is not open, it is opened.
831: As long as the stream remains open, subsequent calls will read subsequent
832: records from
833: .Ar file .
834: .Ar file
835: remains open until explicitly closed with a call to
836: .Fn close .
837: .It Fn system cmd
838: Executes
839: .Fa cmd
840: and returns its exit status.
1.47 millert 841: This will be \-1 upon error,
842: .Ar cmd Ns 's
843: exit status upon a normal exit,
844: 256 +
845: .Em sig
846: if
847: .Fa cmd
848: was terminated by a signal, where
849: .Em sig
850: is the number of the signal,
851: or 512 +
852: .Em sig
853: if there was a core dump.
1.17 jmc 854: .El
1.30 jmc 855: .Ss Bit-Operation Functions
1.29 pyr 856: .Bl -tag -width "lshift(a, b)"
857: .It Fn compl x
858: Returns the bitwise complement of integer argument x.
859: .It Fn and x y
1.30 jmc 860: Performs a bitwise AND on integer arguments x and y.
1.29 pyr 861: .It Fn or x y
1.30 jmc 862: Performs a bitwise OR on integer arguments x and y.
1.29 pyr 863: .It Fn xor x y
1.30 jmc 864: Performs a bitwise Exclusive-OR on integer arguments x and y.
1.29 pyr 865: .It Fn lshift x n
1.39 jmc 866: Returns integer argument x shifted by n bits to the left.
1.29 pyr 867: .It Fn rshift x n
1.39 jmc 868: Returns integer argument x shifted by n bits to the right.
1.29 pyr 869: .El
1.50 millert 870: .Sh ENVIRONMENT
871: The following environment variables affect the execution of
872: .Nm :
873: .Bl -tag -width POSIXLY_CORRECT
874: .It Ev POSIXLY_CORRECT
875: When set, behave in accordance with the standard, even when it conflicts
876: with historical behavior.
877: .El
1.37 jmc 878: .Sh EXIT STATUS
879: .Ex -std awk
880: .Pp
881: But note that the
882: .Ic exit
883: expression can modify the exit status.
1.7 aaron 884: .Sh EXAMPLES
1.16 jmc 885: Print lines longer than 72 characters:
886: .Pp
1.7 aaron 887: .Dl length($0) > 72
1.16 jmc 888: .Pp
889: Print first two fields in opposite order:
1.7 aaron 890: .Pp
891: .Dl { print $2, $1 }
1.16 jmc 892: .Pp
1.47 millert 893: Same, with input fields separated by comma and/or spaces and tabs:
1.7 aaron 894: .Bd -literal -offset indent
1.1 tholo 895: BEGIN { FS = ",[ \et]*|[ \et]+" }
896: { print $2, $1 }
1.7 aaron 897: .Ed
1.16 jmc 898: .Pp
899: Add up first column, print sum and average:
1.7 aaron 900: .Bd -literal -offset indent
901: { s += $1 }
902: END { print "sum is", s, " average is", s/NR }
903: .Ed
1.16 jmc 904: .Pp
905: Print all lines between start/stop pairs:
1.7 aaron 906: .Pp
907: .Dl /start/, /stop/
1.16 jmc 908: .Pp
1.45 naddy 909: Simulate
910: .Xr echo 1 :
1.7 aaron 911: .Bd -literal -offset indent
912: BEGIN { # Simulate echo(1)
913: for (i = 1; i < ARGC; i++) printf "%s ", ARGV[i]
914: printf "\en"
915: exit }
1.19 jmc 916: .Ed
917: .Pp
918: Print an error message to standard error:
919: .Bd -literal -offset indent
920: { print "error!" > "/dev/stderr" }
1.7 aaron 921: .Ed
1.59 millert 922: .Sh UNUSUAL FLOATING-POINT VALUES
923: .Nm
924: was designed before IEEE 754 arithmetic defined Not-A-Number (NaN)
925: and Infinity values, which are supported by all modern floating-point
926: hardware.
927: .Pp
928: Because
929: .Nm
930: uses
931: .Xr strtod 3
932: and
933: .Xr atof 3
934: to convert string values to double-precision floating-point values,
935: modern C libraries also convert strings starting with
936: .Dv inf
937: and
938: .Dv nan
939: into infinity and NaN values respectively.
940: This led to strange results,
941: with something like this:
942: .Pp
943: .Li echo nancy | awk '{ print $1 + 0 }'
944: .Pp
945: printing
946: .Dv nan
947: instead of zero.
948: .Pp
949: .Nm
950: now follows GNU
951: .Nm ,
952: and prefilters string values before attempting
953: to convert them to numbers, as follows:
954: .Bl -tag -width Ds
955: .It Hexadecimal values
956: Hexadecimal values (allowed since C99) convert to zero, as they did
957: prior to C99.
958: .It NaN values
959: The two strings
960: .Dq +NAN
961: and
962: .Dq -NAN
963: (case independent) convert to NaN.
964: No others do.
965: (NaNs can have signs.)
966: .It Infinity values
967: The two strings
968: .Dq +INF
969: and
970: .Dq -INF
971: (case independent) convert to positive and negative infinity, respectively.
972: No others do.
973: .El
1.7 aaron 974: .Sh SEE ALSO
1.42 tedu 975: .Xr cut 1 ,
1.52 millert 976: .Xr date 1 ,
1.47 millert 977: .Xr grep 1 ,
1.7 aaron 978: .Xr lex 1 ,
1.20 jmc 979: .Xr printf 1 ,
1.16 jmc 980: .Xr sed 1 ,
1.52 millert 981: .Xr strftime 3 ,
1.23 jmc 982: .Xr re_format 7 ,
983: .Xr script 7
1.61 jsg 984: .Rs
985: .\" 4.4BSD USD:16
1.62 ! jsg 986: .\".%R Computing Science Technical Report
! 987: .\".%N 68
! 988: .\".%D July 1978
1.61 jsg 989: .%A A. V. Aho
990: .%A P. J. Weinberger
991: .%A B. W. Kernighan
992: .%T AWK \(em A Pattern Scanning and Processing Language
1.62 ! jsg 993: .%J Software \(em Practice and Experience
! 994: .%V 9:4
! 995: .%P pp. 267-279
! 996: .%D April 1979
1.61 jsg 997: .Re
1.7 aaron 998: .Rs
999: .%A A. V. Aho
1000: .%A B. W. Kernighan
1001: .%A P. J. Weinberger
1002: .%T The AWK Programming Language
1003: .%I Addison-Wesley
1004: .%D 1988
1005: .%O ISBN 0-201-07981-X
1006: .Re
1.26 jmc 1007: .Sh STANDARDS
1008: The
1009: .Nm
1010: utility is compliant with the
1.33 jmc 1011: .St -p1003.1-2008
1.50 millert 1012: specification except that consecutive backslashes in the replacement
1013: string argument for
1014: .Fn sub
1015: and
1016: .Fn gsub
1.51 millert 1017: are not collapsed and a slash
1018: .Pq Ql /
1019: does not need to be escaped in a bracket expression.
1.53 tim 1020: Also, the behaviour of
1021: .Fn rand
1022: and
1023: .Fn srand
1024: has been changed to support non-deterministic random numbers.
1.26 jmc 1025: .Pp
1026: The flags
1027: .Op Fl \&dV
1028: and
1029: .Op Fl safe ,
1.56 millert 1030: support for regular expressions in
1031: .Va RS ,
1.52 millert 1032: as well as the functions
1033: .Fn fflush ,
1034: .Fn gensub ,
1035: .Fn compl ,
1036: .Fn and ,
1037: .Fn or ,
1038: .Fn xor ,
1039: .Fn lshift ,
1040: .Fn rshift ,
1.57 millert 1041: .Fn mktime ,
1.52 millert 1042: .Fn strftime
1043: and
1044: .Fn systime
1.26 jmc 1045: are extensions to that specification.
1.8 aaron 1046: .Sh HISTORY
1.13 millert 1047: An
1.8 aaron 1048: .Nm
1.13 millert 1049: utility appeared in
1050: .At v7 .
1.7 aaron 1051: .Sh BUGS
1.1 tholo 1052: There are no explicit conversions between numbers and strings.
1053: To force an expression to be treated as a number add 0 to it;
1054: to force it to be treated as a string concatenate
1.7 aaron 1055: .Li \&""
1056: to it.
1057: .Pp
1.1 tholo 1058: The scope rules for variables in functions are a botch;
1059: the syntax is worse.
1.47 millert 1060: .Pp
1061: Only eight-bit character sets are handled correctly.