Annotation of src/usr.bin/awk/awk.1, Revision 1.65
1.65 ! millert 1: .\" $OpenBSD: awk.1,v 1.64 2023/09/15 15:07:08 jsg Exp $
1.11 jmc 2: .\"
3: .\" Copyright (C) Lucent Technologies 1997
4: .\" All Rights Reserved
1.12 jmc 5: .\"
1.11 jmc 6: .\" Permission to use, copy, modify, and distribute this software and
7: .\" its documentation for any purpose and without fee is hereby
8: .\" granted, provided that the above copyright notice appear in all
9: .\" copies and that both that the copyright notice and this
10: .\" permission notice and warranty disclaimer appear in supporting
11: .\" documentation, and that the name Lucent Technologies or any of
12: .\" its entities not be used in advertising or publicity pertaining
13: .\" to distribution of the software without specific, written prior
14: .\" permission.
1.12 jmc 15: .\"
1.11 jmc 16: .\" LUCENT DISCLAIMS ALL WARRANTIES WITH REGARD TO THIS SOFTWARE,
17: .\" INCLUDING ALL IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS.
18: .\" IN NO EVENT SHALL LUCENT OR ANY OF ITS ENTITIES BE LIABLE FOR ANY
19: .\" SPECIAL, INDIRECT OR CONSEQUENTIAL DAMAGES OR ANY DAMAGES
20: .\" WHATSOEVER RESULTING FROM LOSS OF USE, DATA OR PROFITS, WHETHER
21: .\" IN AN ACTION OF CONTRACT, NEGLIGENCE OR OTHER TORTIOUS ACTION,
22: .\" ARISING OUT OF OR IN CONNECTION WITH THE USE OR PERFORMANCE OF
23: .\" THIS SOFTWARE.
24: .\"
1.65 ! millert 25: .Dd $Mdocdate: September 15 2023 $
1.7 aaron 26: .Dt AWK 1
27: .Os
28: .Sh NAME
29: .Nm awk
30: .Nd pattern-directed scanning and processing language
31: .Sh SYNOPSIS
32: .Nm awk
1.16 jmc 33: .Op Fl safe
34: .Op Fl V
35: .Op Fl d Ns Op Ar n
1.65 ! millert 36: .Op Fl F Ar fs | Fl -csv
1.38 schwarze 37: .Op Fl v Ar var Ns = Ns Ar value
1.18 jmc 38: .Op Ar prog | Fl f Ar progfile
1.7 aaron 39: .Ar
40: .Sh DESCRIPTION
41: .Nm
1.1 tholo 42: scans each input
1.7 aaron 43: .Ar file
1.1 tholo 44: for lines that match any of a set of patterns specified literally in
1.7 aaron 45: .Ar prog
1.16 jmc 46: or in one or more files specified as
1.7 aaron 47: .Fl f Ar progfile .
1.16 jmc 48: With each pattern there can be an associated action that will be performed
1.1 tholo 49: when a line of a
1.7 aaron 50: .Ar file
1.1 tholo 51: matches the pattern.
52: Each line is matched against the
53: pattern portion of every pattern-action statement;
54: the associated action is performed for each matched pattern.
1.6 aaron 55: The file name
1.16 jmc 56: .Sq -
1.1 tholo 57: means the standard input.
58: Any
1.7 aaron 59: .Ar file
1.1 tholo 60: of the form
1.16 jmc 61: .Ar var Ns = Ns Ar value
1.1 tholo 62: is treated as an assignment, not a filename,
63: and is executed at the time it would have been opened if it were a filename.
1.16 jmc 64: .Pp
65: The options are as follows:
1.20 jmc 66: .Bl -tag -width "-safe "
1.65 ! millert 67: .It Fl -csv
! 68: Process records using the (more or less) standard comma-separated values
! 69: .Pq CSV
! 70: format instead of the input field separator.
! 71: When the
! 72: .Fl -csv
! 73: option is specified, attempts to change the input field separator
! 74: or record separator are ignored.
1.16 jmc 75: .It Fl d Ns Op Ar n
76: Debug mode.
77: Set debug level to
78: .Ar n ,
79: or 1 if
80: .Ar n
81: is not specified.
82: A value greater than 1 causes
83: .Nm
84: to dump core on fatal errors.
85: .It Fl F Ar fs
86: Define the input field separator to be the regular expression
1.7 aaron 87: .Ar fs .
1.25 jmc 88: .It Fl f Ar progfile
1.16 jmc 89: Read program code from the specified file
1.25 jmc 90: .Ar progfile
1.16 jmc 91: instead of from the command line.
92: .It Fl safe
93: Disable file output
1.17 jmc 94: .Pf ( Ic print No > ,
95: .Ic print No >> ) ,
1.7 aaron 96: process creation
97: .Po
1.17 jmc 98: .Ar cmd | Ic getline ,
1.40 jmc 99: .Ic print | ,
1.17 jmc 100: .Ic system
1.7 aaron 101: .Pc
102: and access to the environment
1.17 jmc 103: .Pf ( Va ENVIRON ;
1.18 jmc 104: see the section on variables below).
1.17 jmc 105: This is a first
1.16 jmc 106: .Pq and not very reliable
107: approximation to a
1.7 aaron 108: .Dq safe
109: version of
1.16 jmc 110: .Nm .
111: .It Fl V
112: Print the version number of
113: .Nm
114: to standard output and exit.
115: .It Fl v Ar var Ns = Ns Ar value
116: Assign
117: .Ar value
118: to variable
119: .Ar var
120: before
121: .Ar prog
122: is executed;
123: any number of
124: .Fl v
125: options may be present.
126: .El
1.7 aaron 127: .Pp
1.18 jmc 128: The input is normally made up of input lines
129: .Pq records
130: separated by newlines, or by the value of
131: .Va RS .
132: If
133: .Va RS
134: is null, then any number of blank lines are used as the record separator,
135: and newlines are used as field separators
136: (in addition to the value of
137: .Va FS ) .
138: This is convenient when working with multi-line records.
139: .Pp
1.7 aaron 140: An input line is normally made up of fields separated by whitespace,
1.55 millert 141: or by the value of the field separator
142: .Va FS
143: at the time the line is read.
1.1 tholo 144: The fields are denoted
1.7 aaron 145: .Va $1 , $2 , ... ,
146: while
147: .Va $0
1.1 tholo 148: refers to the entire line.
1.55 millert 149: .Va FS
150: may be set to either a single character or a regular expression.
1.58 jmc 151: As a special case, if
1.55 millert 152: .Va FS
153: is a single space
154: .Pq the default ,
155: fields will be split by one or more whitespace characters.
1.1 tholo 156: If
1.7 aaron 157: .Va FS
1.1 tholo 158: is null, the input line is split into one field per character.
1.7 aaron 159: .Pp
1.18 jmc 160: Normally, any number of blanks separate fields.
161: In order to set the field separator to a single blank, use the
162: .Fl F
163: option with a value of
164: .Sq [\ \&] .
165: If a field separator of
166: .Sq t
167: is specified,
168: .Nm
169: treats it as if
170: .Sq \et
171: had been specified and uses
172: .Aq TAB
173: as the field separator.
174: In order to use a literal
175: .Sq t
176: as the field separator, use the
177: .Fl F
178: option with a value of
179: .Sq [t] .
1.55 millert 180: The field separator is usually set via the
181: .Fl F
182: option or from inside a
183: .Ic BEGIN
184: block so that it takes effect before the input is read.
1.18 jmc 185: .Pp
1.47 millert 186: A pattern-action statement has the form:
1.7 aaron 187: .Pp
188: .D1 Ar pattern Ic \&{ Ar action Ic \&}
189: .Pp
1.6 aaron 190: A missing
1.7 aaron 191: .Ic \&{ Ar action Ic \&}
1.1 tholo 192: means print the line;
193: a missing pattern always matches.
194: Pattern-action statements are separated by newlines or semicolons.
1.7 aaron 195: .Pp
1.18 jmc 196: Newlines are permitted after a terminating statement or following a comma
197: .Pq Sq ,\& ,
198: an open brace
199: .Pq Sq { ,
200: a logical AND
201: .Pq Sq && ,
202: a logical OR
203: .Pq Sq || ,
204: after the
205: .Sq do
206: or
207: .Sq else
208: keywords,
209: or after the closing parenthesis of an
210: .Sq if ,
211: .Sq for ,
212: or
213: .Sq while
214: statement.
215: Additionally, a backslash
216: .Pq Sq \e
217: can be used to escape a newline between tokens.
218: .Pp
1.1 tholo 219: An action is a sequence of statements.
220: A statement can be one of the following:
1.35 jmc 221: .Pp
222: .Bl -tag -width Ds -offset indent -compact
1.43 schwarze 223: .It Ic if Ar ( expression ) Ar statement Op Ic else Ar statement
224: .It Ic while Ar ( expression ) Ar statement
225: .It Ic for Ar ( expression ; expression ; expression ) statement
226: .It Ic for Ar ( var Ic in Ar array ) statement
227: .It Ic do Ar statement Ic while Ar ( expression )
1.35 jmc 228: .It Ic break
229: .It Ic continue
230: .It Xo Ic {
231: .Op Ar statement ...
232: .Ic }
233: .Xc
234: .It Xo Ar expression
235: .No # commonly
236: .Ar var No = Ar expression
1.7 aaron 237: .Xc
1.35 jmc 238: .It Xo Ic print
1.7 aaron 239: .Op Ar expression-list
1.17 jmc 240: .Op > Ns Ar expression
1.7 aaron 241: .Xc
1.35 jmc 242: .It Xo Ic printf Ar format
1.7 aaron 243: .Op Ar ... , expression-list
1.17 jmc 244: .Op > Ns Ar expression
1.7 aaron 245: .Xc
1.35 jmc 246: .It Ic return Op Ar expression
247: .It Xo Ic next
248: .No # skip remaining patterns on this input line
249: .Xc
250: .It Xo Ic nextfile
251: .No # skip rest of this file, open next, start at top
252: .Xc
253: .It Xo Ic delete
254: .Sm off
255: .Ar array Ic \&[ Ar expression Ic \&]
256: .Sm on
257: .No # delete an array element
1.7 aaron 258: .Xc
1.35 jmc 259: .It Xo Ic delete Ar array
260: .No # delete all elements of array
1.7 aaron 261: .Xc
1.35 jmc 262: .It Xo Ic exit
1.7 aaron 263: .Op Ar expression
1.46 deraadt 264: .No # exit processing, and perform
265: .Ic END
266: processing; status is
267: .Ar expression
1.7 aaron 268: .Xc
1.35 jmc 269: .El
1.7 aaron 270: .Pp
1.1 tholo 271: Statements are terminated by
272: semicolons, newlines or right braces.
273: An empty
1.7 aaron 274: .Ar expression-list
1.1 tholo 275: stands for
1.7 aaron 276: .Ar $0 .
277: String constants are quoted
278: .Li \&"" ,
1.20 jmc 279: with the usual C escapes recognized within
280: (see
281: .Xr printf 1
282: for a complete list of these).
1.1 tholo 283: Expressions take on string or numeric values as appropriate,
284: and are built using the operators
1.7 aaron 285: .Ic + \- * / % ^
1.20 jmc 286: .Pq exponentiation ,
287: and concatenation
288: .Pq indicated by whitespace .
1.1 tholo 289: The operators
1.16 jmc 290: .Ic \&! ++ \-\- += \-= *= /= %= ^=
1.59 millert 291: .Ic > >= < <= == != ?\&:
1.1 tholo 292: are also available in expressions.
293: Variables may be scalars, array elements
294: (denoted
1.7 aaron 295: .Li x[i] )
1.1 tholo 296: or fields.
297: Variables are initialized to the null string.
298: Array subscripts may be any string,
299: not necessarily numeric;
300: this allows for a form of associative memory.
301: Multiple subscripts such as
1.7 aaron 302: .Li [i,j,k]
1.1 tholo 303: are permitted; the constituents are concatenated,
304: separated by the value of
1.17 jmc 305: .Va SUBSEP
1.31 deraadt 306: .Pq see the section on variables below .
1.7 aaron 307: .Pp
1.1 tholo 308: The
1.7 aaron 309: .Ic print
1.1 tholo 310: statement prints its arguments on the standard output
311: (or on a file if
1.47 millert 312: .Pf >\ \& Ar file
1.1 tholo 313: or
1.47 millert 314: .Pf >>\ \& Ar file
1.1 tholo 315: is present or on a pipe if
1.17 jmc 316: .Pf |\ \& Ar cmd
1.1 tholo 317: is present), separated by the current output field separator,
318: and terminated by the output record separator.
1.7 aaron 319: .Ar file
1.1 tholo 320: and
1.7 aaron 321: .Ar cmd
1.1 tholo 322: may be literal names or parenthesized expressions;
323: identical string values in different statements denote
324: the same open file.
325: The
1.7 aaron 326: .Ic printf
1.47 millert 327: statement formats its expression list according to the
328: .Ar format
1.1 tholo 329: (see
1.28 jmc 330: .Xr printf 1 ) .
1.18 jmc 331: .Pp
332: Patterns are arbitrary Boolean combinations
333: (with
334: .Ic "\&! || &&" )
335: of regular expressions and
336: relational expressions.
1.22 jmc 337: .Nm
338: supports extended regular expressions
339: .Pq EREs .
340: See
341: .Xr re_format 7
342: for more information on regular expressions.
1.18 jmc 343: Isolated regular expressions
344: in a pattern apply to the entire line.
345: Regular expressions may also occur in
346: relational expressions, using the operators
347: .Ic ~
348: and
349: .Ic !~ .
1.44 schwarze 350: .Pf / Ar re Ns /
1.18 jmc 351: is a constant regular expression;
352: any string (constant or variable) may be used
353: as a regular expression, except in the position of an isolated regular expression
354: in a pattern.
355: .Pp
356: A pattern may consist of two patterns separated by a comma;
357: in this case, the action is performed for all lines
358: from an occurrence of the first pattern
359: through an occurrence of the second.
360: .Pp
361: A relational expression is one of the following:
1.35 jmc 362: .Pp
363: .Bl -tag -width Ds -offset indent -compact
364: .It Ar expression matchop regular-expression
365: .It Ar expression relop expression
366: .It Ar expression Ic in Ar array-name
367: .It Xo Ic \&( Ns
1.18 jmc 368: .Ar expr , expr , \&... Ns Ic \&) in
1.35 jmc 369: .Ar array-name
1.18 jmc 370: .Xc
1.35 jmc 371: .El
1.18 jmc 372: .Pp
373: where a
374: .Ar relop
375: is any of the six relational operators in C, and a
376: .Ar matchop
377: is either
378: .Ic ~
379: (matches)
380: or
381: .Ic !~
382: (does not match).
383: A conditional is an arithmetic expression,
384: a relational expression,
385: or a Boolean combination
386: of these.
387: .Pp
1.46 deraadt 388: The special pattern
1.18 jmc 389: .Ic BEGIN
1.46 deraadt 390: may be used to capture control before the first input line is read.
391: The special pattern
1.18 jmc 392: .Ic END
1.46 deraadt 393: may be used to capture control after processing is finished.
1.18 jmc 394: .Ic BEGIN
395: and
396: .Ic END
397: do not combine with other patterns.
1.47 millert 398: They may appear multiple times in a program and execute
399: in the order they are read by
400: .Nm .
1.18 jmc 401: .Pp
402: Variable names with special meanings:
403: .Pp
1.20 jmc 404: .Bl -tag -width "FILENAME " -compact
1.18 jmc 405: .It Va ARGC
406: Argument count, assignable.
407: .It Va ARGV
408: Argument array, assignable;
409: non-null members are taken as filenames.
410: .It Va CONVFMT
411: Conversion format when converting numbers
412: (default
413: .Qq Li %.6g ) .
414: .It Va ENVIRON
415: Array of environment variables; subscripts are names.
416: .It Va FILENAME
417: The name of the current input file.
418: .It Va FNR
419: Ordinal number of the current record in the current file.
420: .It Va FS
1.55 millert 421: Regular expression used to separate fields (default whitespace);
422: also settable by option
1.63 jmc 423: .Fl F Ar fs .
1.18 jmc 424: .It Va NF
425: Number of fields in the current record.
426: .Va $NF
427: can be used to obtain the value of the last field in the current record.
428: .It Va NR
429: Ordinal number of the current record.
430: .It Va OFMT
431: Output format for numbers (default
432: .Qq Li %.6g ) .
433: .It Va OFS
434: Output field separator (default blank).
435: .It Va ORS
436: Output record separator (default newline).
437: .It Va RLENGTH
438: The length of the string matched by the
439: .Fn match
440: function.
441: .It Va RS
442: Input record separator (default newline).
1.49 millert 443: If empty, blank lines separate records.
444: If more than one character long,
445: .Va RS
446: is treated as a regular expression, and records are
447: separated by text matching the expression.
1.18 jmc 448: .It Va RSTART
449: The starting position of the string matched by the
450: .Fn match
451: function.
452: .It Va SUBSEP
453: Separates multiple subscripts (default 034).
454: .El
1.17 jmc 455: .Sh FUNCTIONS
456: The awk language has a variety of built-in functions:
1.30 jmc 457: arithmetic, string, input/output, general, and bit-operation.
458: .Pp
459: Functions may be defined (at the position of a pattern-action statement)
460: thusly:
461: .Pp
462: .Dl function foo(a, b, c) { ...; return x }
463: .Pp
464: Parameters are passed by value if scalar, and by reference if array name;
465: functions may be called recursively.
466: Parameters are local to the function; all other variables are global.
467: Thus local variables may be created by providing excess parameters in
468: the function definition.
1.17 jmc 469: .Ss Arithmetic Functions
470: .Bl -tag -width "atan2(y, x)"
471: .It Fn atan2 y x
472: Return the arctangent of
473: .Fa y Ns / Ns Fa x
474: in radians.
475: .It Fn cos x
476: Return the cosine of
477: .Fa x ,
478: where
479: .Fa x
480: is in radians.
481: .It Fn exp x
482: Return the exponential of
483: .Fa x .
484: .It Fn int x
485: Return
486: .Fa x
487: truncated to an integer value.
488: .It Fn log x
489: Return the natural logarithm of
490: .Fa x .
1.7 aaron 491: .It Fn rand
1.17 jmc 492: Return a random number,
493: .Fa n ,
494: such that
495: .Sm off
496: .Pf 0 \*(Le Fa n No \*(Lt 1 .
497: .Sm on
1.53 tim 498: Random numbers are non-deterministic unless a seed is explicitly set with
499: .Fn srand .
1.17 jmc 500: .It Fn sin x
501: Return the sine of
502: .Fa x ,
503: where
504: .Fa x
505: is in radians.
506: .It Fn sqrt x
507: Return the square root of
508: .Fa x .
509: .It Fn srand expr
1.16 jmc 510: Sets seed for
1.7 aaron 511: .Fn rand
1.17 jmc 512: to
513: .Fa expr
1.1 tholo 514: and returns the previous seed.
1.17 jmc 515: If
516: .Fa expr
1.53 tim 517: is omitted,
518: .Fn rand
519: will return non-deterministic random numbers.
1.17 jmc 520: .El
521: .Ss String Functions
522: .Bl -tag -width "split(s, a, fs)"
1.52 millert 523: .It Fn gensub r s h [t]
524: Search the target string
525: .Ar t
526: for matches of the regular expression
527: .Ar r .
528: If
529: .Ar h
530: is a string beginning with
531: .Ic g
532: or
533: .Ic G ,
534: then replace all matches of
535: .Ar r
536: with
537: .Ar s .
538: Otherwise,
539: .Ar h
540: is a number indicating which match of
541: .Ar r
542: to replace.
543: If no
544: .Ar t
545: is supplied,
546: .Va $0
547: is used instead.
548: .\"Within the replacement text
549: .\".Ar s ,
550: .\"the sequence
551: .\".Ar \en ,
552: .\"where
553: .\".Ar n
554: .\"is a digit from 1 to 9, may be used to indicate just the text that
555: .\"matched the
556: .\".Ar n Ap th
557: .\"parenthesized subexpression.
558: .\"The sequence
559: .\".Ic \e0
560: .\"represents the entire text, as does the character
561: .\".Ic & .
562: Unlike
563: .Fn sub
564: and
565: .Fn gsub ,
566: the modified string is returned as the result of the function,
567: and the original target is
568: .Em not
569: changed.
570: Note that
571: .Ar \en
572: sequences within the replacement string
573: .Ar s ,
574: as supported by GNU
575: .Nm ,
576: are
577: .Em not
578: supported at this time.
1.17 jmc 579: .It Fn gsub r t s
580: The same as
581: .Fn sub
582: except that all occurrences of the regular expression are replaced.
583: .Fn gsub
584: returns the number of replacements.
1.7 aaron 585: .It Fn index s t
1.16 jmc 586: The position in
1.7 aaron 587: .Fa s
1.1 tholo 588: where the string
1.7 aaron 589: .Fa t
1.1 tholo 590: occurs, or 0 if it does not.
1.17 jmc 591: .It Fn length s
592: The length of
593: .Fa s
594: taken as a string,
1.47 millert 595: number of elements in an array for an array argument,
596: or length of
1.17 jmc 597: .Va $0
598: if no argument is given.
1.7 aaron 599: .It Fn match s r
1.16 jmc 600: The position in
1.7 aaron 601: .Fa s
1.1 tholo 602: where the regular expression
1.7 aaron 603: .Fa r
1.1 tholo 604: occurs, or 0 if it does not.
1.17 jmc 605: The variable
1.7 aaron 606: .Va RSTART
1.17 jmc 607: is set to the starting position of the matched string
608: .Pq which is the same as the returned value
609: or zero if no match is found.
610: The variable
1.7 aaron 611: .Va RLENGTH
1.17 jmc 612: is set to the length of the matched string,
613: or \-1 if no match is found.
1.7 aaron 614: .It Fn split s a fs
1.16 jmc 615: Splits the string
1.7 aaron 616: .Fa s
1.1 tholo 617: into array elements
1.7 aaron 618: .Va a[1] , a[2] , ... , a[n]
1.1 tholo 619: and returns
1.7 aaron 620: .Va n .
1.1 tholo 621: The separation is done with the regular expression
1.7 aaron 622: .Ar fs
1.1 tholo 623: or with the field separator
1.7 aaron 624: .Va FS
1.1 tholo 625: if
1.7 aaron 626: .Ar fs
1.1 tholo 627: is not given.
628: An empty string as field separator splits the string
629: into one array element per character.
1.17 jmc 630: .It Fn sprintf fmt expr ...
631: The string resulting from formatting
632: .Fa expr , ...
633: according to the
1.28 jmc 634: .Xr printf 1
1.17 jmc 635: format
636: .Fa fmt .
1.7 aaron 637: .It Fn sub r t s
1.16 jmc 638: Substitutes
1.7 aaron 639: .Fa t
1.1 tholo 640: for the first occurrence of the regular expression
1.7 aaron 641: .Fa r
1.1 tholo 642: in the string
1.7 aaron 643: .Fa s .
1.1 tholo 644: If
1.7 aaron 645: .Fa s
1.1 tholo 646: is not given,
1.7 aaron 647: .Va $0
1.1 tholo 648: is used.
1.17 jmc 649: An ampersand
650: .Pq Sq &
651: in
652: .Fa t
653: is replaced in string
654: .Fa s
655: with regular expression
656: .Fa r .
657: A literal ampersand can be specified by preceding it with two backslashes
658: .Pq Sq \e\e .
659: A literal backslash can be specified by preceding it with another backslash
660: .Pq Sq \e\e .
1.7 aaron 661: .Fn sub
1.17 jmc 662: returns the number of replacements.
663: .It Fn substr s m n
664: Return at most the
665: .Fa n Ns -character
666: substring of
667: .Fa s
668: that begins at position
669: .Fa m
670: counted from 1.
671: If
672: .Fa n
673: is omitted, or if
674: .Fa n
675: specifies more characters than are left in the string,
676: the length of the substring is limited by the length of
677: .Fa s .
1.7 aaron 678: .It Fn tolower str
1.16 jmc 679: Returns a copy of
1.7 aaron 680: .Fa str
1.1 tholo 681: with all upper-case characters translated to their
682: corresponding lower-case equivalents.
1.7 aaron 683: .It Fn toupper str
1.16 jmc 684: Returns a copy of
1.7 aaron 685: .Fa str
1.1 tholo 686: with all lower-case characters translated to their
687: corresponding upper-case equivalents.
1.7 aaron 688: .El
1.52 millert 689: .Ss Time Functions
690: This version of
691: .Nm
692: provides the following functions for obtaining and formatting time
693: stamps.
694: .Bl -tag -width indent
1.57 millert 695: .It Fn mktime datespec
696: Converts
697: .Fa datespec
698: into a timestamp in the same form as a value returned by
699: .Fn systime .
700: The
701: .Fa datespec
702: is a string composed of six or seven numbers separated by whitespace:
703: .Bd -literal -offset indent
704: YYYY MM DD HH MM SS [DST]
705: .Ed
706: .Pp
707: The fields in
708: .Fa datespec
709: are as follows:
710: .Bl -tag -width "YYYY"
1.60 millert 711: .It YYYY
1.57 millert 712: Year: a four-digit year, including the century.
713: .It MM
714: Month: a number from 1 to 12.
715: .It DD
716: Day: a number from 1 to 31.
717: .It HH
718: Hour: a number from 0 to 23.
719: .It MM
720: Minute: a number from 0 to 59.
721: .It SS
722: Second: a number from 0 to 60 (permitting a leap second).
723: .It DST
724: Daylight Saving Time: a positive or zero value indicates that
725: DST is or is not in effect.
726: If DST is not specified, or is negative,
727: .Fn mktime
728: will attempt to determine the correct value.
729: .El
1.52 millert 730: .It Fn strftime "[format [, timestamp]]"
731: Formats
732: .Ar timestamp
733: according to the string
734: .Ar format .
735: The format string may contain any of the conversion specifications described
736: in the
737: .Xr strftime 3
738: manual page, as well as any arbitrary text.
739: The
740: .Ar timestamp
741: must be in the same form as a value returned by
1.57 millert 742: .Fn mktime
743: and
1.52 millert 744: .Fn systime .
745: If
746: .Ar timestamp
747: is not specified, the current time is used.
748: If
749: .Ar format
750: is not specified, a default format equivalent to the output of
751: .Xr date 1
752: is used.
753: .It Fn systime
754: Returns the value of time in seconds since 0 hours, 0 minutes,
755: 0 seconds, January 1, 1970, Coordinated Universal Time (UTC).
756: .El
1.17 jmc 757: .Ss Input/Output and General Functions
758: .Bl -tag -width "getline [var] < file"
759: .It Fn close expr
760: Closes the file or pipe
761: .Fa expr .
762: .Fa expr
763: should match the string that was used to open the file or pipe.
764: .It Ar cmd | Ic getline Op Va var
765: Read a record of input from a stream piped from the output of
766: .Ar cmd .
767: If
768: .Va var
769: is omitted, the variables
770: .Va $0
771: and
772: .Va NF
773: are set.
774: Otherwise
775: .Va var
776: is set.
777: If the stream is not open, it is opened.
778: As long as the stream remains open, subsequent calls
779: will read subsequent records from the stream.
780: The stream remains open until explicitly closed with a call to
781: .Fn close .
1.24 jmc 782: .Ic getline
783: returns 1 for a successful input, 0 for end of file, and \-1 for an error.
784: .It Fn fflush [expr]
1.39 jmc 785: Flushes any buffered output for the file or pipe
1.24 jmc 786: .Fa expr ,
787: or all open files or pipes if
788: .Fa expr
789: is omitted.
1.17 jmc 790: .Fa expr
791: should match the string that was used to open the file or pipe.
792: .It Ic getline
793: Sets
794: .Va $0
795: to the next input record from the current input file.
796: This form of
797: .Ic getline
798: sets the variables
799: .Va NF ,
800: .Va NR ,
801: and
802: .Va FNR .
1.7 aaron 803: .Ic getline
1.17 jmc 804: returns 1 for a successful input, 0 for end of file, and \-1 for an error.
805: .It Ic getline Va var
806: Sets
1.7 aaron 807: .Va $0
1.17 jmc 808: to variable
809: .Va var .
810: This form of
811: .Ic getline
812: sets the variables
813: .Va NR
814: and
815: .Va FNR .
816: .Ic getline
817: returns 1 for a successful input, 0 for end of file, and \-1 for an error.
818: .It Xo
819: .Ic getline Op Va var
1.47 millert 820: .Pf <\ \& Ar file
1.17 jmc 821: .Xc
822: Sets
1.7 aaron 823: .Va $0
1.1 tholo 824: to the next record from
1.7 aaron 825: .Ar file .
1.17 jmc 826: If
827: .Va var
828: is omitted, the variables
829: .Va $0
830: and
831: .Va NF
832: are set.
833: Otherwise
834: .Va var
835: is set.
836: If
837: .Ar file
838: is not open, it is opened.
839: As long as the stream remains open, subsequent calls will read subsequent
840: records from
841: .Ar file .
842: .Ar file
843: remains open until explicitly closed with a call to
844: .Fn close .
845: .It Fn system cmd
846: Executes
847: .Fa cmd
848: and returns its exit status.
1.47 millert 849: This will be \-1 upon error,
850: .Ar cmd Ns 's
851: exit status upon a normal exit,
852: 256 +
853: .Em sig
854: if
855: .Fa cmd
856: was terminated by a signal, where
857: .Em sig
858: is the number of the signal,
859: or 512 +
860: .Em sig
861: if there was a core dump.
1.17 jmc 862: .El
1.30 jmc 863: .Ss Bit-Operation Functions
1.29 pyr 864: .Bl -tag -width "lshift(a, b)"
865: .It Fn compl x
866: Returns the bitwise complement of integer argument x.
867: .It Fn and x y
1.30 jmc 868: Performs a bitwise AND on integer arguments x and y.
1.29 pyr 869: .It Fn or x y
1.30 jmc 870: Performs a bitwise OR on integer arguments x and y.
1.29 pyr 871: .It Fn xor x y
1.30 jmc 872: Performs a bitwise Exclusive-OR on integer arguments x and y.
1.29 pyr 873: .It Fn lshift x n
1.39 jmc 874: Returns integer argument x shifted by n bits to the left.
1.29 pyr 875: .It Fn rshift x n
1.39 jmc 876: Returns integer argument x shifted by n bits to the right.
1.29 pyr 877: .El
1.50 millert 878: .Sh ENVIRONMENT
879: The following environment variables affect the execution of
880: .Nm :
881: .Bl -tag -width POSIXLY_CORRECT
882: .It Ev POSIXLY_CORRECT
883: When set, behave in accordance with the standard, even when it conflicts
884: with historical behavior.
885: .El
1.37 jmc 886: .Sh EXIT STATUS
887: .Ex -std awk
888: .Pp
889: But note that the
890: .Ic exit
891: expression can modify the exit status.
1.7 aaron 892: .Sh EXAMPLES
1.16 jmc 893: Print lines longer than 72 characters:
894: .Pp
1.7 aaron 895: .Dl length($0) > 72
1.16 jmc 896: .Pp
897: Print first two fields in opposite order:
1.7 aaron 898: .Pp
899: .Dl { print $2, $1 }
1.16 jmc 900: .Pp
1.47 millert 901: Same, with input fields separated by comma and/or spaces and tabs:
1.7 aaron 902: .Bd -literal -offset indent
1.1 tholo 903: BEGIN { FS = ",[ \et]*|[ \et]+" }
904: { print $2, $1 }
1.7 aaron 905: .Ed
1.16 jmc 906: .Pp
907: Add up first column, print sum and average:
1.7 aaron 908: .Bd -literal -offset indent
909: { s += $1 }
910: END { print "sum is", s, " average is", s/NR }
911: .Ed
1.16 jmc 912: .Pp
913: Print all lines between start/stop pairs:
1.7 aaron 914: .Pp
915: .Dl /start/, /stop/
1.16 jmc 916: .Pp
1.45 naddy 917: Simulate
918: .Xr echo 1 :
1.7 aaron 919: .Bd -literal -offset indent
920: BEGIN { # Simulate echo(1)
921: for (i = 1; i < ARGC; i++) printf "%s ", ARGV[i]
922: printf "\en"
923: exit }
1.19 jmc 924: .Ed
925: .Pp
926: Print an error message to standard error:
927: .Bd -literal -offset indent
928: { print "error!" > "/dev/stderr" }
1.7 aaron 929: .Ed
1.59 millert 930: .Sh UNUSUAL FLOATING-POINT VALUES
931: .Nm
932: was designed before IEEE 754 arithmetic defined Not-A-Number (NaN)
933: and Infinity values, which are supported by all modern floating-point
934: hardware.
935: .Pp
936: Because
937: .Nm
938: uses
939: .Xr strtod 3
940: and
941: .Xr atof 3
942: to convert string values to double-precision floating-point values,
943: modern C libraries also convert strings starting with
944: .Dv inf
945: and
946: .Dv nan
947: into infinity and NaN values respectively.
948: This led to strange results,
949: with something like this:
950: .Pp
951: .Li echo nancy | awk '{ print $1 + 0 }'
952: .Pp
953: printing
954: .Dv nan
955: instead of zero.
956: .Pp
957: .Nm
958: now follows GNU
959: .Nm ,
960: and prefilters string values before attempting
961: to convert them to numbers, as follows:
962: .Bl -tag -width Ds
963: .It Hexadecimal values
964: Hexadecimal values (allowed since C99) convert to zero, as they did
965: prior to C99.
966: .It NaN values
967: The two strings
968: .Dq +NAN
969: and
970: .Dq -NAN
971: (case independent) convert to NaN.
972: No others do.
973: (NaNs can have signs.)
974: .It Infinity values
975: The two strings
976: .Dq +INF
977: and
978: .Dq -INF
979: (case independent) convert to positive and negative infinity, respectively.
980: No others do.
981: .El
1.7 aaron 982: .Sh SEE ALSO
1.42 tedu 983: .Xr cut 1 ,
1.52 millert 984: .Xr date 1 ,
1.47 millert 985: .Xr grep 1 ,
1.7 aaron 986: .Xr lex 1 ,
1.20 jmc 987: .Xr printf 1 ,
1.16 jmc 988: .Xr sed 1 ,
1.52 millert 989: .Xr strftime 3 ,
1.23 jmc 990: .Xr re_format 7 ,
991: .Xr script 7
1.61 jsg 992: .Rs
993: .\" 4.4BSD USD:16
1.62 jsg 994: .\".%R Computing Science Technical Report
995: .\".%N 68
996: .\".%D July 1978
1.61 jsg 997: .%A A. V. Aho
998: .%A P. J. Weinberger
999: .%A B. W. Kernighan
1000: .%T AWK \(em A Pattern Scanning and Processing Language
1.62 jsg 1001: .%J Software \(em Practice and Experience
1002: .%V 9:4
1003: .%P pp. 267-279
1004: .%D April 1979
1.61 jsg 1005: .Re
1.7 aaron 1006: .Rs
1007: .%A A. V. Aho
1008: .%A B. W. Kernighan
1009: .%A P. J. Weinberger
1010: .%T The AWK Programming Language
1011: .%I Addison-Wesley
1.64 jsg 1012: .%D 2024
1013: .%O ISBN 0-13-826972-6
1.7 aaron 1014: .Re
1.26 jmc 1015: .Sh STANDARDS
1016: The
1017: .Nm
1018: utility is compliant with the
1.33 jmc 1019: .St -p1003.1-2008
1.50 millert 1020: specification except that consecutive backslashes in the replacement
1021: string argument for
1022: .Fn sub
1023: and
1024: .Fn gsub
1.51 millert 1025: are not collapsed and a slash
1026: .Pq Ql /
1027: does not need to be escaped in a bracket expression.
1.53 tim 1028: Also, the behaviour of
1029: .Fn rand
1030: and
1031: .Fn srand
1032: has been changed to support non-deterministic random numbers.
1.26 jmc 1033: .Pp
1034: The flags
1035: .Op Fl \&dV
1036: and
1037: .Op Fl safe ,
1.56 millert 1038: support for regular expressions in
1039: .Va RS ,
1.52 millert 1040: as well as the functions
1041: .Fn fflush ,
1042: .Fn gensub ,
1043: .Fn compl ,
1044: .Fn and ,
1045: .Fn or ,
1046: .Fn xor ,
1047: .Fn lshift ,
1048: .Fn rshift ,
1.57 millert 1049: .Fn mktime ,
1.52 millert 1050: .Fn strftime
1051: and
1052: .Fn systime
1.26 jmc 1053: are extensions to that specification.
1.8 aaron 1054: .Sh HISTORY
1.13 millert 1055: An
1.8 aaron 1056: .Nm
1.13 millert 1057: utility appeared in
1058: .At v7 .
1.7 aaron 1059: .Sh BUGS
1.1 tholo 1060: There are no explicit conversions between numbers and strings.
1061: To force an expression to be treated as a number add 0 to it;
1062: to force it to be treated as a string concatenate
1.7 aaron 1063: .Li \&""
1064: to it.
1065: .Pp
1.1 tholo 1066: The scope rules for variables in functions are a botch;
1067: the syntax is worse.
1.47 millert 1068: .Pp
1.65 ! millert 1069: Input is expected to be UTF-8 encoded.
! 1070: Other multibyte character sets are not handled.