Annotation of src/usr.bin/awk/FIXES, Revision 1.59
1.1 tholo 1: /****************************************************************
1.6 kstailey 2: Copyright (C) Lucent Technologies 1997
1.1 tholo 3: All Rights Reserved
4:
5: Permission to use, copy, modify, and distribute this software and
6: its documentation for any purpose and without fee is hereby
7: granted, provided that the above copyright notice appear in all
8: copies and that both that the copyright notice and this
9: permission notice and warranty disclaimer appear in supporting
1.6 kstailey 10: documentation, and that the name Lucent Technologies or any of
11: its entities not be used in advertising or publicity pertaining
12: to distribution of the software without specific, written prior
13: permission.
14:
15: LUCENT DISCLAIMS ALL WARRANTIES WITH REGARD TO THIS SOFTWARE,
16: INCLUDING ALL IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS.
17: IN NO EVENT SHALL LUCENT OR ANY OF ITS ENTITIES BE LIABLE FOR ANY
18: SPECIAL, INDIRECT OR CONSEQUENTIAL DAMAGES OR ANY DAMAGES
19: WHATSOEVER RESULTING FROM LOSS OF USE, DATA OR PROFITS, WHETHER
20: IN AN ACTION OF CONTRACT, NEGLIGENCE OR OTHER TORTIOUS ACTION,
21: ARISING OUT OF OR IN CONNECTION WITH THE USE OR PERFORMANCE OF
22: THIS SOFTWARE.
1.1 tholo 23: ****************************************************************/
24:
1.51 millert 25: This file lists all bug fixes, changes, etc., made since the
26: second edition of the AWK book was published in September 2023.
1.45 millert 27:
1.59 ! millert 28: May 4, 2024
! 29: Fixed a use-after-free bug with ARGV for "delete ARGV".
! 30: Also ENVtab is no longer global. Thanks to Benjamin Sturz
! 31: for spotting the ARGV issue and Todd Miller for the fix.
! 32:
! 33: May 3, 2024:
! 34: Remove warnings when compiling with g++. Thanks to Arnold Robbins.
! 35:
1.58 millert 36: Apr 22, 2024:
1.59 ! millert 37: Fixed regex engine gototab reallocation issue that was
! 38: Introduced during the Nov 24 rewrite. Thanks to Arnold Robbins.
1.58 millert 39: Fixed a scan bug in split in the case the separator is a single
1.59 ! millert 40: character. Thanks to Oguz Ismail for spotting the issue.
1.58 millert 41:
42: Mar 10, 2024:
1.59 ! millert 43: Fixed use-after-free bug in fnematch due to adjbuf invalidating
! 44: the pointers to buf. Thanks to github user caffe3 for spotting
1.58 millert 45: the issue and providing a fix, and to Miguel Pineiro Jr.
46: for the alternative fix.
47: MAX_UTF_BYTES in fnematch has been replaced with awk_mb_cur_max.
48: thanks to Miguel Pineiro Jr.
49:
1.57 millert 50: Jan 22, 2024:
51: Restore the ability to compile with g++. Thanks to
52: Arnold Robbins.
53:
54: Dec 24, 2023:
1.58 millert 55: Matchop dereference after free problem fix when the first
56: argument is a function call. Thanks to Oguz Ismail Uysal.
1.57 millert 57: Fix inconsistent handling of --csv and FS set in the
58: command line. Thanks to Wilbert van der Poel.
1.58 millert 59: Casting changes to int for is* functions.
1.57 millert 60:
1.56 millert 61: Nov 27, 2023:
1.58 millert 62: Fix exit status of system on MacOS. Update to REGRESS.
1.56 millert 63: Thanks to Arnold Robbins.
64: Fix inconsistent handling of -F and --csv, and loss of csv
1.57 millert 65: mode when FS is set.
1.56 millert 66:
1.55 millert 67: Nov 24, 2023:
68: Fix issue #199: gototab improvements to dynamically resize the
69: table, qsort and bsearch to improve the lookup speed as the
1.58 millert 70: table gets larger for multibyte input. Thanks to Arnold Robbins.
1.55 millert 71:
72: Nov 23, 2023:
73: Fix Issue #169, related to escape sequences in strings.
74: Thanks to Github user rajeevvp.
75: Fix Issue #147, reported by Github user drawkula, and fixed
76: by Miguel Pineiro Jr.
77:
78: Nov 20, 2023:
1.58 millert 79: Rewrite of fnematch to fix a number of issues, including
1.54 millert 80: extraneous output, out-of-bounds access, number of bytes
81: to push back after a failed match etc.
1.58 millert 82: Thanks to Miguel Pineiro Jr.
1.54 millert 83:
1.55 millert 84: Nov 15, 2023:
1.58 millert 85: Man page edit, regression test fixes. Thanks to Arnold Robbins
86: Consolidation of sub and gsub into dosub, removing duplicate
87: code. Thanks to Miguel Pineiro Jr.
1.54 millert 88: gcc replaced with cc everywhere.
89:
1.53 millert 90: Oct 30, 2023:
1.58 millert 91: Multiple fixes and a minor code cleanup.
92: Disabled utf-8 for non-multibyte locales, such as C or POSIX.
93: Fixed a bad char * cast that causes incorrect results on big-endian
94: systems. Also fixed an out-of-bounds read for empty CCL.
95: Fixed a buffer overflow in substr with utf-8 strings.
96: Many thanks to Todd C Miller.
1.53 millert 97:
1.52 millert 98: Sep 24, 2023:
99: fnematch and getrune have been overhauled to solve issues around
1.58 millert 100: unicode FS and RS. Also fixed gsub null match issue with unicode.
101: Big thanks to Arnold Robbins.
1.52 millert 102:
1.51 millert 103: Sep 12, 2023:
104: Fixed a length error in u8_byte2char that set RSTART to
105: incorrect (cannot happen) value for EOL match(str, /$/).
1.50 millert 106:
1.49 millert 107:
1.51 millert 108: -----------------------------------------------------------------
1.48 millert 109:
1.51 millert 110: [This entry is a summary, not a precise list of changes.]
1.46 millert 111:
1.51 millert 112: Added --csv option to enable processing of comma-separated
113: values inputs. When --csv is enabled, fields are separated
114: by commas, fields may be quoted with " double quotes, fields
115: may contain embedded newlines.
1.46 millert 116:
1.51 millert 117: If no explicit separator argument is provided, split() uses
118: the setting of --csv to determine how fields are split.
1.44 millert 119:
1.51 millert 120: Strings may now contain UTF-8 code points (not necessarily
121: characters). Functions that operate on characters, like
122: length, substr, index, match, etc., use UTF-8, so the length
123: of a string of 3 emojis is 3, not 12 as it would be if bytes
124: were counted.
1.43 millert 125:
1.58 millert 126: Regular expressions are processed as UTF-8.
1.6 kstailey 127:
1.51 millert 128: Unicode literals can be written as \u followed by one
129: to eight hexadecimal digits. These may appear in strings and
130: regular expressions.