git: Complete upgrade of gnu grep 2.14 => 2.20
John Marino
marino at crater.dragonflybsd.org
Fri Oct 10 15:53:39 PDT 2014
commit 51ddd709576b6d603cb35ff07b103a739f875a02
Author: John Marino <draco at marino.st>
Date: Fri Oct 10 23:33:40 2014 +0200
Complete upgrade of gnu grep 2.14 => 2.20
** 2.20 Bug fixes
grep --max-count=N FILE would no longer stop reading after Nth match.
I.e., while grep would still print the correct output, it would continue
reading until end of input, and hence, potentially forever.
[bug introduced in grep-2.19]
A command like echo aa|grep -E 'a(b$|c$)' would mistakenly
report the input as a matched line. [bug introduced in grep-2.19]
** 2.20 Changes in behavior
grep --exclude-dir='FOO/' now excludes the directory FOO.
Previously, the trailing slash meant the option was ineffective.
** 2.19 Improvements
Performance has improved, typically by 10% and in some cases by a
factor of 200. However, performance of grep -P in UTF-8 locales has
gotten worse as part of the fix for the crashes mentioned below.
** 2.19 Bug fixes
grep no longer mishandles patterns like [a-[.z.]], and no longer
mishandles patterns like [^a] in locales that have multicharacter
collating sequences so that [^a] can match a string of two characters.
grep no longer mishandles an empty pattern at the end of a pattern list.
[bug introduced in grep-2.5]
grep -C NUM now outputs separators consistently even when NUM is zero,
and similarly for grep -A NUM and grep -B NUM.
[bug present since "the beginning"]
grep -f no longer mishandles patterns containing NUL bytes.
[bug introduced in grep-2.11]
Plain grep, grep -E, and grep -F now treat encoding errors in patterns
the same way the GNU regular expression matcher treats them, with respect
to whether the errors can match parts of multibyte characters in data.
[bug present since "the beginning"]
grep -w no longer mishandles a potential match adjacent to a letter that
takes up two or more bytes in a multibyte encoding.
Similarly, the patterns '\<', '\>', '\b', and '\B' no longer
mishandle word-boundary matches in multibyte locales.
[bug present since "the beginning"]
grep -P now reports an error and exits when given invalid UTF-8 data.
Previously it was unreliable, and sometimes crashed or looped.
[bug introduced in grep-2.16]
grep -P now works with -w and -x and backreferences. Before,
echo aa|grep -Pw '(.)\1' would fail to match, yet
echo aa|grep -Pw '(.)\2' would match.
grep -Pw now works like grep -w in that the matched string has to be
preceded and followed by non-word components or the beginning and end
of the line (as opposed to word boundaries before). Before, this
echo a@@a| grep -Pw @@ would match, yet this
echo a@@a| grep -w @@ would not. Now, they both fail to match,
per the documentation on how grep's -w works.
grep -i no longer mishandles patterns containing titlecase characters.
For example, in a locale containing the titlecase character
'Ç' (U+01C8 LATIN CAPITAL LETTER L WITH SMALL LETTER J),
'grep -i Ç' now matches both 'Ç' (U+01C7 LATIN CAPITAL LETTER LJ)
and 'Ç' (U+01C9 LATIN SMALL LETTER LJ).
** 2.18 Bug fixes
grep no longer mishandles patterns like [^^-~] in unibyte locales.
[bug introduced in grep-2.8]
grep -i in a multibyte, non-UTF8 locale could be up to 200 times slower
than in 2.16. [bug introduced in grep-2.17]
** 2.17 Improvements
grep -i in a multibyte locale is now typically 10 times faster
for patterns that do not contain \ or [.
grep (without -i) in a multibyte locale is now up to 7 times faster
when processing many matched lines.
** 2.16 Bug fixes
The fix to make \s and \S work with multi-byte white space broke
the use of each shortcut whenever followed by a repetition operator.
For example, \s*, \s+, \s? and \s{3} would all malfunction in a
multi-byte locale. [bug introduced in grep-2.15]
The fix to make grep -P work better with UTF-8 made it possible for
grep to evoke a larger set of PCRE errors, some of which could trigger
an abort. E.g., this would abort:
printf '\x82'|LC_ALL=en_US.UTF-8 grep -P y
Now grep handles arbitrary PCRE errors. [bug introduced in grep-2.15]
Handle very long lines (2GiB and longer) on systems with a deficient
read system call.
** 2.15 Bug fixes
grep's \s and \S failed to work with multi-byte white space characters.
For example, \s would fail to match a non-breaking space, and this
would print nothing: printf '\xc2\xa0' | LC_ALL=en_US.UTF-8 grep '\s'
A related bug is that \S would mistakenly match an invalid multibyte
character. For example, the following would match:
printf '\x82\n' | LC_ALL=en_US.UTF-8 grep '^\S$'
[bug present since grep-2.6]
grep -i would segfault on systems using UTF-16-based wchar_t (Cygwin)
when converting an input string containing certain 4-byte UTF-8
sequences to lower case. The conversions to wchar_t and back to
a UTF-8 multibyte string did not take surrogate pairs into account.
[bug present since at least grep-2.6, though the segfault is new with 2.13]
grep -E would segfault when given a regexp like '([^.]*[M]){1,2}'
for any multibyte character M. [bug introduced in grep-2.6, which would
segfault, but 2.7 and 2.8 had no problem, and 2.9 through 2.14 would
hit a failed assertion. ]
grep -F would get stuck in an infinite loop when given a search string
that is an invalid byte sequence in the current locale and that matches
the bytes of the input twice on a line. Now grep fails with exit status 1.
grep -P could misbehave. While multi-byte mode is only supported by PCRE
with UTF-8 locales, grep did not activate it. This would cause failures
to match multibyte characters against some regular expressions, especially
those including the '.' or '\p' metacharacters.
** 2.15 New features
grep -P can now use a just-in-time compiler to greatly speed up matches,
This feature is transparent to the user; no flag is required to enable
it. It is only available if the corresponding support in the PCRE
library is detected when grep is compiled.
Summary of changes:
contrib/grep/README.DELETED | 4 +-
contrib/grep/README.DRAGONFLY | 14 +-
gnu/usr.bin/grep/Makefile | 2 +-
gnu/usr.bin/grep/Makefile.inc0 | 14 -
gnu/usr.bin/grep/egrep/Makefile | 8 +-
gnu/usr.bin/grep/egrep/egrep | 11 +
gnu/usr.bin/grep/fgrep/Makefile | 8 +-
gnu/usr.bin/grep/fgrep/fgrep | 11 +
gnu/usr.bin/grep/grep/Makefile | 21 +-
gnu/usr.bin/grep/grep/grep.1 | 38 +-
gnu/usr.bin/grep/libgrep/Makefile | 21 -
gnu/usr.bin/grep/libgreputils/Makefile | 22 +-
gnu/usr.bin/grep/libgreputils/alloca.h | 2 +-
gnu/usr.bin/grep/libgreputils/config.h | 265 ++-
gnu/usr.bin/grep/libgreputils/configmake.h | 4 +-
gnu/usr.bin/grep/libgreputils/{fcntl.h => ctype.h} | 342 +---
.../grep/libgreputils/{fcntl.h => dirent.h} | 437 ++---
gnu/usr.bin/grep/libgreputils/fcntl.h | 18 +-
gnu/usr.bin/grep/libgreputils/getopt.h | 4 +-
gnu/usr.bin/grep/libgreputils/{fcntl.h => iconv.h} | 341 +---
gnu/usr.bin/grep/libgreputils/inttypes.h | 1452 +++++++++++++++
.../grep/libgreputils/{fcntl.h => langinfo.h} | 431 ++---
.../grep/libgreputils/{fcntl.h => locale.h} | 423 ++---
gnu/usr.bin/grep/libgreputils/stdio.h | 1665 +++++++++++++++++
gnu/usr.bin/grep/libgreputils/stdlib.h | 1276 +++++++++++++
gnu/usr.bin/grep/libgreputils/string.h | 1341 ++++++++++++++
gnu/usr.bin/grep/libgreputils/sys/stat.h | 8 +-
.../grep/libgreputils/{fcntl.h => sys/time.h} | 438 ++---
gnu/usr.bin/grep/libgreputils/sys/types.h | 54 +
gnu/usr.bin/grep/libgreputils/unistd.h | 1869 ++++++++++++++++++++
gnu/usr.bin/grep/libgreputils/unistr.h | 2 +-
gnu/usr.bin/grep/libgreputils/unitypes.h | 2 +-
gnu/usr.bin/grep/libgreputils/uniwidth.h | 2 +-
gnu/usr.bin/grep/libgreputils/wchar.h | 1340 ++++++++++++++
.../grep/libgreputils/{fcntl.h => wctype.h} | 733 +++++---
35 files changed, 10455 insertions(+), 2168 deletions(-)
delete mode 100644 gnu/usr.bin/grep/Makefile.inc0
create mode 100755 gnu/usr.bin/grep/egrep/egrep
create mode 100755 gnu/usr.bin/grep/fgrep/fgrep
delete mode 100644 gnu/usr.bin/grep/libgrep/Makefile
copy gnu/usr.bin/grep/libgreputils/{fcntl.h => ctype.h} (60%)
copy gnu/usr.bin/grep/libgreputils/{fcntl.h => dirent.h} (66%)
copy gnu/usr.bin/grep/libgreputils/{fcntl.h => iconv.h} (63%)
create mode 100644 gnu/usr.bin/grep/libgreputils/inttypes.h
copy gnu/usr.bin/grep/libgreputils/{fcntl.h => langinfo.h} (61%)
copy gnu/usr.bin/grep/libgreputils/{fcntl.h => locale.h} (65%)
create mode 100644 gnu/usr.bin/grep/libgreputils/stdio.h
create mode 100644 gnu/usr.bin/grep/libgreputils/stdlib.h
create mode 100644 gnu/usr.bin/grep/libgreputils/string.h
copy gnu/usr.bin/grep/libgreputils/{fcntl.h => sys/time.h} (64%)
create mode 100644 gnu/usr.bin/grep/libgreputils/sys/types.h
create mode 100644 gnu/usr.bin/grep/libgreputils/unistd.h
create mode 100644 gnu/usr.bin/grep/libgreputils/wchar.h
copy gnu/usr.bin/grep/libgreputils/{fcntl.h => wctype.h} (55%)
http://gitweb.dragonflybsd.org/dragonfly.git/commitdiff/51ddd709576b6d603cb35ff07b103a739f875a02
--
DragonFly BSD source repository
More information about the Commits
mailing list