git: libc/regex: Replace old regex library with modified TRE
John Marino
marino at crater.dragonflybsd.org
Thu Aug 6 15:15:30 PDT 2015
commit 6af9a77b394698e42f3a7ec6126497a3fc2fd470
Author: John Marino <draco at marino.st>
Date: Thu Aug 6 23:26:49 2015 +0200
libc/regex: Replace old regex library with modified TRE
The existing DragonFly REGEX library has several limitations, including
lack of wide character support and no collation ability due to its being
locked to POSIX/C locale. It's also slow and doesn't pass a number of
tests of the AT&T Research Regex testsuite:
basic : TEST testregex, 539 tests, 0 errors
categorize : TEST testregex, 20 tests, 0 errors
nullsubexpr : TEST testregex, 84 tests, 31 errors
leftassoc : TEST testregex, 12 tests, 12 errors
rightassoc : TEST testregex, 24 tests, 0 errors
forcedassoc : TEST testregex, 48 tests, 8 errors
repetition : TEST testregex, 129 tests, 37 errors
Now it achieves these scores (elevated with new regnexec support):
basic : TEST testregex, 808 tests, 0 errors
categorize : TEST testregex, 26 tests, 0 errors
nullsubexpr : TEST testregex, 172 tests, 0 errors
leftassoc : TEST testregex, 12 tests, 12 errors
rightassoc : TEST testregex, 36 tests, 0 errors
forcedassoc : TEST testregex, 84 tests, 0 errors
repetition : TEST testregex, 241 tests, 0 errors
Here's proof that the regex library is now locale sensitive:
> env LANG=C sed /abandonn[a-z]/d fwl-sort-C.txt
a
abandonnâmes
abandonnât
abandonnâtes
abandonnèrent
abandonné
abandonnée
abandonnées
abandonnés
abord
abords
absence
> env LANG=fr_FR sed /abandonn[a-z]/d fwl-sort-C.txt
a
abord
abords
absence
accepta
acceptai
acceptaient
acceptais
acceptait
acceptant
acceptas
acceptasse
Several new functions have been added to to libc:
variations of regcomp: regcomp_l,
regncomp, regncomp_l,
regwcomp, regwcomp_l,
regnwcomp, regnwcomp_l
variations of regexec: regnexec, regwexec, regwnexec
The regex.3 and re_format.7 map pages have been updated and symlinked
accordingly.
Summary of changes:
include/Makefile | 2 +-
include/regex.h | 120 --
lib/libc/Makefile.inc | 2 +-
lib/libc/regex/COPYRIGHT | 56 -
lib/libc/regex/Makefile.inc | 19 -
lib/libc/regex/Symbol.map | 6 -
lib/libc/regex/WHATSNEW | 94 --
lib/libc/regex/cname.h | 139 ---
lib/libc/regex/engine.c | 1186 -------------------
lib/libc/regex/regcomp.c | 1823 -----------------------------
lib/libc/regex/regerror.c | 170 ---
lib/libc/regex/regex2.h | 193 ---
lib/libc/regex/regexec.c | 229 ----
lib/libc/regex/regfree.c | 85 --
lib/libc/regex/utils.h | 54 -
lib/libc/tre-regex/Makefile.inc | 42 +
lib/libc/tre-regex/Symbol.map | 23 +
lib/libc/tre-regex/cname.h | 140 +++
lib/libc/tre-regex/config.h | 259 ++++
lib/libc/{regex => tre-regex}/re_format.7 | 354 +++++-
lib/libc/{regex => tre-regex}/regex.3 | 327 +++++-
lib/libc/tre-regex/regex.h | 232 ++++
lib/libc/tre-regex/tre.h | 8 +
23 files changed, 1321 insertions(+), 4242 deletions(-)
delete mode 100644 include/regex.h
delete mode 100644 lib/libc/regex/COPYRIGHT
delete mode 100644 lib/libc/regex/Makefile.inc
delete mode 100644 lib/libc/regex/Symbol.map
delete mode 100644 lib/libc/regex/WHATSNEW
delete mode 100644 lib/libc/regex/cname.h
delete mode 100644 lib/libc/regex/engine.c
delete mode 100644 lib/libc/regex/regcomp.c
delete mode 100644 lib/libc/regex/regerror.c
delete mode 100644 lib/libc/regex/regex2.h
delete mode 100644 lib/libc/regex/regexec.c
delete mode 100644 lib/libc/regex/regfree.c
delete mode 100644 lib/libc/regex/utils.h
create mode 100644 lib/libc/tre-regex/Makefile.inc
create mode 100644 lib/libc/tre-regex/Symbol.map
create mode 100644 lib/libc/tre-regex/cname.h
create mode 100644 lib/libc/tre-regex/config.h
rename lib/libc/{regex => tre-regex}/re_format.7 (56%)
rename lib/libc/{regex => tre-regex}/regex.3 (69%)
create mode 100644 lib/libc/tre-regex/regex.h
create mode 100644 lib/libc/tre-regex/tre.h
http://gitweb.dragonflybsd.org/dragonfly.git/commitdiff/6af9a77b394698e42f3a7ec6126497a3fc2fd470
--
DragonFly BSD source repository
More information about the Commits
mailing list