Opened 5 years ago
#13156 new Bugs
Not word boundary - \b vs. NOT \B are not the same
Reported by: | anonymous | Owned by: | John Maddock |
---|---|---|---|
Milestone: | To Be Determined | Component: | regex |
Version: | Boost 1.64.0 | Severity: | Showstopper |
Keywords: | Not Word Boundary \B | Cc: |
Description
I am reposting this from another source.
In theory, \B should match everywhere \b doesn't.
In boost regex, this is not the case at the beginning nor end of string.
Below is a list of test results from a few Perl-like engines. Apparently, this was fixed in Perl v5.22 (below shows v5.20). The only engines that seem to handle this correctly is PHP and ECMAScript.
Target = ' ssssssssssssss ' Replacement = '<>' ================================================== PHP 7.03 \b = ' <>ssssssssssssss<> ' \B = '<> <> <> s<>s<>s<>s<>s<>s<>s<>s<>s<>s<>s<>s<>s<>s <>' (?!\b) = '<> <> <> s<>s<>s<>s<>s<>s<>s<>s<>s<>s<>s<>s<>s<>s <>' (?<!\b) = '<> <> <> s<>s<>s<>s<>s<>s<>s<>s<>s<>s<>s<>s<>s<>s <>' (?!\B) = ' <>ssssssssssssss<> ' (?<!\B) = ' <>ssssssssssssss<> ' ======================================= Perl 5.20 \b = ' <>ssssssssssssss<> ' \B = '<> <> <> s<>s<>s<>s<>s<>s<>s<>s<>s<>s<>s<>s<>s<>s ' (?!\b) = '<> <> <> s<>s<>s<>s<>s<>s<>s<>s<>s<>s<>s<>s<>s<>s <>' (?<!\b) = '<> <> <> s<>s<>s<>s<>s<>s<>s<>s<>s<>s<>s<>s<>s<>s <>' (?!\B) = ' <>ssssssssssssss<> ' (?<!\B) = ' <>ssssssssssssss<> ' ======================================== Boost 1.64 \b = ' <>ssssssssssssss<> ' \B = ' <> <> s<>s<>s<>s<>s<>s<>s<>s<>s<>s<>s<>s<>s<>s ' (?!\b) = '<> <> <> s<>s<>s<>s<>s<>s<>s<>s<>s<>s<>s<>s<>s<>s <>' (?<!\b) = '<> <> <> s<>s<>s<>s<>s<>s<>s<>s<>s<>s<>s<>s<>s<>s <>' (?!\B) = '<> <>ssssssssssssss<> <>' (?<!\B) = '<> <>ssssssssssssss<> <>' ===================================== JavaScript \b = ' <>ssssssssssssss<> ' \B = '<> <> <> s<>s<>s<>s<>s<>s<>s<>s<>s<>s<>s<>s<>s<>s <>' (?!\b) = '<> <> <> s<>s<>s<>s<>s<>s<>s<>s<>s<>s<>s<>s<>s<>s <>' (?!\B) = ' <>ssssssssssssss<> '
Note:
See TracTickets
for help on using tickets.