Opened 6 years ago

#12619 new Bugs

Boost.Regex partial_match fails (see also Ticket #11776 feature request)

Reported by: Dr. Robert van Engelen <engelen@…> Owned by: John Maddock
Milestone: To Be Determined Component: regex
Version: Boost 1.61.0 Severity: Problem
Keywords: partial_match Cc:

Description

Boost.Regex is a great library that we use extensively. I am re-raising Ticket 11776 as a bug. The partial_match implementation is broken because regex repetitions (*, +) may behave lazy or greedy depending on input text buffer size. This is very unfortunate, because partial_match provides the only possible mechanism to search streaming input text without buffering the entire text. To restrict the regex to simple forms that do not include repetitions (*, +) is not a viable workaround. There are use cases in which we must take interactive input (i.e. buffering one char at a time) or take large files in which the pattern searched may not fit in the current buffer allocated, thus not producing the longest match, and worse we don't know if the buffer must be enlarged to continue iterating to find the longest match.

The correct partial_match algorithm should consider that as long as backtracking on a repetition pattern in the regex is still possible given some partial input text, Boost.Regex should flag the result as a partial match instead of a full match.. With this change, matching "abc.*123" may require the whole input, but in this case that is OK! We need this flexibility of the matcher with a buffering approach.

Unfortunately, the suggested workaround by the Boost.Regex documentation to check if the pattern matched the input up to the buffer end (which indicates a partial match) does not always work.

Attachments (1)

boostbug.cpp (2.5 KB ) - added by Dr. Robert van Engelen <engelen@…> 6 years ago.
Small example to demonstrate the issue

Download all attachments as: .zip

Change History (1)

by Dr. Robert van Engelen <engelen@…>, 6 years ago

Attachment: boostbug.cpp added

Small example to demonstrate the issue

Note: See TracTickets for help on using tickets.