Opened 10 years ago

Closed 10 years ago

#7758 closed Bugs (fixed)

Regex will create stack error after upgrade of boost

Reported by: david.ecker@… Owned by: John Maddock
Milestone: To Be Determined Component: regex
Version: Boost 1.52.0 Severity: Problem
Keywords: Cc:

Description

This regex .*?\r\n(version .*?\r\n) will create a stack error using boost 1.44 or 1.52. It did work using boost 1.31.

I used the following method to get the tokens: boost::sregex_iterator a_RegIterator( a_InputString.begin(), a_InputString.end(), a_Expression, a_RegExMatchFlags );

Setting the match flag match_single_line will work, too. It looks like the error will show up if the input text is rather large (100kb...).

Attachments (2)

BoostRegexTest.cpp (1.7 KB ) - added by david.ecker@… 10 years ago.
cpp file
Session.txt (225.4 KB ) - added by david.ecker@… 10 years ago.
Input File

Download all attachments as: .zip

Change History (7)

by david.ecker@…, 10 years ago

Attachment: BoostRegexTest.cpp added

cpp file

by david.ecker@…, 10 years ago

Attachment: Session.txt added

Input File

comment:1 by david.ecker@…, 10 years ago

I am testing using Visual Studio 2010, 64bit.

comment:2 by John Maddock, 10 years ago

Resolution: invalid
Status: newclosed

The program throws a std::runtime_error because the match was taking too long to find - without a try-catch block around the code there is a stack error (I don't understand that one), but with a try-catch block the exception is caught and everything is behaving as expected.

Is there a reason for the leading .* in the expression? It serves no good purpose other than to make the expression harder to match - you can get the same information from match_results::prefix() and way better performance.

comment:3 by david.ecker@…, 10 years ago

Resolution: invalid
Status: closedreopened

I haven't posted the whole code. Catching the exception is not really the problem. The user ist able to enter his own regex for a defined job. I just posted one as an example. With this regex he can describe what parts of the text are to be extracted or replaced or used further on.

The main problem is, that a whole bunch of different configured jobs exist, since it did work fast with boost 1.31. I can't really upgrade boost, because it will break existing user defined regex jobs.

comment:4 by anonymous, 10 years ago

Investigating some more I believe you are correct - I'm testing a fix now (there's an optimization heuristic that's used almost everywhere else, but not for .* for some strange reason).

In the mean time, try building both Boost and your code with BOOST_REGEX_RECURSIVE defined as the issue doesn't occur in the stack-recursive implementation. Unfortunately, you should be aware that that implementation (with BOOST_REGEX_RECURSIVE defined) isn't really considered safe to use with VC10, as it's harder to safely trap stack overflows (and throw an exception) than it was in earlier Visual Studio releases. Of course nearly every other regex implementation out there is stack recursive and also has this issue...

comment:5 by John Maddock, 10 years ago

Resolution: fixed
Status: reopenedclosed

(In [81705]) Add missing optimization for leading .* repeats. Fixes #7758.

Note: See TracTickets for help on using tickets.