Opened 10 years ago
Closed 10 years ago
#7959 closed Bugs (wontfix)
regex lookahead fails at about 95,000 chars
Reported by: | Owned by: | John Maddock | |
---|---|---|---|
Milestone: | To Be Determined | Component: | regex |
Version: | Boost 1.52.0 | Severity: | Problem |
Keywords: | Cc: |
Description
The lookahead system fails if it must traverse about 95000 chars without a match.
Test pattern: start((?!end).)*stuff
Input:
start nothing end start some stuff end start nothing end start more stuff end
This will match the two start-end regions containing "stuff". However, insert about 95000 random characters into the first start-end region, and the search fails.
Debugging on VC2008 shows a Microsoft C++ exception: boost::exception_detail::clone_impl<boost::exception_detail::error_info_injector<std::runtime_error> >
I traced as far as finding that the error occurs on the first call to match_all_states().
If you catch the exception you will see that it says:
"Ran out of stack space trying to match the regular expression."
Basically Regex sets an upper limit on how much memory it will grab when trying to find a match. If you up the values of BOOST_REGEX_MAX_BLOCKS and/or BOOST_REGEX_BLOCKSIZE in boost/regex/user.hpp and then rebuild everything (including the library), then the issue will disappear - or at least get shifted to larger texts before you hit the limit. Obviously if you set BOOST_REGEX_MAX_BLOCKS to something like INT_MAX then there is no limit - whether you think that's a good idea I'll leave up to you!
But basically this is by design.