Boost C++ Libraries: Ticket #3299: boost regex regex_search crash https://svn.boost.org/trac10/ticket/3299 <p> i use tmp.txt size &gt;5m, only one line include bbbbbbbb </p> <p> use regex "[a-z].*(xxxxx)' and call regex_search </p> <p> then boost throw except </p> en-us Boost C++ Libraries /htdocs/site/boost.png https://svn.boost.org/trac10/ticket/3299 Trac 1.4.3 Steven Watanabe Fri, 31 Jul 2009 19:29:09 GMT component changed; owner set https://svn.boost.org/trac10/ticket/3299#comment:1 https://svn.boost.org/trac10/ticket/3299#comment:1 <ul> <li><strong>owner</strong> set to <span class="trac-author">John Maddock</span> </li> <li><strong>component</strong> <span class="trac-field-old">None</span> → <span class="trac-field-new">regex</span> </li> </ul> <p> Can you provide minimal code and input that reproduces the problem? What exception is thrown exactly? </p> Ticket ufwt@… Tue, 04 Aug 2009 13:17:20 GMT <link>https://svn.boost.org/trac10/ticket/3299#comment:2 </link> <guid isPermaLink="false">https://svn.boost.org/trac10/ticket/3299#comment:2</guid> <description> <p> crash example: </p> <blockquote> <p> boost::regex re("[a-z].*(d_notexiststring)",boost::regex::perl|boost::regex::no_except); </p> </blockquote> <blockquote> <p> std::string s="f"; s.append(1024*100,'a'); s.append("dd"); boost::match_results&lt;std::string::const_iterator&gt; what; boost::match_flag_type flags = boost::match_default; </p> </blockquote> <blockquote> <p> std::string::const_iterator start, end; start = s.begin(); end = s.end(); boost::regex_search(start,end,what,re,flags); </p> </blockquote> <p> except: </p> <blockquote> <p> uncaught exception of type boost::exception_detail::clone_impl&lt;boost::exception_detail::error_info_injector&lt;std::runtime_error&gt; &gt; </p> </blockquote> <ul><li>Memory exhausted </li></ul><p> raise at state_count &gt; max_state_count </p> </description> <category>Ticket</category> </item> <item> <dc:creator>John Maddock</dc:creator> <pubDate>Thu, 06 Aug 2009 16:22:32 GMT</pubDate> <title>status changed https://svn.boost.org/trac10/ticket/3299#comment:3 https://svn.boost.org/trac10/ticket/3299#comment:3 <ul> <li><strong>status</strong> <span class="trac-field-old">new</span> → <span class="trac-field-new">assigned</span> </li> </ul> <p> The code doesn't crash: it throws an exception that it's documented to throw. </p> <p> In this case it's because the number of states visited in the FSM has grown too large and it bails out to prevent an "eternal" match attempt. </p> <p> Perl manages to optimise this particular expression, but if you make it trivially more complex, say '[a-z].*(xxxxx)|x', then it takes a very long time indeed to return from a match attempt on your string. </p> <p> I'll leave this open for now though, as it looks like this special case can be optimised a little more. </p> <p> John. </p> Ticket ufwt@… Thu, 06 Aug 2009 16:26:26 GMT <link>https://svn.boost.org/trac10/ticket/3299#comment:4 </link> <guid isPermaLink="false">https://svn.boost.org/trac10/ticket/3299#comment:4</guid> <description> <p> so i use try catch to avoid program crash!! </p> </description> <category>Ticket</category> </item> <item> <author>ufwt@…</author> <pubDate>Thu, 06 Aug 2009 16:31:02 GMT</pubDate> <title/> <link>https://svn.boost.org/trac10/ticket/3299#comment:5 </link> <guid isPermaLink="false">https://svn.boost.org/trac10/ticket/3299#comment:5</guid> <description> <p> i also change re_repeat's max let re_pepeat max - min &lt; 1024 i can change this ? </p> </description> <category>Ticket</category> </item> <item> <dc:creator>anonymous</dc:creator> <pubDate>Fri, 07 Aug 2009 11:24:38 GMT</pubDate> <title/> <link>https://svn.boost.org/trac10/ticket/3299#comment:6 </link> <guid isPermaLink="false">https://svn.boost.org/trac10/ticket/3299#comment:6</guid> <description> <p> "i also change re_repeat's max let re_pepeat max - min &lt; 1024 i can change this ? " </p> <p> Sorry I don't understand what you are asking. Change what code precisely? </p> </description> <category>Ticket</category> </item> <item> <author>ufwt@…</author> <pubDate>Sat, 08 Aug 2009 13:20:33 GMT</pubDate> <title/> <link>https://svn.boost.org/trac10/ticket/3299#comment:7 </link> <guid isPermaLink="false">https://svn.boost.org/trac10/ticket/3299#comment:7</guid> <description> <p> i chanage re_repeat'max , code: </p> <p> void changeRegexRepeatNum(boost::regex &amp;regex,size_t max) { </p> <blockquote> <p> boost::re_detail::re_syntax_base* state=regex.get_data().m_first_state; while(state){ </p> <blockquote> <p> switch(state-&gt;type){ </p> <blockquote> <p> case boost::re_detail::syntax_element_rep: case boost::re_detail::syntax_element_dot_rep: case boost::re_detail::syntax_element_char_rep: case boost::re_detail::syntax_element_short_set_rep: case boost::re_detail::syntax_element_long_set_rep: { </p> <blockquote> <p> boost::re_detail::re_repeat *repeat=static_cast&lt;boost::re_detail::re_repeat*&gt;(state); if(repeat-&gt;max - repeat-&gt;min &gt;max){ </p> <blockquote> <p> repeat-&gt;max = repeat-&gt;min + max; </p> </blockquote> <p> } </p> </blockquote> <p> } break; default: </p> <blockquote> <p> break; </p> </blockquote> </blockquote> <p> } state = state-&gt;next.p; </p> </blockquote> <p> } </p> </blockquote> </description> <category>Ticket</category> </item> <item> <dc:creator>anonymous</dc:creator> <pubDate>Sat, 08 Aug 2009 16:52:12 GMT</pubDate> <title/> <link>https://svn.boost.org/trac10/ticket/3299#comment:8 </link> <guid isPermaLink="false">https://svn.boost.org/trac10/ticket/3299#comment:8</guid> <description> <p> Sure you can do that, but I'm not sure why you would want to? </p> <p> John. </p> </description> <category>Ticket</category> </item> <item> <author>ufwt@…</author> <pubDate>Sun, 09 Aug 2009 08:15:48 GMT</pubDate> <title/> <link>https://svn.boost.org/trac10/ticket/3299#comment:9 </link> <guid isPermaLink="false">https://svn.boost.org/trac10/ticket/3299#comment:9</guid> <description> <p> because my txt is very long ,i do this can regex_search speed quick! </p> </description> <category>Ticket</category> </item> <item> <dc:creator>John Maddock</dc:creator> <pubDate>Tue, 02 Mar 2010 17:02:11 GMT</pubDate> <title>status changed; resolution set https://svn.boost.org/trac10/ticket/3299#comment:10 https://svn.boost.org/trac10/ticket/3299#comment:10 <ul> <li><strong>status</strong> <span class="trac-field-old">assigned</span> → <span class="trac-field-new">closed</span> </li> <li><strong>resolution</strong> → <span class="trac-field-new">wontfix</span> </li> </ul> Ticket