Opened 16 years ago

Closed 16 years ago

#858 closed Bugs (Invalid)

regex_match throws exception

Reported by: krankurs Owned by: John Maddock
Milestone: Component: regex
Version: None Severity:
Keywords: Cc:

Description

Running:
MS Windows XP with pack 2, boost_1_33_1, 
MS VS 2005, VC++ console app, 
BOOST_REGEX_DYN_LINK;
Multi-threaded Debug DLL (/MDd)



boost::regex site_url_regex("http(s?)://((\b(?:\d{1,3}\.){3}\d{1,3}\b)|(www(.[a-z0-9]+)+.mydomain.com))");


boost::regex_match("http://www.appwebsite1.mydomain.com", site_url_regex);

returns true; (as expected)

boost::regex_match("http://www.appwebsite1.mydomain.com/1234", site_url_regex);

returns false; (as expected)

boost::regex_match("http://www.appwebsite1.mydomain.com/12345678/12345678/", site_url_regex);

throws exception,  - expected false

Thank you,
Leonid

Change History (2)

comment:1 by nobody, 16 years ago

Logged In: NO 

You forgot to escape the backslashes in the reg-ex!

comment:2 by John Maddock, 16 years ago

Status: assignedclosed
Logged In: YES 
user_id=14804
Originator: NO

The behaviour is deliberate and by design, there are some Perl-style regular expressions that take effectively "forever" to match.  Boost.Regex keeps track of how many states the machine has visited and if the number appears to be growing out of control throws an exception.

In this case as well as the \'s needing to be escaped, the problem is with (.[a-z0-9]+)+ which is to all intents and purposes the same as the well known pathological case: (.+)+

I assume you meant to use (\\.[a-z0-9]+)+ so as to match a literal ".", and this would be OK.

John Maddock
Note: See TracTickets for help on using tickets.