Context Navigation

#360 closed Support Requests (invalid)

Regex

Reported by:	nobody	Owned by:	John Maddock
Milestone:		Component:	regex
Version:	None	Severity:	Problem
Keywords:		Cc:

Description (last modified by John Maddock)

PROBLEM:
========

If I use the following code:

lv_rcode = regcomp(&lr_re, ms_RegExp, 
REG_EXTENDED);

lv_rcode = regexec(&lr_re, &lv_string[0], 1, &pm, 0);

to find the following text: 

cm582172

Using the following regular expression

[A-Z]{2}[[:space:]]*[0-9]{6}[[:space:]]*

The text if found.

However if I use instead the following code

boost::regex regx(config.vRegx.at(iCnt).c_str());
flags = boost::match_default;
boost::regex_search(start, end, what, regx, flags)

With the same regular expression it does not find the 
string unless I use the following regular expression:

[a-z]{2}[[:space:]]*[0-9]{6}[[:space:]]*

The difference is the lower case of the alpha range [a-z]

My question:

Is there a way I can use regexec to be case sensitive for 
the alpha range like it is with regex_search function?

Thanks,

Matias

Change History (2)

comment:1 by Daryle Walker, 15 years ago

Component:	None → regex
Severity:	→ Problem

comment:2 by John Maddock, 15 years ago

Description:	modified (diff)
Resolution:	None → invalid
Status:	assigned → closed

This occurs because by default POSIX regular expressions treat character ranges like [A-Z] as locale sensitive, and will match any character that collates within that range. For most locales the character "cm" do collate within that range, and hence they match. You can get more Perl-like behaviour by setting REG_NOCOLLATE as well as REG_EXTENDED when compiling the expression.

Note: See TracTickets for help on using tickets.

Download in other formats: