Opened 18 years ago

Closed 15 years ago

#360 closed Support Requests (invalid)

Regex

Reported by: nobody Owned by: John Maddock
Milestone: Component: regex
Version: None Severity: Problem
Keywords: Cc:

Description (last modified by John Maddock)

PROBLEM:
========

If I use the following code:

lv_rcode = regcomp(&lr_re, ms_RegExp, 
REG_EXTENDED);

lv_rcode = regexec(&lr_re, &lv_string[0], 1, &pm, 0);

to find the following text: 

cm582172

Using the following regular expression

[A-Z]{2}[[:space:]]*[0-9]{6}[[:space:]]*

The text if found.

However if I use instead the following code

boost::regex regx(config.vRegx.at(iCnt).c_str());
flags = boost::match_default;
boost::regex_search(start, end, what, regx, flags)

With the same regular expression it does not find the 
string unless I use the following regular expression:

[a-z]{2}[[:space:]]*[0-9]{6}[[:space:]]*

The difference is the lower case of the alpha range [a-z]

My question:

Is there a way I can use regexec to be case sensitive for 
the alpha range like it is with regex_search function?

Thanks,

Matias

Change History (2)

comment:1 by Daryle Walker, 15 years ago

Component: Noneregex
Severity: Problem

comment:2 by John Maddock, 15 years ago

Description: modified (diff)
Resolution: Noneinvalid
Status: assignedclosed

This occurs because by default POSIX regular expressions treat character ranges like [A-Z] as locale sensitive, and will match any character that collates within that range. For most locales the character "cm" do collate within that range, and hence they match. You can get more Perl-like behaviour by setting REG_NOCOLLATE as well as REG_EXTENDED when compiling the expression.

Note: See TracTickets for help on using tickets.