Opened 18 years ago
Closed 15 years ago
#360 closed Support Requests (invalid)
Regex
| Reported by: | nobody | Owned by: | John Maddock |
|---|---|---|---|
| Milestone: | Component: | regex | |
| Version: | None | Severity: | Problem |
| Keywords: | Cc: |
Description (last modified by )
PROBLEM:
========
If I use the following code:
lv_rcode = regcomp(&lr_re, ms_RegExp,
REG_EXTENDED);
lv_rcode = regexec(&lr_re, &lv_string[0], 1, &pm, 0);
to find the following text:
cm582172
Using the following regular expression
[A-Z]{2}[[:space:]]*[0-9]{6}[[:space:]]*
The text if found.
However if I use instead the following code
boost::regex regx(config.vRegx.at(iCnt).c_str());
flags = boost::match_default;
boost::regex_search(start, end, what, regx, flags)
With the same regular expression it does not find the
string unless I use the following regular expression:
[a-z]{2}[[:space:]]*[0-9]{6}[[:space:]]*
The difference is the lower case of the alpha range [a-z]
My question:
Is there a way I can use regexec to be case sensitive for
the alpha range like it is with regex_search function?
Thanks,
Matias
Change History (2)
comment:1 by , 15 years ago
| Component: | None → regex |
|---|---|
| Severity: | → Problem |
comment:2 by , 15 years ago
| Description: | modified (diff) |
|---|---|
| Resolution: | None → invalid |
| Status: | assigned → closed |
Note:
See TracTickets
for help on using tickets.

This occurs because by default POSIX regular expressions treat character ranges like [A-Z] as locale sensitive, and will match any character that collates within that range. For most locales the character "cm" do collate within that range, and hence they match. You can get more Perl-like behaviour by setting REG_NOCOLLATE as well as REG_EXTENDED when compiling the expression.