Opened 10 years ago

Closed 10 years ago

#7935 closed Bugs (invalid)

Exception if specify character range with certain code points (if collation used)

Reported by: anonymous Owned by: John Maddock
Milestone: To Be Determined Component: regex
Version: Boost 1.51.0 Severity: Problem
Keywords: Cc:

Description

If you try and compile the regex:

[\x{0080}-\x{10C6}]

the regex code will throw an exception if boost::regex::collate is specified. However, the expressions:

[\x{0080}-\x{10C5}]

or

[\x{0080}-\x{10D0}]

are just fine. I think the issue is related to 0x10C6 not being an assigned Unicode code point but I could be wrong.

Sample code:

boost::wregex regexText( L"[\\x{0080}-\\x{10C6}]", boost::regex::collate );

O/S: Win 7 (x64), Compiler: VS 2010 (SP1)

Change History (1)

comment:1 by John Maddock, 10 years ago

Resolution: invalid
Status: newclosed

I don't believe this is a bug - or at least not one that can be fixed.

When the collate flag is set, the behavior is determined entirely by the underlying OS and what locale is in effect. If the underlying API's believe that the character 0x10D0 collates before 0x80 then the library is correct in saying that the range is invalid. In short the collate flag is only useful where the endpoints of the range come from the same block of characters in the current locate.

Note: See TracTickets for help on using tickets.