Opened 16 years ago

Closed 16 years ago

#696 closed Support Requests (Works For Me)

replacing IsBasicLatin in extanded syntax

Reported by: nobody Owned by: nobody
Milestone: Component: None
Version: None Severity:
Keywords: Cc:

Description

I'm trying to create an expression that replaces the \p
{IsBasicLatin}, that is [\x00-\x7F]. I manage to do it 
with PERL syntax, but not with 'extended' or 'awk'. It 
simply doesn't find what I want it to find.

The expressions I've tried are:

[\\x00-\\x7F]
[\\x{00}-\\x{7F}]
[[.NUL.]-[.DEL.]]

I don't understand the problem, or how I can make it 
work.

Thanks, Moddy.

Change History (1)

comment:1 by John Maddock, 16 years ago

Status: assignedclosed
Logged In: YES 
user_id=14804

In the docs for POSIX regular expressions here:
file:///c:/data/boost/develop/boost/libs/regex/doc/syntax_extended.html

It says: 

Character ranges:

For example [a-c] will match any single character in the
range 'a' to 'c'.  By default, for POSIX-Extended regular
expressions, a character x is within the range y to z, if it
collates within that range; THIS RESULTS IN LOCAL SPECIFIC
BEHAVIOUR .  This behavior can be turned off by unsetting
the collate option flag - in which case whether a character
appears within a range is determined by comparing the code
points of the characters only.

So use boost::regex::extended & ~boost::regex::collate as
the syntax type to force character ranges to be independent
of the locale.

HTH, John.
Note: See TracTickets for help on using tickets.