Opened 10 years ago
Closed 10 years ago
#7744 closed Bugs (fixed)
make_u32regex() performs insufficient UTF-8 validation
Reported by: | anonymous | Owned by: | John Maddock |
---|---|---|---|
Milestone: | To Be Determined | Component: | regex |
Version: | Boost 1.52.0 | Severity: | Problem |
Keywords: | Cc: |
Description
The program below shows a segfault for regular expression ".*\xf6.*". AFAIK the maximum value allowed as leading byte for 4-byte sequences is 0xF4. I would expect an exception.
Regular expression ".*\xe4.*" is created without exception. However 0xE4 starts a 3-byte character and no trailing bytes are present. I would expect an exception here too.
We use Boost 1.52.0 together with ICU 50.1. The behavior is the same in Linux and Windows.
#include <boost/regex/icu.hpp> int main(void) { // this line does not throw an exception although this is not valid UTF-8 boost::u32regex(boost::make_u32regex(".*\xe4.*")); // this line segfaults boost::u32regex(boost::make_u32regex(".*\xf6.*")); return 0; }
Note:
See TracTickets
for help on using tickets.
Fixed in Trunk rev #81614