id,summary,reporter,owner,description,type,status,milestone,component,version,severity,resolution,keywords,cc 7744,make_u32regex() performs insufficient UTF-8 validation,anonymous,John Maddock,"The program below shows a segfault for regular expression "".*\xf6.*"". AFAIK the maximum value allowed as leading byte for 4-byte sequences is 0xF4. I would expect an exception. Regular expression "".*\xe4.*"" is created without exception. However 0xE4 starts a 3-byte character and no trailing bytes are present. I would expect an exception here too. We use Boost 1.52.0 together with ICU 50.1. The behavior is the same in Linux and Windows. {{{ #include int main(void) { // this line does not throw an exception although this is not valid UTF-8 boost::u32regex(boost::make_u32regex("".*\xe4.*"")); // this line segfaults boost::u32regex(boost::make_u32regex("".*\xf6.*"")); return 0; } }}}",Bugs,closed,To Be Determined,regex,Boost 1.52.0,Problem,fixed,,