id summary reporter owner description type status milestone component version severity resolution keywords cc 9473 make_u32regex() accepts illegal UTF-8 Peter Klotz John Maddock "The attached example shows that make_u32regex() accepts two kinds of illegal UTF-8. It accepts codepoints reserved for UTF-16 surrogate pairs encoded as 3-byte UTF-8 characters, e.g. ""\xed\xa0\x80"" representing U+D800. It accepts overlong UTF-8 encodings where the codepoint value has been extended to the left with additional zero bits, e.g. ""\xc0\x80"" representing U+0000 whereas its correct 1-byte encoding is ""\x00"". Boost.Locale already contains code to protect against overlong encodings (see method width() in https://svn.boost.org/svn/boost/trunk/boost/locale/utf.hpp)." Bugs closed To Be Determined regex Boost 1.54.0 Problem fixed