Boost C++ Libraries: Ticket #9473: make_u32regex() accepts illegal UTF-8 https://svn.boost.org/trac10/ticket/9473 <p> The attached example shows that make_u32regex() accepts two kinds of illegal UTF-8. </p> <p> It accepts codepoints reserved for UTF-16 surrogate pairs encoded as 3-byte UTF-8 characters, e.g. "\xed\xa0\x80" representing U+D800. </p> <p> It accepts overlong UTF-8 encodings where the codepoint value has been extended to the left with additional zero bits, e.g. "\xc0\x80" representing U+0000 whereas its correct 1-byte encoding is "\x00". </p> <p> Boost.Locale already contains code to protect against overlong encodings (see method width() in <a class="ext-link" href="https://svn.boost.org/svn/boost/trunk/boost/locale/utf.hpp"><span class="icon">​</span>https://svn.boost.org/svn/boost/trunk/boost/locale/utf.hpp</a>). </p> en-us Boost C++ Libraries /htdocs/site/boost.png https://svn.boost.org/trac10/ticket/9473 Trac 1.4.3 Peter Klotz <peter.klotz@…> Thu, 05 Dec 2013 11:23:30 GMT attachment set https://svn.boost.org/trac10/ticket/9473 https://svn.boost.org/trac10/ticket/9473 <ul> <li><strong>attachment</strong> → <span class="trac-field-new">main.cpp</span> </li> </ul> Ticket John Maddock Thu, 19 Dec 2013 10:52:03 GMT status changed; resolution set https://svn.boost.org/trac10/ticket/9473#comment:1 https://svn.boost.org/trac10/ticket/9473#comment:1 <ul> <li><strong>status</strong> <span class="trac-field-old">new</span> → <span class="trac-field-new">closed</span> </li> <li><strong>resolution</strong> → <span class="trac-field-new">fixed</span> </li> </ul> <p> Fixed in Git develop. </p> Ticket