Boost C++ Libraries: Ticket #11981: boost::archive::xml_woarchive with locale dosen't work https://svn.boost.org/trac10/ticket/11981 <p> New locale library seems to have a bug. "Implemented generic codecvt facet and add general purpose utf8_codecvt facet" </p> <pre class="wiki">#include &lt;string&gt; #include &lt;locale&gt; #include &lt;fstream&gt; #include &lt;boost/serialization/string.hpp&gt; #include &lt;boost/serialization/nvp.hpp&gt; #include &lt;boost/archive/xml_woarchive.hpp&gt; #include &lt;boost/archive/xml_wiarchive.hpp&gt; int wmain(int argc, wchar_t* argv[]) { std::locale::global(std::locale("japanese")); std::wofstream wofs("output.xml"); boost::archive::xml_woarchive oa(wofs); // exception in 1.60 oa &lt;&lt; boost::serialization::make_nvp("string", std::string("日本語文字列")); wofs.close(); std::string str; std::wifstream wifs("output.xml"); boost::archive::xml_wiarchive ia(wifs); ia &gt;&gt; boost::serialization::make_nvp("string", str); wifs.close(); return 0; } </pre><p> An exception occurs in boost 1.60 in Visual Studio 2013. "invalid multbyte/wide char conversion". </p> <p> This exception doesn't occur in boost 1.59, but this code makes invalid xml. The encoding is not UTF-8 but SJIS. </p> <p> In boost 1.57, it makes valid UTF-8 encoding xml. </p> en-us Boost C++ Libraries /htdocs/site/boost.png https://svn.boost.org/trac10/ticket/11981 Trac 1.4.3 JeremyS3DS Wed, 18 May 2016 23:37:27 GMT severity changed https://svn.boost.org/trac10/ticket/11981#comment:1 https://svn.boost.org/trac10/ticket/11981#comment:1 <ul> <li><strong>severity</strong> <span class="trac-field-old">Problem</span> → <span class="trac-field-new">Regression</span> </li> </ul> Ticket Artyom Beilis Thu, 19 May 2016 05:59:45 GMT owner, component changed https://svn.boost.org/trac10/ticket/11981#comment:2 https://svn.boost.org/trac10/ticket/11981#comment:2 <ul> <li><strong>owner</strong> changed from <span class="trac-author">Artyom Beilis</span> to <span class="trac-author">Robert Ramey</span> </li> <li><strong>component</strong> <span class="trac-field-old">locale</span> → <span class="trac-field-new">serialization</span> </li> </ul> Ticket anonymous Sat, 28 May 2016 04:44:15 GMT <link>https://svn.boost.org/trac10/ticket/11981#comment:3 </link> <guid isPermaLink="false">https://svn.boost.org/trac10/ticket/11981#comment:3</guid> <description> <p> If we run this code in boost 1.61 in Visual Studio 2013 x86 Windows 10, abort() is called. </p> <p> Assertion failed: std::codecvt_base::ok == r, file D:\Project\boost_1_61_0\boost/archive/iterators/wchar_from_mb.hpp, line 175 </p> </description> <category>Ticket</category> </item> <item> <dc:creator>Robert Ramey</dc:creator> <pubDate>Sun, 29 May 2016 17:37:20 GMT</pubDate> <title>status changed; resolution set https://svn.boost.org/trac10/ticket/11981#comment:4 https://svn.boost.org/trac10/ticket/11981#comment:4 <ul> <li><strong>status</strong> <span class="trac-field-old">new</span> → <span class="trac-field-new">closed</span> </li> <li><strong>resolution</strong> → <span class="trac-field-new">wontfix</span> </li> </ul> <p> the reason for the regression is that I improved the test. That is, it's a problem that was always there but not exhaustively tested as it is now. When you say the encoding is SJIS what do you mean? The test uses UTF-8 characters. I've had a lot of problem with this test on various platforms so any information you want to give would be appreciated. </p> <p> I'm marking tis "wont fix" But that's not entirely true - I would like to say "can't fix" but that choice is not presented. </p> Ticket anonymous Fri, 10 Jun 2016 03:20:23 GMT <link>https://svn.boost.org/trac10/ticket/11981#comment:5 </link> <guid isPermaLink="false">https://svn.boost.org/trac10/ticket/11981#comment:5</guid> <description> <blockquote class="citation"> <p> When you say the encoding is SJIS what do you mean? The test uses UTF-8 characters. </p> </blockquote> <p> The output xml should be UTF-8 characters but it was not in boost 1.59. In boost 1.61 abort() is called and there is no output xml. </p> <p> I want to use new boost but I can't because of this bug. Is it possible to use boost 1.57 for serialization and boost 1.61 for others? How can we mix different versions? </p> <p> Platform. Windows 10 Japanese 64bit. Visual Studio 2013 Update 5. </p> </description> <category>Ticket</category> </item> <item> <dc:creator>anonymous</dc:creator> <pubDate>Thu, 16 Jun 2016 14:00:44 GMT</pubDate> <title/> <link>https://svn.boost.org/trac10/ticket/11981#comment:6 </link> <guid isPermaLink="false">https://svn.boost.org/trac10/ticket/11981#comment:6</guid> <description> <p> If I copy these files from boost 1.57.0 into boost 1.61.0, the test code works well. </p> <p> boost\serialization\pfto.hpp<br /> boost\archive\iterators\mb_from_wchar.hpp<br /> boost\archive\iterators\wchar_from_mb.hpp </p> </description> <category>Ticket</category> </item> <item> <dc:creator>matsu</dc:creator> <pubDate>Fri, 29 Sep 2017 10:38:08 GMT</pubDate> <title>status, version changed; resolution deleted https://svn.boost.org/trac10/ticket/11981#comment:7 https://svn.boost.org/trac10/ticket/11981#comment:7 <ul> <li><strong>status</strong> <span class="trac-field-old">closed</span> → <span class="trac-field-new">reopened</span> </li> <li><strong>version</strong> <span class="trac-field-old">Boost 1.60.0</span> → <span class="trac-field-new">Boost 1.65.0</span> </li> <li><strong>resolution</strong> <span class="trac-field-deleted">wontfix</span> </li> </ul> <p> This problem still exists on boost 1.65.1. Platform. Windows 10 Japanese 64bit. Visual Studio 2017 Update 1. </p> <p> Abort is called at this line. </p> <pre class="wiki">oa &lt;&lt; boost::serialization::make_nvp("string", std::string("日本語文字列")); </pre><p> This is because std::string is not always UTF-8 encoding. It depends on the locale, in my case it is Shift_JIS encording. But utf8_codecvt_facet is always used in xml_woarchive_impl.ipp. </p> <p> I have changed utf8_codecvt_facet to mbstowcs_s and it works well. mbstowcs_s refers the locale and converts accordingly. </p> <p> xml_woarchive_impl.ipp </p> <pre class="wiki">#define BOOST_NO_UTF8 // my change #ifdef BOOST_NO_UTF8 #include &lt;stdlib.h&gt; #else #include &lt;boost/archive/iterators/wchar_from_mb.hpp&gt; #endif // copy chars to output escaping to xml and widening characters as we go template&lt;class InputIterator&gt; void save_iterator(std::wostream &amp;os, InputIterator begin, InputIterator end){ #ifdef BOOST_NO_UTF8 std::size_t len = end - begin + 1; std::vector&lt;wchar_t&gt; dst(len); if (::mbstowcs_s(&amp;len, dst.data(), len, begin, len - 1) != 0) { throw std::system_error(errno, std::system_category()); } std::copy( dst.data(), dst.data() + len - 1, boost::archive::iterators::ostream_iterator&lt;wchar_t&gt;(os) ); #else typedef iterators::wchar_from_mb&lt; iterators::xml_escape&lt;InputIterator&gt; &gt; xmbtows; std::copy( xmbtows(begin), xmbtows(end), boost::archive::iterators::ostream_iterator&lt;wchar_t&gt;(os) ); #endif } </pre><p> xml_wiarchive_impl.ipp </p> <pre class="wiki">#define BOOST_NO_UTF8 // my change #ifdef BOOST_NO_UTF8 #include &lt;stdlib.h&gt; #else #include &lt;boost/archive/iterators/wchar_from_mb.hpp&gt; #endif void copy_to_ptr(char * s, const std::wstring &amp; ws){ #ifdef BOOST_NO_UTF8 std::size_t len = ws.size() * sizeof(wchar_t) + 1; if (::wcstombs_s(&amp;len, s, len, ws.c_str(), len - 1) != 0) { throw std::system_error(errno, std::system_category()); } #else std::copy( iterators::mb_from_wchar&lt;std::wstring::const_iterator&gt;( ws.begin() ), iterators::mb_from_wchar&lt;std::wstring::const_iterator&gt;( ws.end() ), s ); s[ws.size()] = 0; #endif } template&lt;class Archive&gt; BOOST_WARCHIVE_DECL void xml_wiarchive_impl&lt;Archive&gt;::load(std::string &amp; s){ std::wstring ws; bool result = gimpl-&gt;parse_string(is, ws); if(! result) boost::serialization::throw_exception( xml_archive_exception(xml_archive_exception::xml_archive_parsing_error) ); #if BOOST_WORKAROUND(_RWSTD_VER, BOOST_TESTED_AT(20101)) if(NULL != s.data()) #endif s.resize(0); #ifdef BOOST_NO_UTF8 std::size_t len = ws.size() * sizeof(wchar_t) + 1; s.resize(len); if (::wcstombs_s(&amp;len, &amp;s[0], len, ws.c_str(), _TRUNCATE) != 0) { throw std::system_error(errno, std::system_category()); } s.resize(len - 1); #else s.reserve(ws.size()); std::copy( iterators::mb_from_wchar&lt;std::wstring::iterator&gt;( ws.begin() ), iterators::mb_from_wchar&lt;std::wstring::iterator&gt;( ws.end() ), std::back_inserter(s) ); #endif } </pre> Ticket