Opened 7 years ago
Last modified 5 years ago
#11981 reopened Bugs
boost::archive::xml_woarchive with locale dosen't work
Reported by: | anonymous | Owned by: | Robert Ramey |
---|---|---|---|
Milestone: | To Be Determined | Component: | serialization |
Version: | Boost 1.65.0 | Severity: | Regression |
Keywords: | locale, xml_woarchive, serialization | Cc: |
Description
New locale library seems to have a bug. "Implemented generic codecvt facet and add general purpose utf8_codecvt facet"
#include <string> #include <locale> #include <fstream> #include <boost/serialization/string.hpp> #include <boost/serialization/nvp.hpp> #include <boost/archive/xml_woarchive.hpp> #include <boost/archive/xml_wiarchive.hpp> int wmain(int argc, wchar_t* argv[]) { std::locale::global(std::locale("japanese")); std::wofstream wofs("output.xml"); boost::archive::xml_woarchive oa(wofs); // exception in 1.60 oa << boost::serialization::make_nvp("string", std::string("日本語文字列")); wofs.close(); std::string str; std::wifstream wifs("output.xml"); boost::archive::xml_wiarchive ia(wifs); ia >> boost::serialization::make_nvp("string", str); wifs.close(); return 0; }
An exception occurs in boost 1.60 in Visual Studio 2013. "invalid multbyte/wide char conversion".
This exception doesn't occur in boost 1.59, but this code makes invalid xml. The encoding is not UTF-8 but SJIS.
In boost 1.57, it makes valid UTF-8 encoding xml.
Change History (7)
comment:1 by , 6 years ago
Severity: | Problem → Regression |
---|
comment:2 by , 6 years ago
Component: | locale → serialization |
---|---|
Owner: | changed from | to
comment:3 by , 6 years ago
comment:4 by , 6 years ago
Resolution: | → wontfix |
---|---|
Status: | new → closed |
the reason for the regression is that I improved the test. That is, it's a problem that was always there but not exhaustively tested as it is now. When you say the encoding is SJIS what do you mean? The test uses UTF-8 characters. I've had a lot of problem with this test on various platforms so any information you want to give would be appreciated.
I'm marking tis "wont fix" But that's not entirely true - I would like to say "can't fix" but that choice is not presented.
comment:5 by , 6 years ago
When you say the encoding is SJIS what do you mean? The test uses UTF-8 characters.
The output xml should be UTF-8 characters but it was not in boost 1.59. In boost 1.61 abort() is called and there is no output xml.
I want to use new boost but I can't because of this bug. Is it possible to use boost 1.57 for serialization and boost 1.61 for others? How can we mix different versions?
Platform. Windows 10 Japanese 64bit. Visual Studio 2013 Update 5.
comment:6 by , 6 years ago
If I copy these files from boost 1.57.0 into boost 1.61.0, the test code works well.
boost\serialization\pfto.hpp
boost\archive\iterators\mb_from_wchar.hpp
boost\archive\iterators\wchar_from_mb.hpp
comment:7 by , 5 years ago
Resolution: | wontfix |
---|---|
Status: | closed → reopened |
Version: | Boost 1.60.0 → Boost 1.65.0 |
This problem still exists on boost 1.65.1. Platform. Windows 10 Japanese 64bit. Visual Studio 2017 Update 1.
Abort is called at this line.
oa << boost::serialization::make_nvp("string", std::string("日本語文字列"));
This is because std::string is not always UTF-8 encoding. It depends on the locale, in my case it is Shift_JIS encording. But utf8_codecvt_facet is always used in xml_woarchive_impl.ipp.
I have changed utf8_codecvt_facet to mbstowcs_s and it works well. mbstowcs_s refers the locale and converts accordingly.
xml_woarchive_impl.ipp
#define BOOST_NO_UTF8 // my change #ifdef BOOST_NO_UTF8 #include <stdlib.h> #else #include <boost/archive/iterators/wchar_from_mb.hpp> #endif // copy chars to output escaping to xml and widening characters as we go template<class InputIterator> void save_iterator(std::wostream &os, InputIterator begin, InputIterator end){ #ifdef BOOST_NO_UTF8 std::size_t len = end - begin + 1; std::vector<wchar_t> dst(len); if (::mbstowcs_s(&len, dst.data(), len, begin, len - 1) != 0) { throw std::system_error(errno, std::system_category()); } std::copy( dst.data(), dst.data() + len - 1, boost::archive::iterators::ostream_iterator<wchar_t>(os) ); #else typedef iterators::wchar_from_mb< iterators::xml_escape<InputIterator> > xmbtows; std::copy( xmbtows(begin), xmbtows(end), boost::archive::iterators::ostream_iterator<wchar_t>(os) ); #endif }
xml_wiarchive_impl.ipp
#define BOOST_NO_UTF8 // my change #ifdef BOOST_NO_UTF8 #include <stdlib.h> #else #include <boost/archive/iterators/wchar_from_mb.hpp> #endif void copy_to_ptr(char * s, const std::wstring & ws){ #ifdef BOOST_NO_UTF8 std::size_t len = ws.size() * sizeof(wchar_t) + 1; if (::wcstombs_s(&len, s, len, ws.c_str(), len - 1) != 0) { throw std::system_error(errno, std::system_category()); } #else std::copy( iterators::mb_from_wchar<std::wstring::const_iterator>( ws.begin() ), iterators::mb_from_wchar<std::wstring::const_iterator>( ws.end() ), s ); s[ws.size()] = 0; #endif } template<class Archive> BOOST_WARCHIVE_DECL void xml_wiarchive_impl<Archive>::load(std::string & s){ std::wstring ws; bool result = gimpl->parse_string(is, ws); if(! result) boost::serialization::throw_exception( xml_archive_exception(xml_archive_exception::xml_archive_parsing_error) ); #if BOOST_WORKAROUND(_RWSTD_VER, BOOST_TESTED_AT(20101)) if(NULL != s.data()) #endif s.resize(0); #ifdef BOOST_NO_UTF8 std::size_t len = ws.size() * sizeof(wchar_t) + 1; s.resize(len); if (::wcstombs_s(&len, &s[0], len, ws.c_str(), _TRUNCATE) != 0) { throw std::system_error(errno, std::system_category()); } s.resize(len - 1); #else s.reserve(ws.size()); std::copy( iterators::mb_from_wchar<std::wstring::iterator>( ws.begin() ), iterators::mb_from_wchar<std::wstring::iterator>( ws.end() ), std::back_inserter(s) ); #endif }
If we run this code in boost 1.61 in Visual Studio 2013 x86 Windows 10, abort() is called.
Assertion failed: std::codecvt_base::ok == r, file D:\Project\boost_1_61_0\boost/archive/iterators/wchar_from_mb.hpp, line 175