Opened 7 years ago

Last modified 7 years ago

#11529 reopened Bugs

regression: boost::archive::archive_exception exception during serialization non latin1 strings to xml

Reported by: nikolay@… Owned by: Robert Ramey
Milestone: To Be Determined Component: serialization
Version: Boost 1.58.0 Severity: Regression
Keywords: Cc:

Description

Just try compile and run following example:

int _tmain(int argc, _TCHAR* argv[])
{
	boost::filesystem::wofstream ofs("c:\\test.xml");

	std::string s1 = "kkk";
	std::wstring w1 = L"kkk";
	std::wstring w2 = L"апр"; // some non-lati1 (for example russians) letters

	boost::archive::xml_woarchive oa(ofs);
	oa << boost::serialization::make_nvp("key1", s1);
	oa << boost::serialization::make_nvp("key2", w1);
	oa << boost::serialization::make_nvp("key3", w2); // here exception is throw
	return 0;
}

This code was working in 1.38 + VS2005, 1.44+VS2005, 1.52+VS2005, 1.52+VS2012, 1.52+VS2013. When I have updated from boost 1.52 to boost 1.58 (both VS2013) I have mentioned my app crashes during serialization. I have tried this code on 1.57 + VS 2012 (unfortunatelly I don't have 1.57 compiled for VS2013) and it works. So problem appears between 1.57 and 1.58. If need I can obtain crash dumps.

Attachments (3)

impl.patch (918 bytes ) - added by nikolay@… 7 years ago.
Patch
library_status.html (36.8 KB ) - added by Robert Ramey 7 years ago.
serialization test matrix output
links.html (138.3 KB ) - added by Robert Ramey 7 years ago.
library test results

Download all attachments as: .zip

Change History (22)

comment:1 by nikolay@…, 7 years ago

Summary: regression: boost::archive::archive_exception exception during serializationregression: boost::archive::archive_exception exception during serialization non latin1 strings to xm

comment:2 by nikolay@…, 7 years ago

Any plans when this problem will be fixed? I have found another problem with deserialization (it is broken as well) and I guess these two problems are related.

by nikolay@…, 7 years ago

Attachment: impl.patch added

Patch

comment:3 by nikolay@…, 7 years ago

I have attached a patch which fixes from problems (with serialization and with de-serialization).

Could somebody commit changes into boost repository?

Code I used to test deserialization:

	boost::filesystem::wifstream ifs("c:\\test.xml");

	std::string s1;
	std::wstring w1;
	std::wstring w2;

	boost::archive::xml_wiarchive ia(ifs);
	ia >> boost::serialization::make_nvp("key1", s1);
	ia >> boost::serialization::make_nvp("key2", w1);
	ia >> boost::serialization::make_nvp("key3", w2);

	return 0;

comment:4 by Robert Ramey, 7 years ago

Resolution: fixed
Status: newclosed

I've looked at the patch and it modifies a change made last december. Still it looks good to me so I'm putting it in the development branch.

comment:5 by Robert Ramey, 7 years ago

I applied your patch and it makes a test fail with GCC 5.1 - GCC reports an error in that a malloc allocation has been modified after being freed. But I haven't been able to find the source of the problem. The test is test_array_xml_archive

so I had to back it out.

It's not obvious to me that your test code actually throws an exception - this is where the test fails.

Note that we to invoke imbue - in a rather non - obvious way. You might want to check that.

Note that as the code stands now - everything passes on CLANG compiler and GCC -as well as others in the boost test matrix.

comment:6 by nikolay@…, 7 years ago

As I understand you just closed this ticket without any fix? Am I right?

If yes then I would to reopen the ticket as it is a real problem: starting with 1.58 it is not possible to use Boost.Serialization to work with non-latin1 strings. It is regression. Yes my patch it is not ideal :-) but I have attached it just for information only (like some start point to investigate the problem). If with my patch some other test cases are failed then I believe other patch has to be written instead which fixes all problems. May be even it will be the best choice just to revert code to 1.57, add more test cases (for non-latin1 strings serialization) and then try another time to do refactoring.

Just to remind: It you cannot reproduce the prolem on program listened in the ticket description then I am able to collect memory dump for with exception call stack. Also I am able to test your patches. Even if you don't have Visual Studio I can compile boost-wserialization.dll myself and try a fix.

comment:7 by Robert Ramey, 7 years ago

How about this:

a) you roll in your patch

b) run the test program test_array_xml_archive

c) verify the malloc problem

d) trace down the problem

e) modify your patch

f) run the serialization test suite and verify it passes all tests

g) re-submit the tested patch.

Robert Ramey

comment:8 by Andrey <nikolay@…>, 7 years ago

Actually I don't think it is a good idea for some reasons:

  • I am not a familiar with Boost.Serialization code. So even if I will fix current issue (and all test cases will pass) it still can cause other issues with are not covered by test cases.
  • Also it may take a very long time to find a proper fix for current problem as it is new code for me and I have ~30 minutes of free time daily.
  • I am not familiar with git so I even cannot obtain latest developer boost sources to start investigation

I believe you as author of this library can find a proper fix in short time.

comment:9 by Andrey <nikolay@…>, 7 years ago

Resolution: fixed
Status: closedreopened

comment:10 by Robert Ramey, 7 years ago

"I believe you as author of this library can find a proper fix in short time."

LOL - why do you believe that? FYI, I DID spend time on this - and not a short time. That's why I need help with this!. I don't think that it's related to the serialization library itself but the gcc implementation of code convert facets. These things are a bitch to track down. I just don't know what to do here. I certainly can't integrate a patch which creates test failures. So this will have to say in limbo for a while. I'll leave it open for you.

comment:11 by Andrey <nikolay@…>, 7 years ago

Unfortunately, I am a Windows developer only. And I was never worked with gcc before. I can try investigate an issue if some test will fail under Visual Studio. I have tested my sample with Windows Page Heap enabled for the process (to catch if there is read/write access to the memory already freed) but I was unable to catch this issue. So possible it is real GCC bug and it is required to report them to GCC team.

comment:12 by Robert Ramey, 7 years ago

I've got good news for you.

I have spent (even) more time on the issue. I've concluded that your patch is basically correct. Including the patch triggers a seemingly unrelated problem in one of the tests on the GCC compiler. This problem doesn't occur on any other compilers tested. But now that I've seen it and repeat it, I can fix it. (This is a little trickier than one might think). So, if you have a little patience, that will be rewarded. You can roll your patch into your current version of the library and rebuild. The next version of the library - 1.60 will include your patch and a fix for the problem side-effect.

comment:13 by Andrey <nikolay@…>, 7 years ago

Very good news. Thanks a lot.

Now I have already compiled boost 1.58 with my patch and now it passes all my tests. It is good enough for now for me. Anyway I will update to boost 1.60 as soon as it will be released.

comment:14 by Andrey <nikolay@…>, 7 years ago

And one question: is it possible to add some new test cases to the library for serialization/deserialization of non-latin1 strings to catch similar bugs while developing lib?

by Robert Ramey, 7 years ago

Attachment: library_status.html added

serialization test matrix output

by Robert Ramey, 7 years ago

Attachment: links.html added

library test results

comment:15 by Robert Ramey, 7 years ago

I'd certainly be happy to look at them.

Do you run the serialization test suite on your own equipment. This would be very helpful and isn't very hard.

a) build some boost tools b2, process_jam_log, library_status b) cd libs/serialization/test c) invoke library_test ... d) deal with setting up of b2 user config, site config, environment variables etc.

Of course this is pain in the rear. And probably "wastes" a workday. But in my view for an enterprise which depends upon boost software - it's a very good investment. It verifies that the the boost libraries you use actually pass all tests on your local combination of compiler, compiler version, operating system etc.

If you want to submit a few tests, I'd like it if you looked at the tests I have and follow the scheme that I use so I don't have to adapt them. This also means

a) not introducing dependencies on other boost libraries b) using the "test_tools.hp" c) using the macros for stream settings and archive settings. This means that serialization of one data structure can be tested with all archives types without any extra effort.

Alternatively, you can add tests to the CMakeLists.txt file that you find there. This isn't as elaborate as the Bjam setup. But it does work well for me when I develop/debug using the IDE - which is most of the time.

My usual regimen is:

a) get a complaint at this site b) make a test for it. c) put it into the test_z file d) debug it and make fixes e) build and test with jam - using the gcc compiler - I need to use at least two compilers - more would be better. f) maybe add to the official test suite - update CMakeList.txt and jam files g) test again h) upload to the develop branch - cross fingers. i) await results on the develop branch j) merge develop into release.

Other points.

a) you really should have cygwin installed. b) You really should become familiar with GIT. It's included in cygwin. For windows you can use SourceTree GUI for git which I find convenient.

So that's all you need to become a boost developer!

Robert Ramey

comment:16 by Robert Ramey, 7 years ago

Resolution: fixed
Status: reopenedclosed

I believe I have fixed this to everyone's satisfaction

comment:17 by nikolay@…, 7 years ago

Resolution: fixed
Status: closedreopened

Updated boost to 1.60. Problem is NOT fixed! The same code as in the issue description. The boost::archive::archive_exception exception is throw when trying to serialize a non-latin1 string.

comment:18 by nikolay@…, 7 years ago

The same issue with deserialization.

My patch fixes both problems.

comment:19 by anonymous, 7 years ago

Summary: regression: boost::archive::archive_exception exception during serialization non latin1 strings to xmregression: boost::archive::archive_exception exception during serialization non latin1 strings to xml
Note: See TracTickets for help on using tickets.