Opened 12 years ago
Last modified 8 years ago
#5341 reopened Patches
Patch to improve shared library behavior with serialization
Reported by: | Owned by: | Robert Ramey | |
---|---|---|---|
Milestone: | To Be Determined | Component: | serialization |
Version: | Boost 1.47.0 | Severity: | Showstopper |
Keywords: | serialization shared_libraries dlls | Cc: |
Description
There are some issues that I have come across when using serialization with objects across shared libraries. I have made a patch that resolves the issues I have encountered, and will hopefully make serialization finally be friendly across multiple shared libraries.
The primary issue that I had was if there is a serializer created for an object, there can be some shared libraries that have a pointer serializer registered, and others that don't. This is actually quite common if you have the base class of an object declared in a separate library, since serializng a base class with base_object doesn't instanciate a pointer serializer.
Here is one case where this would prove to be an issue. Let's say the first time you wrote out a type it had no pointer serializer registered. It would write in the type info that tracking is not required and write out the object. Later, an object of the same type is written out from another shared library, but in this library a pointer serializer is registered. In this case, it will write out the object id, followed by the object itself. When that file is read in, it will read that the type has tracking disabled with the type information, and read the first instance just fine. However, on the second instance it still thinks that tracking is disabled, and will fail to read in the object id, even though it was written out.
I solved this issue by re-writing the archive serializer map to manage both pointer and non-pointer serializers. The map itself stores the non-pointer serializers in a multiset, so the serializers for all shared libraries are stored. When a iserializer/oserializer is constructed it is registered for that object type, and when it is destructed it is unregistered. pointer_iserializer/pointer_oserializer also register and unregister themselves with the map, but the behavior is different. What it will do is look at all non-pointer serializers of the proper type currently registered, and for any that don't have any pointer serializer registered for them it will use the pointer serializer currently being registered. When a pointer serializer is unregistered, it will search for an equivalent pointer serialier registered for another serializer for the same type (that was instanciated in a different library), or NULL if there is no such pointer serializer, and replace all instances of the pointer serializer being unregistered with that. Additionally, if a non-pointer serializer is being registered, it will look for an existing pointer serializer for that type to use. (it will quickly be replaced if there is a pointer serializer for that type instanciated in that library)
The effect at the end of the day is that for any given serializer, either all libraries or no libraries will have a pointer serializer registered for the corresponding non-pointer serializer. Since unregistration of pointer serializers tries to look for an equivalent to take its place, it should handle dynamic loading and unloading of libraries properly. Also, it should add no runtime cost during the actual usage of the serializers, only during the registration and unregistration.
The other issue I was having was I needed to create my own archive, but there was no way to export the archive serializer map symbols unless I put it in the boost serialization library, since it used the BOOST_ARCHIVE_OR_WARCHIVE_DECL() to generate the dllexport/dllimport declspecs. I changed it to use a technique described here. (http://support.microsoft.com/kb/168958) This allows customization for whether the symbols are imported and exported for each instantioation of archive_serializer_map. I also had to change all of the current archives to use this new method to register the archive serializer map.
I tried to keep consistent with your coding style and techniques. Since both this patch and the concrete to abstract patch I submitted earlier modify iserializer.hpp, I have also attached a combined patch that applies both.
Attachments (5)
Change History (26)
by , 12 years ago
Attachment: | shared library patch.zip added |
---|
by , 12 years ago
Attachment: | combined shared library patch.zip added |
---|
Zip file containing patches for both this issue and #5340
comment:1 by , 12 years ago
hmmm - this would require quite a bit of consideration. A couple of initial questions
a) Did you re-run all the tests using the library_test script? b) I didn't really follow the whole explanation, but I'm considering giving you the benefit of the doubt given that you seem to really understand it. Probably better than I do as I haven't looked at it in a long time. c) I looked at your link to the microsoft website. I don't see how this is different than the BOOST_ARCHIVE_OR_WARCHIVE_DECL() .. macros - except for lack of portability. d) I'm thinking you've addressed a concern that I've had since the library started being used as part of DLL which I never considered when I wrote it. So I'm interested in your work.
But I've come to understand that this is quite a subtle topic that I've had a hard time to get even partly right - so don't be disappointed if show a healthy skepticism as I've often thought I had something addressed only to find that there is some new wrinkle which comes out after the fact. And then there is he problem that it's way too easy to make outstanding archives obsolete. An error of this type has caused me untold pain.
Dont' give up on this
Robert Ramey
comment:2 by , 12 years ago
To answer your questions:
a) I haven't re-run them yet, but when I get a chance I will look into it and reply with the results.
b) If there's anything in particular that you aren't sure about, I'll try my best to clarify.
c) The problem with the previous usage of BOOST_ARCHIVE_OR_WARCHIVE_DECL() was that it required all template instantiations of the map to be imported/export in exactly the same way. By using the method described by Microsoft, it allows you to specify different import/export settings per template instantiation. This is a requirement if you want to create an archive compiled outside of the boost_serialization dll.
I can certainly understand your skepticism. I've had to deal with a lot of subtle issues with serialization in the codebase I'm working in, which is very large and complex, and spread out across many DLLs. (it was so much simpler before we started using DLLs... but not using them had its own set of problems) After going through how the registration system works, and performing a lot of both thought and real world experiments, I'm pretty sure it won't cause any backwards compatibility issues. However, I know that despite my best efforts I still could have missed something, so this will certainly take a lot of testing in other use cases to verify.
I'm curious, do any of the tests in library_test use files saved from earlier versions of boost serialization? If so, then that can help put some fears to rest, though we can't be completely sure until it's used in more real world situations.
Aaron
comment:3 by , 12 years ago
I have run the regression tests, and they have all passed with the patches in "shared library patch.zip".
Unfortunately, when I run the tests with the concrete to abstract class fix (submitted separately in #5340 and included with "combined shared library patch.zip") the shared pointer test fails to compile, since the base class doesn't have a serialize function. When I have time I can see if I can remove that dependency and get it to compile, but at least the shared library patches by themselves all compile and check out.
comment:4 by , 12 years ago
Resolution: | → wontfix |
---|---|
Status: | new → closed |
"c) The problem with the previous usage of BOOST_ARCHIVE_OR_WARCHIVE_DECL() was that it required all template instantiations of the map to be imported/export in exactly the same way. By using the method described by Microsoft, it allows you to specify different import/export settings per template instantiation. This is a requirement if you want to create an archive compiled outside of the boost_serialization dll. "
I looked at this again and I can't see how this differs from using the BOOST_ARCHIVE macros above. It could be that the BOOST_ARCHIVE macros are not implemented correctly ... but the idea and method is identical as far as I can tell. In addition to the properly declaring export and import, they support the boost autolib fuctionality. If you want to investigate the implementation of these macros, feel free, but there is no way I'm going to do this in a different way than all the other boost libraries do.
Robert Ramey
comment:5 by , 12 years ago
The problem with the current method is all instantiations of archive_serializer_map are always set to export for the boost serialization DLL, and set to import for all other libraries, which causes issues due to the fact that it’s templated. This makes it impossible to create your own archive outside of the boost serialization DLL. I will lay out every step of why this is the case.
I will first start with how template instantiation works. When you instantiate a templated class, it is as if you are creating a completely new, unrelated type. For example, let’s say you have a template class Foo. If you have a template instantiation for type A and type B, you have Foo<A> and Foo<B>. As far as the compiler is concerned, you might as well have separately created two unrelated classes, Foo_A and Foo_B. Additionally, these separate templated types are created in whatever library they are instantiated in. For example, let’s say you declare your templated Foo type in Library1, and instantiate Foo<A> in Library2 and Foo<B> in Library3. This is equivalent to creating nothing in Library1, creating a class Foo_A in Library2, and a completely unrelated class Foo_B in Library3.
Now I will cover how DLL symbol importing and exporting work in the Microsoft compiler. When compiling a library, you must declare all symbols you want to expose to external libraries to export. When compiling code that utilizes that library, all declarations of those symbols must be declared to import. Exported symbols must have an implementation, while imported symbols may not have an implementation. Choosing whether to export or import a symbol is typically done with a macro, which resolves to export when compiling the library, and import when compiling other libraries and executables that utilize that library.
Now in this case we must combine both templated classes and DLL importing and exporting to fully understand the archive_serializer_map symbol exporting. In this case, let’s again use the Foo class used above. The way the archive_serializer_map is implemented, it is the same as saying that all instantiations of the Foo templated class exports symbols when compiling Library1, and imports symbols when compiling anything else. This causes problems when you instantiate Foo<A> in Library2 and Foo<B> in Library3. In both these cases, the compiler thinks it needs to import the symbols for both Foo<A> and Foo<B>, since it is set up to export from Library1, and import everywhere else. However, since Foo<A> and Foo<B> are being created in different libraries, you actually want Foo<A> to export symbols when in Library2, and Foo<B> to export symbols when in Library3. The method that I use in the patch achieves this, and allows you to specify for each template instantiation of the archive_serializer_map whether it’s importing or exporting symbols independently.
I will admit that this is different from how the rest of boost works, but at the same time the rest of boost doesn’t export DLL symbols for templated classes, so it will be different no matter what you do. The way it’s currently set up, based on what I laid out above, you can only create archives in the boost DLL, since it will always export the symbols in the boost DLL library and import the symbols in all other libraries regardless of where the template is instantiated. My changes in my patch allow you to export those symbols in other libraries. If you refuse to change it, then you must accept the fact that users will never be able to create their own archives when using DLLs. Ever.
Also, I am wondering why you marked this as “won’t fix”. Your issue was with the DLL importing/exporting. While this is necessary if you want to create your own archives, there are other aspects of this patch that don’t rely on that. The other archive_serializer_map and basic_serializer_map changes deal with the serializers being instantiated in multiple libraries, which is completely unrelated. Like I said before, please ask me if you need clarification for any details, but please don’t close this just because you don’t understand it.
comment:6 by , 12 years ago
Resolution: | wontfix |
---|---|
Status: | closed → reopened |
OK, I'm still not getting it. I'll spend some more time looking at it in more detail.
Robert Ramey
follow-up: 8 comment:7 by , 12 years ago
I spent a little time looking at a couple of these patches in two files picked more or less at random;
first basic_serializer_map.cpp
a) it looks like you've excluded the possibility that the serialization for some class A may be instantiated in more than one DLL or in the mainline and DLL. The comment you eliminated describes the "trap" which catches this ODR violation. I had to disable the trap because too many users violate this rule and don't think it's a problem (i disagree) and they've found it too burdensome to re-factor their code to avoid violation. In the future I may re-enable the trap with a method to suppress it for those who have problems with it. So eliminating the extensive comment was a mistake.
b) the adustment to use equal_range was clever. I have to admit I wasn't aware of this algorithm in the STL. In fact, I can't find it in the documention for G++. This would make it a non-starter for portable code.
c) you're function "equal_range" is just an implementation detail and would never need to be exported in any case. A better strategy would be to use inline code as "type_compare" does in that same file.
d) the change from returning a bool to void was of course correct and I'll consider including it
second file:text_iarchive.hpp
At first I thought this just moved the ...ARCHIVE macro from the header to to the *.cpp file. which of course would be superflous. Deeper investigation revealed that there was more involved with the definition of macros in other files and I lost track of it. It did remind me that the explicit instantiation of the template archive_serializer_map<file> was a hack which I had to add to compensate for some other improvement/change. I realize that this is not really satisfactory - though I think my reason are different than yours - not to say yours are invalid. I'm think about some required improvements to the library and addressing this would be a good idea.
In general, I know you've invested a lot of effort in investigating and patching the library, but including these patches - even assuming they are all correct - would require days of work on my part which I can't spare right now.
On the other hand, I don't want to dissuade people from helping me out on this and have incorporated many fixes and enhancements in the past (though not all of course). If you're not too discouraged, you might consider separating your patches in smaller bites. e.g.
i) fix for archive_map instantiation. Ideally it should not be necessary to include this in these archive.cpp files as it is an implementation detail. My next enhancement will be to formalize the concepts of the archive classes so that it's easier to create correct versions of one's own archive classes and the current setup conflicts with that. So I'm not totally adverse to this.
i) improvements in efficiency of archive_map. Of course, as noted above you're improvements wouldn't work for me. But mixing multiple changes in the same set of patches makes the job of evaluating them very hard for me.
... etc. I don't know what else is in here.
also remember that the code is years old to me and to consider something like this I have to spend a lot of time re-acquainting myself with it. In a sense you'll have to gift wrap it for me. I realize that this maybe not be fair (and probably isn't) but it's just a fact of life.
I don't know if you're still interested in this, but if you are I'll share with you a list of the issues that I hope to see addressed in the near future with the eye that you might want to contribute.
misc note: I use cygwin GCC to test my code with the gcc compiler etc. It's invaluable to get the code to pass on all compilers - which of course is a necessity for us. It's also easy to download and install.
I've spent a lot of time on this as I know you've spent a lot of time of your own and I want to respect that. You've earned the right to a fair hearing.
Robert Ramey
comment:8 by , 12 years ago
Replying to anonymous:
b) the adustment to use equal_range was clever. I have to admit I wasn't aware of this algorithm in the STL. In fact, I can't find it in the documention for G++. This would make it a non-starter for portable code.
equal_range is part of the standard. (25.3.3.3 lib.equal.range) GCC had better support.
comment:9 by , 12 years ago
a) It does handle when the serializer for class A is instantiated in multiple DLLs. The map type has changed from a set to a multiset, which allows multiple serializer instances for the same extended_type_info. In order to understand this, you will also have to look at archive_serializer_map. I wrote this patch because my use case was encountering the situation of having multiple serializers instantiated for type A, and it was failing because the pointer serializers for class A was set in one DLL, but not in another, causing confusion as to whether or not the instances for that type should be tracked.
Here's a step by step example of what happens at various stages of initialization and sutdown:
- A serializer for class A is instantiated in a new DLL.
- The serializer is registered with the archive_serializer_map. Since it is a multiset, it will be in addition to existing serializers for type A.
- It looks at an existing serializer registered for class A. If it has a pointer serializer, it re-uses that pointer serializer for the new serializer being registered.
- If a pointer serializer for class A IS NOT instantiated in the same DLL, everything works since it is using a pointer serializer from a different DLL, keeping the state consistent.
- If a pointer serializer for class A IS instantiated in the same DLL. (NOTE: This is guaranteed to happen after the serializer is instantiated)
- It assigns itself to the serializer of A. This ensures that the serializer for class A always prefers the pointer serializer in the same DLL it's instantiated in.
- It looks at the existing instances of serializers for class A, and if they have no pointer serializer it uses the current pointer serializer being initialized.
- A pointer serializer for class A is destroyed.
- It looks at the existing instances of serializers for class A, and tries to find a pointer serializer that isn't the one being destroyed.
- For all serializers for class A, if it is using the pointer serializer being destroyed it uses the replacement found in step a. If no replacement was found, NULL is assigned.
- A serializer for class A is destroyed, it is simply removed from the archive_serializer_map.
These steps ensure that either all serializers for class A have a corresponding pointer serializer or none do, eliminating the case where some serializers for class A would have a pointer serializer and some wouldn't. The trap is no longer necessary, as this solution specifically handles the case that was breaking before.
b) As already mentioned by Steven, it is part of the standard. Here's the documentation for it. (http://www.cplusplus.com/reference/stl/multiset/equal_range/)
c) equal_range is used by the new implementation of archive_serializer_map for the operations that look at all existing serializers for a specific type.
The additional macros in text_iarchive.hpp are necessary to properly export the symbols for the archive_serializer_map. This is what I was explaining earlier, where you need to specify per library whether or not to export the symbols for the archive_serializer_map in order to allow creating custom archive types. The way it was before, it was hard coded to only export from the boost serialization DLL.
It might be possible to eliminate the need for a separate archive_serializer_map instantiation for every archive type, but that has two requirements. First, it would require a lookup based on the archive's extended_type_info, adding an additional lookup every time you need to lookup a serializer in the archive_serailizer_map. Second, the serializer type hierarchy would need to be redone to allow querying the pointer serializer without knowing the archive type.
As for my patch, I can probably make two separate patches: one for the changes supporting multiple serializer instantiations, and one for the different DLL import/export changes. I won't be able to split it up any further and have it still function, though.
I would like to see the list of issues you want to see addressed. We rely heavily on serialization in the codebase I'm working with, so having the opportunity to fix issues we may run into later will be a win-win.
comment:10 by , 12 years ago
b) OK you guys have me here.
a) you're beginning to convince me you know more about this than I do. My concern was the case where serialization code for class A was instantiated in two different DLLS where there was no guarantee that they would be the same code.
I have had problems with code which tracks pointers in one module and doesn't in some other module. It looks like you've considered that.
I'm still not seeing the import/export issue changes.
Here are the things that I'm currently most concerned with:
a) Implement concept checking for serialization and especially archive concepts. Lack of this latter has lead to some ambiguities in how to use the existing code to make new archive classes.
b) compile and test and include as examples submitted code for yaml and json archives
c) create archives for gui editors - three flavors MFC, QT and WX windows. These would work like this:
given: class A {
T1 m_a1; T2 m_a2; ... template<class Archive> void serialize(Archive &ar, unsigned int version){
ar << NVP(m_a1); ...
}
};
one could do the following create a dialog template. Controls would be named/tagged "m_a1", ...
create dialogbox db; boost::archive::mfc_oarchive oa(db); copy to dialog box oa << a; user edits data boost::archive::mfc_oarchive ia(db); ia >> a; get edited data.
I once made such a thing but it was too much of a hack
d) re-implement, document and test the concept of "archive helpers" which did dynamically what the current code does statically for shared_ptr. I realize now that I had it more right the first time.
e) I don't want to keep cluttering up the archive concept with hacks like ar.reset_object_address. So I want to create the concept of "archive manipulators" similar to the standard io manipulators. In this way, not every archive would have to implement these extra functions. Currently they don't only because they are implemented in the base class - but I want it to be easier to make archive classes which don't depend on the current implementation. An example of an archive manipulator would be the pair, set_tracking(off)/restore_tracking. This would permit the creation of an archive which would be better suited to sending transactions over the net. Basically, it would keep the archive concept from continually growing.
f) I want to trap NaN(s) in archives other than binary. This could be overridden with a new flag on archive opening.
I have a few more items on my list but I can't decipher them anymore.
Robert Ramey
comment:11 by , 11 years ago
I'm taking a more extensive look at your patch and I'm feeling more comfortable with it. I see now that I didn't really understand what it was about. I would characterize it as:
a) improving the tracking of eti record created. I recognize that you're now more knowledgeable about this situation than I currently am. So if I don't see any obvious problems, I'll likely be checking it into the trunk to see how it flies.
b) handling an issue with decl(import) and decl(export) as it relates to eti maps. I'm still looking at this.
It's easier for me to understand if I consider these orthogonal issues. I see that the changes are not as extensive as I had originally thought from the large number of files altered. I'll keep you posted.
Robert Ramey
comment:12 by , 11 years ago
That is pretty much the gist of it. When I have a chance (most likely tomorrow) I can set up a simple example of where the dll import/export fails with the current behavior.
If you want to try it out before I can get that set up, it's fairly simple. First, compile unpatched boost as DLLs. Next, setup a simple tester and copy one of the existing archives (like text_iarchive) into that tester and rename it. (say, to my_iarchive) If you compile that against the boost DLLs, it should fail with this error. (http://msdn.microsoft.com/en-us/library/62688esh.aspx) If you then apply my patch and recompile, it should succeed. (note that your renamed my_archive shouldn't have any of the dll import/export macros since it's in an executable rather than a DLL)
Also, just to double check, I want to make sure you're applying the first set of patches (shared library patch.zip) and not the second set. (combined shared library patch.zip) The second one includes my patches in issue #5340, since both patches modify iserializer.hpp and cannot be applied separately. As I mentioned in comment 3, the combined patch still has an issue, so I will re-submit it when I resolve it. The normal patch (shared library patch.zip) should be fine, though, and passes all the regression tests. I will verify that it passes all the tests with gcc when I have a chance.
by , 11 years ago
Attachment: | test dll.zip added |
---|
Tester demonstrating DLL compile issues with archive_serializer_map
comment:13 by , 11 years ago
I have attached a simple tester to demonstrate the issues with DLL linking. (test dll.zip) This tester defines a separate archive (just a copy of text_iarchive) and demonstrates how it will not compile with unpatched boost, but will compile once my patches are applied. I detailed why this is the case in comment 5.
If you want to use the solution and project files I provided, first you'll have to have Visual Studio 2010 installed. If you don't own Visual Studio 2010, you can download the free express version here. (http://www.microsoft.com/express/Downloads/#2010-Visual-CPP) It should install beside any existing Visual Studio installation gracefully. Additionally, you can re-use my source files and set up the project yourself, making sure it links to the DLL build of boost.
I set up my project assuming you're compiling boost with the following bjam parameters: variant=debug link=shared threading=multi address-model=32 --toolset=msvc-10.0 stage
To compile the test, extract "test dll.zip" in the "libs\serialization" directory. Open the "test dll.sln" solution and build with the debug/win32 configuration. With unpatched boost, it fails with the following errors:
1>c:\users\abarany.navteq\desktop\boosttest\boost_1_46_1\boost\archive\impl\archive_serializer_map.ipp(45): error C2491: 'boost::archive::detail::archive_serializer_map<Archive>::insert' : definition of dllimport function not allowed
1>c:\users\abarany.navteq\desktop\boosttest\boost_1_46_1\boost\archive\impl\archive_serializer_map.ipp(57): error C2491: 'boost::archive::detail::archive_serializer_map<Archive>::erase' : definition of dllimport function not allowed
1>c:\users\abarany.navteq\desktop\boosttest\boost_1_46_1\boost\archive\impl\archive_serializer_map.ipp(67): error C2491: 'boost::archive::detail::archive_serializer_map<Archive>::find' : definition of dllimport function not allowed
If you apply my patches and rebuild boost, when you rebuild the test dll solution the build should succeed.
comment:14 by , 11 years ago
I have posted an updated patch which fixes a couple compile errors in GCC. (I was missing a couple "typename" keywords that Microsoft's compiler didn't complain about) All of the serialization regression tests pass under gcc with my patchset.
I did run into one issue when running the regression tests due to cygwin in the test test_utf8_codecvt.cpp. It is checking to see if the character from the wifstream is WEOF, but wchar_t is a 2 byte integer while WEOF is stored in a 4 byte integer with the value -1. When it does the comparison, it expands the wchar_t to 4 bytes, but since it's an unsigned value it pads the extra 2 bytes with 0s, causing the comparison to fail and infinite loops until it runs out of memory. By changing that comparison, I was able to get the test to succeed.
comment:15 by , 11 years ago
Cc: | added |
---|---|
Severity: | Problem → Showstopper |
Version: | Boost 1.46.0 → Boost 1.47.0 |
This is a pretty serious problem, I've run into something related at work. Marking as showstopper, as attention needs to be paid here.
comment:16 by , 11 years ago
Out of curiosity, have you tried applying my patch to see if it resolves your problems? (the best one to apply would be "shared library patch.2.zip") If additional people can verify that it works for them, it will hopefully make it more likely to be accepted.
comment:17 by , 11 years ago
Cc: | removed |
---|
Aaron: one of your other patches that was accepted and applied appx 4 weeks ago fixed our immediate problems at work. I'll verify that this patch is valid in the morning. Understanding the problem domain I can confirm at the least that the problem is, in fact, a problem.
by , 11 years ago
Attachment: | issue 5340 and 5341 patch v2.zip added |
---|
Updated combined patch for this issue and #5340.
comment:18 by , 11 years ago
I have submitted the updated combined patch that combines the fix for this issue and #5340, which fixes the compile issues that it had earlier.
comment:19 by , 9 years ago
Aaron or Robert: I've run across this bug while switching serialization over to boost. We currently have several shared libraries that handle serialization, and this was actually triggered in our unit tests - in particular because I had added archives that serialized this particular object type as a pointer. Unfortunately, it took me the better part of the day to track down this.
I have not yet tried to apply these patches to determine if they solve the issue (seems to be difficult under Windows).
I assume this was simply not applied yet due to time to investigate further, is that still the case?
comment:20 by , 9 years ago
Sorry for the late reply, but last I heard Robert didn't have time to investigate and integrate the patches to his satisfaction. Since I haven't heard anything since, I assume that is still the case.
To apply the patches under Windows, you can try running GNU patch. (http://gnuwin32.sourceforge.net/packages/patch.htm) You may have to fiddle with the command line parameters a bit to have it find the path to the files that it needs to patch.
comment:21 by , 8 years ago
We have observed the same problem here and it is a showstopper for our application. Is the library state unmaintained?
Zip file containing the patches.