Opened 9 years ago

Closed 9 years ago

Last modified 8 years ago

#9612 closed Bugs (invalid)

Memory Access Violation when recursively serializing classes

Reported by: Chris Rusby <chris.rusby@…> Owned by: Robert Ramey
Milestone: To Be Determined Component: serialization
Version: Boost 1.55.0 Severity: Problem
Keywords: Cc:

Description

See attached example. I have a class which contains one member variable: a shared pointer to its base class. If the actual object type is the same as the object being serialized, the serialization libraries crash. It also only occurs if they are in separate modules.

Attachments (3)

CrashTest.zip (6.5 KB ) - added by Chris Rusby <chris.rusby@…> 9 years ago.
0001-Changes-to-singleton-to-allow-extern-templates-on-si.patch (5.6 KB ) - added by Brandon Kohn 9 years ago.
Patch to workaround issue
CrashTestDLLFileFixes.zip (2.3 KB ) - added by Brandon Kohn 9 years ago.
Changes to files in test to work with patch

Download all attachments as: .zip

Change History (20)

by Chris Rusby <chris.rusby@…>, 9 years ago

Attachment: CrashTest.zip added

comment:1 by Brandon Kohn, 9 years ago

It looks like there are two instances of the basic_iserializer<xml_iarchive, TestObject> which live in the CrashTestDLL and then in main application. The one in the dll has a bpis member where the one in the main app has a null member. To figure this out I set a breakpoint on baseic_iserializer.hpp: basic_iserializer::set_bpis (around line 64 in boost 1.53). When it loads the dll you'll see these singletons being setup with the pointer iserializer.

Then in the test case code add:

    BOOST_AUTO( const &cinst, (boost::serialization::singleton<boost::archive::detail::iserializer<boost::archive::xml_iarchive, TestObject> >::get_const_instance()) );

    BOOST_CHECK(cinst.get_bpis_ptr() != 0);//! This is null!

I used the debugger to copy the value of m_bpis from the singleton in the DLL into the one in the main application and it then loads.

Last edited 9 years ago by Brandon Kohn (previous) (diff)

comment:2 by Chris Rusby <chris.rusby@…>, 9 years ago

Yes - I agree 100% with your analysis: I should probably have added in the original ticket that it doesn't crash on Linux or when I'm not serializing across a DLL boundary.

Am I doing something wrong (I don't think I am), or is this a boost serialization bug? It strikes me that the definition of the basic_iserialization class might be missing a BOOST_DLLEXPORT macro to ensure it's exported from CrashTestDLL and hence not duplicated?

Thanks,

Chris

comment:3 by Brandon Kohn, 9 years ago

The singleton<iserializer<Archive,TestObject> > is the thing being duplicated, and it is exported. BOOST_DLL_EXPORT is (in msvc) always defined as _declspec(dllexport), and is used on the static public interface of the singleton type. Near as I can tell, this means it will be exported from every DLL or exe in which it is included. I don't know enough about what could be done differently to be able to call it a bug. I'm surprised we haven't run into an issue like this example in our codebase as we use a lot of DLLs. I suppose we tend to concentrate our use of serialization in the same DLLs as where the types are exported. Perhaps that could be a workaround?

comment:4 by Robert Ramey, 9 years ago

Resolution: invalid
Status: newclosed

I've seen problems occur with DLLS when the "same" code is included in two different DLLS.

This can occur when an inline serialization function is in a header which is used by several DLLS. Each inline code generation ripples back to include stuff in the library which ends up being duplicated. example

Instead of

// X.hpp
struct X {
  template<class Archive>
  serialize(Archive & ar, unsigned int file_version){
    ar & ...
 };

Use

// X.hpp
struct X {
  template<class Archive>
  serialize(Archive & ar, unsigned int file_version);
};
// X.cpp
// use explicit instantiation
template X::serialization<boost::serialization::text_iarchive>( boost::serialization::text_iarchive & ar, unsigned int file_version){
  ar >>  ..
}
template X::serialization<boost::serialization::text_iarchive>( boost::serialization::text_oarchive & ar, unsigned int file_version){
  ar <<  ..
}

This will leave all the serialization code for X in one and only one place. Thereby diminishing DLL hell and making DLLS smaller.

If someone want's to write an article on tis subject - organizing code in DLLS, plugins using polymorphic versions of archives, etc. I'll consider including it in the documentation.

Robert Ramey

by Brandon Kohn, 9 years ago

Patch to workaround issue

by Brandon Kohn, 9 years ago

Attachment: CrashTestDLLFileFixes.zip added

Changes to files in test to work with patch

comment:5 by Brandon Kohn, 9 years ago

I've looked into this a bit more and I think there is a way to support this case for compilers which support extern templates. That includes msvc back to 8.0.

The trick is to combine the dllexport/import with the extern for the explicit instantiations of the singleton's for the i/oserialzer instances. This requires changes to singleton.hpp (moved the exports to the class level.) Was there a particular reason the exports were put on the members?

I put together a patch. Check it out Chris.

Note: There is a warning by the vc12 compiler c4910 : http://msdn.microsoft.com/en-us/library/bb531392%28v=vs.90%29.aspx

I agree with the comment on that page that this shouldn't be a warning. Export and extern are not mutually exclusive and should be able to modify the same thing. In spite of the warning, the compiled code does the right thing in all the versions I tested. (vc11, vc12).

comment:6 by Robert Ramey, 9 years ago

I've taken a cursory look at your patches.

I've spent quite a bit of time over the years looking at this problem and concluded there is no robust, maintainable solution other than to follow the advice I gave above. Following the practice outlined above will avoid problems with the same code appearing in different modules having the possibility of conflicting. It also supports plug-in DLLS in an effective way. And it eliminates unnecessary code bloat. Trying to support that which is really an application design mistake is a fools errand that never ends.

So I wouldn't expect such a patch to appear in the library.

Robert Ramey

comment:7 by Brandon Kohn, 9 years ago

Hi Robert,

Did you look at his example? It follows your suggestions and still fails. I don't see where there is a design mistake.

As for the patch, I think you should consider changing the way the singleton is exported so doing externs like this is possible. I believe the risks are minimal.

Cheers,

Brandon

comment:8 by Robert Ramey, 9 years ago

I took another look at your example. It is not a simple example or test. it has serialization of cycles, boost shared pointers, implementation across DLLs and serialization through a base class pointer and xml. Each of these is sort a "corner case" in that it required separate coding and attention to these aspects so that the features can be combined in an orthogonal way.

I also took another look at your patch. This involves Export facility which is another very murky area which touches upon C++ instatiation, visibility across compilers and other stuff.

So introducing such a patch under these circumstances has a strong possibility of turning into a very major investment of time.

comment:9 by Brandon Kohn, 9 years ago

Strictly speaking, all serialization library needs to do to support this fix is to modify the way the singleton is exported. Currently the export specifier is on the static members of singleton. Moving the specifier to the class level allows users to do their own exports of the instantiations. The change works with all supported msvc compilers (i.e. since vc8) and on gcc 4.8.2. Perhaps it would be worthwhile to make the change on develop and see how the test runners perform? If anything I would say the mixing of extern/dllexport in the extern.hpp macros is the more risky proposition. However, it's easy enough for a user to implement those if needed.

Why was the export implemented on the members instead of at the class level initially? Was there some compiler which did not handle this? I know clang doesn't now, but it's logged as a bug.

Last edited 9 years ago by Brandon Kohn (previous) (diff)

comment:10 by Chris Rusby <chris.rusby@…>, 9 years ago

Based on the above, I am of the opinion that I (the user) need to tell the boost serialization libraries whether to export the implementation of the instantiated template from the DLL. I envisage adding a new macro that I'd use something like this:

BOOST_CLASS_EXPORT_KEY3(TestObject, "TestObject", CRASHTESTDLL_EXPORTS)

This would declspec(dllexport) the instantiated TestObject serialization templates from the DLL (because CRASHTESTDLL_EXPORTS would be defined), and from the main program would declspec(dllimport) the serialization templates and hence import from the DLL. Hence there would be one and only one implementation of iserializer<Archive,TestObject?> (which seems to be the root cause of my problem).

I've not verified that this works, I'm afraid, but I hope to soon. If it does work, though, the advantage is that it's not going to modify existing code so is much lower risk.

Thanks,

Chris

PS In answer to your (Robert) original question, when I understand it fully i.e. when this is fixed and I understand why/how it's fixed, I'd be more than happy to do some documentation for you.

comment:11 by Chris Rusby <chris.rusby@…>, 9 years ago

Hi again,

I've just made some small serialization changes in our code-base in a much less edge-casey way.

If I understand the cause of the problem, it arises because I have a class defined in DLL A, and I'm serializing a shared pointer to this within classes in DLL B and DLL C - and this isn't happy (windows _and_ linux).

I've not yet managed to simplify the test case sufficiently to be able to upload it, but I hope to do this in the next few days.

On the one hand, I think Brandon's fix looks good (I will try it out fully tomorrow) - but on the other hand I can see why Robert's nervous about putting this live as it's fairly fundamental!

However, maybe as a compromise we could take Brandon's changes to singleton.hpp? These look fairly innocuous, but they would allow me (or anybody else who comes across this problem) to make changes as per extern.hpp in our own code-base. Would this work?

Thanks,

Chris

comment:12 by Chris Rusby <chris.rusby@…>, 9 years ago

I have finally figured out the cause of the 2nd crash. Here's what was going on.

I was serializing in/out (correctly) shared_ptr<const Object>. In a completely different part of the codebase, somebody else was serializing out a shared_ptr<const Object> but serializing it back in as a shared_ptr<Object>.

If I tested my code, it worked fine. However, if both the other code and mine were used in the same serialized file I got a null pointer exception (same place as above).

I think the erroneous code was messing up the type registration in some way. It would be really nice if this could be caught at compile time - although I must admit, I can't think of any way to do this. Maybe the library could throw a nicer runtime error though? I guess the downside is performance?

Thanks,

Chris

comment:13 by Robert Ramey, 9 years ago

If you look into basic_serializer_map.cpp you'll find a passage of commented out code which I implemented just to trap this type of situation. I believe that uncommenting this code would be effective in your situation.

I had to comment it out because so many people just ignored this issue and instantiated code all over the place and this prevented their code from compiling. They complained a lot that the serialization library was "buggy" because of this. So I had to comment it out.

Probably this should be re-enabled but permit the check to be suppressed with a run-time switch. But then this is also a hassle to implement so I just left it commented out.

You might want to experiment with enabling this code and see what impact it would have on your application and perhaps think about how to conditionally enable it. - assuming you've got nothing else to do.

Robert Ramey

comment:14 by anonymous, 9 years ago

This looks interesting, I'll try it next week.

Thanks for your help,

Chris

comment:15 by anonymous, 8 years ago

Hi again,

We've come up against a situation which cannot easily be worked around, and hence I'm having to revisit this. I do believe that Brandon's fix is good. In fact, my current plan is to patch in his change to just singleton.hpp in our boost install; the rest can be managed within our codebase.

I don't see an reason why the BOOST_DLLEXPORT needs to be on each function, rather than on the class itself. If it could be moved, then it would resolve my issue.

Note that, with C++ 11's comes extern templates, and this is the "proper" fix for this issue as then there will only be one singleton implementation.

Thanks,

Chris

comment:16 by vincent.legoll@…, 8 years ago

Hello, I don't know if this is the same bug I'm getting, but it looks at least related.

I'm getting: /usr/include/boost/serialization/singleton.hpp:132:13: runtime error: reference binding to null pointer of type 'const boost::archive::detail::extra_detail::guid_initializer<CombatEvent>'

I'm compiling freeorion (www.freeorion.org) with clang++-3.6 and boost version 1.55.0.2 from debian

Is there anything else or new on the subject that I can try ?

comment:17 by Chris, 8 years ago

Sorry - not the same bug; this one impacts windows only.

Note: See TracTickets for help on using tickets.