#9601 closed Bugs (fixed)
Unable to load archives with pointer instances which contain self reference cycles in 1.55
Reported by: | Brandon Kohn | Owned by: | Robert Ramey |
---|---|---|---|
Milestone: | To Be Determined | Component: | serialization |
Version: | Boost 1.55.0 | Severity: | Problem |
Keywords: | Cc: |
Description
First the original pointer is loaded like so:
// because the following operation could move the items // don't use co after this // add to list of serialized objects so that we can properly handle // cyclic strucures //! Note: t's value is undefined at this point. It's whatever value it had when serialization was called. object_id_vector.push_back(aobject(t, cid)); // remember that that the address of these elements could change // when we make another call so don't use the address bpis_ptr->load_object_ptr( ar, t, m_pending.version ); BOOST_ASSERT(NULL != t); //! Note: The object_id_vector element's address is not updated until the load is complete. [1] object_id_vector[ui].address = t;
load_object_ptr will eventually lead to:
template<class Archive, class T> BOOST_DLLEXPORT void pointer_iserializer<Archive, T>::load_object_ptr( basic_iarchive & ar, void * & x, const unsigned int file_version ) const { ... //! This serializes an instance of T ar_impl >> boost::serialization::make_nvp(NULL, * t); }
If t contains a cyclic reference to itself then the following is called:
// extra line to evade borland issue const bool tracking = co.tracking_level; // if we're tracking and the pointer has already been read if(tracking && ! track(ar, t)) // we're done return bpis_ptr;
track will check if the oid is already loaded and assign the address of the previously loaded pointer.
bool basic_iarchive_impl::track( basic_iarchive & ar, void * & t ){ ... // if its a reference to a old object if(object_id_type(object_id_vector.size()) > oid){ // we're done //! Note this line: *** [2] t = object_id_vector[oid].address; return false; } return true; }
Recall back at the original scope [1]: remember that that the address of these elements could change
// when we make another call so don't use the address bpis_ptr->load_object_ptr( ar, t, m_pending.version ); BOOST_ASSERT(NULL != t); //! The assignment to object_id_vector[ui].address takes place after the cycle is loaded. //! This means means the value assigned to t at [2] is undefined. object_id_vector[ui].address = t;
Here's a unit test that illustrates the issue:
namespace TestPointerSerializationReferenceCycle { struct PointerHolder { PointerHolder(PointerHolder* ptr = nullptr) : Ptr(ptr) {} PointerHolder* Ptr; template <typename Archive> void serialize(Archive& ar, const unsigned int v) { ar & Ptr; } }; } BOOST_AUTO_TEST_CASE(TestPointerReferenceCycleLoad) { using namespace TestPointerSerializationReferenceCycle; std::stringstream buff(std::stringstream::in | std::stringstream::out); //! Write it. { PointerHolder* pPtrHldr = new PointerHolder(); pPtrHldr->Ptr = pPtrHldr; boost::archive::text_oarchive ar(buff); ar & pPtrHldr; } //! Read it. { buff.seekg(0); PointerHolder* pPtrHldr = reinterpret_cast<PointerHolder*>(0xBAADF00D); boost::archive::text_iarchive ar(buff); ar & pPtrHldr; BOOST_CHECK(pPtrHldr != nullptr); if (pPtrHldr != nullptr) { BOOST_CHECK(pPtrHldr->Ptr == pPtrHldr);//! At this point pPtrHldr->Ptr == 0xBAADFOOD. } } }
Attachments (3)
Change History (14)
follow-up: 2 comment:1 by , 9 years ago
Status: | new → assigned |
---|
comment:2 by , 9 years ago
Replying to ramey:
boy - I bet this cost a lot of effort to figure out.
Yeah, it's fun stuff :). Anyone who doesn't appreciate the intricacies of implementing a serialization library needs to spend time debugging one.
Note that the test set includes a specific test for cyclic pointers which is passing.
I'm thinking that the difference between that test and your test isn't so much a cycle as a self referential object.
If you mean serialization\test\test_cyclic_ptrs.cpp, I'll give it a more thorough look.
We've been using boost serialization at my work for our application files and have just recently moved from 1.53 to 1.55 to support vc12. The use case where I found this was more complex than the unit-test as it involved polymorphic types held in shared_ptr instances with the cyclic reference being held inside a std::map member variable. The unit-test was just the most simple cycle I could think of which illustrated the issue.
I'm also thinking that the correct fix is to tweak the code so that it addresses the special case of a self referential object such as yours. *snip* This is predicated on my guess that this is really just a corner case I failed to consider and not some fundamental issue which requires a major overhaul. If I'm wrong about this, feel free to let me know.
My suspicion is that it would affect any cycle, but I've been wrong before ;). I'm already having a think about how I would fix it.
One note is that it worked in 1.53. It seems to have stopped working when a change was made to :
// remember that that the address of these elements could change // when we make another call so don't use the address bpis_ptr->load_object_ptr( ar, t,//! used to be object_id_vector[ui].address m_pending.version );
And then in iserializer.hpp (the old version before SHA: 8af0e20bb091631a153147587d7958492be623ae):
BOOST_DLLEXPORT void pointer_iserializer<Archive, T>::load_object_ptr( basic_iarchive & ar, void * & x, const unsigned int file_version ) const { Archive & ar_impl = boost::serialization::smart_cast_reference<Archive &>(ar); auto_ptr_with_deleter< T > ap(heap_allocator< T >::invoke()); if(NULL == ap.get()) boost::serialization::throw_exception(std::bad_alloc()) ; T * t = ap.get(); x = t;//! This would update the object_id_vector[ui].address member with the right ptr.
The new code doesn't do this assignment before deserializing. So my first idea for a fix would be to create a version of load_object_ptr with an overload to update the address member of that vector as before.
There is a comment:
// remember that that the address of these elements could change // when we make another call so don't use the address
It makes me wonder if there's something funky that can happen to that address which I might be missing.
comment:3 by , 9 years ago
"One note is that it worked in 1.53. It seems to have stopped working when a change was made to : ..."
That's it almost for sure. I made this change about a year ago when some guy writing a book claimed that the library was slow. Damn if his case wasn't slow - After wracking my brain, I did "fix" this and everything passed so I declared victory. The slowness was due to checking the whole list of tracked pointers rather than the most recent ones. (IIRC). or something like that
remember that that the address of these elements could change when we make another call so don't use the address
This comment reminds us that caching some address, calling lower level serialization, might change the address of the object originally pointed to. So don't cache the address - or if you do, note that it expires when one calls to a lower level.
Thanks for looking into this. I know it's a bitch.
Robert Ramey
comment:4 by , 9 years ago
Yep, I put together a build with a few changes to test and it works. Let me put together a clean patch as a proposal. My current working copy contains an unadopted patch to workaround a previous issue, so I need to get a clean checkout and make the changes on top of that for the patch.
Cheers,
Brandon Kohn
by , 9 years ago
Attachment: | 0001-Added-new-method-for-loading-tracked-pointers-which-.patch added |
---|
Proposed patch to resolve issue
by , 9 years ago
Attachment: | 0001-Added-unit-test-for-regression-testing-of-trac9601-i.patch added |
---|
Proposed unit test for the issue
comment:5 by , 9 years ago
I've put up two patches: one with a proposed fix and another with changes to test which cover the unit-test I did above. Let me know if there's anything you would like done differently and thanks for letting me lend a hand.
I suppose it might be prudent for me to flesh out the unit-test a bit more with more complex cases involving base types.
Cheers,
Brandon
P.S. Also let me know if something is funky from the patches, I'm not a diff guru ;).
by , 9 years ago
Attachment: | 0001-Cleaned-up-some-tab-space-for-consistency.patch added |
---|
Added new cycle unit-test with multiple base types.
comment:6 by , 9 years ago
With these patches I've managed to run these tests successfully on vc8 (with some changes for friendship to interface_archive), vc11, vc12 and gcc 4.5.3.
There is still a lingering question as to whether the patch is a sufficient fix. That comment that the cached value can change suggests a use-case which might still break. Do you have an idea of what the conditions for that case might be?
comment:7 by , 9 years ago
I've spent a fair amount of time investigating this.
I think that this is a problem which as always existed in the library. Note that the following fails
PointerHolder *pph = new PointerHolder; pph->m_ph = pph; test_ostream os(testfile, TEST_STREAM_FLAGS); test_oarchive oa(os, TEST_ARCHIVE_FLAGS); //oa << BOOST_SERIALIZATION_NVP(pph); oa << pph;
While this passes:
PointerHolder ph; pph.m_ph = ph; test_ostream os(testfile, TEST_STREAM_FLAGS); test_oarchive oa(os, TEST_ARCHIVE_FLAGS); //oa << BOOST_SERIALIZATION_NVP(pph); oa << pph;
That is, this issue is the usage of a recursive data structure at the "top" level.
Actually I believe that this could occur whenever ALL the pointers are allocated with "new" so I think your fix is too specific. On the other hand, looking at the internals, it's not obvious how to fix this. It's sort of a chicken and egg issue. That is it's hard to assign the address of an object which we are still creating.
comment:8 by , 9 years ago
Edit: after learning about BOOST_CLASS_TRACKING, the earlier comment made no sense. ;)
comment:9 by , 9 years ago
Hi all,
This looks very similar to the issue I raised a couple of days ago (#9612), although I've patched my version and it still fails.
I'll take a more detailed look over the weekend, but in the interim if you have any thoughts as to what causes my issue I'd be interested to hear them!
Thanks,
Chris
comment:10 by , 9 years ago
Resolution: | → fixed |
---|---|
Status: | assigned → closed |
OK
I believe that I've fixed the problem. I'm still doing final testing but I expect to push the changes to the development branch before close of business today.
Robert Ramey
boy - I bet this cost a lot of effort to figure out.
Note that the test set includes a specific test for cyclic pointers which is passing.
I'm thinking that the difference between that test and your test isn't so much a cycle as a self referential object. I remember when I wrote code to address cyclical pointers many other serialization system failed on cyclical chains so I took special care to make this work. This should be apparent from all the comments.
I'm also thinking that the correct fix is to tweak the code so that it addresses the special case of a self referential object such as yours.
Of course I wrote this many years ago and this is extremely intricate code. It takes me at least a day to wrap my head around this case, make a small test case and integrate it into the test suite. Since you've already done the head wrapping part, maybe you want to make a stab at tweaking the code to address this corner case.
This is predicated on my guess that this is really just a corner case I failed to consider and not some fundamental issue which requires a major overhaul. If I'm wrong about this, feel free to let me know.
Robert Ramey