Opened 13 years ago

Closed 13 years ago

#3080 closed Bugs (fixed)

dynamic_cast returns 0 on polymorphic serialization with weak_ptr

Reported by: Takatoshi Kondo <kondo@…> Owned by: Robert Ramey
Milestone: Boost 1.40.0 Component: serialization
Version: Boost 1.39.0 Severity: Problem
Keywords: Cc: Sohail Somani

Description

Problem

In multiple inheritance case, after de-serialization, failed to dynamic_cast from a data member polymorphic serialize via 2nd base class (Left) to subclass (Bottom).

        // check
        Left *pLeft = h2.mbl_.get();
        Bottom *pBottomFromLeft = dynamic_cast<Bottom *>(pLeft);

        Right *pRight = h2.mbr_.get();
        Bottom *pBottomFromRight = dynamic_cast<Bottom *>(pRight);

        assert(pBottomFromLeft);  // Success
        assert(pBottomFromRight); // Fail. Comment out *1 and *2, it works corrctly.

My analysis

Class 'Bottom' has weak_ptr that point to itself. When weak_ptr member wp_ exclude serialization, dynamic_cast is success.(Comment out *1 and *2 lines.)

I set break point (using gdb). => is position.

class Holder {
public:
    LeftSp mbl_;  // shared_ptr<Left>
    RightSp mbr_; // shared_ptr<Right>

    Holder() {} // for serialize
    Holder(BottomSp l, BottomSp r):mbl_(l) ,mbr_(r) {} // for appli construct

    // serialize
    friend class boost::serialization::access;
    template<class Archive>
    void save(Archive &ar, const unsigned int /* file_version */) const
    {
        // polymophic serialize via shared_ptr
        ar << BOOST_SERIALIZATION_NVP(mbl_);
        // polymophic serialize via shared_ptr
        ar << BOOST_SERIALIZATION_NVP(mbr_);
    }
    template<class Archive>
    void load(Archive & ar, const unsigned int /* file_version */)
    {
        // polymophic de-serialize via shared_ptr
        ar >> BOOST_SERIALIZATION_NVP(mbl_);
        // polymophic de-serialize via shared_ptr
        ar >> BOOST_SERIALIZATION_NVP(mbr_);
=>  }
    BOOST_SERIALIZATION_SPLIT_MEMBER()
};

And, I printed below values.

In assertion failed case, _vptr.Left is equal to _vptr.Right.

(gdb) p *mbl_.px
$1 = {_vptr.Left = 0x8060ba8}
(gdb) p *mbr_.px
$1 = {_vptr.Right = 0x8060ba8}

But, success case (remove the member wp_'s serialization), _vptr.Left is not equal to _vptr.Right

(gdb) p *mbl_.px
$1 = {_vptr.Left = 0x805d1e8}
(gdb) p *mbr_.px
$1 = {_vptr.Right = 0x805d1f8}

Environment

g++ (Gentoo 4.3.2-r3 p1.6, pie-10.1.5) 4.3.2 Copyright (C) 2008 Free Software Foundation, Inc. This is free software; see the source for copying conditions. There is NO warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.

Attachments (7)

wps.cpp (5.3 KB ) - added by Takatoshi Kondo <kondo@…> 13 years ago.
Sample program that operates completely that reproduces bug.
ClassDiagram.png (7.1 KB ) - added by Takatoshi Kondo <kondo@…> 13 years ago.
Class structure
sp_mlt_base.cpp (2.1 KB ) - added by Takatoshi Kondo <kondo@…> 13 years ago.
test_zmisc.cpp (2.6 KB ) - added by Robert Ramey 13 years ago.
shared_ptr_mlt_inh.patch (2.0 KB ) - added by Takatoshi Kondo <kondo@…> 13 years ago.
patch to solve this problem
test_shared_ptr_multi_base.cpp (7.2 KB ) - added by Takatoshi Kondo <kondo@…> 13 years ago.
TestCode
add_test.patch (452 bytes ) - added by Takatoshi Kondo <kondo@…> 13 years ago.
Add test to Jamfile.v2 in serialization

Download all attachments as: .zip

Change History (25)

by Takatoshi Kondo <kondo@…>, 13 years ago

Attachment: wps.cpp added

Sample program that operates completely that reproduces bug.

by Takatoshi Kondo <kondo@…>, 13 years ago

Attachment: ClassDiagram.png added

Class structure

comment:1 by Takatoshi Kondo <kondo@…>, 13 years ago

I analyzed this problem. And, I finally noticed.

Now, I have understood data member m_pointers in the shared_ptr_helper class keep information by the type shared_ptr<void> to achieve the restoration shared.

The behavior of wps.c (Attached file) is same as the following simple sample.

class Left {
public:
        virtual ~Left() {}
};
class Right {
public:
        virtual ~Right() {}
};

class Bottom:public Left, public Right {
public:
        virtual ~Bottom() {}
};

int main() {
        Bottom b;
        void *pv = &b;
        Right *pr2 = static_cast<Right *>(pv);
        assert(dynamic_cast<Bottom *>(pr2)); // Fail
}

The meaning of static_cast in this sample is same as the following code. https://svn.boost.org/trac/boost/browser/tags/release/Boost_1_39_0/boost/archive/shared_ptr_helper.hpp#L132

In deserializing weak_ptr, it is inserted to m_pointers as shared_ptr<void>. https://svn.boost.org/trac/boost/browser/tags/release/Boost_1_39_0/boost/archive/shared_ptr_helper.hpp#L129

Afterwards, when deserializing the shared_ptr, the following code is executed. https://svn.boost.org/trac/boost/browser/tags/release/Boost_1_39_0/boost/archive/shared_ptr_helper.hpp#L132

Then, the problem occurs because the type of shared_ptr is 'Right' the second base class of 'Bottom'.


I checked the document again.

I had overlooked the following part of the document.

http://www.boost.org/doc/libs/1_39_0/libs/serialization/doc/new_case_studies.html#archivehelper

I'm not sure this problem is same as the current archive helper limitation. However, if it is exactly correct, this ticket should not be a bug but should be a feature requests.

Could you give me your feedback?

comment:2 by Sohail Somani, 13 years ago

The sample code you post looks to be incorrect. Specifically, you can't cast from T* to void* and back to anything but T*. This will especially apply in multiple inheritance. I think to do anything further on this bug, a better test case needs to be given. Specifically one that we can compile and see failing.

comment:3 by Sohail Somani, 13 years ago

Cc: Sohail Somani added

comment:4 by Takatoshi Kondo <kondo@…>, 13 years ago

I'm sorry for being confused.

https://svn.boost.org/trac/boost/ticket/3080#comment:1 is not a sample that works correctly.

The motivation of this sample is describe behavior of shared_ptr_hepler.

56      class shared_ptr_helper {
57          typedef std::map<const void *, shared_ptr<void> > collection_type;
58          typedef collection_type::const_iterator iterator_type;
59          // list of shared_pointers create accessable by raw pointer. This
60          // is used to "match up" shared pointers loaded at different
61          // points in the archive. Note, we delay construction until
62          // it is actually used since this is by default included as
63          // a "mix-in" even if shared_ptr isn't used.
64          collection_type * m_pointers;

snip ...

111     public:
112         template<class T>
113         void reset(shared_ptr<T> & s, T * r){
114             if(NULL == r){
115                 s.reset();
116                 return;
117             }
118             // get pointer to the most derived object.  This is effectively
119             // the object identifer
120             const void * od = object_identifier(r);
121     
122             if(NULL == m_pointers)
123                 m_pointers = new collection_type;
124     
125             iterator_type it = m_pointers->find(od);
126     
127             if(it == m_pointers->end()){
128                 s.reset(r);
129                 m_pointers->insert(collection_type::value_type(od,s));
130             }
131             else{
132                 s = static_pointer_cast<T>((*it).second);
133             }
134         }

The sample that reproduces bug is the following. https://svn.boost.org/trac/boost/attachment/ticket/3080/wps.cpp

And I made more essential following sample(sp_mlt_base.cpp).

#include <cassert>
#include <fstream>
#include <boost/shared_ptr.hpp>
#include <boost/weak_ptr.hpp>
#include <boost/serialization/serialization.hpp>
#include <boost/archive/xml_oarchive.hpp>
#include <boost/archive/xml_iarchive.hpp>
#include <boost/serialization/nvp.hpp>
#include <boost/serialization/export.hpp>
#include <boost/serialization/shared_ptr.hpp>
#include <boost/serialization/weak_ptr.hpp>

struct Base1 {
    virtual ~Base1() {}
    // serialize
    friend class boost::serialization::access;
    template<class Archive>
    void serialize(Archive & /*ar*/, const unsigned int /* file_version */) {}
};

BOOST_CLASS_EXPORT(Base1)

struct Base2 {
    virtual ~Base2() {}
    // serialize
    friend class boost::serialization::access;
    template<class Archive>
    void serialize(Archive & /*ar*/, const unsigned int /* file_version */) {}
};

BOOST_CLASS_EXPORT(Base2)

struct Sub:public Base1, public Base2 {
    virtual ~Sub() {}
    boost::weak_ptr<Sub> wp_;

    // serialize
    friend class boost::serialization::access;
    template<class Archive>
    void serialize(Archive &ar, const unsigned int /* file_version */)
    {
        ar & BOOST_SERIALIZATION_BASE_OBJECT_NVP(Base1);
        ar & BOOST_SERIALIZATION_BASE_OBJECT_NVP(Base2);
        ar & BOOST_SERIALIZATION_NVP(wp_); // *1
    }
};

BOOST_CLASS_EXPORT(Sub)

int main()
{
    {
        // serialize
        boost::shared_ptr<Sub> s(new Sub);
        boost::shared_ptr<Base2> pb2(s);
        s->wp_ = s; // set weak_ptr. If not set, deserialize success.
        std::ofstream ofs("output.xml");
        assert(ofs);
        boost::archive::xml_oarchive oa(ofs);
        oa << boost::serialization::make_nvp("Base2", pb2);
    }
    {
        // de-serialize
        std::ifstream ifs("output.xml");
        assert(ifs);
        boost::archive::xml_iarchive ia(ifs);
        boost::shared_ptr<Base2> pb2;
        ia >> boost::serialization::make_nvp("Base2", pb2);
        // check
        // pb2's vptr is broken.
        assert(dynamic_cast<Sub *>(pb2.get()));
    }
}

Please check it.

by Takatoshi Kondo <kondo@…>, 13 years ago

Attachment: sp_mlt_base.cpp added

comment:5 by Robert Ramey, 13 years ago

Sorry I haven't been able to give this the attention it deserves. I haven't forgotten it, I just haven't had time to look into it. Thanks for your continuing efforts.

Robert Ramey

by Robert Ramey, 13 years ago

Attachment: test_zmisc.cpp added

comment:6 by Robert Ramey, 13 years ago

Resolution: worksforme
Status: newclosed

I've adjust your test to give me some more information. Compiles, and runs without error on my VC 7.1 system. So one of the following:

a) My changes made it pass b) Compiler differences

Try this on you system. If it passes, figure out where the difference is.

Robert Ramey

comment:7 by Takatoshi Kondo <kondo@…>, 13 years ago

Thank you for the reply.

I can check on VC++ 9.0.21022.8 RTM and g++ (Gentoo 4.3.2-r3 p1.6, pie-10.1.5) 4.3.2.

I will try to check on my environment today. Afterwards, I will comment.

comment:8 by Takatoshi Kondo <kondo@…>, 13 years ago

Resolution: worksforme
Status: closedreopened

Result of test_zmisc.cpp

On VC++ 9.0.21022.8 RTM

Compile successed.

Run failed. Access violation occured. pb2_1 is incorrect.

On g++ (Gentoo 4.3.2-r3 p1.6, pie-10.1.5) 4.3.2.

Compile successed. Run failed. Access violation occured. pb2_1 is incorrect.


Same result.

I think that these compilers are comparatively new, and major.

Could you check your sample and my sample using these compiler?

My analysis

pb2(original data)'s __vfptr point to vftable for Base2.

pb2_1(deserialized)'s __vfptr point to vftavble for Base1.

pb2_1's __vfptr should point to vftable for Base2.

Why does pb2_1's __vfptr point to vftable for Base1.

The reseason is following...

   template<class T>
    void reset(shared_ptr<T> & s, T * r){
        if(NULL == r){
            s.reset();
            return;
        }
        // get pointer to the most derived object.  This is effectively
        // the object identifer
        const void * od = object_identifier(r);

        if(NULL == m_pointers)
            m_pointers = new collection_type;

        iterator_type it = m_pointers->find(od);

        if(it == m_pointers->end()){
            s.reset(r);
            m_pointers->insert(collection_type::value_type(od,s)); // point1
        }
        else{
            s = static_pointer_cast<T>((*it).second); // point2
        }
    }

https://svn.boost.org/trac/boost/browser/tags/release/Boost_1_39_0/boost/archive/shared_ptr_helper.hpp

  1. When m_wp deserialize, program passed point1.
  2. When pb2_1 deserialize, program passed point2.Because they point to same object.

After step1, type information was lost. Because the container m_pointers type is shared_ptr<void>.

In step2, it is impossible to distinguish that (*it).second is instance of 'Sub' or instance of 'Base2'.

Currently, member function reset() considers (*it).second to shared_ptr<Base2>, based on type T.

Unfortunately the real object type is 'Sub'.

After all, writing position for the desirializing data was shifted. And pb2_1 was broken.

comment:9 by Robert Ramey, 13 years ago

Well you've certainly done your homework! Thanks for taking this time. I'll look into it.

Robert Ramey

by Takatoshi Kondo <kondo@…>, 13 years ago

Attachment: shared_ptr_mlt_inh.patch added

patch to solve this problem

comment:10 by Takatoshi Kondo <kondo@…>, 13 years ago

I made the patch to solve this problem. (shared_ptr_mlt_inh.patch )

It seems to have solved the problem.

The point that had been devised with this patch combined void_upcast and aliasing constructor of shared_ptr well.

I am doing regression test preparation, it doesn't complete yet.

There are many tests in /boost_1_39_0/libs/serialization/test.

I do not understand still how to do the regression test though I'm reading the document.

Please review my patch. And could you teach me how to build the regression test?

comment:11 by Robert Ramey, 13 years ago

I've been working on this problem and have found no way to fix it in a definitive manner. A definitive fix would require enhancements in the shared_ptr public interface which I don't expect to occur.

The problem can be addressed in the test case:

struct Sub:public Base1, public Base2 {
    int m_x;
    boost::weak_ptr<Sub> m_wp;     //fails
    //boost::weak_ptr<Base2> m_wp; //OK
    Sub(){}
    Sub(int x) :
        Base1(x),
        Base2(x),
        m_x(x) 
    {}

By using only one type of pointer to serialize the same object. This problem only occurs when using shared pointer and it's friends.

I've updated the code to trap the condition and throw an exception when it is discovered.

I've looked at your patch. First, I'm very impressed that you've clearly understood how share_ptr_helper works and what it does. And this is without any explanation and documentation!!!.

At first I was inclined to dismiss it as I had spent a lot of time only to conclude that it wasn't fixable. Looking carefully at your code - which looks very similar to my own attempts, now I'm thinking you might be right. So I'm still looking at it. The problem for me is that the shared_ptr operations in the public interface are not all that transparent to me.

As for testing, I would expect that a much simpler test could be made.

class Base {
...
};
class Derived : public Base {
...
}

smart_ptr dptr = new Derived;
smart_ptr bptr = dptr;
// now there is only one object with pointers of different types
...
ar >> dptr;
ar >> bptr; // creates problem.
            // dptr and bptr point to the same object


Anyway, I'm going to study your patch in more detail.

Robert Ramey.

PS - unfortunately, I had alread conclude that this was not fixable and had updated the manual and exception list accordingly.

comment:12 by Robert Ramey, 13 years ago

As to building a regression test, just take one of the examples in the test directory and modify it to your own test.

RR

comment:13 by Robert Ramey, 13 years ago

Damn -

Looks like your solution is exactly the correct one.

I would like to see a more exhaustive test which would test the variations on share_ptr and weak_ptr of shared objects.

Robert Ramey

comment:14 by Takatoshi Kondo <kondo@…>, 13 years ago

Thanks!

I am glad to contribute to your library.

I am looking forward to result of the exhaustive test.

by Takatoshi Kondo <kondo@…>, 13 years ago

by Takatoshi Kondo <kondo@…>, 13 years ago

Attachment: add_test.patch added

Add test to Jamfile.v2 in serialization

comment:15 by Takatoshi Kondo <kondo@…>, 13 years ago

I made a test code based on test_shared_ptr.cpp, adn ran it.

The test file name is test_shared_ptr_multi_base.cpp.

I think that this file is added to /libs/serialization/test directory.

And add_test.patch is patch for Jamfile.v2 to add the test file to automatic test process.

Before shared_ptr_mlt_inh.patch, There ware 3 failures. And test program exited on exception. Actually though there is a failure more.

I have not understood the method of doing so though it is thought that all failures should be able to be counted.

But I think it is not so big problem.

After shared_ptr_mlt_inh.patch, There is no failure.

Is this exhaustive enough?

If you don't think so, Please point out the viewpoint of the test that should be added.

comment:16 by Takatoshi Kondo <kondo@…>, 13 years ago

After that, I did the regression tests of serialization, and there is no failure detected.

comment:17 by Takatoshi Kondo <kondo@…>, 13 years ago

Hi.

The milestone of this ticket is 1.40.0. And the next release day is getting closer.

https://svn.boost.org/trac/boost/wiki/ReleaseSchedule

Will my patch merge to the next release?

I already did the tests.

Should I do anything?

comment:18 by Robert Ramey, 13 years ago

Resolution: fixed
Status: reopenedclosed

Fixed in the trunk due in large part to your efforts. Thank you.

Note: See TracTickets for help on using tickets.