Boost C++ Libraries: Ticket #5876: Serialization - tracking of non-versioned classes https://svn.boost.org/trac10/ticket/5876 <p> I quote a case from my workplace. The issue came up for std::vector&lt;int&gt;, but I use a simple struct X to reproduce the problem. See the attached file for detailed instructions. </p> <p> Opened by Yakov Galka 9/8/2011 (Today) 8:31 AM Edit </p> <p> The following code: </p> <pre class="wiki"> struct X { ... }; X x2; BoostLoad("SerializationBug.dat", x2); #if FLAG volatile int x = 0; if(x) { X *x3; BoostLoad("this code is never executed", x3); } #endif </pre><p> Produces different results depending on whether FLAG == 0 or 1, although it's clear that it must not change it's behavior. </p> <p> After some research it happens that the handling of tracking level is broken in boost since at least boost 1.46.1. </p> <p> Assigned to Yakov Galka by Yakov Galka 9/8/2011 (Today) 8:31 AM Notified ######. </p> <p> Edited by Yakov Galka 9/8/2011 (Today) 8:53 AM [Revised 11:44 AM] Edit </p> <p> It happens for objects with implementation level of object_serializable. That is: </p> <pre class="wiki">BOOST_CLASS_IMPLEMENTATION(X, boost::serialization::object_serializable) </pre><p> For greater implementation level, the tracking level is read from the archive. However it still must affect the saving of objects to any archives (binary, xml, text). </p> <p> If it's not clear enough, the above code reads/writes the the file correctly when FLAG == 0, but tries to load x2 as-if it's tracked when FLAG == 1. </p> <p> Edited by Yakov Galka 9/8/2011 (Today) 10:38 AM Edit I've successfully reproduced this same bug in boost 1.33.1, although there it's silent (no crash, just wrong data is read). Boost serialization is broken really hard on the low-level: </p> <p> basic_iserializer::tracking() decides whether the class should be tracked or not based on m_bpis value. However it can't decide this based on the information it has, since it's shared among objects serialized trough a pointer and not through a pointer. </p> <p> Possible Fix: make basic_iserializer::tracking return the tracking level instead of a boolean value and let the caller decide what this tracking level means. It's a lot of work, and it may break computability with archives serialized incorrectly in 1.33.1, which happens to be possible. We are screwed anyway. </p> <p> Edited by Yakov Galka 9/8/2011 (Today) 11:44 AM </p> <p> Revised Yakov Galka's 8:53 AM entry from 9/8/2011 </p> en-us Boost C++ Libraries /htdocs/site/boost.png https://svn.boost.org/trac10/ticket/5876 Trac 1.4.3 ybungalobill@… Thu, 08 Sep 2011 09:44:45 GMT attachment set https://svn.boost.org/trac10/ticket/5876 https://svn.boost.org/trac10/ticket/5876 <ul> <li><strong>attachment</strong> → <span class="trac-field-new">ATest.cpp</span> </li> </ul> <p> code demonstrating the problem </p> Ticket ybungalobill@… Thu, 15 Sep 2011 15:55:55 GMT <link>https://svn.boost.org/trac10/ticket/5876#comment:1 </link> <guid isPermaLink="false">https://svn.boost.org/trac10/ticket/5876#comment:1</guid> <description> <p> Hereby, I submit a patch that fixes the above bug. However, it doesn't solve the issue completely since it still suffers from the problem that if someone serializes an object with implementation level of class-info through a pointer, then all other instances are are tracked too (even those that are not serialized through a pointer). However it's now trivial to fix the behavior to be the intended one, it's just that it involves archive format breakage so we must bump the library version. </p> </description> <category>Ticket</category> </item> <item> <author>ybungalobill@…</author> <pubDate>Thu, 15 Sep 2011 15:57:43 GMT</pubDate> <title>attachment set https://svn.boost.org/trac10/ticket/5876 https://svn.boost.org/trac10/ticket/5876 <ul> <li><strong>attachment</strong> → <span class="trac-field-new">boost.serrialization.patch</span> </li> </ul> <p> proposed fix </p> Ticket ybungalobill@… Sun, 18 Sep 2011 07:54:11 GMT attachment set https://svn.boost.org/trac10/ticket/5876 https://svn.boost.org/trac10/ticket/5876 <ul> <li><strong>attachment</strong> → <span class="trac-field-new">boost.serialization2.patch</span> </li> </ul> <p> I apologize, the previous fix doesn't fix the save correctly. </p> Ticket Robert Ramey Mon, 23 Jan 2012 16:56:11 GMT <link>https://svn.boost.org/trac10/ticket/5876#comment:2 </link> <guid isPermaLink="false">https://svn.boost.org/trac10/ticket/5876#comment:2</guid> <description> <p> Looks like you're on to something here. I'm still looking at this. </p> <p> "suffers from the problem that if someone serializes an object with implementation level of class-info through a pointer, then all other instances are are tracked too (even those that are not serialized through a pointer)" </p> <p> But this is the intended behavior! If this wasn't true, pointers to objects previously serialized as objects wouldn't be restored properly. </p> <p> Robert Ramey </p> </description> <category>Ticket</category> </item> <item> <author>ybungalobill@…</author> <pubDate>Mon, 23 Jan 2012 17:45:16 GMT</pubDate> <title/> <link>https://svn.boost.org/trac10/ticket/5876#comment:3 </link> <guid isPermaLink="false">https://svn.boost.org/trac10/ticket/5876#comment:3</guid> <description> <blockquote class="citation"> <p> But this is the intended behavior! If this wasn't true, pointers to objects previously serialized as objects wouldn't be restored properly. </p> </blockquote> <p> I find it counter-intuitive. I expect objects serialized as objects to mean 'by value' and not be tracked, while objects serialized by pointer to mean 'by reference' and be tracked. Anyway it's your choice and I won't criticize it. </p> <p> The problem is that you can't implement the intended behavior properly for object_serializable. The above code is a minimal code that reproduces the problem, but it's not hypothetical. The real world scenario went like this: </p> <p> 3rd party interprocess communication library used serialization by pointer in generic code. It happened that we passed vector&lt;int&gt; between two processes, so it serialized it through vector&lt;int&gt;*. Another unrelated part of our codebase serialized shared_ptr&lt;vector&lt;int&gt;&gt; which in turn is serialized through vector&lt;int&gt;*. Note that nowhere the user (me) played with the implementation level. Each part is valid in isolation. So far so good. </p> <p> We updated from boost 1.33.1 to boost 1.47.0. It got broken. Why? I don't exactly know. My hypothesis is that because the way singletons are created had changed (function statics in 1.33.1 versus class statics in 1.47.0) the linker handles them differently. </p> <p> Whatever the reason I think that code that breaks depending on the existence of some unrelated other code is worse than the inability to serialize object_serializable object as object and then track it through a pointer. </p> <p> Thank you,<br /> Yakov Galka </p> </description> <category>Ticket</category> </item> <item> <author>Mika Fischer <mika.fischer@…></author> <pubDate>Fri, 29 Jun 2012 13:58:07 GMT</pubDate> <title>cc set https://svn.boost.org/trac10/ticket/5876#comment:4 https://svn.boost.org/trac10/ticket/5876#comment:4 <ul> <li><strong>cc</strong> <span class="trac-author">mika.fischer@…</span> added </li> </ul> Ticket Robert Ramey Sun, 15 Jul 2012 20:44:02 GMT status changed https://svn.boost.org/trac10/ticket/5876#comment:5 https://svn.boost.org/trac10/ticket/5876#comment:5 <ul> <li><strong>status</strong> <span class="trac-field-old">new</span> → <span class="trac-field-new">assigned</span> </li> </ul> <p> "I find it counter-intuitive. I expect objects serialized as objects to mean 'by value' and not be tracked, while objects serialized by pointer to mean 'by reference' and be tracked. Anyway it's your choice and I won't criticize it." </p> <p> This was a design decision made at the very beginning as an optimization. Basically the idea was that there was no need to include the pointer deserialization code unless the the item was serialized explicitly via a pointer. In retrospect, this turns out to be questionable. It's possible for the same object to be serialized multiple times even though it's never through a pointer. Actually, this needs to be re-thought in detail - which I'm not likely to do any time soon. </p> <p> One thing you might consider is explicitly setting the implementation level to include tracking so that items get tracked regardless of how they are used. </p> <p> Robert Ramey </p> Ticket ybungalobill@… Tue, 05 Apr 2016 10:51:57 GMT <link>https://svn.boost.org/trac10/ticket/5876#comment:6 </link> <guid isPermaLink="false">https://svn.boost.org/trac10/ticket/5876#comment:6</guid> <description> <p> Hey, it has been a long time ago, yet I found a draft somewhere in my inbox: </p> <blockquote class="citation"> <p> Changing the implementation level of 'vector&lt;int&gt;' as you suggest would break backward and forward compatibility, due to the exact same problem. Also it counts as boost modification, and if we agree to resort to changing boost, then there are better alternatives: fix the core problem. We had already applied a bunch of mandatory modifications (a fix for <a class="new ticket" href="https://svn.boost.org/trac10/ticket/5499" title="#5499: Bugs: Serialization backward compatability (new)">#5499</a> and full support for saving boost 1.33.1 archive format), so using the patch I sent is the most practical choice. There are still some circumstances that may trigger the bug (that I can't recall now), but thankfully it isn't the case in our code. We managed to read and write archives binary identical to those of boost 1.33.1. </p> </blockquote> <p> I'm sorry to say that, but since then we stopped using Boost.Serialization in new code and hope to obliterate it in other places one day, which is tricky due to the deployed systems life cycle. </p> <p> Yakov Galka </p> </description> <category>Ticket</category> </item> </channel> </rss>