Opened 12 years ago

Closed 12 years ago

#4809 closed Bugs (fixed)

deserialization with xml_iarchive fails when attributes of top level element are reordered

Reported by: Kolja Nowak <kolja@…> Owned by: Bryce Adelstein Lelbach
Milestone: To Be Determined Component: serialization
Version: Boost 1.44.0 Severity: Problem
Keywords: serialization xml attribute order Cc:

Description

Citing from W3C XML spec, section 3.1: "Note that the order of attribute specifications in a start-tag or empty-element tag is not significant."

The parser used by xml_iarchive doesn't conform to this, because it fails on XML input which is logically equivalent to the output of xml_oarchive, just because the attribute order of the top level element 'boost_serialization' was changed. In contrast, the parser seems to be robust against reordering attributes in other elements.

The following program demonstrates the issue:

#include <iostream>
#include <sstream>
#include <string>
#include <boost/archive/xml_iarchive.hpp>

void f(const std::string& xml)
{
  try
  {
    std::string str;
    boost::archive::xml_iarchive(std::istringstream(xml)) 
      >> boost::serialization::make_nvp("str", str);
    std::cout << str << std::endl;
  }
  catch(boost::archive::archive_exception& e)
  {
    std::cout << e.what() << std::endl;
  }
}

int main()
{
  // deserialize some xml, which was created using xml_oarchive

  f("<?xml version=\"1.0\" encoding=\"UTF-8\" standalone=\"yes\" ?>"
    "<!DOCTYPE boost_serialization>"
    "<boost_serialization signature=\"serialization::archive\" version=\"7\">"
    "<str>deserialize successfull</str>"
    "</boost_serialization>");

  // the same xml, but attributes of <boost_serialization> reorderd

  f("<?xml version=\"1.0\" encoding=\"UTF-8\" standalone=\"yes\" ?>"
    "<!DOCTYPE boost_serialization>"
    "<boost_serialization version=\"7\" signature=\"serialization::archive\">"
    "<str>deserialize successfull</str>"
    "</boost_serialization>");

  return 0;
}

The following patch to basic_xml_grammar.ipp modifies the rule 'SerializationWrapper' to accept both possible combinations:

Index: libs/serialization/src/basic_xml_grammar.ipp
===================================================================
--- libs/serialization/src/basic_xml_grammar.ipp    (revision 66354)
+++ libs/serialization/src/basic_xml_grammar.ipp    (working copy)
@@ -271,9 +271,9 @@
       =  -S
       >> "<boost_serialization"
       >> S
-      >> SignatureAttribute
-      >> S
-      >> VersionAttribute
+      >> ( (SignatureAttribute >> S >> VersionAttribute)
+         | (VersionAttribute >> S >> SignatureAttribute)
+         )
       >> !S
       >> '>'
     ;

Attachments (2)

boost-serialization-xml-attr-order-bug.cpp (1.1 KB ) - added by Kolja Nowak <kolja@…> 12 years ago.
program to reproduce the issue
boost-serialization-xml-attr-order.patch (563 bytes ) - added by Kolja Nowak <kolja@…> 12 years ago.
patch fixing the issue

Download all attachments as: .zip

Change History (7)

by Kolja Nowak <kolja@…>, 12 years ago

program to reproduce the issue

by Kolja Nowak <kolja@…>, 12 years ago

patch fixing the issue

comment:1 by Kolja Nowak <kolja@…>, 12 years ago

Component: Noneserialization
Owner: set to Robert Ramey

comment:2 by Bryce Adelstein Lelbach, 12 years ago

Owner: changed from Robert Ramey to Bryce Adelstein Lelbach

Ramey, what are your thoughts here? This adds more complexity to the grammar, and I'm having trouble imagining a situation in which this is needed. This happens with the old grammar, too, I should note.

in reply to:  2 comment:3 by Kolja Nowak <kolja@…>, 12 years ago

Replying to wash:

I'm having trouble imagining a situation in which this is needed.

We use boost::serialize to exchange messages between different processes, and sadly one of them is not written in C++ but in Adobe Flex. We wrote a counterpart to boost::serialize in Flex on top of the XML classes provided by the framework. This worked well until the recent update of Adobe Air, which changed the behavior of the XML generator, causing attributes to be ordered slightly different than before.

You may argue boost::serialize was never designed to talk to a different implementation expect itself. But even then, violating XML in the current way may lead to trouble. Imagine I have some storage engine, a database, whatever, which is able to store XML documents in a more space efficient way than plain text, for example by converting them to a binary representation. It would be perfectly fine and XML conformant for such a storage engine to reconstruct logically identical documents with changed attribute order, but even if it looks like a good idea I couldn't use it store boost::serialize XML archives.

comment:4 by Bryce Adelstein Lelbach, 12 years ago

Alright, Kolja. I'm sold. Unless Ramey disagrees, I think we should apply this against the old version of the grammar which will be in trunk soon, and I've already applied it to my local trunk.

comment:5 by Bryce Adelstein Lelbach, 12 years ago

Resolution: fixed
Status: newclosed

Applied in r66459.

Note: See TracTickets for help on using tickets.