Opened 10 years ago

Last modified 4 years ago

#8095 new Bugs

Invalid XML is produced by write_xml when there is a special character in the key

Reported by: martongabesz@… Owned by: Sebastian Redl
Milestone: To Be Determined Component: property_tree
Version: Boost 1.67.0 Severity: Problem
Keywords: xml Cc: smr@…, wolf+"https:@…

Description

I have the following property tree, dumped in INFO format:

CudbAppServiceId=1
{
    userLabel ""
    sqlAppSrvPlSchema identities-pl.sql
    CudbAppServiceId CudbAppServiceId=1
}

When I dump it in XML format, I got the following:

<?xml version="1.0" encoding="utf-8"?>
<CudbAppServiceId=1>
<userLabel/>
<sqlAppSrvPlSchema>identities-pl.sql</sqlAppSrvPlSchema>
<CudbAppServiceId>CudbAppServiceId=1</CudbAppServiceId>
</CudbAppServiceId=1>

This is not a valid XML, because we have a tag "CudbAppServiceId=1". Of course, any other xml parser like xmllint will fail to parse it, because of the equality sign in the tag.

Is there any solution to this problem? Like somehow emitting &#61; instead of the raw '=' inside the tag and in the text?

This is a possible problem in 1.53 as well, since property tree was not updated.

Change History (4)

comment:1 by Sebastian Redl, 8 years ago

There is no solution that will turn your data into valid XML. Element names must follow the Name production, which cannot contain entity references.

The best I can do is fail validation when writing the XML, so that you don't end up with invalid documents.

comment:2 by smr@…, 8 years ago

Cc: smr@… added

comment:3 by Gray Wolf <wolf+"https://svn.boost.org"@…>, 4 years ago

Version: Boost 1.52.0Boost 1.67.0

I've hit another edge case today when converting bunch of .info to .xml. In my case the issue was empty key (since this is valid in info format):

"" value

but produces

<>value</>

in xml.

I think failing the generation with an exception would be best (and only) solution. Sure, I can (and do now) re-read the xml after pt::write_xml() and validate it but feels like unnecessary, since I can't imagine when the invalid xml would be actually useful (since not even property_tree can read it back, correct?).

comment:4 by Gray Wolf <wolf+"https:@…>, 4 years ago

Cc: wolf+"https:@… added
Note: See TracTickets for help on using tickets.