Opened 15 years ago
Closed 15 years ago
#1498 closed Patches (fixed)
xml parser: iteration instead of recursion in 'content' rule
Reported by: | Owned by: | Robert Ramey | |
---|---|---|---|
Milestone: | Boost 1.36.0 | Component: | serialization |
Version: | Boost 1.34.1 | Severity: | Problem |
Keywords: | Cc: |
Description
The 'content' rule in basic_xml_grammar.ipp contains a recursion, which leads to stack overflows if the serialized data contains many escaped characters.
The end user of our application may serialize arbitrary binary data, and the msvc linker limits the stack size to 1 MB by default. Deserialization of a std::string containing the Verdana.ttf font file that comes with every Windows installation fails, it requires about 2 Mb of stack space. While it is possible to increase the stack size of the executable it still would not ensure the deserialization of arbitrary data.
Proposed change: The 'content' rule matches the delimiter '<' or a sequence of one or more 'Reference' or 'CharData' rules followed by the delimiter '<'. This requires both 'Reference' and 'CharData' to not match an empty string, thus the 'CharDataChars' rule uses the Positive operator instead of the Kleene star. The use of 'CharData' in the rule 'UnusedAttribute' has to be adapted by prepending 'CharData' with the Optional operator.
Effect: Only the deserialization is affected, serialized files are identical. The 'content' rule iterates over the data instead of recursing into itself, which requires less than 128 Kb stack space for the mentioned example file.
A diff of basic_xml_grammar.ipp follows.
272c272 < CharDataChars = *(anychar_p - chset_p(L"&<")); --- > CharDataChars = +(anychar_p - chset_p(L"&<")); 308c308 < | (Reference | CharData) >> content --- > | +(Reference | CharData) >> L"<" 371c371 < >> CharData --- > >> !CharData
Change History (1)
comment:1 by , 15 years ago
Milestone: | → Boost 1.36.0 |
---|---|
Resolution: | → fixed |
Status: | new → closed |
Very good call.
I appreciate how much effort it is to track something like this down.
I'm making the change and running the tests.
RObert Ramey