id summary reporter owner description type status milestone component version severity resolution keywords cc 1273 CR+LF newlines in position_iterator slehuitouze@… Joel de Guzman "On september 13th, I sent a mail on ""spirit-general"" mailing list to describe a bug I ran into using position_iterator, which is entitled ""Various newline styles and position_iterator"". I'm not sure it is useful to rewrite everything here, I'll just come to the conclusion : ""position_iterator< file_iterator >"" has iterator category ""random_access_iterator_tag"", whereas direct pointer arithmetic is not possible on it (because of the eating of LF when facing CR+LF newline). As a consequence, one may end up with an unitialized character when one tries to copy a range of two position_iterator in a ""std::vector"". This is demonstrated by the attached C++ source code, whose (part of the) output on my machine is as follows: **************BEGINNING OF OUTPUT*********************** We have read following characters in a 'vector' container from a file: #0: 65 (A) #1: 66 (B) #2: 13 (\r) #3: 87 (W) #4: 205 (unexpected character) **************END OF OUTPUT*********************** You will see while perusing the code that I have provided two versions : one dealing with a file (i.e. type ""position_iterator< file_iterator >""), one dealing with a mere character buffer (i.e. type ""position_iterator""). Both of them cause the bug. I also tried a variant (that can be activated by commenting out line #2) that uses a ""std::string"" instead of a ""std::vector"", and which does not exhibit the problem. I have not looked in detail, but it's probably because ""std::string"" copy is probably implemented by a pre-reservation followed by a loop of ""insert"" and ""push_back"", rather than a pre-allocation followed by a loop of assignment and incrementation (as in ""std::vector""). This approach (i.e. using a ""std::string"" rather than a ""std::vector"") is not a practical workaround for my problem, since the problem is inside spirit itself (more precisely, at lines 246-248 in 1.8.3 file ""spirit/tree/common.hpp""), where variable ""text"" has type ""std::vector"": ******************************************** node_val_data(IteratorT const& _first, IteratorT const& _last) : text(_first, _last), is_root_(false), parser_id_(), value_() {} ******************************************** As I said in my original mail, rapid solution is to simply change the iterator category of ""position_iterator"" to ""forward_iterator_tag"". But I think a more serious reflexion should also be considered: Is it normal that the stream of char coming out of a ""position_iterator< file_iterator >"" may be different than the one coming out of a ""file_iterator""? I'm not sure of the answer... In the above-mentionned mail, I suggested a correction for method ""increment"" (that needs an extra member variable ""_crJustSeen"") that would not change the stream, this might be the base for a new implementation that you could do. Regards. --Serge Le Huitouze " Bugs closed To Be Determined spirit Problem fixed