Boost C++ Libraries: Ticket #2659: optional_io insertion operator assumes presence of whitespace on stream https://svn.boost.org/trac10/ticket/2659 <p> In optional_io.hpp, the insertion operator is written as: </p> <pre class="wiki">if ( in.good() ) { int d = in.get(); if ( d == ' ' ) { T x ; in &gt;&gt; x; v = x ; } else v = optional&lt;T&gt;() ; } </pre><p> However the assumption that there must be a space on the stream means something like this fails: </p> <pre class="wiki">istringstream in ("test"); optional&lt;string&gt; s; assert (s); </pre><p> It fails because we eat 'd', throw it away, and then return an empty optional object because we didn't get the space we were expecting. </p> <p> If the intent was to eat up spare whitespace, I would prepose something like this: </p> <pre class="wiki">if (in) { T x; if (in &gt;&gt; std::ws &gt;&gt; x) v = x; else v = optional&lt;T&gt;(); } return in; </pre><p> However doesn't that interfere with the intent of the skipws flag? For example: </p> <pre class="wiki">istringstream in (" test1 test2"); optional&lt;string&gt; s; in &gt;&gt; s; assert (s); in.unsetf (ios::skipws); in &gt;&gt; s; assert (!s); </pre><p> This doesn't work with "in &gt;&gt; std::skipws &gt;&gt; x" because the second assertion will fail. </p> <p> Instead you'd need to use "in &gt;&gt; x", which I believe would is the best solution. </p> en-us Boost C++ Libraries /htdocs/site/boost.png https://svn.boost.org/trac10/ticket/2659 Trac 1.4.3 Andrew Troschinetz <ast@…> Wed, 21 Jan 2009 14:48:59 GMT <link>https://svn.boost.org/trac10/ticket/2659#comment:1 </link> <guid isPermaLink="false">https://svn.boost.org/trac10/ticket/2659#comment:1</guid> <description> <p> Sorry, just noticed my slip of the tongue. I meant to say that the problem was with the extraction operator, not the insertion operator. Hopefully that was obvious from the explanation of the problem. </p> </description> <category>Ticket</category> </item> <item> <author>Andrew Troschinetz <ast@…></author> <pubDate>Wed, 04 Feb 2009 21:15:31 GMT</pubDate> <title/> <link>https://svn.boost.org/trac10/ticket/2659#comment:2 </link> <guid isPermaLink="false">https://svn.boost.org/trac10/ticket/2659#comment:2</guid> <description> <p> I think I understand why optional_io is coded the way it is. I also think it can be improved upon. </p> <p> For example: </p> <pre class="wiki"> istringstream sin (" 1 this 5.5 is a test -3.2 a"); optional&lt;double&gt; token; while (sin &gt;&gt; token) cout &lt;&lt; token.get_value_or (numeric_limits&lt;double&gt;::quiet_NaN ()) &lt;&lt; endl; </pre><p> This will print out 1 and then exit the loop, but I think it would be nicer, and less surprising if it printed out: </p> <pre class="wiki"> 1 nan 5.5 nan nan nan -3.2 nan </pre><p> This can be accomplished by changing operator&gt;&gt;() to the following: </p> <pre class="wiki"> if (in) { std::string token; in &gt;&gt; token; if (!in) { v = optional&lt;T&gt; (); return in; } std::istringstream sin (token); T x; if (sin &gt;&gt; x) v = x; else v = optional&lt;T&gt; (); } else { v = optional&lt;T&gt; (); } return in; </pre><p> We protect the istream from its failbit getting set by first extracting the next token as a string, which is sure to succeed if there is anything at all on the stream. </p> <p> This way, we don't set the failbit on the istream just because we fail to extract a value for our optional type. I think this fits with the semantics of optional types because it's a valid state for an them to be uninitialized, unlike just about any other type. </p> <p> This change also nicely fixes the problem I initially reported. </p> </description> <category>Ticket</category> </item> <item> <author>Andrew Troschinetz <ast@…></author> <pubDate>Thu, 05 Feb 2009 14:40:20 GMT</pubDate> <title/> <link>https://svn.boost.org/trac10/ticket/2659#comment:3 </link> <guid isPermaLink="false">https://svn.boost.org/trac10/ticket/2659#comment:3</guid> <description> <p> If I'm being a nuisance by continuing to reply to this ticket please let me know. I'm learning a lot about streams from trying to figure out this problem; allow me to refine the description of the problem I'm trying to fix and the code I posted in the previous reply. </p> <p> I think any given optional type should behave as much like the type it wraps as possible. For example: </p> <pre class="wiki">// This: stringstream sin ("1.2.3.4.5"); optional&lt;double&gt; token; while (sin &gt;&gt; token) cout &lt;&lt; token &lt;&lt; endl; // Should have the same output as this: stringstream sin ("1.2.3.4.5"); double token; while (sin &gt;&gt; token) cout &lt;&lt; token &lt;&lt; endl; </pre><p> Also this should work (as of boost 1.37.0 it does not): </p> <pre class="wiki">stringstream sin ("test"); optional&lt;string&gt; token; sin &gt;&gt; token; // fails to read "test", instead we get an empty optional type </pre><p> However I think that using an optional type should offer a little more than just that (call me crazy). Since it's semantically valid to have an un-initialized optional type, I think failure to extract something off the stream should return an empty optional type <strong>without</strong> setting the failbit on the stream. So like: </p> <pre class="wiki">// outputs 1 nan 5.5 nan nan nan -3.2 nan istringstream sin (" 1 this 5.5 is a test -3.2 a"); optional&lt;double&gt; token; const double nan = numeric_limits&lt;double&gt;::quiet_NaN (); while (sin &gt;&gt; token) cout &lt;&lt; token.get_value_or (nan) &lt;&lt; " "; // outputs 1 istringstream sin (" 1 this 5.5 is a test -3.2 a"); double token; while (sin &gt;&gt; token) cout &lt;&lt; token &lt;&lt; " "; </pre><p> With that in mind, here's my preposed refinement: </p> <pre class="wiki">if (in) { T x; if (in &gt;&gt; x) { v = x; } else { if (in.fail ()) { in.clear (); std::string dump; in &gt;&gt; dump; } v = optional&lt;T&gt; (); } } else { v = optional&lt;T&gt; (); } return in; </pre> </description> <category>Ticket</category> </item> <item> <author>Andrew Troschinetz <ast@…></author> <pubDate>Thu, 05 Feb 2009 16:10:02 GMT</pubDate> <title/> <link>https://svn.boost.org/trac10/ticket/2659#comment:4 </link> <guid isPermaLink="false">https://svn.boost.org/trac10/ticket/2659#comment:4</guid> <description> <p> And once again, even better: </p> <pre class="wiki">if (in) { T x; if (in &gt;&gt; x) { v = x; } else { if (in.fail () &amp;&amp; !in.eof ()) { in.clear (); in.get (); } v = optional&lt;T&gt; (); } } else { v = optional&lt;T&gt; (); } return in; </pre><p> Instead of reading in a std::string, we should really just skip over the 1 character that gave us trouble and keep trying. Also added a check to make sure we didn't fail b/c we tried to read past eof, because if that's the case we really should fail and not try to recover. </p> </description> <category>Ticket</category> </item> <item> <dc:creator>Andrey Semashev</dc:creator> <pubDate>Sun, 12 Dec 2010 12:17:55 GMT</pubDate> <title>status changed; resolution set https://svn.boost.org/trac10/ticket/2659#comment:5 https://svn.boost.org/trac10/ticket/2659#comment:5 <ul> <li><strong>status</strong> <span class="trac-field-old">new</span> → <span class="trac-field-new">closed</span> </li> <li><strong>resolution</strong> → <span class="trac-field-new">wontfix</span> </li> </ul> <p> IMO, if reading a value from stream failed, the stream should be marked with failbit and reading must be stopped. This behavior is not only valid for boost::optional, but also for other types. This, in particular, allows user to detect invalid character sequences and not accidentally read invalid data (which could be a security issue). </p> <p> Extraction operator has been changed recently (<a class="changeset" href="https://svn.boost.org/trac10/changeset/67183" title="Merged changes from trunk. Fixes #3395. Also updates swap behavior: if ...">[67183]</a>) to include more robust input checking and not to miss invalid input. </p> <p> PS: Regarding the initial test case, the input "test" is not valid for boost::optional and thus reading it must fail. The basic rule about IO is that whatever is written with insertion operator may be read with extraction operator. And insertion operator always writes either space or dash as leading character. </p> Ticket