Boost C++ Libraries: Ticket #5086: Fix for possible assertion failure in MSVC isctype.c https://svn.boost.org/trac10/ticket/5086 <p> Ticket <a class="closed ticket" href="https://svn.boost.org/trac10/ticket/4791" title="#4791: Patches: boost/token_functions.hpp: warning isspace/ispunct called with wrong ... (closed: fixed)">#4791</a> about warning C6328 was closed as fixed. But in fact, the fix (<a class="changeset" href="https://svn.boost.org/trac10/changeset/66855" title="Merge patch to release; fixes #4791">r66855</a>) only silenced the warning, not addressing the real problem indicated by the warning. </p> <p> The problem is explained <a class="ext-link" href="http://msdn.microsoft.com/en-us/library/ms245348.aspx"><span class="icon">​</span>here</a>. In short: (for character classification functions in &lt;cctype&gt;) </p> <blockquote class="citation"> <p> the valid range of values for its input argument is: </p> <blockquote> <p> 0 &lt;= c &lt;= 255, plus the special value EOF. </p> </blockquote> </blockquote> <p> Otherwise, the behavior is undefined. Thus, passing an user provided value of char (may be signed) can cause UB. The existing comment just above struct traits_extension addresses the same issue. </p> <p> Nothing was changed by the static_cast&lt;int&gt; introduced by the fix. It just expressed what the compiler implicitly does (integral promotion). </p> <p> Here is the patch to fix the real problem, including a test about the problem. </p> en-us Boost C++ Libraries /htdocs/site/boost.png https://svn.boost.org/trac10/ticket/5086 Trac 1.4.3 Kazutoshi Satoda <k_satoda@…> Tue, 18 Jan 2011 18:29:09 GMT attachment set https://svn.boost.org/trac10/ticket/5086 https://svn.boost.org/trac10/ticket/5086 <ul> <li><strong>attachment</strong> → <span class="trac-field-new">boost_tokenizer_tolerate_negative_char.patch</span> </li> </ul> <p> svn diff for trunk <a class="changeset" href="https://svn.boost.org/trac10/changeset/68230" title="* finished ideas/units ">r68230</a> </p> Ticket Kazutoshi Satoda <k_satoda@…> Tue, 18 Jan 2011 18:46:17 GMT version changed https://svn.boost.org/trac10/ticket/5086#comment:1 https://svn.boost.org/trac10/ticket/5086#comment:1 <ul> <li><strong>version</strong> <span class="trac-field-old">Boost 1.45.0</span> → <span class="trac-field-new">Boost Development Trunk</span> </li> </ul> Ticket Marshall Clow Sat, 22 Jan 2011 16:37:45 GMT <link>https://svn.boost.org/trac10/ticket/5086#comment:2 </link> <guid isPermaLink="false">https://svn.boost.org/trac10/ticket/5086#comment:2</guid> <description> <p> Good explanation; I see the problem that needs to be fixed. </p> <p> However, it seems to me that this patch will change the behavior of non-MS systems. </p> <p> isspace ( -5 ) will become isspace ( 251 ) - is that correct? </p> </description> <category>Ticket</category> </item> <item> <author>Kazutoshi Satoda <k_satoda@…></author> <pubDate>Sat, 22 Jan 2011 17:16:46 GMT</pubDate> <title/> <link>https://svn.boost.org/trac10/ticket/5086#comment:3 </link> <guid isPermaLink="false">https://svn.boost.org/trac10/ticket/5086#comment:3</guid> <description> <blockquote class="citation"> <p> isspace ( -5 ) will become isspace ( 251 ) - is that correct? </p> </blockquote> <p> Yes. But the former was undefined behavior. I think change to any certain behavior from undefined behavior is enough convincing. </p> </description> <category>Ticket</category> </item> <item> <dc:creator>Marshall Clow</dc:creator> <pubDate>Sat, 22 Jan 2011 17:32:50 GMT</pubDate> <title/> <link>https://svn.boost.org/trac10/ticket/5086#comment:4 </link> <guid isPermaLink="false">https://svn.boost.org/trac10/ticket/5086#comment:4</guid> <description> <p> I don't see where that's undefined behavior (except on MS-systems) </p> <p> Looking at the C specification (since the C++ one defers back to that), all I see is: </p> <p> 7.4.1.9 The isspace function </p> <p> <strong>Synopsis: </strong> </p> <pre class="wiki">#include &lt;ctype.h&gt; int isspace(int c); </pre><p> <strong>Description:</strong> The isspace function tests for any character that is a standard white-space character or is one of a locale-specific set of characters for which isalnum is false. The standard white-space characters are the following: space (’ ’), form feed (’\f’), new-line (’\n’), carriage return (’\r’), horizontal tab (’\t’), and vertical tab (’\v’). In the "C" locale, isspace returns true only for the standard white-space characters. </p> <p> Is there something that I'm missing? </p> </description> <category>Ticket</category> </item> <item> <author>Kazutoshi Satoda <k_satoda@…></author> <pubDate>Sat, 22 Jan 2011 17:53:54 GMT</pubDate> <title/> <link>https://svn.boost.org/trac10/ticket/5086#comment:5 </link> <guid isPermaLink="false">https://svn.boost.org/trac10/ticket/5086#comment:5</guid> <description> <blockquote class="citation"> <p> Is there something that I'm missing? </p> </blockquote> <p> Yes, here: 7.4 Character handling &lt;ctype.h&gt; p1 </p> <blockquote> <p> The header &lt;ctype.h&gt; declares several functions useful for classifying and mapping characters. In all cases the argument is an int, the value of which shall be representable as an unsigned char or shall equal the value of the macro EOF. If the argument has any other value, the behavior is undefined. </p> </blockquote> </description> <category>Ticket</category> </item> <item> <dc:creator>Marshall Clow</dc:creator> <pubDate>Mon, 24 Jan 2011 15:57:16 GMT</pubDate> <title/> <link>https://svn.boost.org/trac10/ticket/5086#comment:6 </link> <guid isPermaLink="false">https://svn.boost.org/trac10/ticket/5086#comment:6</guid> <description> <p> Thanks for the pointer. </p> <p> But I still see a problem with the path, specifically, when handling EOF. Casting EOF into an unsigned char will make it "not EOF" any more. </p> </description> <category>Ticket</category> </item> <item> <author>Kazutoshi Satoda <k_satoda@…></author> <pubDate>Thu, 27 Jan 2011 16:39:13 GMT</pubDate> <title/> <link>https://svn.boost.org/trac10/ticket/5086#comment:7 </link> <guid isPermaLink="false">https://svn.boost.org/trac10/ticket/5086#comment:7</guid> <description> <blockquote class="citation"> <p> But I still see a problem with the path, specifically, when handling EOF. </p> </blockquote> <p> A value <code>char(EOF)</code>, which become equal to <code>EOF</code> after integer promotion if <code>EOF</code> is representable by char, in a string should never be treated as end of file in the first place. </p> <p> This patch will somewhat cure that case, too. </p> </description> <category>Ticket</category> </item> </channel> </rss>