Boost C++ Libraries: Ticket #8261: lexical_cast<unsigned> returns unexpected result when using with split_iterator<std::wstring::iterator> https://svn.boost.org/trac10/ticket/8261 <p> The following code will print "4948.4948" instead of expected "10.10" (This are ascii codes of the symbols instead of values): </p> <p> std::wstring wstr(L"10.10"); typedef boost::split_iterator&lt;std::wstring::iterator&gt; wsplit_iter_t; wsplit_iter_t wdot_iter = boost::make_split_iterator( wstr, boost::first_finder(L ".")); std::cout&lt;&lt;boost::lexical_cast&lt;unsigned&gt;(*wdot_iter++)&lt;&lt;'.'&lt;&lt;boost::lexical_cast&lt;unsigned&gt;(*wdot_iter++); </p> en-us Boost C++ Libraries /htdocs/site/boost.png https://svn.boost.org/trac10/ticket/8261 Trac 1.4.3 Antony Polukhin Fri, 29 Mar 2013 09:44:38 GMT <link>https://svn.boost.org/trac10/ticket/8261#comment:1 </link> <guid isPermaLink="false">https://svn.boost.org/trac10/ticket/8261#comment:1</guid> <description> <p> This is a known problem, which is very hard to solve in s good, portable and generic way. </p> <p> As a temporary workaround you may use the following solution: </p> <pre class="wiki">#include &lt;boost/algorithm/string.hpp&gt; #include &lt;boost/lexical_cast.hpp&gt; int main() { const wchar_t wstr[] = L"20.20"; typedef boost::split_iterator&lt;const wchar_t*&gt; wsplit_iter_t; wsplit_iter_t wdot_iter = boost::make_split_iterator( wstr, boost::first_finder(L".")); std::cout &lt;&lt; boost::lexical_cast&lt;unsigned&gt;(*wdot_iter++) &lt;&lt; '.' &lt;&lt; boost::lexical_cast&lt;unsigned&gt;(*wdot_iter++); } </pre> </description> <category>Ticket</category> </item> <item> <author>Troy Korjuslommi <troykor@…></author> <pubDate>Sat, 01 Feb 2014 15:46:29 GMT</pubDate> <title/> <link>https://svn.boost.org/trac10/ticket/8261#comment:2 </link> <guid isPermaLink="false">https://svn.boost.org/trac10/ticket/8261#comment:2</guid> <description> <p> The attached patch is against boost 1.55.0's lexical_cast.hpp. </p> <p> This patch catches iterator_range objects and calls a version of lexical_cast specialized on the correct internal char type. </p> <p> The root cause is that the templates in the default lexical_cast function do not consider iterator_range objects. It is more efficient to catch all iterator_range objects in a specialized function. </p> <p> Another issue which might be of concern is that the member variables start and finish (start_ and finish_ would be a nice change) are pointers to char types. Using an iterator abstraction would be more generic. It looks like a big change at first peak, and its merits would have to evaluated too. </p> </description> <category>Ticket</category> </item> <item> <author>Troy Korjuslommi <troykor@…></author> <pubDate>Sat, 01 Feb 2014 15:47:48 GMT</pubDate> <title>attachment set https://svn.boost.org/trac10/ticket/8261 https://svn.boost.org/trac10/ticket/8261 <ul> <li><strong>attachment</strong> → <span class="trac-field-new">boost_1_55_0_lexical_cast.patch</span> </li> </ul> <p> Patch is against boost_1_55_0 lexical_cast.hpp. </p> Ticket Antony Polukhin Sat, 01 Feb 2014 16:48:26 GMT <link>https://svn.boost.org/trac10/ticket/8261#comment:3 </link> <guid isPermaLink="false">https://svn.boost.org/trac10/ticket/8261#comment:3</guid> <description> <p> Replying to <a class="ticket" href="https://svn.boost.org/trac10/ticket/8261#comment:2" title="Comment 2">Troy Korjuslommi &lt;troykor@…&gt;</a>: </p> <blockquote class="citation"> <p> The attached patch is against boost 1.55.0's lexical_cast.hpp. </p> <p> This patch catches iterator_range objects and calls a version of lexical_cast specialized on the correct internal char type. </p> <p> The root cause is that the templates in the default lexical_cast function do not consider iterator_range objects. It is more efficient to catch all iterator_range objects in a specialized function. </p> </blockquote> <p> This is a good attempt, however it will break one of the usecaes: &amp;lexical_cast&lt;int, char*&gt;. This situation is covered by the regression test, which could be run by ./b2 in boost/libs/conversion/tests/ folder. Breaking existing use cases is unfordable. </p> <blockquote class="citation"> <p> Another issue which might be of concern is that the member variables start and finish (start_ and finish_ would be a nice change) are pointers to char types. Using an iterator abstraction would be more generic. It looks like a big change at first peak, and its merits would have to evaluated too. </p> </blockquote> <p> This will increase the size of a resulting binary without actually adding new functionality. <br /> <br /> </p> <p> The issue in this ticket comes from the ability of <code>iterator_range&lt;std::wstring::iterator&gt;</code> to stream using <code>std::stream</code> and <code>std::wstream</code>. <code>lexical_cast</code> chooses the <code>std::stream</code> version, according to the assumption that non wide character streaming is more optimized and used more often by users, leading to smaller binary size. </p> <p> Same issue occurs with any wide&amp;non-wide streamable classes. </p> <p> A simple solution would be to specialize <code>detail::stream_char_common</code> (can be found in boost/lexical_cast.hpp) for <code>std::wstring::iterator/std::string::iterator/std::6string::const_iterator/std::wstring::const_iterator</code> and ensure that <code>lexical_cast</code> works well with <code>std::wstring::iterator</code> and <code>iterator_range&lt;std::wstring::const_iterator&gt;</code>. </p> <p> A super cool solution would be to additionally resolve some of the cases with wide&amp;non-wide streamable classes by introducing a <code>boost::has_right_shift&lt;std::basic_istream&lt;char&gt;, T &gt;</code> type trait that does not allow implicit conversions and type promotions for <code>T</code>. Not know how to do it thou. </p> </description> <category>Ticket</category> </item> <item> <author>Troy Korjuslommi <troykor@…></author> <pubDate>Sat, 01 Feb 2014 21:28:34 GMT</pubDate> <title/> <link>https://svn.boost.org/trac10/ticket/8261#comment:4 </link> <guid isPermaLink="false">https://svn.boost.org/trac10/ticket/8261#comment:4</guid> <description> <p> I didn't get a failure on any of the tests on 1_55_0. Must a dev branch new test or something. </p> <p> My understanding of the problem is as follows. The eventual problem is that the code interprets wchar_t* as char*. The reason this happens is that the traits template says the char type for CharT is char, not wchar_t. The reason it gets it wrong is that it doesn't have any checks for iterator_range types, so it just fumbles. It catches those iterator_range types which have a parameter char or wchar_t, but iterator_range&lt;iterator/const_iterator&gt; types have one more level of embedded types, so it misses those. </p> <p> One could add those checks, but that would just add complexity without any real benefit. A specialization for iterator_range would just get added somewhere else anyway. It is much simpler and safer to catch all iterator_range types in their own lexical_cast function. If something breaks, then write more special cases. I am attaching another patch which adds another level of indirection, so specializations can be written as needed. </p> <p> I know this is slightly different take on what you mentioned. I do see your point too with the stream types. I ended up taking a different route. Please give it some thought. </p> </description> <category>Ticket</category> </item> <item> <author>Troy Korjuslommi <troykor@…></author> <pubDate>Sat, 01 Feb 2014 21:30:58 GMT</pubDate> <title>attachment set https://svn.boost.org/trac10/ticket/8261 https://svn.boost.org/trac10/ticket/8261 <ul> <li><strong>attachment</strong> → <span class="trac-field-new">boost_1_55_0_lexical_cast.2.patch</span> </li> </ul> <p> v2 of patch to handle iterator_range types. </p> Ticket Troy Korjuslommi <troykor@…> Sat, 01 Feb 2014 23:53:19 GMT <link>https://svn.boost.org/trac10/ticket/8261#comment:5 </link> <guid isPermaLink="false">https://svn.boost.org/trac10/ticket/8261#comment:5</guid> <description> <p> I tried your stream_char_common specialization solution. Didn't work. Probably some typo or other brain fart type thing. The template should define wchar_t as the type, but it somehow doesn't. requires_stringbuf gets set to false in lexical_cast_stream_traits, which it shouldn't if the type is char or wchar_t. </p> <p> I have to look at it again later. Maybe it'll come to me after a break. Anyway, if you feel like having a go at it, here's the one that fails (but matches). </p> <blockquote> <p> template &lt; class Traits, class Alloc, template&lt;class,class&gt; class Iter &gt; struct stream_char_common&lt; boost::iterator_range&lt; Iter &lt;wchar_t*, std::basic_string&lt;wchar_t,Traits,Alloc&gt; &gt; &gt; &gt; { </p> <blockquote> <p> typedef wchar_t type; </p> </blockquote> <p> }; </p> </blockquote> </description> <category>Ticket</category> </item> <item> <dc:creator>Antony Polukhin</dc:creator> <pubDate>Sun, 02 Feb 2014 07:35:14 GMT</pubDate> <title/> <link>https://svn.boost.org/trac10/ticket/8261#comment:6 </link> <guid isPermaLink="false">https://svn.boost.org/trac10/ticket/8261#comment:6</guid> <description> <p> Replying to <a class="ticket" href="https://svn.boost.org/trac10/ticket/8261#comment:4" title="Comment 4">Troy Korjuslommi &lt;troykor@…&gt;</a>: </p> <blockquote class="citation"> <p> I didn't get a failure on any of the tests on 1_55_0. Must a dev branch new test or something. </p> </blockquote> <p> Just out of curiosity downloaded the 1.55. Test is in there: <code>boost_1_55_0/libs/conversion/lexical_cast_test.cpp</code>. With your patches test <code>void test_getting_pointer_to_function()</code> fail to compile. Breaking existing use cases is unfordable. </p> <blockquote class="citation"> <p> My understanding of the problem is as follows. The eventual problem is that the code interprets wchar_t* as char*. The reason this happens is that the traits template says the char type for CharT is char, not wchar_t. The reason it gets it wrong is that it doesn't have any checks for iterator_range types, so it just fumbles. It catches those iterator_range types which have a parameter char or wchar_t, but iterator_range&lt;iterator/const_iterator&gt; types have one more level of embedded types, so it misses those. </p> <p> One could add those checks, but that would just add complexity without any real benefit. A specialization for iterator_range would just get added somewhere else anyway. It is much simpler and safer to catch all iterator_range types in their own lexical_cast function. If something breaks, then write more special cases. I am attaching another patch which adds another level of indirection, so specializations can be written as needed. </p> <p> I know this is slightly different take on what you mentioned. I do see your point too with the stream types. I ended up taking a different route. Please give it some thought. </p> </blockquote> <p> Your approach can be used with developer branch. <code>try_lexical_convert</code> function does not guarantee unambiguity while getting pointer to function, so it can be specialized in your way. Fork the <a class="ext-link" href="https://github.com/boostorg/conversion"><span class="icon">​</span>conversion repo</a>, apply your patch, add tests to ensure that new feature work and make sure that <strong>all</strong> tests pass. </p> <blockquote class="citation"> <p> Iter &lt;wchar_t*, std::basic_string&lt;wchar_t,Traits,Alloc&gt; &gt; </p> </blockquote> <p> This is not a portable solution (it won't work with SCARY iterators). </p> </description> <category>Ticket</category> </item> <item> <author>Troy Korjuslommi <troykor@…></author> <pubDate>Sun, 02 Feb 2014 08:43:30 GMT</pubDate> <title/> <link>https://svn.boost.org/trac10/ticket/8261#comment:7 </link> <guid isPermaLink="false">https://svn.boost.org/trac10/ticket/8261#comment:7</guid> <description> <p> The stream_char_common template function I posted was meant just for testing this special case. I made it as specific as possible, so I could test it easily. </p> <p> I didn't get the errors with either patch, so this must be an environment issue. My test machine is g++-4.4 on x86_64 (Debian). </p> <p> I will test the fork. </p> <p> </p> </description> <category>Ticket</category> </item> <item> <dc:creator>Antony Polukhin</dc:creator> <pubDate>Sun, 02 May 2021 17:33:39 GMT</pubDate> <title>status changed; resolution set https://svn.boost.org/trac10/ticket/8261#comment:8 https://svn.boost.org/trac10/ticket/8261#comment:8 <ul> <li><strong>status</strong> <span class="trac-field-old">new</span> → <span class="trac-field-new">closed</span> </li> <li><strong>resolution</strong> → <span class="trac-field-new">duplicate</span> </li> </ul> <p> Moved to github: <a class="ext-link" href="https://github.com/boostorg/lexical_cast/issues/47"><span class="icon">​</span>https://github.com/boostorg/lexical_cast/issues/47</a> </p> Ticket