Opened 10 years ago
Closed 18 months ago
#8261 closed Bugs (duplicate)
lexical_cast<unsigned> returns unexpected result when using with split_iterator<std::wstring::iterator>
Reported by: | Owned by: | Antony Polukhin | |
---|---|---|---|
Milestone: | To Be Determined | Component: | lexical_cast |
Version: | Boost 1.52.0 | Severity: | Problem |
Keywords: | Cc: |
Description
The following code will print "4948.4948" instead of expected "10.10" (This are ascii codes of the symbols instead of values):
std::wstring wstr(L"10.10"); typedef boost::split_iterator<std::wstring::iterator> wsplit_iter_t; wsplit_iter_t wdot_iter = boost::make_split_iterator( wstr, boost::first_finder(L ".")); std::cout<<boost::lexical_cast<unsigned>(*wdot_iter++)<<'.'<<boost::lexical_cast<unsigned>(*wdot_iter++);
Attachments (2)
Change History (10)
comment:1 by , 10 years ago
follow-up: 3 comment:2 by , 9 years ago
The attached patch is against boost 1.55.0's lexical_cast.hpp.
This patch catches iterator_range objects and calls a version of lexical_cast specialized on the correct internal char type.
The root cause is that the templates in the default lexical_cast function do not consider iterator_range objects. It is more efficient to catch all iterator_range objects in a specialized function.
Another issue which might be of concern is that the member variables start and finish (start_ and finish_ would be a nice change) are pointers to char types. Using an iterator abstraction would be more generic. It looks like a big change at first peak, and its merits would have to evaluated too.
by , 9 years ago
Attachment: | boost_1_55_0_lexical_cast.patch added |
---|
Patch is against boost_1_55_0 lexical_cast.hpp.
comment:3 by , 9 years ago
Replying to Troy Korjuslommi <troykor@…>:
The attached patch is against boost 1.55.0's lexical_cast.hpp.
This patch catches iterator_range objects and calls a version of lexical_cast specialized on the correct internal char type.
The root cause is that the templates in the default lexical_cast function do not consider iterator_range objects. It is more efficient to catch all iterator_range objects in a specialized function.
This is a good attempt, however it will break one of the usecaes: &lexical_cast<int, char*>. This situation is covered by the regression test, which could be run by ./b2 in boost/libs/conversion/tests/ folder. Breaking existing use cases is unfordable.
Another issue which might be of concern is that the member variables start and finish (start_ and finish_ would be a nice change) are pointers to char types. Using an iterator abstraction would be more generic. It looks like a big change at first peak, and its merits would have to evaluated too.
This will increase the size of a resulting binary without actually adding new functionality.
The issue in this ticket comes from the ability of iterator_range<std::wstring::iterator>
to stream using std::stream
and std::wstream
. lexical_cast
chooses the std::stream
version, according to the assumption that non wide character streaming is more optimized and used more often by users, leading to smaller binary size.
Same issue occurs with any wide&non-wide streamable classes.
A simple solution would be to specialize detail::stream_char_common
(can be found in boost/lexical_cast.hpp) for std::wstring::iterator/std::string::iterator/std::6string::const_iterator/std::wstring::const_iterator
and ensure that lexical_cast
works well with std::wstring::iterator
and iterator_range<std::wstring::const_iterator>
.
A super cool solution would be to additionally resolve some of the cases with wide&non-wide streamable classes by introducing a boost::has_right_shift<std::basic_istream<char>, T >
type trait that does not allow implicit conversions and type promotions for T
. Not know how to do it thou.
follow-up: 6 comment:4 by , 9 years ago
I didn't get a failure on any of the tests on 1_55_0. Must a dev branch new test or something.
My understanding of the problem is as follows. The eventual problem is that the code interprets wchar_t* as char*. The reason this happens is that the traits template says the char type for CharT is char, not wchar_t. The reason it gets it wrong is that it doesn't have any checks for iterator_range types, so it just fumbles. It catches those iterator_range types which have a parameter char or wchar_t, but iterator_range<iterator/const_iterator> types have one more level of embedded types, so it misses those.
One could add those checks, but that would just add complexity without any real benefit. A specialization for iterator_range would just get added somewhere else anyway. It is much simpler and safer to catch all iterator_range types in their own lexical_cast function. If something breaks, then write more special cases. I am attaching another patch which adds another level of indirection, so specializations can be written as needed.
I know this is slightly different take on what you mentioned. I do see your point too with the stream types. I ended up taking a different route. Please give it some thought.
by , 9 years ago
Attachment: | boost_1_55_0_lexical_cast.2.patch added |
---|
v2 of patch to handle iterator_range types.
comment:5 by , 9 years ago
I tried your stream_char_common specialization solution. Didn't work. Probably some typo or other brain fart type thing. The template should define wchar_t as the type, but it somehow doesn't. requires_stringbuf gets set to false in lexical_cast_stream_traits, which it shouldn't if the type is char or wchar_t.
I have to look at it again later. Maybe it'll come to me after a break. Anyway, if you feel like having a go at it, here's the one that fails (but matches).
template < class Traits, class Alloc, template<class,class> class Iter > struct stream_char_common< boost::iterator_range< Iter <wchar_t*, std::basic_string<wchar_t,Traits,Alloc> > > > {
typedef wchar_t type;
};
comment:6 by , 9 years ago
Replying to Troy Korjuslommi <troykor@…>:
I didn't get a failure on any of the tests on 1_55_0. Must a dev branch new test or something.
Just out of curiosity downloaded the 1.55. Test is in there: boost_1_55_0/libs/conversion/lexical_cast_test.cpp
. With your patches test void test_getting_pointer_to_function()
fail to compile. Breaking existing use cases is unfordable.
My understanding of the problem is as follows. The eventual problem is that the code interprets wchar_t* as char*. The reason this happens is that the traits template says the char type for CharT is char, not wchar_t. The reason it gets it wrong is that it doesn't have any checks for iterator_range types, so it just fumbles. It catches those iterator_range types which have a parameter char or wchar_t, but iterator_range<iterator/const_iterator> types have one more level of embedded types, so it misses those.
One could add those checks, but that would just add complexity without any real benefit. A specialization for iterator_range would just get added somewhere else anyway. It is much simpler and safer to catch all iterator_range types in their own lexical_cast function. If something breaks, then write more special cases. I am attaching another patch which adds another level of indirection, so specializations can be written as needed.
I know this is slightly different take on what you mentioned. I do see your point too with the stream types. I ended up taking a different route. Please give it some thought.
Your approach can be used with developer branch. try_lexical_convert
function does not guarantee unambiguity while getting pointer to function, so it can be specialized in your way. Fork the conversion repo, apply your patch, add tests to ensure that new feature work and make sure that all tests pass.
Iter <wchar_t*, std::basic_string<wchar_t,Traits,Alloc> >
This is not a portable solution (it won't work with SCARY iterators).
comment:7 by , 9 years ago
The stream_char_common template function I posted was meant just for testing this special case. I made it as specific as possible, so I could test it easily.
I didn't get the errors with either patch, so this must be an environment issue. My test machine is g++-4.4 on x86_64 (Debian).
I will test the fork.
comment:8 by , 18 months ago
Resolution: | → duplicate |
---|---|
Status: | new → closed |
Moved to github: https://github.com/boostorg/lexical_cast/issues/47
This is a known problem, which is very hard to solve in s good, portable and generic way.
As a temporary workaround you may use the following solution: