Opened 6 years ago
Last modified 6 years ago
#12174 new Bugs
Tokenizer delivers additional null byte - string token
Reported by: | Owned by: | jsiek | |
---|---|---|---|
Milestone: | To Be Determined | Component: | tokenizer |
Version: | Boost 1.59.0 | Severity: | Problem |
Keywords: | Cc: |
Description
If I run the attached program I get
TokenStartsHere:McStructuredLoanEngine::calculate() trace information:TokenEndsHere / length of string is 53 TokenStartsHere::TokenEndsHere / length of string is 1
The first token is expected, but the second is not. I seems to contain one null byte.
Attachments (1)
Change History (4)
by , 6 years ago
comment:1 by , 6 years ago
I just tried with several boost versions and I have the same issue but in my case all the streams you have in your example fails after getting out the string directly into the Container.
a workaround would be:
std::string str = tmp.str(); boost::tokenizer<boost::char_separator<char> > t(str, sep);
then it will work.
I've tested this with VS14, gcc4.8 and gcc4.9.
I'll have a look at the implementation.
Thanks.
comment:2 by , 6 years ago
Hi, more data here:
tokeniser holds an iterator to the beginning and the end of the container, ostringstream::str() will return a temporary container(in this case std::string) which will be deleted at the end of the expression where you are using it, meaning that the tokeniser will holds a iterator to a temporary object which was already destroyed, this is Undefined Behaviour, different compiler are doing different things. If you hold the temporal object in scope(string str=tmp.str()) meanwhile you tokenize it will fix the issue, that's the proper Idiom I would say.
Also this is not related to the 'escaped' character \n, you can use any proper character there and will have the same results.
Dam.
comment:3 by , 6 years ago
Hi Dam, thanks a lot. It's clear now what is going on, and easy to avoid following your suggestion. Kind Regards, Peter
minimal test program producing the bug