Opened 7 years ago

Closed 5 years ago

Last modified 5 years ago

#11953 closed Bugs (worksforme)

Boost 1.60: boost::archive::iterators bug?

Reported by: kreuzerkrieg@… Owned by: Robert Ramey
Milestone: To Be Determined Component: serialization
Version: Boost 1.60.0 Severity: Problem
Keywords: Cc: ernest.zaslavsky@…

Description

Hi, Recently I've noticed that Base64 encoded data produced by implementation I use and implementation used by chromium project sometimes are not the same. It happened only once but it was enough for me to start worrying. The implementation is quite simple.
namespace bai = boost::archive::iterators;
using it_base64_t = bai::base64_from_binary<bai::transform_width<std::string::const_iterator, 6, 8>>;
auto writePaddChars = (3 - input.length() % 3) % 3;
std::string base64(it_base64_t(input.begin()), it_base64_t(input.end()));
base64.append(writePaddChars, '=');

simple and elegant.

However, today I wrote a short test which takes a chapter from 'Alice's Adventures in Wonderland' and encodes it. As well, I used two different online encoder to encode the same text, both online encoders produce the same data. However above code produced the same code until character 1029 and then it is different from what it should be. Then later, it gets back to the same output as produced by online encoders. So my question is, any chance it is a bug in archive::iterators or it is something in the above code? Coliru is down, so, full code is here http://pastebin.com/JxMsMeV2

Attachments (1)

B64EncodingSamples.7z (8.4 KB ) - added by anonymous 7 years ago.

Download all attachments as: .zip

Change History (8)

comment:1 by Robert Ramey, 7 years ago

Hmmm - very interesting. What are the two different characters. What is the global local setting during the two tests?

by anonymous, 7 years ago

Attachment: B64EncodingSamples.7z added

in reply to:  1 comment:2 by kreuzerkrieg, 7 years ago

Replying to ramey:

Hmmm - very interesting. What are the two different characters. What is the global local setting during the two tests?

I've attached sample files diff position 0x400 IGhpbeKA (canonical) IGhpbZdp (boost)

Machine locale is English, US

comment:3 by anonymous, 7 years ago

just in case it would help, machine info
Win7, x64
VS2015, boost obtained using nuget package, built for x64

comment:4 by Robert Ramey, 5 years ago

Resolution: worksforme
Status: newclosed

FWIW - I compiled and ran your sample program on my recent version of clang with my Mac OS X. I ran under the debugger and checked the output characters at positions 1028-1031 and they all matched the ones in your file - canonic.txt. In other words, I can't reproduce the problem on my machine. The whole question of uft8 on windows machines is very confusing

comment:5 by kreuzerkrieg@…, 5 years ago

And you didn't change anything in the code last 15 months? What is so confusing about utf8 on whatever platform? it is well defined, be it windows, mac or Linux In any case, I will verify it once I'm back to the office. Should I check it with the latest boost

comment:5 by kreuzerkrieg@…, 5 years ago

And you didn't change anything in the code last 15 months? What is so confusing about utf8 on whatever platform? it is well defined, be it windows, mac or Linux In any case, I will verify it once I'm back to the office. Should I check it with the latest boost?

comment:6 by anonymous, 5 years ago

and sorry for the duplicates

Note: See TracTickets for help on using tickets.