Boost C++ Libraries: Ticket #5629: base64 encode/decode for std::istreambuf_iterator/std::ostreambuf_iterator https://svn.boost.org/trac10/ticket/5629 <p> MSVS 2008 The code: </p> <pre class="wiki">#include "boost/archive/iterators/base64_from_binary.hpp" #include "boost/archive/iterators/binary_from_base64.hpp" #include "boost/archive/iterators/transform_width.hpp" //typedefs typedef std::istreambuf_iterator&lt;char&gt; my_istream_iterator; typedef std::ostreambuf_iterator&lt;char&gt; my_ostream_iterator; typedef boost::archive::iterators::base64_from_binary&lt; boost::archive::iterators::transform_width&lt; my_istream_iterator, 6, 8&gt; &gt; bin_to_base64; typedef boost::archive::iterators::transform_width&lt; boost::archive::iterators::binary_from_base64&lt; my_istream_iterator &gt;, 8, 6 &gt; base64_to_bin; void test() { { //INPUT FILE!!! std::ifstream ifs("test.zip", std::ios_base::in|std::ios_base::binary); std::ofstream ofs("test.arc", std::ios_base::out|std::ios_base::binary); std::copy( bin_to_base64( my_istream_iterator(ifs &gt;&gt; std::noskipws) ), bin_to_base64( my_istream_iterator() ), my_ostream_iterator(ofs) ); } { std::ifstream ifs("test.arc", std::ios_base::in|std::ios_base::binary); std::ofstream ofs("test.rez", std::ios_base::out|std::ios_base::binary); std::copy( base64_to_bin( my_istream_iterator(ifs &gt;&gt; std::noskipws) ), base64_to_bin( my_istream_iterator() ), my_ostream_iterator(ofs) ); } } </pre><p> Result: 1) If the INPUT FILE will be any of ZIP-file format. The result was: </p> <blockquote> <p> a) _DEBUG_ERROR("istreambuf_iterator is not dereferencable"); <em>it can be disabled or ignored b) The encoded file "test.rez" will have one superfluous byte than INPUT FILE </em></p> </blockquote> <p> 2) If the INPUT FILE will any other file (binary or text) all will be OK. </p> en-us Boost C++ Libraries /htdocs/site/boost.png https://svn.boost.org/trac10/ticket/5629 Trac 1.4.3 nen777w@… Thu, 23 Jun 2011 13:57:47 GMT <link>https://svn.boost.org/trac10/ticket/5629#comment:1 </link> <guid isPermaLink="false">https://svn.boost.org/trac10/ticket/5629#comment:1</guid> <description> <p> If it may help. The workaround code and example how to use you can find here: <a class="ext-link" href="http://rsdn.ru/forum/cpp.applied/4317966.1.aspx"><span class="icon">​</span>http://rsdn.ru/forum/cpp.applied/4317966.1.aspx</a> </p> </description> <category>Ticket</category> </item> <item> <dc:creator>anonymous</dc:creator> <pubDate>Sat, 20 Oct 2012 00:20:10 GMT</pubDate> <title/> <link>https://svn.boost.org/trac10/ticket/5629#comment:2 </link> <guid isPermaLink="false">https://svn.boost.org/trac10/ticket/5629#comment:2</guid> <description> <p> Not so much a bug but a missing feature - no function to add/remove "=" padding. See <a class="ext-link" href="http://stackoverflow.com/questions/8033942/boost-base64-url-encode-decode"><span class="icon">​</span>http://stackoverflow.com/questions/8033942/boost-base64-url-encode-decode</a> </p> </description> <category>Ticket</category> </item> <item> <author>Ruslan Teliuk <nen777w@…></author> <pubDate>Sat, 20 Oct 2012 10:38:43 GMT</pubDate> <title/> <link>https://svn.boost.org/trac10/ticket/5629#comment:3 </link> <guid isPermaLink="false">https://svn.boost.org/trac10/ticket/5629#comment:3</guid> <description> <p> Here is updated solution: <a class="ext-link" href="http://rsdn.ru/forum/cpp.applied/4341293.1"><span class="icon">​</span>http://rsdn.ru/forum/cpp.applied/4341293.1</a> </p> </description> <category>Ticket</category> </item> <item> <dc:creator>Dave Abrahams</dc:creator> <pubDate>Wed, 21 Nov 2012 19:46:55 GMT</pubDate> <title>owner changed https://svn.boost.org/trac10/ticket/5629#comment:4 https://svn.boost.org/trac10/ticket/5629#comment:4 <ul> <li><strong>owner</strong> changed from <span class="trac-author">Dave Abrahams</span> to <span class="trac-author">jeffrey.hellrung</span> </li> </ul> Ticket Robert Ramey Wed, 21 Nov 2012 19:52:45 GMT owner, status changed https://svn.boost.org/trac10/ticket/5629#comment:5 https://svn.boost.org/trac10/ticket/5629#comment:5 <ul> <li><strong>owner</strong> changed from <span class="trac-author">jeffrey.hellrung</span> to <span class="trac-author">Robert Ramey</span> </li> <li><strong>status</strong> <span class="trac-field-old">new</span> → <span class="trac-field-new">assigned</span> </li> </ul> Ticket Dave Abrahams Wed, 21 Nov 2012 22:44:45 GMT component changed https://svn.boost.org/trac10/ticket/5629#comment:6 https://svn.boost.org/trac10/ticket/5629#comment:6 <ul> <li><strong>component</strong> <span class="trac-field-old">iterator</span> → <span class="trac-field-new">serialization</span> </li> </ul> Ticket iGene Fri, 14 Dec 2012 14:51:07 GMT <link>https://svn.boost.org/trac10/ticket/5629#comment:7 </link> <guid isPermaLink="false">https://svn.boost.org/trac10/ticket/5629#comment:7</guid> <description> <p> The root cause is that sequences whose size doesn't divide by four get a buffer overrun. Here is my workaround. </p> <pre class="wiki"> #include &lt;sstream&gt; #include &lt;cassert&gt; struct to_base64 : public std::stringstream { to_base64(const std::string&amp; str); to_base64(const char* begin, const char* end); }; struct from_base64 : public std::stringstream { from_base64(const std::string&amp; str); from_base64(const char* begin, const char* end); }; #include &lt;boost/archive/iterators/binary_from_base64.hpp&gt; #include &lt;boost/archive/iterators/base64_from_binary.hpp&gt; #include &lt;boost/archive/iterators/transform_width.hpp&gt; #include &lt;boost/archive/iterators/ostream_iterator.hpp&gt; // slightly generalized version of the example here: // http://stackoverflow.com/questions/7053538/how-do-i-encode-a-string-to-base64-using-only-boost template &lt;typename TransformIterator&gt; static void apply(const char* begin, const char* end, std::stringstream&amp; target) { std::copy(TransformIterator(begin), TransformIterator(end), std::ostreambuf_iterator&lt;char&gt;(target)); } template &lt;typename TransformIterator&gt; static void applyTwice(const char* begin, const char* end, std::stringstream&amp; target) { long size = end - begin; int remainder = size % 4; const char* truncated = end - remainder; apply&lt;TransformIterator&gt;(begin, truncated, target); if (remainder) { assert(remainder != 1); /* it can never be =1 if this whole thing about dividing by four is correct */ char padded[4] = { 'A', 'A', 'A', 'A' }; const char* src = truncated; char* dest = &amp;padded[0]; while (src != end) *(dest++) = *(src++); apply&lt;TransformIterator&gt;(&amp;padded[0], &amp;padded[sizeof(padded)], target); std::ios::streampos pos = target.tellp(); pos -= (4 - remainder); target.seekp(pos); } } using namespace boost::archive::iterators; typedef base64_from_binary&lt;transform_width&lt;const char*, 6, 8&gt; &gt; to; to_base64::to_base64(const char* begin, const char* end) { apply&lt;to&gt;(begin, end, *this); } to_base64::to_base64(const std::string&amp; str) { apply&lt;to&gt;(str.c_str(), str.c_str() + str.length(), *this); } typedef transform_width&lt;binary_from_base64&lt;const char*&gt;, 8, 6&gt; from; from_base64::from_base64(const char* begin, const char* end) { applyTwice&lt;from&gt;(begin, end, *this); } from_base64::from_base64(const std::string&amp; str) { applyTwice&lt;from&gt;(str.c_str(), str.c_str() + str.length(), *this); } int main() { size_t length = 0; do { // generate source bytes char source[RAND_MAX + 1]; for (size_t pos = 0; pos &lt; length; ++pos) source[pos] = '0' + char(rand() % 32); source[length] = '\0'; // convert them to base64 to_base64 b(&amp;source[0], &amp;source[length]); std::string b64 = b.str(); // and convert them back from_base64 result(b64.c_str(), b64.c_str() + b64.size()); // compare as binary size_t size = (size_t)result.tellp(); assert(size == length); char dest[RAND_MAX]; result.read(&amp;dest[0], size); for (size_t pos = 0; pos &lt; length; ++pos) assert(source[pos] == dest[pos]); // compare as text std::string asString = result.str(); assert(!strcmp(asString.c_str(), &amp;source[0])); } while (++length &lt; 100); return 0; } </pre> </description> <category>Ticket</category> </item> <item> <dc:creator>anonymous</dc:creator> <pubDate>Thu, 07 Feb 2013 06:50:25 GMT</pubDate> <title/> <link>https://svn.boost.org/trac10/ticket/5629#comment:8 </link> <guid isPermaLink="false">https://svn.boost.org/trac10/ticket/5629#comment:8</guid> <description> <p> Already fixed? </p> </description> <category>Ticket</category> </item> <item> <dc:creator>anonymous</dc:creator> <pubDate>Wed, 10 Apr 2013 17:01:00 GMT</pubDate> <title/> <link>https://svn.boost.org/trac10/ticket/5629#comment:9 </link> <guid isPermaLink="false">https://svn.boost.org/trac10/ticket/5629#comment:9</guid> <description> <p> Replying to <a class="ticket" href="https://svn.boost.org/trac10/ticket/5629#comment:8" title="Comment 8">anonymous</a>: </p> <blockquote class="citation"> <p> Already fixed? </p> </blockquote> <p> Not entirely. Although a fix was put into boost 1.53 so that '=' characters will not cause the decoder to crash, it still doesn't treat them as padding. The fix will cause the decoder to add nulls to the end of the decoded value, which is probably not what you want. Granted, you should be able to figure out the right thing to to given the number of '=' characters on the end of the encoded stream, but you shouldn't have to. </p> <p> Also, the encoder still won't produce a padded encoding. </p> </description> <category>Ticket</category> </item> <item> <author>prantlf@…</author> <pubDate>Sat, 07 Sep 2013 11:28:04 GMT</pubDate> <title/> <link>https://svn.boost.org/trac10/ticket/5629#comment:10 </link> <guid isPermaLink="false">https://svn.boost.org/trac10/ticket/5629#comment:10</guid> <description> <p> The problem of padding could be solved by making the transform_width stateful. If you want to encode/decode to/from BASE64 and you input comes in chunks or as a stream, you need to save your previous state to be able to continue with the next chunk correctly. </p> <p> The transform_width currently goes "blindly" for the next item pointed to by the iterator, although there may not be enough items to finish the minimum quantum. It even ends up with buffer overflow, if your input sequence is not zero-padded. Also, this eager reading of the zero padding makes it useful only to convert once a complete buffer. </p> <p> Having the transform_width "know" that there is a minimum quantum of units to read which has to be available to produce an output unit, it would prevent the buffer overflow and allow transforming chunked input. </p> <p> I added the end-iterator and state to the transform_width (either directly or to the transformed iterator by an iterator adaptor). Reading ahead and storing the next value makes the code a little longer and note so compact. Also, tt runs significantly slower than a hand-coded BASE64 encoder; I\m nor sure why. Maybe copying of the iterators around? </p> <p> Would it make sense to include a stateful transform_width in boost? </p> </description> <category>Ticket</category> </item> <item> <dc:creator>Robert Ramey</dc:creator> <pubDate>Wed, 19 Feb 2014 21:27:41 GMT</pubDate> <title/> <link>https://svn.boost.org/trac10/ticket/5629#comment:11 </link> <guid isPermaLink="false">https://svn.boost.org/trac10/ticket/5629#comment:11</guid> <description> <p> I did spend a significant amount of time on this while (apparently) not getting it quite right. </p> <p> How about: </p> <p> a) updating the current test so that if fails </p> <p> b) suggesting a patch </p> <p> Robert Ramey </p> </description> <category>Ticket</category> </item> </channel> </rss>