Opened 6 years ago
Last modified 5 years ago
#12471 new Bugs
gzip compressor decompressor broken
Reported by: | anonymous | Owned by: | Jonathan Turkanis |
---|---|---|---|
Milestone: | To Be Determined | Component: | iostreams |
Version: | Boost 1.61.0 | Severity: | Problem |
Keywords: | Cc: |
Description
Hi guys, when I use the iostreams with the gzip tools it fails to give correct answers on final dezip. my data is all small integers separated by commas, and the recovered data interchanges the order of commas and numbers as if it was a bad multithreaded program. The result is worse than useless. This is a big file phenomena (up to gb size).
Looking at the compressor code (gzip.hpp) I see many problems between 32 and 64 bit. Currently I am looking at an assignment: One that is problematic at the
template<typename Sink> std::streamsize write(Sink& snk, const char_type* s, std::streamsize n) { if (!(flags_ & f_header_done)) { std::streamsize amt = static_cast<std::streamsize>(header_.size() - offset_); offset_ += boost::iostreams::write(snk, header_.data() + offset_, amt); if (offset_ == header_.size()) flags_ |= f_header_done; else return 0; } return base_type::write(snk, s, n); }
offset_ is size_t while boost::iostreams::write(returns streamsize a signed 64 bit type (long long). So in 32 bit code there is a problem. Streamsize occurs many places, as does size_t; but one seems to be consistently 64 bit across platforms. offset seems to be 32 bit in 32 bit code. I do not think this this is the cause of the issues with gzip but short of rewriting it I don't know where to start.
I would like ot use and have confidence in this compression code but so far it only gives grief...
Change History (2)
comment:1 by , 6 years ago
comment:2 by , 5 years ago
While I don't like the code much, I did review most if not all of those case. boost::iostreams::write cannot return a value larger than amt, amt is header_.size() - offset_, and header_.size() is size_t, thus there is no issue in this case. Well, not correctness wise. The code is annoying/confusing and potentially triggers compiler warnings. So yes, a specific test case for reproducing any issue would be useful.
Thank you for reporting this issue. Would you please post a code example that produces an incorrect outcome?