Opened 10 years ago

Last modified 10 years ago

#8150 new Bugs

gzip_compressor produces corrupt data when used with filtering_ostream

Reported by: matt@… Owned by: Jonathan Turkanis
Milestone: To Be Determined Component: iostreams
Version: Boost 1.53.0 Severity: Regression
Keywords: Cc:

Description

In some cases, the USE_CORRUPTING_OSTREAM path in the following program will produce corrupt gzip output (fails CRC and the de-gzipped file differs slightly from the original input). This does not occur for all data, so I will attach an input data file that triggers the error.

This bug only affects gzip_stream when used with filtering_ostream. It does not affect filtering_istream usage.

Bug reproduced on Mac 10.8.2, 64-bit, Apple LLVM version 4.2 (clang-425.0.24) (based on LLVM 3.2svn), zlib 1.2.5 (dylib as shipped with Mac OS X), boost 1.53.0.

Bug does not appear to be affected by compiler optimizations (-O0 and -Os tested).

#include <iostream>
#include <fstream>
#include <vector>
#include <boost/iostreams/filter/gzip.hpp>
#include <boost/iostreams/filtering_stream.hpp>
#include <boost/iostreams/device/back_inserter.hpp>
#include <boost/iostreams/copy.hpp>
#include <boost/iostreams/device/array.hpp>
#include <boost/iostreams/stream.hpp>
#include <boost/iostreams/device/file_descriptor.hpp>

int main(int argc, char *argv[])
{
	std::vector<char> input;
	std::vector<char> output;
	
	boost::iostreams::file_descriptor_source source("inputdata");
	boost::iostreams::copy(source, boost::iostreams::back_inserter(input));
	
#define USE_CORRUPTING_OSTREAM 1
#if USE_CORRUPTING_OSTREAM
	boost::iostreams::filtering_ostream gzip_stream;
	gzip_stream.push(boost::iostreams::gzip_compressor());
	gzip_stream.push(boost::iostreams::back_inserter(output));
	gzip_stream.write(&input[0], input.size());
	boost::iostreams::close(gzip_stream);
#else
	boost::iostreams::filtering_istream gzip_stream;
	boost::iostreams::stream<boost::iostreams::array_source> input_array(&input[0], input.size());
	gzip_stream.push(boost::iostreams::gzip_compressor());
	gzip_stream.push(input_array);
	boost::iostreams::copy(gzip_stream, boost::iostreams::back_inserter(output));
#endif
	
	boost::iostreams::file_descriptor_sink destination("inputdata.gz");
	destination.write(&output[0], output.size());
}

Attachments (1)

inputdata (78.5 KB ) - added by matt@… 10 years ago.
inputdata to be used with the program in the ticket to trigger the bug

Download all attachments as: .zip

Change History (3)

by matt@…, 10 years ago

Attachment: inputdata added

inputdata to be used with the program in the ticket to trigger the bug

comment:1 by matt@…, 10 years ago

To be specific, when this inputdata file is run through the two different code paths, the only difference is that the USE_CORRUPTING_OSTREAM version is one byte shorter -- it is missing the 0xFF at byte 0x4000. Otherwise, the gzipped output is identical.

comment:2 by matt@…, 10 years ago

Further investigation has revealed that this is a bug in libcxx, not boost. The fix is here:

https://llvm.org/viewvc/llvm-project/libcxx/trunk/include/streambuf?annotate=165884

basically, the 0xff in the data was incorrectly passed by libcxx through to indirect_streambuf::overflow, causing it to be interpreted as an EOF (and rejected).

Please close this bug.

Note: See TracTickets for help on using tickets.