Reputation: 21
I'm trying to compress some data using boost gzip compression via filtering_streambuf. The compressed version is then written to disc. The problem is the data is over 10GB in size and I believe stringstream is running out of space. Assuming I can break this data up into pieces, what's the right way of using stringstream and filtering_streambuf to compress all my data?
I've tried breaking up the data into pieces where I set the max chunk size to std::string::max_size()/2 and pushing several stringstream objects to the filtering_streambuf object but that doesn't seem to be how filtering_streambuf works :) I've also tried copying each chunk of data using bio::copy() repeatedly. I've attached a sample code that isn't my exact code (don't have access to it atm) but the idea is the same except compressed is a filestream. It's possible something I mentioned actually works and I just have a bug in my code but if that's the case then I'll find the bug. Just need to know what's considered the correct approach for compressing a large chunk of data.
EDIT: Added actual code I've written. For some reason, this doesn't compile because write is not a valid function? Also, can't declare filtering_ostream either. Maybe this version of boost is old? The variables being written are chars.
boost::iostreams::filtering_streambuf<boost::iostreams::output> out;
out.push(boost::iostreams::gzip_compressor());
out.push(boost::iostreams::file_sink(fileName.c_str()));
out.write(&sizeof_sizet, 1);
out.write(&sizeof_int, 1);
out.write(&sizeof_double, 1);
out.write(&sizeof_Int, 1);
EDIT 2: This might be what I'm trying to achieve. Compiles but didn't test yet.
boost::iostreams::filtering_ostreambuf buf;
buf.push(boost::iostreams::gzip_compressor());
buf.push(boost::iostreams::file_sink(fileName.c_str()));
std::ostream out(&buf);
out.write(&sizeof_sizet, 1);
out.write(&sizeof_int, 1);
out.write(&sizeof_double, 1);
out.write(&sizeof_Int, 1);
Upvotes: 2
Views: 2435
Reputation: 19041
Use a filtering_stream
instead of filtering_streambuf
and write directly to a file to avoid having to buffer the entire compressed result in memory until completion.
#include <boost/iostreams/device/file.hpp>
#include <boost/iostreams/filtering_stream.hpp>
#include <boost/iostreams/filter/gzip.hpp>
int main()
{
boost::iostreams::filtering_ostream out;
out.push(boost::iostreams::gzip_compressor());
out.push(boost::iostreams::file_sink("test.gz"));
std::string test_string("FOO BAR BAZ....\n");
out.write(test_string.c_str(), test_string.size() + 1);
}
I can run it, and then try to decompress the file it created:
>ls test.gz
ls: test.gz: No such file or directory
>test.exe
>ls test.gz
test.gz
>gzip -cd test.gz
FOO BAR BAZ....
Upvotes: 2