Reputation: 21
I need to handle large amount of data in memory (without using files/fstream) and I know that VS implementation of streambuf doesn't allow for that as it uses 32-bit counter (https://github.com/microsoft/STL/issues/388). I thought that maybe Boost could help me, but apparently it doesn't handle that properly as well (or maybe I'm missing something).
#include <vector>
#include <iostream>
#include <boost/iostreams/stream.hpp>
namespace bs = boost::iostreams;
int main()
{
uint64_t mb1 = 1024 * 1024;
uint64_t gb1 = 1024 * mb1;
uint64_t mbToCopy = 2048;
std::vector<char> iBuffer(mb1);
std::vector<char> oBuffer(4 * gb1);
bs::stream<bs::array_sink> oStr(oBuffer.data(), oBuffer.size());
for (int i = 0; i < mbToCopy; i++) {
oStr.write(iBuffer.data(), iBuffer.size());
}
std::cout << oStr.tellp() << std::endl; // (1)
oStr.seekp(0, std::ios_base::beg);
std::cout << oStr.tellp() << std::endl; // (2)
}
This code works fine as long as mbToCopy is not bigger than 2048 and the output is:
2147483648
0
When I change mbToCopy to 2049 the output is:
2148532224
4294967296
As you can see, when I try to move back to the beginning of the stream (this is example usage, but I need to be able to reposition to any place in the stream) it places me way beyond the current size of the stream and stream becomes unreliable. What's more, when I keep mbToCopy set to 2049 and reduce the size of oBuffer to 3GB oStr.seekp starts crashing.
Any idea if Boost provides other solutions that could help in my case?
Upvotes: 1
Views: 82
Reputation: 392833
I would suggest not using streams here at all. They seem to introduce unncessary overhead:
#include <cassert>
#include <iostream>
#include <vector>
static inline auto operator""_kb(unsigned long long v) { return v << 10ull; }
static inline auto operator""_mb(unsigned long long v) { return v << 20ull; }
static inline auto operator""_gb(unsigned long long v) { return v << 30ull; }
int main()
{
std::vector<char> iBuffer(1_mb);
std::vector<char> oBuffer(12_gb);
auto pos = oBuffer.begin();
for (size_t i = 0; i < 8192; i++) {
assert(std::next(pos, iBuffer.size()) <= oBuffer.end());
pos = std::copy_n(iBuffer.begin(), iBuffer.size(), pos);
}
auto tellp = [&] { return std::distance(oBuffer.begin(), pos); };
auto seekp = [&](size_t from_beg) { pos = std::next(oBuffer.begin(), from_beg); };
std::cout << tellp() << std::endl; // (1)
seekp(0);
std::cout << tellp() << std::endl; // (2)
}
Which on my system prints, without a concern:
8589934592
0
Of course I introduced the tellp()
/seekp()
helpers only to make the code as similar as possible. You could also just write:
auto const beg = oBuffer.begin();
auto pos = beg;
for (size_t i = 0; i < 8192; i++) {
assert(std::next(pos, iBuffer.size()) <= oBuffer.end());
pos = std::copy_n(iBuffer.begin(), iBuffer.size(), pos);
}
std::cout << (pos-beg) << std::endl; // (1)
pos = beg;
std::cout << (pos-beg) << std::endl; // (2)
With exactly the same output.
Upvotes: 0