Reputation: 7824
I am trying to communicate a std::vector<MyClass>
with varying size via MPI. MyClass
contains members that are vectors that may be uninitialized or vary in size. To do that, I wrote a serialize()
und deserialize()
function that reads and writes such a std::vector<MyClass>
to a std::string
, which I then communicate via MPI.
class MyClass {
...
int some_int_member;
std::vector<float> some_vector_member;
}
std::vector<MyClass> deserialize(const std::string &in) {
std::istringstream iss(in);
size_t total_size;
iss.read(reinterpret_cast<char *>(&total_size), sizeof(total_size));
std::vector<MyClass> out_vec;
out_vec.resize(total_size);
for(MyClass &d: out_vec) {
size_t v_size;
iss.read(reinterpret_cast<char *>(&d.some_int_member), sizeof(d.some_int_member));
iss.read(reinterpret_cast<char *>(&v_size), sizeof(v_size));
d.some_vector_member.resize(v_size);
iss.read(reinterpret_cast<char *>(&d.some_vector_member[0]), v_size * sizeof(float));
}
return out_vec;
}
std::string serialize(std::vector<MyClass> &data) {
std::ostringstream os;
size_t total_size = data.size();
os.write(reinterpret_cast<char *>(&total_size), sizeof(total_size));
for(MyClass &d: data) {
size_t v_size = d.some_vector_member.size();
os.write(reinterpret_cast<char *>(&some_int_member), sizeof(some_int_member));
os.write(reinterpret_cast<char *>(&v_size), sizeof(v_size));
os.write(reinterpret_cast<char *>(&d.some_vector_member[0]), v_size * sizeof(float));
}
return os.str();
}
My implementation works in principle, but sometimes (not always!) MPI processes crash at positions I think are related to the serialization. The payload sent can be as big as hundrets of MB. I suspect that using std::string
as a container is not a good choice. Are there some limitations using std::string
as a container for char[]
with huge binary data that I may be running into here?
(Note, that I don't want to use boost::mpi
along with its serialization routines, neither do I want to pull in a huge library such as cereal
into my project)
Upvotes: 0
Views: 1182
Reputation: 22650
Generally, using std::string
for binary data is fine although some people might prefer std::vector<char>
- or std::vector<std::byte>
in C++17 (see also, note C++11 strings guarantee contiguous data). There are two significant efficiency issues in your code:
string
and the intermediate [io]stringstream
.ostringstream
, which may lead to over-allocation and frequent reallocation.Hence, you waste a significant amount of memory, which might contribute to bad_alloc
. That said, it may be perfectly fine and you just have a memory leak somewhere. It's impossible to tell if this is a practical issue for you without knowing the cause of the bad_alloc
and a performance analysis of your application.
Upvotes: 1