MarayJay
MarayJay

Reputation: 95

c++ write vectors to hdf5-file without storing in array

I'm trying to store 4 vectors (of variable size, but all 4 have the same size) in an .h5 file. Until now, I did it like this:

void write_file_h5(std::string filename,
            std::vector<int16_t> &x,
            std::vector<int16_t> &y,
            std::vector<int8_t> &p,
            std::vector<int32_t> &ts)
{
  struct myEvent myEvents[x.size()];
  for (int i=0; i<x.size();i++)
  {
    myEvents[i].x = x[i];
    myEvents[i].y = y[i];
    myEvents[i].p = int(p[i]);
    myEvents[i].ts = ts[i];
  }
  H5::CompType mtype(sizeof(myEvent));
  // Define the datatype to pass HDF5
  mtype.insertMember("x", HOFFSET(myEvent, x), H5::PredType::NATIVE_INT16);
  mtype.insertMember("y", HOFFSET(myEvent, y), H5::PredType::NATIVE_INT16);
  mtype.insertMember("p", HOFFSET(myEvent, p), H5::PredType::NATIVE_INT8);
  mtype.insertMember("ts", HOFFSET(myEvent, ts), H5::PredType::NATIVE_INT32);


  // preparation of a dataset and a file.
  hsize_t dim[1];
  dim[0] = sizeof(myEvents) / sizeof(myEvent);
  int rank = sizeof(dim) / sizeof(hsize_t);
  H5::DataSpace space(rank, dim);
  H5::H5File *file = new H5::H5File(filename, H5F_ACC_TRUNC);
  H5::DataSet *dataset = new H5::DataSet(file->createDataSet("myDataset", mtype, space));
  dataset->write(myEvents, mtype);
  delete dataset;
  delete file;
}

The problem with this is: As the vectors get bigger (vector size > 732428), I get a segmentation fault:

Segmentation fault (core dumped)

I assume that's because the the array I'm trying to create would just be too big. Does anyone have an idea on how to solve that issue? i.e. how to write the vectors while using a minimum amount of memory? I need the vectors because I need to fill them dynamically during the execution of the programm... so directly storing the values in an array doesn't seem to be possible.

Thanks in advance

Upvotes: 0

Views: 1169

Answers (1)

Ted Lyngmo
Ted Lyngmo

Reputation: 117298

This line

struct myEvent myEvents[x.size()];

creates a variable-length array which is a non-standard (in C++) array that is likely placed on the stack.

Replace that with a std::vector<myEvent> to allocate the memory on the heap instead.

void write_file_h5(std::string filename,
            std::vector<int16_t> &x,
            std::vector<int16_t> &y,
            std::vector<int8_t> &p,
            std::vector<int32_t> &ts)
{
  // Suggestion: check that y, p and ts has at least as many elements as x
  std::vector<myEvent> myEvents(x.size());  // replacement for the VLA

  for (int i=0; i<x.size();i++) {
    myEvents[i].x = x[i];
    myEvents[i].y = y[i];
    myEvents[i].p = int(p[i]);
    myEvents[i].ts = ts[i];
  }

  H5::CompType mtype(sizeof(myEvent));
  // Define the datatype to pass HDF5
  mtype.insertMember("x", HOFFSET(myEvent, x), H5::PredType::NATIVE_INT16);
  mtype.insertMember("y", HOFFSET(myEvent, y), H5::PredType::NATIVE_INT16);
  mtype.insertMember("p", HOFFSET(myEvent, p), H5::PredType::NATIVE_INT8);
  mtype.insertMember("ts", HOFFSET(myEvent, ts), H5::PredType::NATIVE_INT32);

  // preparation of a dataset and a file.
  hsize_t dim[1];
  dim[0] = myEvents.size();                   // using vector::size()
  int rank = sizeof(dim) / sizeof(hsize_t);
  H5::DataSpace space(rank, dim);

  H5::H5File *file = new H5::H5File(filename, H5F_ACC_TRUNC);
  H5::DataSet *dataset = new H5::DataSet(file->createDataSet("myDataset", mtype, space));
  dataset->write(myEvents.data(), mtype);     // use vector::data()
  delete dataset;
  delete file;
}

I also suggest not using new and delete to simplify the last part of the function with:

  H5::H5File file(filename, H5F_ACC_TRUNC);
  H5::DataSet dataset(file.createDataSet("myDataset", mtype, space));
  dataset.write(myEvents.data(), mtype);

Upvotes: 1

Related Questions