tflutre
tflutre

Reputation: 3546

Read gzipped files by using a STL vector of pointers to igzstreams

As input, I have a list of gzipped files. As shown here, I use gzstream to handle them. For practical reasons, I want to open each file and record each stream into a vector. It seems pretty straightforward but I don't manage to make it work. Here is the minimal code:

#include <cstdlib>

#include <iostream>
#include <vector>
using namespace std;

#include <gzstream.h>

int main (int argc, char ** argv)
{
  size_t i;
  vector<string> vInFiles;
  vector<igzstream *> vStreams;
  string line;

  // create the dummy input files
  system ("rm -f infile*.gz; for i in {1..2}; do echo \"toto\"${i} | gzip > infile${i}.gz; done");
  vInFiles.push_back ("infile1.gz");
  vInFiles.push_back ("infile2.gz");

  // open each input file
  for (i = 0; i < vInFiles.size(); ++i)
  {
    igzstream inStream;
    inStream.open (vInFiles[i].c_str());
    if (! inStream.good())
    {
      cerr << "ERROR: can't open file " << vInFiles[i] << endl;
      exit (1);
    }
    vStreams.push_back (&inStream);
  }

  // manipulate each input file
  for (i = 0; i < vInFiles.size(); ++i)
  {
    cout << "read first line of file " << vInFiles[i] << endl;
    getline (*(vStreams[i]), line);
    if (line.empty())
    {
      cerr << "empty line" << endl;
      exit (1);
    }
    cout << line << endl;
  }

  // close each input file
  for (i = 0; i < vInFiles.size(); ++i)
  {
    vStreams[i]->close();
  }
  vStreams.clear();

  return 0;
}

This code compiles properly:

$ gcc -Wall test.cpp -lstdc++ -lgzstream -lz

And although it run smoothly, it doesn't read the files properly:

$ ./a.out
read first line of file infile1.gz
empty line                

Upvotes: 0

Views: 666

Answers (3)

tflutre
tflutre

Reputation: 3546

As mentioned in the comments, I would prefer not to use Boost and I only have gcc 4.1.2. Thus, here is the solution using the free store, thanks to the suggestion of Cat Plus Plus:

  // open each input file
  for (i = 0; i < vInFiles.size(); ++i)
  {
    igzstream * pt_inStream = new igzstream;
    pt_inStream->open (vInFiles[i].c_str());
    if (! pt_inStream->good())
    {
      cerr << "ERROR: can't open file " << vInFiles[i] << endl;
      exit (1);
    }
    vStreams.push_back (pt_inStream);
  }

And:

  // close each input file                                                                                                                                                                                                                                           
  for (i = 0; i < vInFiles.size(); ++i)
  {
    vStreams[i]->close();
    delete vStreams[i];
  }

Upvotes: 0

Cat Plus Plus
Cat Plus Plus

Reputation: 129774

Your stream pointers are invalid after the iteration ends, as the automatic stream object is destroyed then. If you really need that you need to allocate them on the free store (or make igzstream movable).

// std::vector<boost::shared_ptr<igzstream>> for C++03 
std::vector<std::unique_ptr<igzstream>> vStreams;

// ...

for (size_t i = 0; i < vInFiles.size(); ++i) {
    // boost::shared_ptr<igzstream> inStream = boost::make_shared<igzstream>();
    auto inStream = std::unique_ptr<igzstream>(new igzstream);
    inStream->open(...);
    // ...
    vStreams.push_back(inStream);
}

// ...

Upvotes: 2

Nim
Nim

Reputation: 33655

This is broken; you store a vector of pointers to streams, but you initialize it with a pointer to a locally scoped automatic instance of the stream (inside the for-loop). Once each iteration of the loop completes, that instance is out of scope, and you have a pointer to some crap.

You then use that crap later, and you get crap out.

Use a smart pointer, e.g.

  std::vector<boost::shared_ptr<igzstream> > vStreams;
  // to initialize
  for (i = 0; i < vInFiles.size(); ++i)
  {
    boost::shared_ptr<igzstream> inStream(new igzstream(vInFiles[i].c_str());
    if (!inStream->good())
    {
      cerr << "ERROR: can't open file " << vInFiles[i] << endl;
      exit (1);
    }
    vStreams.push_back (inStream); // save the smart pointer
  }

Upvotes: 0

Related Questions