user52366
user52366

Reputation: 1147

Encapsulate data access using own istream_iterator

I currently have code that essentially contains these commands:

std::string line;
std::getline(ifs, line);

ifs is some std::ifstream. The code works.

I now would like to use the same on data I decompress from another source. Basically, I'd read in chunks that I buffer somewhere, and then retrieve data from this buffer using a function that is a bit like fgetc. This all works, but is a bad mix of non-STL with STL code.

What I would like to do is encapsulate the decompression and provide an iterator so that the call to std::getline works. I guess that would be an istream_iterator? What is the minimum that this class needs to contain?

UPDATE: Based on the comments I got, and the streambuf implementation at http://www.voidcn.com/article/p-vjnlygmc-gy.html, I now have the following code that works. As a test case, I used the zlib compression. The code is not really nice, and I appreciate any hint to improve it.

class gzip_streambuf: public std::streambuf
{
private: 
    gzip_streambuf(const gzip_streambuf &);      
    gzip_streambuf &operator= (const gzip_streambuf &);
public:
    explicit gzip_streambuf(gzFile h_, std::size_t buf_sz_ = 1024, std::size_t putback_sz_ = 2) :
        putback_sz(std::max(putback_sz_, std::size_t(1))),
        buffer(std::max(putback_sz_, buf_sz_ ) + putback_sz_),
        h(h_)
    {
        char *end = &buffer.front() + buffer.size();        
        setg(end, end, end);
    }
    ~gzip_streambuf() {}    
    bool error(void)
    {
        return gzerr;
    }

private:    
    int_type underflow()
    {
        if (gptr() < egptr())
            return traits_type::to_int_type(*gptr());
        char *base = &buffer.front();
        char *start = base;
        if (eback() == base) {
            std::memmove(base, egptr() - putback_sz, putback_sz); 
            start += putback_sz;
        }
        int n = gzread(h, start, buffer.size() - (start - base));
        if ( n < 0 )
            gzerr = true;                        
        if ( n <= 0)
            return traits_type::eof();
        setg(base, start, start + n);
        return traits_type::to_int_type(*gptr());
    }

private:
    bool gzerr = false;    
    gzFile h = nullptr;
    const std::size_t putback_sz;
    std::vector<char> buffer;
};


class gzip_istream: public std::istream
{    
public:
    gzip_istream() {};
    ~gzip_istream() { delete gzbuf; };

private:    
    gzip_istream(const gzip_istream  &);      
    gzip_istream &operator= (const gzip_istream &);

public:    
    void open( const char * filename) {
        if ( h = gzopen(filename, "rb") ) {
            delete gzbuf;
            if ( gzbuf = new gzip_streambuf(h) ) {
                set_rdbuf(gzbuf);
                clear();
            } else
                close();            
        } 
    }    
    void close() {
        if (h)
            gzclose( std::exchange(h, nullptr) );
        delete std::exchange(gzbuf, nullptr);
        set_rdbuf(nullptr);
    }        
    bool is_open(void) {
        return h != nullptr;
    }    
private:
    gzFile h = nullptr;
    gzip_streambuf *gzbuf = nullptr;
};

It can be used as follows:

std::string line;
gzip_istream gin;
gin.open("file.gz");         
if (gin.good()) {    
    while (std::getline( gin, line)) 
        std::cout << line << std::endl;
    gin.close();
}

I have tried to use std::unique_ptr for the gzip_streambuf member in gzip_istream, but all attempts failed because of the lack of copy constructor. I'd like to get rid of the new/delete stuff but am not sure how.

Upvotes: 1

Views: 65

Answers (0)

Related Questions