Raph Schim
Raph Schim

Reputation: 538

reading from istringstream slower than ifstream

I'm trying to read point cloud file (PTX). I tried 2 solutions in order to do that:

The first one : The most easy method. std::ifstream and getline(...) while I can read.
The second one : I read everything and put everything in a std::istringstream then read from this using operator >>.
Since I put everything in memory with the second method, I tought reading from it would be faster but no.
In average : 45 seconds for method number 1 and 49 for number 2.

Here are my codes :
Method 1:

std::istringstream getLine(std::ifstream& file) {
    std::string line;
    std::getline(file, line);
    return std::istringstream{ line };
}

void readPoint(std::ifstream& file, TinyPTX& tptx) {
    std::vector<PointPTX> ptxPoints(tptx.numPoints);

    size_t num_pts_to_remove = 0;
    tptx.asCol = true;
    for (int i = 0; i < tptx.numPoints; ++i) {
        float x, y, z, intens;
        uint8_t r, g, b;
        getLine(file) >> x >> y >> z >> intens >> r >> g >> b;
        PointPTX& _pptx = tptx.cloud->points[i - num_pts_to_remove];
        if (!isZero(x, 10e-4) || !isZero(y, 10e-4) || !isZero(z, 10e-4)) {
            _pptx.x = x;  _pptx.y = y; _pptx.z = z; _pptx.intensity = intens;
            _pptx.r = r;
            _pptx.g = g;
            _pptx.b = b;
        }
        else
            num_pts_to_remove++;
    }
    tptx.numPoints -= num_pts_to_remove;
    tptx.cloud->points.resize(tptx.numPoints);
}

Method 2 :

bool readPoint(std::istringstream& str, TinyPTX& tptx, std::streamsize& size) {
    std::vector<PointPTX> ptxPoints(tptx.numPoints);

    size_t num_pts_to_remove = 0;
    for (int i = 0; i < tptx.numPoints; ++i) {
        float x, y, z, intens;
        int r, g, b;
        str >> x >> y >> z >> intens >> r >> g >> b;
        PointPTX& _pptx = tptx.cloud->points[i - num_pts_to_remove];
        if (!isZero(x, 10e-4) || !isZero(y, 10e-4) || !isZero(z, 10e-4)) {
            _pptx.x = x;  _pptx.y = y; _pptx.z = z; _pptx.intensity = intens;
            _pptx.r = r;
            _pptx.g = g;
            _pptx.b = b;
        }
        else
            num_pts_to_remove++;
    }
    tptx.numPoints -= num_pts_to_remove;
    tptx.cloud->points.resize(tptx.numPoints);

    int pos = str.tellg();
    std::cout << pos << " " << size;
    return pos > size - 10 ;//Used to know if we're at the end of the file. 
}

My question is : why is the version in which I put everything in memory slower than the other? Is there something I'm missing? I'm doing wrong?

Upvotes: 0

Views: 102

Answers (2)

mgueydan
mgueydan

Reputation: 401

The first method is slower !

I ran your code on a short sample (20k lines) and the performances I did observe are :

  • first method :
    • 1004 ms total
    • 1004 ms in the readPoint method
    • 553 ms in the std::getline method (inside the readPoint method)
  • second method :
    • 101 ms total
    • 56 ms in the readPoint method.
    • 54 ms in the ifstream::rdbuf method (outside of the readPoint method)

Of course I had to write the code that read the file and put it in istringstream in the second case, and performances of the second method really depends on how you built your istringstream.

What I suspect is that a part of your issue is out of the code you show here.

Why is it ?

The main reason why the first method is much slower is because of what Maxim Egorushkin explained to you here .

But even if you spare all that unnecessary wrapping you do, it is anyway, because files operation are buffered, and a single call on ifstream::rdbuf is faster than multiple call to >> opreator.

Unfortunately, this second method will become really slower if, for some reason, you lack memory.

Using the code suggested by Maxim Egorushkin I had those results :

  • 310 ms total
  • 305 ms in the readPoint method

Regards,

Upvotes: 1

Maxim Egorushkin
Maxim Egorushkin

Reputation: 136256

Is there something I'm missing? I'm doing wrong?

It only makes sense to create an intermediate std::istringstream for each line if your would like to ignore the rest of the line.

If all lines contain just these 7 values you can read them directly from std::istream& file (instead of std::ifstream& file). I.e. change:

getLine(file) >> x >> y >> z >> intens >> r >> g >> b;

to:

file >> x >> y >> z >> intens >> r >> g >> b;

Upvotes: 1

Related Questions