ScienceDiscoverer
ScienceDiscoverer

Reputation: 201

Strange offset for seekg() while reading text file

I'm trying to get last line of file, using logic described here Fastest way to read only last line of text file?, but I'm getting some strange anomaly:

score.seekg(-2, ios::cur);

resets my stream to the same character, so I get infinite loop. However, setting it to -3 works perfectly:

fstream score("high_scores.txt"); //open file

if(score.is_open()) //file exist
{   
    score.seekg(0, ios::end);

    char tmp = '~';
    while(tmp != '\n')
    {
        score.seekg(-3, ios::cur);

        if((int)score.tellg() <= 0) //start of file is start of line
        {
            score.seekg(0);
            break;
        }
        tmp = score.get();
        cout << tmp << "-";
    }
}

Again, the problem is - this code works only with seekg() offset -3, when, theoretically, it should work with -2. Can this be explained somehow? The file contents are like this (newline at the end of file):

28 Mon Jul 10 16:11:24 2017
69 Mon Jul 10 16:11:47 2017
145 Mon Jul 10 16:53:09 2017

I'm using Windows, so now I understand why I need -3 offset from the end of file (to pass CR and LF bytes). But lets consider first char (from end).

28 Mon Jul 10 16:11:24 2017

So, stream gets to 7. It extracts it, and moves to CR byte. If, then, in next loop iteration we offset it -3, we will get 0, but not 1! But in reality, I'm getting 1! And all works fine with -3 offset. That is the mystery for me. Can't get it out of my head.

Upvotes: 1

Views: 882

Answers (1)

G. Sliepen
G. Sliepen

Reputation: 7984

I hope this illustrates what is happening:

28 Mon Jul 10 16:11:24 2017CL  <- C = CR, L = LF
                       6543210 <- position relative to ios::end
                        | || |
                        | || * Start after seekg(0, ios::end)
                        | *|   After first seekg(-3, ios::cur)
                        |  *   After first get()
                        *      After second seekg(-3, ios::cur)

When you seek to SEEK_END, you move the stream position pointer to the byte right past the end of the file. If you seek -3, you skip over the CR, LF, and end up on the '7'. You read this byte, but this moves the pointer one byte ahead. Then you go three back again, and you end up at the '0'.

Note that line endings in the file really are two bytes (CR and LF). It's just that when you read them, that they are converted to a single '\n'. However, when you seek it just uses byte offsets into the actual file. This is why people either recommend you just read the file from start to finish, or that you open the file in binary mode to remove this dichotomy.

Upvotes: 2

Related Questions