nounoursnoir
nounoursnoir

Reputation: 703

FSEEK offset accepts more than what it should accept

Following the Specification:

For a text stream, either offset shall be zero, or offset shall be a value returned by an earlier successful call to the ftell function on a stream associated with the same file and whence shall be SEEK_SET.

I understand that offset must be the retun value of a ftell function, or 0, and whence must be SEET_SET (or 0). But I used some integers as offsets and different SEEK_... and it seemed to work well.

For example, these worked:

fseek(file, 4, SEEK_CUR);
fseek(file, -1, SEEK_END);
fseek(file, 0, SEEK_CUR);

When I read the specification it seems to me that it should not work. I tried to use fseek this way many times, and it never failed. Why does it work, what point am I not getting?

Upvotes: 0

Views: 3292

Answers (5)

JeremyP
JeremyP

Reputation: 86651

I think the main reason why the definition of fseek is as it is in the C standard is that your logical position in a text file may not correlate to the physical number of bytes from the start of the text file.

For example, in Windows implementations, it is not uncommon to convert\r\n in the file on disk to just \n to maintain compatibility with Unix line endings. So if your file looks like this:

hello\r\nworld

i.e. two lines, and you fseek to position 6, would you expect to be on the \n or the w? If you tried to find out by using fgetc on Windows to count the characters, you'd assume you would be on the w. But fseek might chance advance to byte 6 without scanning for line endings.

Edit

And if we use the fgetc function, each character that we read increases our position of 1: the file cursor goes to the next character after the previous one was read. Is that a problem?

Yes. The problem is in the definition of "character". If you are in an environment that uses DOS conventions, using fgetc on a text stream when the next two bytes are 0x0d 0x0a advances the file position by two but only returns the 0x0a. There may be other conversions that the implementation chooses to make, like turning decomposed Unicode into precomposed unicode or vice versa.

The wording in the C standard allows implementations to lose the one to one mapping between bytes in the file and characters returned by fgetc without having to overcomplicate fseek.

Upvotes: 0

Chris Turner
Chris Turner

Reputation: 8142

There is nothing stopping you using fseek to go beyond the current size of the file. Because doing so allows you to write data at that point, filling the as yet unwritten gap with NULs. Like with this example code - it creates a file with 1000 NULs and then "hello\n"

#include <stdio.h>

int main(void)
    {
    FILE *f;

    f=fopen("test","w"); 
    if(f)
        {
        fseek(f,1000,SEEK_SET);
        fprintf(f,"hello\n");
        fclose(f);
        }
    else
        {
        perror("fopen");
        }
    }

Upvotes: 0

Paul Ogilvie
Paul Ogilvie

Reputation: 25286

All your fseek calls are valid. The number you provide as the second argument is an offset, meaning it is relative to the seek type that your provide as the third parameter.

fseek(file, 4, SEEK_CUR);    // seek 4 bytes forward from current position
fseek(file, -1, SEEK_END);   // seek to 1 byte before the end of the file
fseek(file, 0, SEEK_CUR);    // does nothing.

But see also user Tu.ma's explanation that the seek positions are not accurate and/or can be meaningless if the file has been opened in text mode (especially under Windows because of carriage return/line feed translation).

Upvotes: 0

Klas Lindb&#228;ck
Klas Lindb&#228;ck

Reputation: 33273

When I read the specification it seems to me that it should not work.

The specification, states what must work. It should be seen as the minimum requirements for someone creating a c library (i.e. the implementor of fseek et al).

Incorrect use might still work, but there is no guarantee. The result would depend on the platform.

For instance, the Linux manual page for fseeksays:

The fseek() function sets the file position indicator for the stream pointed to by stream. The new position, measured in bytes, is obtained by adding offset bytes to the position specified by whence. If whence is set to SEEK_SET, SEEK_CUR, or SEEK_END, the offset is relative to the start of the file, the current position indicator, or end-of-file, respectively. A successful call to the fseek() function clears the end-of-file indicator for the stream and undoes any effects of the ungetc(3) function on the same stream.

A you can see, the things you tried will work in Linux for both text and binary streams. However, there may exist platforms where fseek won't work with SEEK_CUR or SEEK_END for text streams.

Note also that a stream could be associated with different things: a file, a keyboard, a socket, a terminal window, a device, etc.

Upvotes: 1

Tu.Ma.
Tu.Ma.

Reputation: 1395

In the ftell documentation you can read

For text streams, the numerical value may not be meaningful but can still be used to restore the position to the same position later using fseek (if there are characters put back using ungetc still pending of being read, the behavior is undefined).

What you cited means that it may have sense to use it if you know where you want to place your pointer at, and you may know it because in precedence you invoked ftell().

All your calls to fseek are valid, but in a text file it has not much sense to move using fseek because it is not a random-access (binary) file, but still this does not mean that it is wrong to use it.

For a text file, you can find here the most common functions to access it, like fscanf(), fprintf() and so on.

Upvotes: 2

Related Questions