user1641774
user1641774

Reputation:

Relationship between file descriptors, file pointers and file position indicators

I am trying to understand how a file position indicator moves after I read some bytes from a file. I have a file named "filename.dat" with a single line: "abcdefghijklmnopqrstuvwxyz" (without the quotes).

#include <stdio.h>
#include <unistd.h>
#include <errno.h>
#include <sys/types.h>
#include <sys/stat.h>
#include <fcntl.h>


int main () {

    int fd = open("filename.dat", O_RDONLY);
    FILE* fp = fdopen(fd,"r");
    printf("ftell(fp): %ld, errno = %d\n", ftell(fp), errno);

    fseek(fp, 5, SEEK_SET); // advance 5 bytes from beginning of file
    printf("file position indicator: %ld, errno = %d\n", ftell(fp), errno);

    char buffer[100];
    int result = read(fd, buffer, 4); // read 4 bytes 
    printf("result = %d, buffer = %s, errno = %d\n", result, buffer, errno);
    printf("file position indicator: %ld, errno = %d\n", ftell(fp), errno);

    fseek(fp, 3, SEEK_CUR); // advance 3 bytes 
    printf("file position indicator: %ld, errno = %d\n", ftell(fp), errno);
    result = read(fd, buffer, 6);  // read 6 bytes 
    printf("result = %d, buffer = %s, errno = %d\n", result, buffer, errno);

    printf("file position indicator: %ld\n", ftell(fp));

    close(fd);
    return 0;
}


ftell(fp): 0, errno = 0
file position indicator: 5, errno = 0
result = 4, buffer = fghi, errno = 0
file position indicator: 5, errno = 0
file position indicator: 8, errno = 0
result = 0, buffer = fghi, errno = 0
file position indicator: 8

I do not understand why the second time I try to use read, I get no bytes from the file. Also, why does the file position indicator not move when I read contents from the file using read? On the second fseek, advancing 4 bytes instead of 3 did also not work. Any suggestions?

Upvotes: 1

Views: 1557

Answers (2)

NovaDenizen
NovaDenizen

Reputation: 5305

First thing to note is that the read calls read chars into a raw buffer, but printf() expects to be handed null-terminated strings for %s parameters. You're not explicitly adding a null-terminator byte so your program might print garbage after the first 4 bytes of the buffer, but you've been lucky and your compiler has initialized the buffer to zeroes so you haven't noticed this problem.

The essential problem in this program is that you're mixing high-level buffering FILE * calls with low level file descriptor calls, which will result in unpredictable behavior. FILE structs contain a buffer and a couple of ints to support more efficient and convenient access to the file behind a file descriptor.

Basically all f*() calls (fopen(), fread(), fseek(), fwrite()) expect that all I/O is going to be done by f*() calls using a FILE struct, so the buffer and index values in the FILE struct will be valid. The low-level calls (read(), write(), open(), close(), seek()) completely ignore the FILE struct.

I ran strace on your program. The strace utility logs all system calls made by a process. I've omitted all the uninteresting stuff up to your open() call.

Here is your open call:

open("filename.dat", O_RDONLY)          = 3

Here is where fdopen() is happening. The brk calls are evidence of memory allocation, presumably for something like malloc(sizeof(FILE)).

fcntl64(3, F_GETFL)                     = 0 (flags O_RDONLY)
brk(0)                                  = 0x83ea000
brk(0x840b000)                          = 0x840b000
fstat64(3, {st_mode=S_IFREG|0644, st_size=26, ...}) = 0
mmap2(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0xb7728000

This might be the effect of ftell() or just the last part of fdopen, I'm not sure.

_llseek(3, 0, [0], SEEK_CUR)            = 0

Here is the first printf.

write(1, "ftell(fp): 0, errno = 0\n", 24) = 24

Here is the first fseek, which has decided the easiest way to get to position 5 in the file is to just read in 5 bytes and ignore them.

_llseek(3, 0, [0], SEEK_SET)            = 0
read(3, "abcde", 5)                     = 5

Here is the third printf. Notice that there is no evidence of a ftell() call. ftell() uses the information in the FILE struct, which claims to be accurate, so no system call is necessary.

write(1, "file position indicator: 5, errn"..., 38) = 38

Here is your read() call. Now, the operating system file handle is at position 9, but the FILE struct thinks it is still at position 5.

read(3, "fghi", 4)                      = 4

The third and fourth printf with ftell indication position 5.

write(1, "result = 4, buffer = fghi, errno"..., 37) = 37
write(1, "file position indicator: 5, errn"..., 38) = 38

Here is the fseek(fp, 3, SEEK_CUR) call. fseek() has decided to just SEEK_SET back to the beginning of the file and read the whole thing into the FILE struct's 4k buffer. Since it "knew" it was at position 5, it "knows" it must be at position 8 now. Since the file is only 26 bytes long, the os file position is now at eof.

_llseek(3, 0, [0], SEEK_SET)            = 0
read(3, "abcdefghijklmnopqrstuvwxyz", 4096) = 26

The fifth printf.

write(1, "file position indicator: 8, errn"..., 38) = 38

Here is your second read() call. Since the file handle is at eof, it reads 0 bytes. It doesn't change anything in your buffer.

read(3, "", 6)                          = 0

The sixth and seventh printf calls.

write(1, "result = 0, buffer = fghi, errno"..., 37) = 37
write(1, "file position indicator: 8\n", 27) = 27

Your close() call, and the process exit.

close(3)                                = 0
exit_group(0)                           = ?

Upvotes: 0

n. m. could be an AI
n. m. could be an AI

Reputation: 119847

Use fseek and fread or lseek and read, but do not mix the two APIs, it won't work.

A FILE* has its own internal buffer. fseek may or may not move the internal buffer pointer only. It is not guaranteed that the real file position indicator (one that lseek is responsible for) changes, and if it does, it is not known by how much.

Upvotes: 2

Related Questions