rjs
rjs

Reputation: 848

fopen() in read-only mode and its buffer

Consider the following, albeit very messy, code in C:

#include<stdio.h>

int main() {
    char buf[3]; //a new, small buffer
    FILE *fp = fopen("test.txt", "r"); //our test file, with the contents "123abc"
    setvbuf(fp, buf, _IOFBF, 2); //we assign our small buffer as fp's buffer \
                                 //in fully buffered mode
    char character = fgetc(fp); // get the first character...
    character = fgetc(fp);      // and the next...
    character = fgetc(fp);      // and the next... (third character, '3')
    buf[2] = '\0'; //add a terminating line for display
    fputs(buf, stderr); //write our buffer to stderr, should show up immediately
}

Compiling and running the code will print '3a' as the contents of our self-designated buffer, buf. My question is: how does this buffer get filled? Does a call to fgetc() mean several calls until the buffer is full and then stops (we only made three calls to fgetc, which should not include the present 'a')? The first buffer was "12", so does this mean when another fgetc() call is made and the pointer references something outside of the scope of the buffer, is the buffer purged and then filled with the next block of data, or simply overwritten? I understand buffer sizes are platform dependent so I'm more concerned with how, in general, an fopen()ed stream in a read mode pulls characters into it's buffer.

Upvotes: 0

Views: 4804

Answers (1)

Thomas Padron-McCarthy
Thomas Padron-McCarthy

Reputation: 27652

The buffer, and exactly how and when it is filled, is an implementation detail inside the stdio package. But the way it is likely to be implemented is that fgetc gets one character from the buffer, if there are characters available in the buffer. If the buffer is empty, it fills it by reading (in your case) two more characters from the file.

So your first fgetc will read 12 from the file and put it in the buffer, and then return '1'. Your second fgetc will not read from the file, since a character is available in the buffer, and return '2'. Your third fgetc will find that the buffer is empty, so it will read 3a from the file and put it in the buffer, and then return '3'. Therefore, when you print the content of the buffer, it will be 3a.

Note that there are two levels of "reading" happening here. First you have your fgetc calls, and then, below that level, code inside the stdio packade which is reading from the file. If we assume this is on a Unix or Linux system, the second type of reading is done using the system call read(2).

The lower-level reading fills the entire buffer at once, so you don't need as many calls to read as calls to fgetc. (Which is the entire point of having the buffer.)

Upvotes: 2

Related Questions