How can I get a file's size in C without using either fseek or stat?

Question

I'm doing a project for my school and I can't find out how to get the size of a file. Since I need to read a script and use it in my program, I need the size of the file to use either read or fread.

Here is what I've done to get the file size but it doesn't seem to work.

int my_size(int filedesc)
{
    int size = 1;
    int read_output = 1;
    char *buffer;

    for (size = 1; read_output != 0 ; size++) {
        buffer = malloc((size+1)*sizeof(char*));
        read_output = read(filedesc, buffer, size);
        free(buffer);
    }
    return(size);
}

And I'm not allowed to use stat() nor fseek() as rules for this project nor can I use read or fread with an arbitrary size like 100 since scripts given can be either small or big.

John Bollinger · Accepted Answer

If you can rely on the input to be a persistent file (i.e. residing on storage media), and on that file not being modified during your program's run, then you could pre-read it to the end to count the bytes in it, then rewind.

But outside of an academic exercise, the usual reason to forbid measuring the size via stat(), fseek(), and similar is that the input might not reside on storage media, so that

you cannot determine its size without reading it, but also
you cannot rewind it or seek within it.

The trick then is not how to determine the size in advance, but rather how to do without measuring the size in advance. There are at least two main strategies for that:

Don't rely on storing the whole contents in memory at once in the first place. Instead, operate on its contents as they are read, maintaining only enough in memory at any given time to do so.
Alternatively, adapt dynamically to the file size. There are many variations on this. For example, if you're just reading the file into a monolithic block then you can malloc() space and realloc() when you find you need more. Or you could store the contents in a linked list, allocating new list nodes as needed.

As for the approach presented in the question, there are several issues with it. It appears to be an attempt to do as I first described -- reading the file to the end to determine its size -- but

It seems to assume that each read() will start at the beginning of the file, or perhaps that read() will fail if it cannot read the full file. Neither is the case. Each read() will start at the file's current position, and will leave the file positioned after the last byte transferred.
Because it changes the file position, your approach will require the file to be rewound after -- via lseek(), for example. But if lseek() can be used for that purpose (and note well my previous comments with respect to files in which you cannot seek), then it would provide a much cleaner approach to measuring the file's size.
You do not account for I/O errors. If one occurred then it would probably send your program into an infinite loop.

Dynamic allocation is comparatively expensive, and you're doing a whole lot of it. If you want to implement the pre-reading strategy, then this would be a better implementation:

ssize_t count_bytes(int fd) {
    ssize_t num_bytes = 0;
    char buffer[2048];
    ssize_t result;

    do {
        result = read(fd, buffer, sizeof(buffer));
        if (result < 0) {
            // handle error ...
        }
        num_bytes += result;
    while (result > 0);

    return num_bytes;
}

How can I get a file's size in C without using either fseek or stat?

Answers (2)

Related Questions

How can I get a file&#39;s size in C without using either fseek or stat?

Answers (2)

Related Questions

How can I get a file's size in C without using either fseek or stat?