catalystofanger
catalystofanger

Reputation: 71

File size lookup in C

I was wondering if there was any significant performance increase in using sys/stat.h versus fseek() and and ftell()?

Upvotes: 7

Views: 3815

Answers (5)

Grumbel
Grumbel

Reputation: 7063

Depending on the circumstances, stat() can be hundred of times faster then seek()/tell(). I am currently toying around with sshfs/FUSE and getting the file size of a few thousand files with seek()/tell() takes well over a minute, doing it with stat() takes a second. So the difference is pretty huge when working over sshfs/FUSE.

Upvotes: 0

hasvn
hasvn

Reputation: 1132

If you're not sure, try it!

I just coded this test. I generated 10,000 files of 2KB each, and iterated over all of them, asking for their file size.

Results on my machine by measuring with the "time" command and doing an average of 10 runs:

  • fseek/fclose version: 0.22 secs
  • stat version: 0.06 secs

So, the winner (at least on my machine): stat!

Here's the test code:

#include <stdio.h>
#include <sys/stat.h>

#if 0 
size_t getFileSize(const char * filename)
{
    struct stat st;
    stat(filename, &st);
    return st.st_size;
}
#else
size_t getFileSize(const char * filename)
{
    FILE * fd=fopen(filename, "rb");
    if(!fd)
        printf("ERROR on file %s\n", filename);

    fseek(fd, 0, SEEK_END);
    size_t size = ftell(fd);
    fclose(fd);
    return size;
}
#endif

int main()
{   
    char buf[256];
    int i, n;
    for(i=0; i<10000; ++i)
    {   
        sprintf(buf, "file_%d", i);
        if(getFileSize(buf)!= 2048)
            printf("WRONG!\n");
    }
    return 0;
}

Upvotes: 8

Jonathan Leffler
Jonathan Leffler

Reputation: 755054

Choosing between fstat() and the fseek()/ftell() combination, there isn't going to be much difference. The single function call should be slightly quicker than the double function call, but the difference won't be great.

Choosing between stat() and the combination isn't a very fair comparison. For the combination calls, the hard work was done when the file was opened, so the inode information is readily available. The stat() call has to parse the file path and then report what it finds. It should almost always be slower - unless you recently opened the file anyway so the kernel has most of the information cached. Even so, the pathname lookup required by stat() is likely to make it slower than the combination.

Upvotes: 7

Vern
Vern

Reputation: 2413

For stat.h you mainly want to use it to tell the stats of the file. Like if you want to tell if it's a file or a directory, etc.

However, if you want to do manipulations with the file, then you'll probably want to use ftell() and fseek(). That is you're actually doing manipulations on the file stream itself.

So in terms of performance, it's really what you need.

Hope it helps :) Cheers!

Upvotes: 0

Williham Totland
Williham Totland

Reputation: 29039

Logically, one would assume that fseek() when prompted to seek to the end of the file uses stat to know how far to seek, or rather, where the end of the file is.

This would make fseek slower than using the facilities directly, and it also requires you to fopen the file in the first place.

Still, any performance difference is likely to be negligible, and if you need to open the file for some reason anyway, fseek/ftell likely improves the readability of your code significantly.

Upvotes: 0

Related Questions