Dariusz
Dariusz

Reputation: 22241

FILE* and file descriptor read/write performance

What are the performance implications of using FILE* or file descriptor APIs for handling local-disk file binary data reads and writes? Does either way have any advantages over the other?

Is either fread() or read() better than the other in terms of performance? How do they differ in their behavior, caching or system resources usage?

Is either fwrite() or write() better than the other in terms of performance? How do they differ in their behavior, caching or system resources usage?

Upvotes: 3

Views: 2202

Answers (3)

ja_mesa
ja_mesa

Reputation: 1969

fread/fwrite is faster than read/write, I agree, but:

1) If the file is going to be accessed randomly then fwrite/fread could not be used so efficiently and most of the time they could cause a performance penalty.

2) If the file is being sharing by another process or thread then it won't be so fast and could not be used unless you use the flush() command every time you write to the file, and, in this case, the speed is decrease at least equally to the write command. Also the fread command cannot be used because it uses its buffers to read the data which may not be updated or, if its care about the update, it has to discard what has read to read new data.

So, it depends.

Upvotes: 1

Sergey L.
Sergey L.

Reputation: 22542

read and write are system calls: thus they are unbuffered in user space. Everything you submit there will go directly into the kernel. The underlying file system may have internal buffering, but the biggest performance impact here will come from changing into kernel space on each call.

fread and fwrite are userspace library calls and are by default buffered. Thus these will group together your accesses to make them faster (in theory).

Try it yourself: read from a file one byte at a time and then fread from it one byte at a time. The latter should be a good 4000 times faster.

#include <stdlib.h>
#include <stdio.h>
#include <unistd.h>
#include <sys/types.h>
#include <sys/stat.h>
#include <fcntl.h>
#include <sys/time.h>
#include <sys/resource.h>

int main() {
    struct rusage usage_start, usage_end;

    getrusage(RUSAGE_SELF, &usage_start);

    int fd = open("/dev/zero", O_RDONLY);

    int i = 0x400 * 0x400; // 1 MB

    char c;

    while (i--)
        read(fd, &c, 1);

    close(fd);

    getrusage(RUSAGE_SELF, &usage_end);

    printf("Time used by reading 1MiB: %zu user, %zu system.\n", ((usage_end.ru_utime.tv_sec - usage_start.ru_utime.tv_sec)* 1000000) + usage_end.ru_utime.tv_usec - usage_start.ru_utime.tv_usec, ((usage_end.ru_stime.tv_sec - usage_start.ru_stime.tv_sec)* 1000000) + usage_end.ru_stime.tv_usec - usage_start.ru_stime.tv_usec);

    getrusage(RUSAGE_SELF, &usage_start);

    FILE * fp = fopen("/dev/zero", "r");

    i = 0x400 * 0x400; // 1 MB

    while (i--)
        fread(&c, 1, 1, fp);

    fclose(fp);

    getrusage(RUSAGE_SELF, &usage_end);

    printf("Time used by freading 1MiB: %zu user, %zu system.\n", ((usage_end.ru_utime.tv_sec - usage_start.ru_utime.tv_sec)* 1000000) + usage_end.ru_utime.tv_usec - usage_start.ru_utime.tv_usec, ((usage_end.ru_stime.tv_sec - usage_start.ru_stime.tv_sec)* 1000000) + usage_end.ru_stime.tv_usec - usage_start.ru_stime.tv_usec);

    return 0;
}

Returns on my OS X:

Time used by reading 1MiB: 103855 user, 442698 system.
Time used by freading 1MiB: 20146 user, 256 system.

The stdio functions just wrap optimisation code around the appropriate system calls.

Here is an strace of the program:

getrusage(RUSAGE_SELF, {ru_utime={0, 0}, ru_stime={0, 0}, ...}) = 0
open("/dev/zero", O_RDONLY)             = 3

Then follows 1048576 times

read(3, "\0", 1)                        = 1

and the rest:

close(3)                                = 0
getrusage(RUSAGE_SELF, {ru_utime={0, 200000}, ru_stime={5, 460000}, ...}) = 0

This is part of fopen:

fstat(1, {st_mode=S_IFIFO|0600, st_size=0, ...}) = 0
mmap(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x2aaaaaaae000

getrusage(RUSAGE_SELF, {ru_utime={0, 200000}, ru_stime={5, 460000}, ...}) = 0
// ...
open("/dev/zero", O_RDONLY)             = 3
fstat(3, {st_mode=S_IFCHR|0666, st_rdev=makedev(1, 5), ...}) = 0
ioctl(3, SNDCTL_TMR_TIMEBASE or TCGETS, 0x7fffffffb050) = -1 ENOTTY (Inappropriate ioctl for device)
mmap(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x2aaaaaaaf000

Now 256 times:

read(3, "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 4096) = 4096

Notice that although I am reading byte by byte the stdio library is fetching the file contents one page at a time.

And the rest mostly deallocations:

close(3)                                = 0
munmap(0x2aaaaaaaf000, 4096)            = 0
getrusage(RUSAGE_SELF, {ru_utime={0, 230000}, ru_stime={5, 460000}, ...}) = 0
write(1, "Time used by reading 1MiB: 20000"..., 106Time used by reading 1MiB: 200000 user, 5460000 system.
Time used by freading 1MiB: 30000 user, 0 system.
) = 106
exit_group(0)                           = ?

Upvotes: 4

mah
mah

Reputation: 39807

With regards to accessing files on disk, the answer is: it depends. The higher level functions can have buffering enabled which can reduce the amount of physical I/O, meaning it can reduce the actual number of read()/write() calls that get made (fread() calls read() to access the disk, etc.).

So, with buffering enabled, high level functions have the advantage that you're generally going to see better performance without needing to think about what you're doing. Low level functions have the advantage that if you know how your app will do things, you can improve on the performance by managing your own buffering directly.

Upvotes: 2

Related Questions