Reputation: 99606
There is an IO example from Advanced Programming in Unix Environment:
#include "apue.h" #define BUFFSIZE 4096 int main(void) { int n; char buf[BUFFSIZE]; while ((n = read(STDIN_FILENO, buf, BUFFSIZE)) > 0) if (write(STDOUT_FILENO, buf, n) != n) err_sys("write error"); if (n < 0) err_sys("read error"); exit(0); }
All normal UNIX system shells provide a way to open a file for reading on standard input and to create (or rewrite) a file on standard output, and allows the user to take advantage of the shell’s I/O redirection facilities.
Figure 3.6 shows the results for reading a 516,581,760-byte file, using 20 different buffer sizes, with standard output redirected to /dev/null. The file system used for this test was the Linux ext4 file system with 4,096-byte blocks. (The st_blksize value is 4,096.) This accounts for the minimum in the system time occurring at the few timing measurements starting around a BUFFSIZE of 4,096. Increasing the buffer size beyond this limit has little positive effect.
How does BUFFSIZE
affect the performance of reading a file?
As BUFFSIZE
increases up to 4096, why does the performance
improve? As BUFFSIZE
increases above 4096, why does the
performance have no significant improvement?
Does the kernel buffer (not the one buf
with size BUFFSIZE
in
the program) help in the performance, in relation to BUFFSIZE
?
When BUFFSIZE
is small, does the kernel buffer help to accumulate
the small writes, so to improve the performance?
Upvotes: 3
Views: 621
Reputation: 73379
Each call to read()
and write()
requires a system call (to communicate with the kernel), plus the time to do the actual copying of into to (or from) the kernel's memory space.
The system-call itself imposes a fixed (per-call) overhead/cost, while the cost of copying the data is of course proportional to the amount of data there is to copy.
Therefore, if you read()
/write()
very small buffers, the overhead of making the system call will be relatively high compared to the number of bytes of data copied; and since you'll have to make a large number of calls, the overall runtime will be longer than if you had done larger transfers.
Calling read()
/write()
a smaller number of times with larger buffers allows the system to amortize the overhead of the system call over a larger number of bytes-per-call, avoiding that inefficiency. However, at some point as sizes get larger, the system-call overhead becomes completely negligible, and at that point the program's efficiency is governed entirely by the cost of transferring the data, which is determined by the speed of the hardware. That's why you see performance leveling out as sizes get larger.
read()
and write()
do not accumulate small writes together, since they represent direct system calls. If you want small reads/writes to be buffered that way, the C runtime provides fread()
and fwrite()
wrappers that will do that for you inside your process-space.
Upvotes: 4