brokenfoot
brokenfoot

Reputation: 11629

how to compare mmap and read performace

I am trying to compare the performace of mmap() & read() for file sizes varying from 1KB to 1GB (increments in factor of 10).

The way I do it is I read the entire files (sequentially) and then write the output to another file for both the cases and measure the time.

Code:

For the read() code, I have:

 19         char text[1000];
. . . . . . 
 77         while((bytes_read=read(d_ifp,text,1000))>0)
 78         {
 79                 write(d_ofp, text, bytes_read);
 80         }

And for the mmap() code, I have:

 20         //char *data;
 21         uintmax_t *data;
 22         //int *data;
. . . . . . 
 86         if((data = (uintmax_t*)mmap((caddr_t)0, sbuf.st_size, PROT_READ, MAP_SHARED, fd, 0)) == (uintmax_t*)(-1))
 87         {
 88                 perror("mmap");
 89                 exit(1);
 90         }

 96         int j=0;
 97         while (i<=sbuf.st_size)
 98         {
 99                 fprintf(ofp, "data[%d]=%ju\n", i, data[j]);
101                 i=i+sizeof(*data);
102                 j++;
103         }

The calculated time in case of mmap() varies depending upon how I declare my data pointer (char, int, uintmax_t), whereas in case of read() it varies depending upon the size of buffer - text.

Output: Right now mmap is proving to be really slow, which is surprising:

[read]:   f_size:    1K B, Time:  8e-06 seconds
[read]:   f_size:   10K B, Time:  1.4e-05 seconds
[read]:   f_size:  100K B, Time:  8.3e-05 seconds
[read]:   f_size:    1M B, Time:  0.000612 seconds
[read]:   f_size:   10M B, Time:  0.009652 seconds
[read]:   f_size:  100M B, Time:  0.12094 seconds
[read]:   f_size:    1G B, Time:  6.5787 seconds

[mmap]:    f_size:    1K B, Time:  0.002922 seconds
[mmap]:    f_size:   10K B, Time:  0.004116 seconds
[mmap]:    f_size:  100K B, Time:  0.020122 seconds
[mmap]:    f_size:    1M B, Time:  0.22538 seconds
[mmap]:    f_size:   10M B, Time:  2.2079 seconds
[mmap]:    f_size:  100M B, Time:  22.691 seconds
[mmap]:    f_size:    1G B, Time:  276.36 seconds

Question:
1. If I take the buffer size in read code equal to the type size in mmap code, will the evaluation be correct/justified ?
2. What is the right way to compare these two ?

Edit:

I changed the fprintf in mmap code to write, now the performance is way better but is very weird, it is decreasing for larger file sizes. Is that something expected ?
(I am writing my data to /dev/null in both the cases):

[mmap]:    f_size:    1K B, Time:  3.3e-05 seconds
[mmap]:    f_size:   10K B, Time:  2e-06 seconds
[mmap]:    f_size:  100K B, Time:  2e-06 seconds
[mmap]:    f_size:    1M B, Time:  4e-06 seconds
[mmap]:    f_size:   10M B, Time:  3e-06 seconds
[mmap]:    f_size:  100M B, Time:  2e-06 seconds
[mmap]:    f_size:    1G B, Time:  2e-06 seconds

Upvotes: 0

Views: 373

Answers (1)

cnicutar
cnicutar

Reputation: 182639

This is somewhat speculation because I probably haven't thought of all the implications:


In the first case the majority of the time is taken by:

  • Overhead of performing read(2) system calls (many of them)
    • Copying the data from one file to memory accessible by the process. The time is dominated by actually reading from the device (spinning the HDD or whatever)
  • Performing write(2) system calls that don't take any time beyond syscall overhead (see below)

In the second case the majority of the time is taken by:

  • Overhead of performing a single system call (one mmap). This does not actually read anything. The kernel just checks you have permissions and pretends to "map" the data.
    • Once you start doing something to it - like reading or writing - the kernel will actually map it in memory which will take time (pagefaults)
  • Overhead of performing write(2) system calls. I assume the you perform fewer write calls for a larger file

In Linux writing to /dev/null is implemented like this:

static ssize_t write_null(struct file *file, const char __user *buf,
    size_t count, loff_t *ppos)
{
    return count;
}

Which roughly means: "just tell the process we did it". Which means the mmaped memory is never touched -> the file is never read. So every time you only incur the cost of performing a system call. So the fewer writes you do, the less time you waste in interrupts that don't do anything anyway.

In conclusion, in both cases writes are cheap, no-op calls. But in the first one the read actually costs because data must actually be pulled from a file.


What about the printf case ?

In that case you were actively touching the memory mmaped, thus forcing the kernel to stop lying and to actually read the data from the file. In addition to that you also printed it, which depending on the buffering stdio was using, was also triggering system calls from time to time. Just in case you were writing to the screen, this was especially costly since stdout is by default line-buffered.

Upvotes: 3

Related Questions