Reputation: 193
I am using rg + perf to measure mmap performance against pread, using minor page fault
as a performance indicator. Here is the result:
perf stat -e major-faults,minor-faults rg -j1 -F 123 a-big-file --mmap
0 major-faults
509 minor-faults
0.002241400 seconds time elapsed
0.000000000 seconds user
0.002221000 seconds sys
perf stat -e major-faults,minor-faults rg -j1 -F 123 a-big-file --no-mmap
0 major-faults
396 minor-faults
0.002911774 seconds time elapsed
0.002890000 seconds user
0.000000000 seconds sys
Performance counter stats for rg -j1 -F 123 empty_file --mmap
:
0 major-faults
393 minor-faults
0.001652534 seconds time elapsed
0.000000000 seconds user
0.001648000 seconds sys
It seems that using mmap causes more page faults, does any one know how to do a deep tracing of Linux, so that code that incurs minor page faults could be shown? Currently my suspicion is munmap.
Upvotes: 0
Views: 115
Reputation: 365517
Using mmap
to read a big file normally involves soft (minor) page faults when you first touch the mmaped region, unless you use MAP_POPULATE
(which also waits for I/O if it wasn't already hot in pagecache, so most programs don't want that.)
Fault-around (wiring neighbouring pages into the page tables when one faults) makes the fault cost usually not too bad. The kernel should notice a sequential read pattern in the page faults and wire up multiple pages instead of just the one that faulted.
madvise(MADV_SEQUENTIAL)
might help with that, or maybe only with I/O from disk. Doing MADV_POPULATE_READ
from another thread might be a good idea; IDK I haven't tested. MAP_POPULATE
on the initial mmap has the downside of not letting mmap return so you can't even get started reading the file, and can't overlap computation with I/O. But getting the kernel working on checking the pages and wiring them into your page table in parallel with whatever you're doing could be helpful.
perf record -e page-faults
should be able to record the faulting instructions. Since your program doesn't trigger any major page faults (that have to sleep for I/O, since your big file isn't so big that the kernel can't keep it hot in the pagecache), the only page-faults will be the minor ones.
(I didn't test this, and IDK what the default sample granularity is for the page-faults
event; IDK if it would record a sample for every page fault by default.)
Upvotes: 0