Reputation: 8692
mmap
can be used to share read-only memory between processes, reducing the memory foot print:
mmap
s a file, uses the mapped memory -> data gets loaded into RAMmmap
s a file, uses the mapped memory -> OS re-uses the same memoryBut how about this:
mmap
s a file, loads it into memory, then exits. mmap
s the same file, accesses the memory that is still hot from P1's access.Is the data loaded again from disk? Is the OS smart enough to re-use the virtual memory even if "mmap count" dropped to zero temporarily?
Does the behaviour differ between different OS? (I'm mostly interested in Linux/OS X)
EDIT: In case the OS is not smart enough -- would it help if there is one "background process", keeping the file mmap
ed, so it never leaves the address space of at least one process?
I am of course interested in performance when I mmap
and munmap
the same file successively and rapidly, possibly (but not necessarily) within the same process.
EDIT2: I see answers describing completely irrelevant points at great length. To reiterate the point -- can I rely on Linux/OS X to not re-load data that already resides in memory, from previous page hits within mmap
ed memory segments, even though the particular region is no longer mmap
ed by any process?
Upvotes: 3
Views: 1805
Reputation: 7348
The second process likely finds the data from the first process in the buffer cache. So in most cases the data will not be loaded again from disk. But since the buffer cache is a cache, there are no guarantees that the pages don't get evicted inbetween.
You could start a third process and use mmap(2) and mlock(2) to fix the pages in ram. But this will probably cause more trouble than it is worth.
Linux substituted the UNIX buffer cache for a page cache. But the principle is still the same. The Mac OS X equivalent is called Unified Buffer Cache (UBC).
Upvotes: 3
Reputation: 22261
The presence or absence of the contents of a file in memory is much less coupled to mmap
system calls than you think. When you mmap
a file, it doesn't necessarily load it into memory. When you munmap
it (or if the process exits), it doesn't necessarily discard the pages.
There are many different things that could trigger the contents of a file to be loaded into memory: mapping it, reading it normally, executing it, attempting to access memory that is mapped to the file. Similarily, there are different things that could cause the file's contents to be removed from memory, mostly related to the OS deciding it wants the memory for something more important.
In the two scenarios from your question, consider inserting a step between steps 1 and 2:
mmap
ed file is evicted from memory to make room.In this case the file's contents will probably have to get reloaded into memory if they are mapped again and used again in step 2.
versus:
mmap
ed file hang around in memory.In this case the file's contents don't need to be reloaded in step 2.
In terms of what happens to the contents of your file, your two scenarios aren't much different. It's something like this step 1.5 that would make a much more important difference.
As for a background process that is constantly accessing the file in order to ensure it's kept in memory (for example, by scanning the file and then sleeping for a short amount of time in a loop), this would of course force the file to remain in memory. but you're probably better off just letting the OS make its own decision about when to evict the file and when not to evict it.
Upvotes: 8