Reputation: 641
I read about memory mapping in following link
https://en.wikipedia.org/wiki/Memory-mapped_file
http://en.wikipedia.org/wiki/Memory-mapped_I/O
It is used for mapping files and devices. it says that mapping of file has advantage that it is fast as compare to directly reading from disk/flash. But why as same amount of time is taken to copy data from disk/flash to virtual memory?
But unable to find advantage of using memory mapped I/O. What is benefit as compare to direct read?
As memory map has particular region in virtual memory, it is above heap memory and below stack space. As we can control the heap and stack space for a process. How can I control (i.e increase/decrease) memory mapping region in virtual memory?
Upvotes: 1
Views: 1241
Reputation: 532
But why as same amount of time is taken to copy data from disk/flash to virtual memory?
Traditional file io using read() system call had two copies of data. + one copy from the disk into the buffer cache + one copy from the buffer cache into the user buffer provided by the read() call.
Using mmap() we could directly copy data into the user buffer && the pagecache, which is mmap()s version of buffer cache.
But, there are pros and cons of both mmap() and read()/write() models and most modern OSes have unified both pagecache and buffer cache into one unified cache. So, not sure how much of a performance advantage we have between the two....
read() call => a system call. So, in addition to the above, we have buffered read i.e. fread() where we cache some file data in a stream in libc. This reduces the cost of system calls for more data but for new data, we need a syscall. For an mmap area, the cost of accessing new data is the cost of a fault v/s the syscall transition. The IO cost ofcourse remains the same as you hint at.
Upvotes: 1
Reputation: 68588
Firstly understand that the heap and stack themselves consist of memory maps. Each userland process has a table of memory maps, and this is how it interacts with the kernel memory manager. Memory maps have different configurations and features. Consult the mmap(2)
man page for a list of settings.
With a file-backed mmap, the kernel manages the caching of the file in units of page size (4096 bytes) in a highly optimized way.
If you are going to read a file sequentially, then a memory map doesn't have any advantage.
If you are going to read the file in random access, then a memory mapped file may be more performant (usually is) as the kernel applies a caching strategy automatically for you, with CPU support on most platforms.
Roughly speaking the file is divided into 4096 byte blocks (pages). When a byte is read from memory (a variable stored inside the memory map is used), the CPU will calculate which block (page) it is in, and then consult its page table to see whether that page is in physical memory or not. If it is not, a page fault occurs and the page is loaded into physical memory (all 4096 bytes) from the file. The access is then converted to a physical memory access.
By loading the file on demand in this fashion, for many patterns of access it will be much faster.
It is also a lot more conveniant to access the file as a memory region (using pointer arithmetic), then through the file interface (seek).
Upvotes: 2