Reputation: 14563
I'm contemplating a system that would let me memory map files and transparently do type conversion on the data they contain. It seems it's possible to catch memory accesses by mmaping a second memory region and making it protected, then catching the segfault when a new page is accessed. This would let me handle the on-read type conversion I need.
However, to be read/write compatible, I'd need some way to catch when the OS is paging part of the memory back to disk so I could do the type conversion the other way before it's written.
Is there any capability for hooking the paging system in this way?
Upvotes: 2
Views: 1552
Reputation: 39298
Using a memory map and a SIGSEGV handler is a bit problematic. First, mprotect() is not async-signal safe, meaning mprotect()
in a signal handler is not guaranteed to work. Second, synchronization of the necessary structures between the signal handler and more than one thread is quite complex (although possible using GCC __sync and/or __atomic built-ins) as you cannot use the standard locking primitives in signal handlers -- fortunately you can simply return from the signal handler; the kernel does not skip the offending instruction, so the same signal gets raised immediately afterwards.
I did write a small program to test an anonymous private unreserved memory map, using read()
and write()
to update the map. The problem is that other threads may access the map while the signal handler is updating it.
I think it might work if you use a temporary file for the currently active region, with an extra page before and after to hold partial records when the records cross page boundaries.
The actual data file would be represented by a private anonymous unreserved inaccessible map (PROT_NONE
, MAP_ANONYMOUS | MAP_PRIVATE | MAP_NORESERVE
). A SIGSEGV signal handler catches accesses to that map. A page-aligned region of that map is unmapped and mapped from the temporary file (MAP_SHARED | MAP_FIXED | MAP_NORESERVE
). The trick is that the temporary file can be additionally mapped (MAP_SHARED | MAP_NORESERVE
) to another memory region, and the signal handler can simply unmap the temporary file within the map, to stop other threads from accessing the region during conversion; the data is still available to your library functions in the another memory region (to be read from and written to using read()
and write()
to the actual data file). MAP_SHARED
mean the exact same pages (from page cache) are used, and MAP_NORESERVE
means the kernel does not reserve swap or RAM for them.
This approach should work well with respect to threads and locking, but it still suffers from mmap()
, munmap()
, and mremap()
not being async-signal safe. However, if you do have a global variable accessed only atomically causing the signal handler to immediately return if application/library code is modifying the structures and/or maps, this should be reliable.
Upvotes: 2
Reputation: 215183
What you want is not possible, and reflects a fundamental misunderstanding of mmap
. The event of file-backed maps being written back on disk is not relevant, because until this happens, any attempt to read the file will (and must, to conform to POSIX) be read from the modified in-memory copy of the page, not the outdated contents on disk. In other words, the writing back of modified pages to disk is completely transparent to applications, and assuming you never lose power or reboot, it would be completely possible that the modified page is never written back to disk.
Your design just doesn't work. You'll have to do something different if you want this kind of behavior.
Upvotes: 3