Pavan Manjunath
Pavan Manjunath

Reputation: 28605

How to find if the underlying Linux Kernel supports Copy on Write?

For example, I am working on an ancient kernel and want to know whether it really implements Copy on Write. Is there a way ( preferably programattically in C ) to find out?

Upvotes: 0

Views: 564

Answers (2)

Marco Bonelli
Marco Bonelli

Reputation: 69512

I casually stumbled upon this rather old question, and I see that other people already pointed out that it does not make much sense to "detect CoW" since Linux already implies CoW.

However I find this question pretty interesting, and while technically one should not be able to detect this kind of kernel mechanism which should be completely transparent to userspace processes, there actually are architecture specific ways (i.e. side-channels) that can be exploited to determine whether Copy on Write happens or not.

On x86 processors that support Restricted Transactional Memory, you can leverage the fact that memory transactions are aborted when an exception such as a page fault occurs. Given a valid address, this information can be used to detect if a page is resident in memory or not (similarly to the use of minicore(2)), or even to detect Copy on Write.

Here's a working example. Note: check that your processor supports RTM by looking at /proc/cpuinfo for the rtm flag, and compile using GCC without optimizations and with the -mrtm flag.

#include <stdio.h>
#include <unistd.h>
#include <sys/mman.h>
#include <immintrin.h>

/* Use x86 transactional memory to detect a page fault when trying to write
 * at the specified address, assuming it's a valid address.
 */
static int page_dirty(void *page) {
    unsigned char *p = page;

    if (_xbegin() == _XBEGIN_STARTED) {
        *p = 0;
        _xend();

        /* Transaction successfully ended => no context switch happened to
         * copy page into virtual memory of the process => page was dirty.
         */
        return 1;
    } else {
        /* Transaction aborted => page fault happened and context was switched
         * to copy page into virtual memory of the process => page wasn't dirty.
         */
        return 0;
    }

    /* Should not happen! */
    return -1;
}

int main(void) {
    unsigned char *addr;

    addr = mmap(NULL, 0x1000, PROT_READ | PROT_WRITE, MAP_PRIVATE | MAP_ANONYMOUS, -1, 0);
    if (addr == MAP_FAILED) {
        perror("mmap failed");
        return 1;
    }

    // Write to trigger initial page fault and actually reserve memory
    *addr = 123;

    fprintf(stderr, "Initial state : %d\n", page_dirty(addr));
    fputs("----- fork -----\n", stderr);

    if (fork()) {
        fprintf(stderr, "Parent before : %d\n", page_dirty(addr));

        // Read (should NOT trigger Copy on Write)
        *addr;

        fprintf(stderr, "Parent after R: %d\n", page_dirty(addr));

        // Write (should trigger Copy on Write)
        *addr = 123;

        fprintf(stderr, "Parent after W: %d\n", page_dirty(addr));
    } else {
        fprintf(stderr, "Child before  : %d\n", page_dirty(addr));

        // Read (should NOT trigger Copy on Write)
        *addr;

        fprintf(stderr, "Child after R : %d\n", page_dirty(addr));

        // Write (should trigger Copy on Write)
        *addr = 123;

        fprintf(stderr, "Child after W : %d\n", page_dirty(addr));
    }

    return 0;
}

Output on my machine:

Initial state : 1
----- fork -----
Parent before : 0
Parent after R: 0
Parent after W: 1
Child before  : 0
Child after R : 0
Child after W : 1

As you can see, writing to pages marked as CoW (in this case after fork), causes the transaction to fail because a page fault exception is triggered and causes a transaction abort. The changes are reverted by hardware before the transaction is aborted. After writing to the page, trying to do the same thing again results in the transaction correctly terminating and the function returning 1.

Of course, this should not really be used seriously, but merely be taken as a fun and interesting exercise. Since RTM transactions are aborted for any kind of exception and also for context switch, false negatives are possible (for example if the process is preempted by the kernel right in the middle of the transaction). Keeping the transaction code really short (in the above case just a branch and an assignment *p = 0) is essential. Multiple tests could also be made to avoid false negatives.

Upvotes: 0

Blagovest Buyukliev
Blagovest Buyukliev

Reputation: 43558

No, there isn't a reliable programmatic way to find that out from within a userland process.

The idea behind COW is that it should be fully transparent to the user code. Your code touches the individual pages, a page fault is invoked, the kernel copies the corresponding page and your process is resumed as if nothing had happened.

Upvotes: 1

Related Questions