frp
frp

Reputation: 1139

Why does this code cause page faults?

I allocate 512mb of memory, and then I modify every 4096th byte (this should cause minor page fault for each modification, and that's what I actually get). But then I repeat the same loop, and it again causes minor page fault for each request. My question is, why?

Output looks like this:

enter image description here

However, if I remove calling ps from my program, the time needed to execute second loop is much lower, like 0.04s. Why?

#include <iostream>
#include <cstdlib>
#include <ctime>
#include <sstream>
#include <unistd.h>
using namespace std;

const int sz = 512 * 1024 * 1024;

int main()
{
    char * data = (char *)malloc(sz);
    if (data == 0) cout << "alloc fail";
    stringstream cmd;
    cmd << "ps -o min_flt,maj_flt " << getpid();
    system(cmd.str().c_str());
    cout << "start\n";
    clock_t start = clock();
    for (int i=0; i<sz; i += 4096)
        data[i] = 1;
    clock_t end = clock();
    double time1 = double(end-start) / CLOCKS_PER_SEC;

    system(cmd.str().c_str());

    start = clock();
    for (int i = 0; i < sz; i += 4096)
        data[i] = 1;
    end = clock();
    double time2 = double(end-start) / CLOCKS_PER_SEC;

    system(cmd.str().c_str());

    cout << time1 << " " << time2 << endl;
}

Upvotes: 0

Views: 1334

Answers (1)

rici
rici

Reputation: 241901

The library function system does roughly the following:

  1. Fork, creating a new process with a duplicate of the memory image.
  2. In the child, exec /bin/sh passing it the command-line option -c and the indicated command, replacing the memory image of the new process with a newly-loaded /bin/sh.
  3. In the parent (the original process), wait for the child to finish, and return its status code.

When a process executes fork, the system wants to avoid copying the entire memory image in order to duplicate it. So it just duplicates the page tables, and marks all the pages as "copy-on-write" which requires setting them to read-only so that writes can be detected.

The subsequent exec will mark the pages as unshared, but it probably will not cancel the setting of the pages as read-only so that subsequent writes to the pages will still trigger a minor page fault, although the handler will do nothing since the page is no longer shared.

It is not actually guaranteed that the exec will happen before the second write loop starts. Your machine most likely has more than one core, so it is quite possible for both processes to be simultaneously active. Since the exec might take a while to set up, it is quite possible that some of the write loop will happen even before the exec, which is to say before there was any clue that the copy-on-write was unnecessary.

Upvotes: 3

Related Questions