Schrodinger ZHU
Schrodinger ZHU

Reputation: 341

Is there a correct way to do PID/TID caching?

Consider the following code:

#define _GNU_SOURCE
#include <unistd.h>
#include <sys/syscall.h>
#include <stdio.h>

int main() {
    pid_t tid = gettid();
    printf("tid: %d", tid); 
    int ret = syscall(SYS_fork, 0);
    printf("tid: %d", gettid()); 
}

The difference is due to caching. libc implementation tries to avoid the overhead due to syscall so such values are cached. The fork/clone wrappers are in charge of updating the values but user may bypass such wrappers, leading to unexpected results.

The manpage of getpid has a special section talking about the history:

From glibc 2.3.4 up to and including glibc 2.24, the glibc wrapper function for getpid() cached PIDs, with the goal of avoiding additional system calls when a process calls getpid() repeatedly. Normally this caching was invisible, but its correct operation relied on support in the wrapper functions for fork(2), vfork(2), and clone(2): if an application bypassed the glibc wrappers for these system calls by using syscall(2), then a call to getpid() in the child would return the wrong value (to be precise: it would return the PID of the parent process). In addition, there were cases where getpid() could return the wrong value even when invoking clone(2) via the glibc wrapper function. (For a discussion of one such case, see BUGS in clone(2).) Furthermore, the complexity of the caching code had been the source of a few bugs within glibc over the years.

I wonder if there is still a way to do the caching properly. Given that MADV_WIPEONFORK will instruct the kernel to wipe certain pages on fork, can we create the following structure:

struct ProcessLocalStorage {
  OnceFlag once; /* zero represents uninitialized */
  pid_t pid;
  pid_t getpid() {
    callonce(&this->once, []{ pid = syscall(SYS_getpid); });
    return pid;
  }
} *pls = mmap(...); /* hinted by MADV_WIPEONFORK */

pid_t gettid() {
  static thread_local pid_t pid = 0;
  static thread_local pid_t tid = 0;
  pid_t real_pid = pls->getpid();
  if (pid != real_pid) {
      pid = real_pid;
      tid = syscall(SYS_gettid);
  }
  return tid;
}

One problem I can think of is that user calls SYS_clone on their own without setting up TLS properly but I think we can say that users should not assume libc to work under such scenarios.

Upvotes: 3

Views: 136

Answers (0)

Related Questions