Reputation: 341
Consider the following code:
#define _GNU_SOURCE
#include <unistd.h>
#include <sys/syscall.h>
#include <stdio.h>
int main() {
pid_t tid = gettid();
printf("tid: %d", tid);
int ret = syscall(SYS_fork, 0);
printf("tid: %d", gettid());
}
>=2.25
) glibc, the above will print three lines with two different values.The difference is due to caching. libc
implementation tries to avoid the overhead due to syscall so such values are cached. The fork/clone
wrappers are in charge of updating the values but user may bypass such wrappers, leading to unexpected results.
The manpage of getpid
has a special section talking about the history:
From glibc 2.3.4 up to and including glibc 2.24, the glibc wrapper function for getpid() cached PIDs, with the goal of avoiding additional system calls when a process calls getpid() repeatedly. Normally this caching was invisible, but its correct operation relied on support in the wrapper functions for fork(2), vfork(2), and clone(2): if an application bypassed the glibc wrappers for these system calls by using syscall(2), then a call to getpid() in the child would return the wrong value (to be precise: it would return the PID of the parent process). In addition, there were cases where getpid() could return the wrong value even when invoking clone(2) via the glibc wrapper function. (For a discussion of one such case, see BUGS in clone(2).) Furthermore, the complexity of the caching code had been the source of a few bugs within glibc over the years.
I wonder if there is still a way to do the caching properly. Given that MADV_WIPEONFORK
will instruct the kernel to wipe certain pages on fork, can we create the following structure:
struct ProcessLocalStorage {
OnceFlag once; /* zero represents uninitialized */
pid_t pid;
pid_t getpid() {
callonce(&this->once, []{ pid = syscall(SYS_getpid); });
return pid;
}
} *pls = mmap(...); /* hinted by MADV_WIPEONFORK */
pid_t gettid() {
static thread_local pid_t pid = 0;
static thread_local pid_t tid = 0;
pid_t real_pid = pls->getpid();
if (pid != real_pid) {
pid = real_pid;
tid = syscall(SYS_gettid);
}
return tid;
}
One problem I can think of is that user calls SYS_clone
on their own without setting up TLS properly but I think we can say that users should not assume libc to work under such scenarios.
Upvotes: 3
Views: 136