Reputation: 2180
I am spawning a child process using fork and exec. Using two pipes to provide input and receive output from that process.
It works just fine most of the time but when I use something like openmp to test how it performs in concurrent environments it hangs in the read
syscall, or waitpid
sometimes.
When I strace
ed the child process I found that it is also blocked on the read
syscall. Which is weird because I only wait on reading in the parent process after I've provided all of my input and closed the write end of the pipe.
I tried to create an MVCE but it is sort of long. I don't know how to make it any shorter. I removed most of the error checking code for the sake of simplicity.
Note that there are no globals in my code. And I'm not trying to read/write from the same file descriptors in multiple threads.
I can't think of what could go wrong. So hopefully you guys can spot what I'm doing wrong.
There goes:
#include <string.h>
#include <assert.h>
#include <unistd.h>
#include <limits.h>
#include <sys/wait.h>
#include <stdio.h>
#include <stdlib.h>
size_t
min(size_t first, size_t second)
{
if(first < second)
{
return first;
}
return second;
}
struct RDI_Buffer
{
char* data;
size_t size;
};
typedef struct RDI_Buffer RDI_Buffer;
RDI_Buffer
rdi_buffer_init()
{
RDI_Buffer b = {0};
return b;
}
RDI_Buffer
rdi_buffer_new(size_t size)
{
RDI_Buffer b;
b.data = malloc(size);
b.size = size;
return b;
}
void
rdi_buffer_free(RDI_Buffer b)
{
if(!b.data)
{
return;
}
free(b.data);
}
RDI_Buffer
rdi_buffer_resize(RDI_Buffer b, size_t new_size)
{
if(!b.data)
{
return rdi_buffer_new(new_size);
}
char* new_data = realloc(b.data, new_size);
if(new_data)
{
b.size = new_size;
b.data = new_data;
return b;
}
RDI_Buffer output = rdi_buffer_new(new_size);
memcpy(output.data, b.data, output.size);
rdi_buffer_free(b);
return output;
}
RDI_Buffer
rdi_buffer_null_terminate(RDI_Buffer b)
{
b = rdi_buffer_resize(b, b.size + 1);
b.data[b.size - 1] = '\0';
return b;
}
static RDI_Buffer
rw_from_fd(int w_fd, int r_fd, RDI_Buffer input)
{
const size_t CHUNK_SIZE = 4096;
assert(input.size <= CHUNK_SIZE);
write(w_fd, input.data, input.size);
close(w_fd);
RDI_Buffer output = rdi_buffer_new(CHUNK_SIZE);
read(r_fd, output.data, CHUNK_SIZE);
close(r_fd);
return output;
}
int main()
{
#pragma omp parallel for
for(size_t i = 0; i < 100; i++)
{
char* thing =
"Hello this is a sort of long text so that we can test how "
"well this works. It should go with cat and be printed.";
RDI_Buffer input_buffer;
input_buffer.data = thing;
input_buffer.size = strlen(thing);
int main_to_sub[2];
int sub_to_main[2];
pipe(main_to_sub);
pipe(sub_to_main);
int pid = fork();
if(pid == 0)
{
dup2(main_to_sub[0], STDIN_FILENO);
dup2(sub_to_main[1], STDOUT_FILENO);
close(main_to_sub[1]);
close(main_to_sub[0]);
close(sub_to_main[1]);
close(sub_to_main[0]);
char* argv[] = {"cat", NULL};
execvp("cat", argv);
exit(1);
}
close(main_to_sub[0]);
close(sub_to_main[1]);
RDI_Buffer output =
rw_from_fd(main_to_sub[1], sub_to_main[0], input_buffer);
int *status = NULL;
waitpid(pid, status, 0);
if(status)
{
printf("%d\n", *status);
}
output = rdi_buffer_null_terminate(output);
if(strcmp(output.data, thing) == 0)
{
printf("good\n");
}
else
{
printf("bad\n");
}
rdi_buffer_free(output);
}
}
Make sure you compile and link with -fopenmp
. Like so: gcc main.c -fopenmp
Upvotes: 1
Views: 1219
Reputation: 2180
The reason for the issue turns out to be the open files that are inherited to the child processes as Jonathan Leffler and Mevet explained in their answers. Please read their answers if you have that problem then refer back to my answer if you still do not understand or don't know what to do.
I will share my explanation in a way I would have understood right away. Also share my code solution to the issue.
Consider the following scenario: Process A opens a pipe (which is two files).
Process A spawns Process B to communicate to it with the pipe. However it also creates Process C which inherits the pipe (the two files).
Now process B will continuously call read(2)
on the pipe which is a blocking system call. (It will wait until someone writes to the pipe)
Process A finishes writing and closes their end of the pipe. Normally that would cause the read(2)
syscall in Process B to fail and the program would exit (That's what we want).
However in our case since Process C does have an open write end of the pipe, the read(2)
syscall in Process B will not fail and will block waiting for a write from the open write-end in Process C.
Things will be okay when Process C just ends.
The real deadlock would arrise in a different scenario where both B and C hold pipes for one another (As explained in Mevet's answer). Each one of them will be waiting for the other to close their ends of the pipes. Which will never happen causing the deadlock.
My solution was to close all open files I don't need right after the fork(2)
int pid = fork();
if(pid == 0)
{
int exceptions[2] = {main_to_sub[0], sub_to_main[1]};
close_all_descriptors(exceptions);
dup2(main_to_sub[0], STDIN_FILENO);
dup2(sub_to_main[1], STDOUT_FILENO);
close(main_to_sub[0]);
close(sub_to_main[1]);
char* argv[] = {"cat", NULL};
execvp("cat", argv);
exit(1);
}
Here's the implementation of close_all_descriptors
#include <fcntl.h>
#include <errno.h>
static int
is_within(int fd, int arr[2])
{
for(int i = 0; i < 2; i++)
{
if(fd == arr[i])
{
return 1;
}
}
return 0;
}
static int
fd_is_valid(int fd)
{
return fcntl(fd, F_GETFD) != -1 || errno != EBADF;
}
static void
close_all_descriptors(int exceptions[2])
{
// getdtablesize returns the max number of files that can be open. It's 1024 on my system
const int max_fd = getdtablesize();
// starting at 3 because I don't want to close stdin/out/err
// let dup2(2) do that
for (int fd = 3; fd <= max_fd; fd++)
{
if(fd_is_valid(fd) && !is_within(fd, exceptions))
{
close(fd);
}
}
}
Upvotes: 0
Reputation: 10445
When your main is hung, type lsof in a separate session. I think you will see something like:
....
cat 5323 steve txt REG 252,0 52080 6553613 /bin/cat
cat 5323 steve mem REG 252,0 1868984 17302005 /lib/x86_64-linux-gnu/libc-2.23.so
cat 5323 steve mem REG 252,0 162632 17301981 /lib/x86_64-linux-gnu/ld-2.23.so
cat 5323 steve mem REG 252,0 1668976 12849924 /usr/lib/locale/locale-archive
cat 5323 steve 0r FIFO 0,10 0t0 32079 pipe
cat 5323 steve 1w FIFO 0,10 0t0 32080 pipe
cat 5323 steve 2u CHR 136,0 0t0 3 /dev/pts/0
cat 5323 steve 3r FIFO 0,10 0t0 32889 pipe
cat 5323 steve 4w FIFO 0,10 0t0 32889 pipe
cat 5323 steve 6r FIFO 0,10 0t0 32890 pipe
cat 5323 steve 7r FIFO 0,10 0t0 34359 pipe
cat 5323 steve 8w FIFO 0,10 0t0 32890 pipe
cat 5323 steve 10r FIFO 0,10 0t0 22504 pipe
cat 5323 steve 15w FIFO 0,10 0t0 22504 pipe
cat 5323 steve 16r FIFO 0,10 0t0 22505 pipe
cat 5323 steve 31w FIFO 0,10 0t0 22505 pipe
cat 5323 steve 35r FIFO 0,10 0t0 17257 pipe
cat 5323 steve 47r FIFO 0,10 0t0 31304 pipe
cat 5323 steve 49r FIFO 0,10 0t0 30264 pipe
which raises the question, where are all those pipes coming from? Your main loop is no longer a single loop, it is a set of unsynchronized parallel loops. Look at the boilerplate below:
void *tdispatch(void *p) {
int to[2], from[2];
pipe(to);
pipe(from);
if (fork() == 0) {
...
} else {
...
pthread_exit(0);
}
}
...
for (int i = 0; i < NCPU; i++) {
pthread_create(..., tdispatch, ...);
}
for (int i = 0; i < NCPU; i++) {
pthread_join(...);
}
Multiple instances of tdispatch can interleave the pipe(to), pipe(from), and fork() calls; thus fds are leaking into these fork'ed processes. I say leaking because the forked'd process has no idea that they are there.
A pipe continues to respond to read() system calls while it either has buffered data or there is at least one write file descriptor open to it.
Suppose process 5 has its normal two ends of two pipes open, pointing to pipe#10, and pipe#11; and process 6 has pipe#12 and pipe#13. But, owing to the leaking above, processes 5 also has the write end of pipe#12, and process 6 has the write end of pipe#10. Process's 5 and 6 will never exit because they are keeping each others read-pipe's open.
The solution is pretty much what people in the earlier bit were saying: threads and forks are a tricky combination. You would have to serialize your pipe,fork,initial-close bits in order to make it work.
Upvotes: 2
Reputation: 755044
Converting comments into an answer.
You could be running out of file descriptors. With parallelism, 100 iterations of a loop that creates 4 file descriptors on each iteration could run into trouble if the limit is about 256 descriptors. Yes, you close some of them fast, but fast enough? That's not clear. And indeterminacy of scheduling would easily account for the varying behaviour.
The way I understand openmp is that it goes into the loop body n times at a time where n is the number of threads (am I wrong?). So at any single time I should never have more than n*2 file descriptors which on my machine should be around 24.
It’s probably n*4 file descriptors, but there may be limits on the parallelism. I'm not sufficiently familiar with OpenMP to comment authoritatively on that. Are there pragmas other than the for loop one that should be set? It isn't clear to me that running the code shown introduced parallelism on a Mac when the code is compiled with Clang — which doesn't complain about the #pragma
, unlike GCC 9.1.0 which does warn about the unknown pragma under my default compilation options.
However, with forks and execs as well as threads, life gets tricky. File descriptors may not be being closed that should be being closed because file descriptors are a process-level resource, so thread 1 may create file descriptors that thread 2 doesn’t know about, but which it does share. And then when thread 2 forks, the file descriptors created by thread 1 aren’t closed, preventing the cat
from detecting EOF properly, etc.
One way to verify this would be to use a function like this:
#include <sys/stat.h>
static void dump_descriptors(int max_fd)
{
struct stat sb;
for (int fd = 0; fd <= max_fd; fd++)
putchar((fstat(fd, &sb) == 0) ? 'o' : '-');
putchar('\n');
fflush(stdout);
}
and in the child code, call it with a suitable number (64 perhaps — there could be a case for using a number as large as 404). Although it is tempting to try using flockfile(stdout)
and funlockfile(stdout)
in the function, it is pointless if it is only called in the child process because the child is single-threaded and therefore there won't be any interference from other threads in the process. However, it is all too feasible that different processes might interfere with each other's output.
If you're going to use dump_descriptor()
from the parent process threads, then add flockfile(stdout);
before the loop and funlockfile(stdout);
after the fflush()
call. I'm not sure how much that will interfere with the problem; it enforces single threading through that function because none of the other threads can write to stdout
while one thread has it locked.
However, when I tested it with a slightly modified version of the code which output the PID before the 'good' and 'bad' lines, and before the dump_descriptors()
output, I never saw any interleaving of operations. I got output like:
14128: ooooooo----------------------------------------------------------
14128: good
14129: ooooooo----------------------------------------------------------
14129: good
14130: ooooooo----------------------------------------------------------
14130: good
…
14225: ooooooo----------------------------------------------------------
14225: good
14226: ooooooo----------------------------------------------------------
14226: good
14227: ooooooo----------------------------------------------------------
14227: good
which strongly suggests there was no parallelism in the code. And when there is no parallelism, you won't see the trouble. Each time, there are 4 descriptors for the pipes, and the code carefully closes them.
Consider redirecting the descriptor map to a file (or one file per child) in your scenario, where you may actually be getting serious parallelism.
Note that mixing threads with fork()
is inherently difficult (as John Bollinger noted) — you normally use one or the other mechanism, not both.
Upvotes: 0