Reputation: 153

Process hangs on the write to the pipe

I am writing a program for inter-process communication, but I encountered a problem where the write operation blocks the process, even though there is enough space in the pipe.

I am using a remote host with a pipe buffer size of 8192, which I know thanks to the following:

#define _GNU_SOURCE
#include <stdio.h>
#include <unistd.h>
#include <fcntl.h>

int main() {

    int fd[2];
    pipe(fd);

    printf("Pipe size: %d\n", fcntl(fd[1], F_GETPIPE_SZ));

    close(fd[1]);
    close(fd[0]);

    return 0;
}

In the example below, I create 16 processes, each with its own pipe. Then each process writes 512B to the pipe of its children. The children read these messages. The root is labeled 0, and the children are numbered consecutively as 2k+1, 2k+2, where k is the process number. Finally, each process sends one message to all of the pipes.

Therefore, 16*512B = 8192 will be written to the root's pipe, and to every other pipe (16 + 1)*512B = 8192 + 512, but the one extra message will be read, so the whole should fit in the pipe.

MRE (This example doesn't do anything useful; it only illustrates my problem):

#include <stdlib.h>
#include <sys/wait.h>
#include <unistd.h>
#include <stdio.h>
#include <string.h>
#include <sys/ioctl.h>

#define NO_OF_PROCESSES 16
#define NO_OF_MESSAGES 1
#define ROOT 0

#define ERROR_CHECK(result) \
do { \
    if ((result) == -1) { \
        fprintf(stderr, "Error at line %d\n", __LINE__); \
        exit(1); \
    } \
} while (0)

#define NOT_PARTIAL(result) \
do { \
    if ((result) != 512) { \
        fprintf(stderr, "Error at line %d\n", __LINE__); \
        exit(1); \
    } \
} while (0)

void close_pipes(int fd[NO_OF_PROCESSES][2]) {
    for (int i = 0; i < NO_OF_PROCESSES; i++) {
        ERROR_CHECK(close(fd[i][0]));
        ERROR_CHECK(close(fd[i][1]));
    }
}

void child_code(int fd[NO_OF_PROCESSES][2], int child_id) {

    void* message = malloc(512);
    if (message == NULL)
        exit(EXIT_FAILURE);

    memset(message, 0, 512);

    int l = 2 * child_id + 1;
    int r = 2 * child_id + 2;

    // Every process sends two messages to each of its children.
    if (child_id == ROOT || l < NO_OF_PROCESSES) { // Root or any other parent.

        if (child_id != ROOT)
            for (int i = 0; i < NO_OF_MESSAGES; i++)
                NOT_PARTIAL(read(fd[child_id][0], message, 512));

        if (l < NO_OF_PROCESSES)
            for (int i = 0; i < NO_OF_MESSAGES; i++)
                NOT_PARTIAL(write(fd[l][1], message, 512));

        if (r < NO_OF_PROCESSES)
            for (int i = 0; i < NO_OF_MESSAGES; i++)
                NOT_PARTIAL(write(fd[r][1], message, 512));
    }
    else { // Leaf.
        for (int i = 0; i < NO_OF_MESSAGES; i++)
            NOT_PARTIAL(read(fd[child_id][0], message, 512));
    }

    printf("Ok, process %d\n", child_id);

    // Process sends one message to every other process.
    for (int i = 0; i < NO_OF_PROCESSES; i++) {
        int pipe_size = 0;
        ioctl(fd[i][1], FIONREAD, &pipe_size);
        printf("Check_1, process %d, there are %d bytes in the pipe, iteration %d\n", child_id, pipe_size, i);
        NOT_PARTIAL(write(fd[i][1], message, 512));
        printf("Check_2, process %d\n", child_id);
        fflush(stdout);
    }

    free(message);

    printf("Finished, process %d\n", child_id);
}

int main() {

    // Each child has its own pipe.
    int fd[NO_OF_PROCESSES][2];
    for (int i = 0; i < NO_OF_PROCESSES; i++) {
        ERROR_CHECK(pipe(fd[i]));
    }

    // Creating children processes.
    for (int i = 0; i < NO_OF_PROCESSES; i++) {

        int fork_result = fork();
        ERROR_CHECK(fork_result);

        if (fork_result== 0) { // Child process.
            child_code(fd, i);
            close_pipes(fd);
            return 0;
        }
    }

    close_pipes(fd);

    // Waiting for all children to finish.
    for (int i = 0; i < NO_OF_PROCESSES; i++) {
        ERROR_CHECK(wait(NULL));
    }

    return 0;
}

Currently, the result is that the program does not terminate because some processes hang.

Last lines of the output:

Ok, process 12
Check_1, process 2, there are 7168 bytes in the pipe, iteration 15
Check_2, process 2
Check_1, process 12, there are 7680 bytes in the pipe, iteration 0
Finished, process 2
Check_2, process 12
Check_1, process 12, there are 7680 bytes in the pipe, iteration 1

As you can see, the Check_2, process 12 is missing and the process hangs on write, even though in the full output Ok appears 16 times, which theoretically means that all messages 'from the tree' should be read.

The program works for 15 or fewer processes because then, at most, 8192B goes into the pipe. Similarly, the code works on a system where the pipe has a larger capacity.

Where am I making a mistake? Why does the process hang? If my code works for you, perhaps you have a different buffer size in the pipe.

Recently (rather clumsily) I asked a similar question. I am adding a new post instead of editing the old one because entire content would change, and the existing answers would no longer make sense. I hope this post is better.

Thanks a lot.

Upvotes: 4

Answers (3)

Luis Colorado

Reputation: 12668

You say:

I am writing a program for inter-process communication, but I encountered a problem where the write operation blocks the process, even though there is enough space in the pipe.

Well if you want to learn about inter-process communication you have made several mistakes:

First you create a single pipe, via the pipe(2) system call. A pipe gives you two file descriptors, but one is only for readin (the one with associated index 0) and one only for writing (the one with associated index 1)
After the fork is done, you have two file descriptors on each process but you can use only one of them (the one valid for the communication) and the other must be close(2)d. If you don't, you will run into trouble becasue some children waiting for data in a read(2) will remain in that read until all associated file descriptors are closed. And this will not happen because it (the process blocked in the read(2)) has still open the writing side of the pipe.
This will make that child never return, and the parent process (the main() routine) will wait forever in one of the wait(2)s for the blocked process. It doesn't suffice to have closed the pipes in the parent (even if you close them all) they have also to be closed (the writing side) in the children, to ensure you will not block reading a pipe in a child.
Don't use a pointer to void if you are reading chars. Better, allocate memory statically, if possible.

    void* message = malloc(512);
    if (message == NULL)
        exit(EXIT_FAILURE);

works better if it is written as:

    char message[512];  /* Better if you define a constant BUFFER_SIZE,
                         * instead of flooding your code with 512 everywhere. 
                         */

The proper way to create the pipes and work with them is to call pipe() just before the fork().

    for (int i = 0; i < NO_OF_PROCESSES; i++) {

        int fd[2];
        ERROR_CHECK(pipe(fd));

        int fork_result = fork();
        ERROR_CHECK(fork_result);

        if (fork_result== 0) { // Child process.
            close(fd[0]); /* we  are not going to read, just write */
            child_code(fd, i);
            close_pipes(fd);
            return 0;
        }
        close(fd[1]); /* we will never write on this pipe */

This way, all blocks will go fine.

Next time, devise a simpler process structure, with one root process and only two children, as the problem has the same structure and, when debugging, gets simpler to find the proper line. Also, identifying which process is emiting each message, simplifies to see what is happening. You don't need to overflow (you cannot overflow a pipe, you will become blocked until somebody has read it) a pipe to get blocked... you need to control al file descriptors in play.

By the way, have you just tried the simple unix pipe with one process sending and the other receiving?

Remember that a pipe gives you unidirectional channel, even if you receive two descriptors, the writing side and the reading side, they cannot be used as a bidirectional channel. For bidirectional (full duplex) communication you need a socket (e.g. a unix socket) But with a socket you receive a single file descriptor, not two... and you wan write and read on that same descriptor. Don't confound a pipe with a socket.

Upvotes: 1

Aleksander Wojsz

Reputation: 153

@JohnBollinger is right. The issue disappeared when I closed unnecessary descriptors straight away:

if (fork_result== 0) { // Child process.

    // close all read ends but ours
    for (int j = 0; j < NO_OF_PROCESSES; j++) {
        if (j != i) {
            ERROR_CHECK(close(fd[j][0]));
        }
    }

    child_code(fd, i);

    // close our read end, and all write ends
    ERROR_CHECK(close(fd[i][0]));
    for (int j = 0; j < NO_OF_PROCESSES; j++) {
        ERROR_CHECK(close(fd[j][1]));
    }

    return 0;
}

Note that the last process receives EPIPE (as it should), so it won't hang. Still not sure why it hangs originally, but this solves my problem.

Just in case, I managed to reproduce my problem in the Programiz online C compiler, where I set the pipe buffer size to 8192 right away:

// Each child has its own pipe.
int fd[NO_OF_PROCESSES][2];
for (int i = 0; i < NO_OF_PROCESSES; i++) {
    ERROR_CHECK(pipe(fd[i]));
    ERROR_CHECK(fcntl(fd[i][1], F_SETPIPE_SZ, 8192));
}

Upvotes: 1

John Bollinger

Reputation: 180361

Where am I making a mistake?

Inasmuch as your program's output seems to show that there is enough space in the pipe buffer to accommodate the data that the last process is trying to write, but the write nevertheless hangs, there are only a few reasonable explanations:

Your system has some additional limitation for which you have not accounted. For example, a limit on the aggregate data buffered in all your pipes at any given time.
Your system has a bug that your program manages to trigger.

You haven't provided any system details, so we cannot be any more specific. Nevertheless, I note that even after adjusting for the pipe buffer size on my system (65536 bytes), I cannot reproduce your program's hang. Thus, I do think that the behavior you observe is system-specific.

Nevertheless, I can answer the question at a high level: your mistake is in writing data to pipes when you have no expectation that it will be read. Pipes are a data transfer mechanism, not a data storage mechanism. It is incumbent on you as programmer to ensure that, to the extent that it is within your control, data you write to your pipes will also be consumed from them.

Addendum

As a secondary, more pragmatic matter, it can cause a variety of issues to leave pipe ends open past when they are needed. In this light, it's good that the parent process closes its copies of all the pipe ends as soon as it finishes forking all the children. But the children ought at startup to close the read ends of all the pipes other than their own, and to close the write end of their own pipe. I expect that if they did this then the program would not hang, but instead some of the writes would fail with EPIPE. As they should.

Upvotes: 1

Process hangs on the write to the pipe

Answers (3)

Related Questions