Closing unused pipe file descriptors

I try to comprehend the basic reason for why closing file descriptors is needed. I get the reason for reader side closing write descriptor. However, conversely, I can't see(simulate) in action the reason for writing side closing read descriptor. I try to apply following one,

When a process tries to write to a pipe for which no process has an open read descriptor, the kernel sends the SIGPIPE signal to the writing process. By default, this signal kills a process.

Source, The Linux programming interface, Michael Kerrisk

write(), on error, -1 is returned, and errno is set appropriately. EPIPE fd is connected to a pipe or socket whose reading end is closed. When this happens the writing process will also receive a SIGPIPE signal. (Thus, the write return value is seen only if the program catches, blocks or ignores this signal.)

Source, man pages.

To do that, I close already read descriptor before fork(). Nevertheless, neither I can catch SIGPIPE, nor print error of write() by perror().

#include <stdio.h>
#include <string.h>
#include <unistd.h>
#include <sys/types.h>
#include <sys/param.h>
#include <signal.h>
#define BUFSIZE 100

char const * errMsgPipe = "signal handled SIGPIPE\n";
int errMsgPipeLen;

void handler(int x) {
    write(2, errMsgPipe, errMsgPipeLen);
}

int main(void) {
    errMsgPipeLen = strlen(errMsgPipe);
    char bufin[BUFSIZE] = "empty";
    char bufout[] = "hello soner";
    int bytesin;
    pid_t childpid;
    int fd[2];

    struct sigaction sa;
    memset(&sa, 0, sizeof(sa));
    sa.sa_flags = 0;
    sigfillset(&sa.sa_mask);
    sa.sa_handler = handler;
    sigaction(SIGPIPE, &sa, 0);

    if (pipe(fd) == -1) {
        perror("Failed to create the pipe");
        return 1;
    }
    bytesin = strlen(bufin);
    childpid = fork();
    if (childpid == -1) {
        perror("Failed to fork");
        return 1;
    }

    close(fd[0]);

    if (childpid) {
        if (write(fd[1], bufout, strlen(bufout)+1) < 0) {
            perror("write");
        }
    }
    else
        bytesin = read(fd[0], bufin, BUFSIZE);
    fprintf(stderr, "[%ld]:my bufin is {%.*s}, my bufout is {%s}\n",
            (long)getpid(), bytesin, bufin, bufout);
    return 0;
}

Output:

[22686]:my bufin is {empty}, my bufout is {hello soner}
[22687]:my bufin is {empty}, my bufout is {hello soner}

Expected output:

[22686]:my bufin is {empty}, my bufout is {hello soner}
signal handled SIGPIPE or similar stuff

Upvotes: 1

Views: 3151

Answers (2)

Jonathan Leffler
Jonathan Leffler

Reputation: 753705

Independent demonstration of why closing read end of a pipe matters

Here is a scenario where closing the read end of a pipe matters:

seq 65536 | sed 10q

If the process that launches seq does not close the read end of the pipe, then seq will fill the pipe buffer (it would like to write 382,110 bytes, but the pipe buffer isn't that big) but because there is a process with the read end of the pipe open (seq), it will not get SIGPIPE or a write error, so it will never complete.

Consider this code. The program runs seq 65536 | sed 10q, but depending on whether it is invoked with any arguments or not, it does or does not close the read end of the pipe to the seq program. When it is run without arguments, the seq program never gets SIGPIPE or a write error on its standard output because there is a process with the read end of the pipe open — that process is seq itself.

#include "stderr.h"
#include <stdio.h>
#include <sys/wait.h>
#include <unistd.h>

int main(int argc, char **argv)
{
    err_setarg0(argv[0]);
    int fd[2];
    int pid1;
    int pid2;

    if (pipe(fd) != 0)
        err_syserr("failed to pipe: ");
    if ((pid1 = fork()) < 0)
        err_syserr("failed to fork 1: ");
    else if (pid1 == 0)
    {
        char *sed[] = { "sed", "10q", 0 };
        if (dup2(fd[0], STDIN_FILENO) < 0)
            err_syserr("failed to dup2 read end of pipe to standard input: ");
        close(fd[0]);
        close(fd[1]);
        execvp(sed[0], sed);
        err_syserr("failed to exec %s: ", sed[0]);
    }
    else if ((pid2 = fork()) < 0)
        err_syserr("failed to fork 2: ");
    else if (pid2 == 0)
    {
        char *seq[] = { "seq", "65536", 0 };
        if (dup2(fd[1], STDOUT_FILENO) < 0)
            err_syserr("failed to dup2 write end of pipe to standard output: ");
        close(fd[1]);
        if (argc > 1)
            close(fd[0]);
        execvp(seq[0], seq);
        err_syserr("failed to exec %s: ", seq[0]);
    }
    else
    {
        int corpse;
        int status;
        close(fd[0]);
        close(fd[1]);
        printf("read end of pipe is%s closed for seq\n", (argc > 1) ? "" : " not");
        printf("shell process is PID %d\n", (int)getpid());
        printf("sed launched as PID %d\n", pid1);
        printf("seq launched as PID %d\n", pid2);
        while ((corpse = wait(&status)) > 0)
            printf("%d exited with status 0x%.4X\n", corpse, status);
        printf("shell process is exiting\n");
    }
}

The library code is available in my SOQ (Stack Overflow Questions) repository on GitHub as files stderr.c and stderr.h in the src/libsoq sub-directory.

Here's a pair of sample runs (the program was called fork29):

$ fork29
read end of pipe is not closed for seq
shell process is PID 90937
sed launched as PID 90938
seq launched as PID 90939
1
2
3
4
5
6
7
8
9
10
90938 exited with status 0x0000
^C
$ fork29 close
read end of pipe is closed for seq
shell process is PID 90940
sed launched as PID 90941
seq launched as PID 90942
1
2
3
4
5
6
7
8
9
10
90941 exited with status 0x0000
90942 exited with status 0x000D
shell process is exiting
$

Note that the exit status of seq in the second example indicates that it died from signal 13, SIGPIPE.

Question about the solution above

(1) How are we sure that here seq executes before sed? How is there no race?

The two programs (seq and sed) execute concurrently. sed cannot read anything until seq has produced it. seq might fill the pipe before sed reads anything, or it might only fill it after sed has quit.

(2) Why do we close both fd[0] and fd[1] in sed? Why not only fd[1]? Similar for seq.

Rule of thumb: If you dup2() one end of a pipe to standard input or standard output, close both of the original file descriptors returned by pipe() as soon as possible. In particular, you should close them before using any of the exec*() family of functions.

The rule also applies if you duplicate the descriptors with either dup() or fcntl() with F_DUPFD

The code for sed follows the Rule of Thumb. The code for seq only does so conditionally, so you can see what happens when you don't follow the Rule of Thumb.

Independent demonstration of why closing write end of a pipe matters

Here is a scenario where closing the write end of a pipe matters:

ls -l | sort

If the process that launches sort does not close the write end of the pipe, then sort could write to the pipe, so it will never see EOF on the pipe, so it will never complete.

Consider this code. The program runs ls -l | sort, but depending on whether it is invoked with any arguments or not, it does or does not close the write end of the pipe to the sort program. When it is run without arguments, then, the sort program never sees EOF on its standard input because there is a process with the write end of the pipe open — that process is sort itself.

#include "stderr.h"
#include <stdio.h>
#include <sys/wait.h>
#include <unistd.h>

int main(int argc, char **argv)
{
    err_setarg0(argv[0]);
    int fd[2];
    int pid1;
    int pid2;

    if (pipe(fd) != 0)
        err_syserr("failed to pipe: ");
    if ((pid1 = fork()) < 0)
        err_syserr("failed to fork 1: ");
    else if (pid1 == 0)
    {
        char *sort[] = { "sort", 0 };
        if (dup2(fd[0], STDIN_FILENO) < 0)
            err_syserr("failed to dup2 read end of pipe to standard input: ");
        close(fd[0]);
        if (argc > 1)
            close(fd[1]);
        execvp(sort[0], sort);
        err_syserr("failed to exec %s: ", sort[0]);
    }
    else if ((pid2 = fork()) < 0)
        err_syserr("failed to fork 2: ");
    else if (pid2 == 0)
    {
        char *ls[] = { "ls", "-l", 0 };
        if (dup2(fd[1], STDOUT_FILENO) < 0)
            err_syserr("failed to dup2 write end of pipe to standard output: ");
        close(fd[1]);
        close(fd[0]);
        execvp(ls[0], ls);
        err_syserr("failed to exec %s: ", ls[0]);
    }
    else
    {
        int corpse;
        int status;
        close(fd[0]);
        close(fd[1]);
        printf("write end of pipe is%s closed for sort\n", (argc > 1) ? "" : " not");
        printf("shell process is PID %d\n", (int)getpid());
        printf("sort launched as PID %d\n", pid1);
        printf("ls   launched as PID %d\n", pid2);
        while ((corpse = wait(&status)) > 0)
            printf("%d exited with status 0x%.4X\n", corpse, status);
        printf("shell process is exiting\n");
    }
}

Here's a pair of sample runs (the program was called fork13):

$ fork13
write end of pipe is not closed for sort
shell process is PID 90737
sort launched as PID 90738
ls   launched as PID 90739
90739 exited with status 0x0000
^C
$ fork13 close
write end of pipe is closed for sort
shell process is PID 90741
sort launched as PID 90742
ls   launched as PID 90743
90743 exited with status 0x0000
-rw-r--r--  1 jleffler  staff   1583 Jun 23 14:20 fork13.c
-rwxr-xr-x  1 jleffler  staff  22216 Jun 23 14:20 fork13
drwxr-xr-x  3 jleffler  staff     96 Jun 23 14:06 fork13.dSYM
total 56
90742 exited with status 0x0000
shell process is exiting
$

(3) Why do we need to close both fd[0] and fd[1] in their parent?

The parent process isn't actively using the pipe it created. It must close it fully, otherwise the other programs won't end. Try it — I did (unintentionally) and the programs didn't behave as I intended (expected). It took me a couple of seconds to realize what I'd not done!

Adaptation of code from answer by OP

snr posted an 'answer' attempting to demonstrate signal handling and what happens with closing (or not) the read end of pipe file descriptors. Here's an adaptation of that code into a program that can be controlled with command line options, where permutations of the options can yield different and useful results. The -b and -a options allow you to close the read end of the pipe before or after the fork (or not close it at all). The -h and -i allow you to handle SIGPIPE with the signal handler or ignore it (or use the default handling — terminate). And the -d option allows you to delay the parent by 1 second before it attempts to write.

#include <errno.h>
#include <signal.h>
#include <stdbool.h>
#include <stdio.h>
#include <string.h>
#include <unistd.h>
#include "stderr.h"

#define BUFSIZE 100

static char const *errMsgPipe = "signal handled SIGPIPE\n";
static int errMsgPipeLen;

static void handler(int x)
{
    if (x == SIGPIPE)
        write(2, errMsgPipe, errMsgPipeLen);
}

static inline void print_bool(const char *tag, bool value)
{
    printf("  %5s: %s\n", (value) ? "true" : "false", tag);
}

int main(int argc, char **argv)
{
    err_setarg0(argv[0]);

    bool sig_ignore = false;
    bool sig_handle = false;
    bool after_fork = false;
    bool before_fork = false;
    bool parent_doze = false;
    static const char usestr[] = "[-abdhi]";

    int opt;
    while ((opt = getopt(argc, argv, "abdhi")) != -1)
    {
        switch (opt)
        {
        case 'a':
            after_fork = true;
            break;
        case 'b':
            before_fork = true;
            break;
        case 'd':
            parent_doze = true;
            break;
        case 'h':
            sig_handle = true;
            break;
        case 'i':
            sig_ignore = true;
            break;
        default:
            err_usage(usestr);
        }
    }

    if (optind != argc)
        err_usage(usestr);

    /* Both these happen naturally - but should be explicit when printing configuration */
    if (sig_handle && sig_ignore)
        sig_ignore = false;
    if (before_fork && after_fork)
        after_fork = false;

    printf("Configuration:\n");
    print_bool("Close read fd before fork", before_fork);
    print_bool("Close read fd after  fork", after_fork);
    print_bool("SIGPIPE handled", sig_handle);
    print_bool("SIGPIPE ignored", sig_ignore); 
    print_bool("Parent doze", parent_doze);

    err_setlogopts(ERR_PID);

    errMsgPipeLen = strlen(errMsgPipe);
    char bufin[BUFSIZE] = "empty";
    char bufout[] = "hello soner";
    int bytesin;
    pid_t childpid;
    int fd[2];

    struct sigaction sa;
    memset(&sa, 0, sizeof(sa));
    sa.sa_flags = 0;
    sigfillset(&sa.sa_mask);
    sa.sa_handler = SIG_DFL;
    if (sig_ignore)
        sa.sa_handler = SIG_IGN;
    if (sig_handle)
        sa.sa_handler = handler;
    if (sigaction(SIGPIPE, &sa, 0) != 0)
        err_syserr("sigaction(SIGPIPE) failed: ");

    printf("Parent: %d\n", (int)getpid());

    if (pipe(fd) == -1)
        err_syserr("pipe failed: ");

    if (before_fork)
        close(fd[0]);

    int val = -999;
    bytesin = strlen(bufin);
    childpid = fork();
    if (childpid == -1)
        err_syserr("fork failed: ");

    if (after_fork)
        close(fd[0]);

    if (childpid)
    {
        if (parent_doze)
            sleep(1);
        val = write(fd[1], bufout, strlen(bufout) + 1);
        if (val < 0)
            err_syserr("write to pipe failed: ");
        err_remark("Parent wrote %d bytes to pipe\n", val);
    }
    else
    {
        bytesin = read(fd[0], bufin, BUFSIZE);
        if (bytesin < 0)
            err_syserr("read from pipe failed: ");
        err_remark("Child read %d bytes from pipe\n", bytesin);
    }

    fprintf(stderr, "[%ld]:my bufin is {%.*s}, my bufout is {%s}\n",
            (long)getpid(), bytesin, bufin, bufout);

    return 0;
}

It can be difficult (non-obvious, at any rate) to track what happens to the parent process. Bash generates an exit status of 128 + signal number when a child dies from a signal. On this machine, SIGPIPE is 13, so an exit status of 141 indicates death from SIGPIPE.

Example runs:

$ pipe71; echo $?
Configuration:
  false: Close read fd before fork
  false: Close read fd after  fork
  false: SIGPIPE handled
  false: SIGPIPE ignored
  false: Parent doze
Parent: 97984
pipe71: pid=97984: Parent wrote 12 bytes to pipe
[97984]:my bufin is {empty}, my bufout is {hello soner}
pipe71: pid=97985: Child read 12 bytes from pipe
[97985]:my bufin is {hello soner}, my bufout is {hello soner}
0
$ pipe71 -b; echo $?
Configuration:
   true: Close read fd before fork
  false: Close read fd after  fork
  false: SIGPIPE handled
  false: SIGPIPE ignored
  false: Parent doze
Parent: 97987
pipe71: pid=97988: read from pipe failed: error (9) Bad file descriptor
141
$ pipe71 -a; echo $?
Configuration:
  false: Close read fd before fork
   true: Close read fd after  fork
  false: SIGPIPE handled
  false: SIGPIPE ignored
  false: Parent doze
Parent: 98000
pipe71: pid=98000: Parent wrote 12 bytes to pipe
[98000]:my bufin is {empty}, my bufout is {hello soner}
0
pipe71: pid=98001: read from pipe failed: error (9) Bad file descriptor
$ pipe71 -a -d; echo $?
Configuration:
  false: Close read fd before fork
   true: Close read fd after  fork
  false: SIGPIPE handled
  false: SIGPIPE ignored
   true: Parent doze
Parent: 98004
pipe71: pid=98005: read from pipe failed: error (9) Bad file descriptor
141
$ pipe71 -h -a -d; echo $?
Configuration:
  false: Close read fd before fork
   true: Close read fd after  fork
   true: SIGPIPE handled
  false: SIGPIPE ignored
   true: Parent doze
Parent: 98007
pipe71: pid=98008: read from pipe failed: error (9) Bad file descriptor
signal handled SIGPIPE
pipe71: pid=98007: write to pipe failed: error (32) Broken pipe
1
$ pipe71 -h -a; echo $?
Configuration:
  false: Close read fd before fork
   true: Close read fd after  fork
   true: SIGPIPE handled
  false: SIGPIPE ignored
  false: Parent doze
Parent: 98009
pipe71: pid=98009: Parent wrote 12 bytes to pipe
[98009]:my bufin is {empty}, my bufout is {hello soner}
pipe71: pid=98010: read from pipe failed: error (9) Bad file descriptor
0
$ pipe71 -i -a; echo $?
Configuration:
  false: Close read fd before fork
   true: Close read fd after  fork
  false: SIGPIPE handled
   true: SIGPIPE ignored
  false: Parent doze
Parent: 98013
pipe71: pid=98013: Parent wrote 12 bytes to pipe
[98013]:my bufin is {empty}, my bufout is {hello soner}
0
pipe71: pid=98014: read from pipe failed: error (9) Bad file descriptor
$ pipe71 -d -i -a; echo $?
Configuration:
  false: Close read fd before fork
   true: Close read fd after  fork
  false: SIGPIPE handled
   true: SIGPIPE ignored
   true: Parent doze
Parent: 98015
pipe71: pid=98016: read from pipe failed: error (9) Bad file descriptor
pipe71: pid=98015: write to pipe failed: error (32) Broken pipe
1
$ pipe71 -i -a; echo $?
Configuration:
  false: Close read fd before fork
   true: Close read fd after  fork
  false: SIGPIPE handled
   true: SIGPIPE ignored
  false: Parent doze
Parent: 98020
pipe71: pid=98020: Parent wrote 12 bytes to pipe
[98020]:my bufin is {empty}, my bufout is {hello soner}
0
pipe71: pid=98021: read from pipe failed: error (9) Bad file descriptor
$

On my machine (a MacBook Pro running macOS High Sierra 10.13.5, with GCC 8.1.0), if I do not delay the parent, the parent consistently writes to the pipe before the child gets around to closing the file descriptor. That is not, however, guaranteed behaviour. It would be possible to add another option (e.g. -n for child_nap) to make the child nap for a second.

Code is available on GitHub

The code for the programs shown above (fork29.c, fork13.c, pipe71.c) are available in my SOQ (Stack Overflow Questions) repository on GitHub as files fork13.c, fork29.c, pipe71.c in the src/so-5100-4470 sub-directory.

Upvotes: 2

My problem is related to place of close(fd[0]); I comment out its reason in the code. Now, I get the error expected.

#include <stdio.h>
#include <string.h>
#include <unistd.h>
#include <sys/types.h>
#include <sys/param.h>
#include <signal.h>
#include <errno.h>
#define BUFSIZE 100

char const * errMsgPipe = "signal handled SIGPIPE\n";
int errMsgPipeLen;

void handler(int x) {
    write(2, errMsgPipe, errMsgPipeLen);
}

int main(void) {
    errMsgPipeLen = strlen(errMsgPipe);
    char bufin[BUFSIZE] = "empty";
    char bufout[] = "hello soner";
    int bytesin;
    pid_t childpid;
    int fd[2];

    struct sigaction sa;
    memset(&sa, 0, sizeof(sa));
    sa.sa_flags = 0;
    sigfillset(&sa.sa_mask);
    sa.sa_handler = handler;
    sigaction(SIGPIPE, &sa, 0);

    if (pipe(fd) == -1) {
        perror("Failed to create the pipe");
        return 1;
    }

    close(fd[0]); // <-- it's in order for no process has an open read descriptor


    int val = -999;
    bytesin = strlen(bufin);
    childpid = fork();
    if (childpid == -1) {
        perror("Failed to fork");
        return 1;
    }


/*
 * close(fd[0]); <---- if it were here, we wouldn't get expected error and signal
 *                      since, parent can be reached to write(fd[1], .... ) call
 *                      before the child close(fd[0]); call defined here it. 
 *          It means there is by child open read descriptor seen by parent.
 */

// sleep(1);     <---- we can prove my saying by calling sleep() here



    if (childpid) {

       val = write(fd[1], bufout, strlen(bufout)+1);
       if (val < 0) {
           perror("writing process error");
       }

    }
    else {
        bytesin = read(fd[0], bufin, BUFSIZE);
    }
    fprintf(stderr, "[%ld]:my bufin is {%.*s}, my bufout is {%s}\n",
            (long)getpid(), bytesin, bufin, bufout);
    return 0;
}

Output:

signal handled SIGPIPE
writing process error: Broken pipe
[27289]:my bufin is {empty}, my bufout is {hello soner}
[27290]:my bufin is {empty}, my bufout is {hello soner}

Therewithal, the saying that if the parent's write operation fails, the child's bufin contains empty is verified.

Upvotes: 0

Related Questions