user7511700
user7511700

Reputation: 3

Understand dup and dup2

I am trying to learn about redirection in Linux and I found this code which creates a child process and redirects the input/output. But I'm not able to understand what dup and dup2 are doing in this case. I know that dup uses the lowest-numbered unused descriptor for the new descriptor. Can anyone explain how this code helps in multiple redirection?

int run_child(char *progname, char *argv[], int child_stdin, int child_stdout, int child_stderr)
{
    int child;

    if ((child = fork()))
    {
        return child;
    }

    if (child_stdout == STDIN_FILENO)
    {
        child_stdout = dup(child_stdout);
        RC_CHECK(child_stdout >= 0);
    }

    while (child_stderr == STDIN_FILENO || child_stderr == STDOUT_FILENO)
    {
        child_stderr = dup(child_stderr);
        RC_CHECK(child_stderr >= 0);
    }

    child_stdin = dup2(child_stdin, STDIN_FILENO);
    RC_CHECK(child_stdin == STDIN_FILENO);
    child_stdout = dup2(child_stdout, STDOUT_FILENO);
    RC_CHECK(child_stdout == STDOUT_FILENO);
    child_stderr = dup2(child_stderr, STDERR_FILENO);
    RC_CHECK(child_stderr == STDERR_FILENO);

    execvp(progname, argv);
}

Upvotes: 0

Views: 1108

Answers (1)

Jonathan Leffler
Jonathan Leffler

Reputation: 754060

I believe that the code is trying, but sometimes failing, to account for the situation where:

  • The parent process has also closed some or all of the three standard streams (standard input, standard output and standard error), so some or all of file descriptors 0, 1, and 2 were closed.
  • The parent process has opened three file descriptors to be used as standard input, standard output and standard error by the child.

There are a couple of sub-scenarios:

  • child_stdout got assigned to 0 (STDIN_FILENO). It is conceivable that child_stderr got assigned to 1 (STDOUT_FILENO).
  • child_stderr got assigned to either 0 or 1 — presumably meaning child_stdout did not get assigned to either of these.
  • Neither child_stdout nor child_stderr got assigned to 0, 1 or 2.

The code forks; the parent code returns immediately — that's all clean. It returns -1 on failure and a PID on success.

For sub-scenario 1, the first condition runs dup() which changes the assignment of child_stdout to the first not-open file descriptor after 0 that's available. That might be 1 or 2 or some larger number. It throws away the information about what child_stdout originally was.

The loop then tries to ensure that child_stderr is not either 0 or 1, again throwing away information about what it originally was.

The next sequence three calls ensures that the file descriptor currently in child_stdin is duplicated to STDIN_FILENO, that child_stdout is duplicated to STDOUT_FILENO, and that child_stderr is duplicated to STDERR_FILENO. Since dup2() doesn't close the original file descriptor if it is the same as the file descriptor to be copied (but does otherwise), this ends up with plausible connections.

However, in the normal case, where the input parameters are (for sake of example) 3, 5, and 7, the code does not ensure that those are closed. This is a bug.

The code then goes on to execute a command using execvp(). It does not handle the scenario where that fails; it simply drops off the end of the function, causing undefined behaviour since the function is supposed to return a value. If any function in the exec*() family of functions returns, it has failed. The code should probably report an error message and exit, maybe using exit() or maybe using a 'fast exit' (_exit(), _Exit() or something similar).

How to fix?

I think that the code should keep the original values for the child file descriptors on hand, and be ready to close them after the dup2() sequence if they're out of the range 0, 1, 2. Other than that, the code probably handles most scenarios.

Note that reporting errors after the redirection of standard error is fraught. The code below simply writes to what is currently standard error, which is about the best it can do unless you fall back on using syslog or some similar system.

#include <assert.h>
#include <stdlib.h>
#include <string.h>
#include <unistd.h>

#define RC_CHECK(test) assert((test) != 0)

int run_child(char *progname, char *argv[], int child_stdin, int child_stdout, int child_stderr);

int run_child(char *progname, char *argv[], int child_stdin, int child_stdout, int child_stderr)
{
    int child;

    if ((child = fork()))
    {
        return child;
    }

    int fd[3] = { child_stdin, child_stdout, child_stderr };

    if (child_stdout == STDIN_FILENO)
    {
        child_stdout = dup(child_stdout);
        RC_CHECK(child_stdout >= 0);
    }

    while (child_stderr == STDIN_FILENO || child_stderr == STDOUT_FILENO)
    {
        child_stderr = dup(child_stderr);
        RC_CHECK(child_stderr >= 0);
    }

    child_stdin = dup2(child_stdin, STDIN_FILENO);
    RC_CHECK(child_stdin == STDIN_FILENO);
    child_stdout = dup2(child_stdout, STDOUT_FILENO);
    RC_CHECK(child_stdout == STDOUT_FILENO);
    child_stderr = dup2(child_stderr, STDERR_FILENO);
    RC_CHECK(child_stderr == STDERR_FILENO);

    for (int i = 0; i < 3; i++)
    {
        if (fd[i] != STDIN_FILENO && fd[i] != STDOUT_FILENO && fd[i] != STDERR_FILENO)
            close(fd[i]);
    }

    execvp(progname, argv);

    /* Or: fprintf(stderr, "Failed to execute program %s\n", progname); */
    char *msg[] = { "Failed to execute program ", progname, "\n" };
    enum { NUM_MSG = sizeof(msg) / sizeof(msg[0]) };
    for (int i = 0; i < NUM_MSG; i++)
        write(2, msg[i], strlen(msg[i]));

    exit(1);  /* Or an alternative status such as 126 or 127 based on errno */
}

That isn't simple code; your mind gets blown tracing through the possibilities. (I've removed various special cases that, on further checking, turned out not to be special while writing my analysis.) I'm not convinced that the pre-checks fixed with dup() are worthwhile; the writer of run_child() could simply stipulate that all three of file descriptors 0, 1 and 2 are open so that all of child file descriptor parameters are greater than 2 and simply get on with the I/O redirections. It would still be necessary to close the file descriptors passed to the function, of course.

Upvotes: 2

Related Questions