How can _do_fork() return two different PIDs (one for the parent process and one for the child process)

Question

I was looking at the _do_fork() function () trying to understand how fork() returns the child PID for the parent process and 0 on the child process.

I think that nr contains the PID of the child process (that will be returned to the caller process), but I can't see how it is able to return 0 to the child process.

The answer How does fork() know when to return 0? says that the return value is passed on the stack created for the new process, but (besides not really understanding it) I can't find that in the code.

So, where the return value of 0 is set for the child process?

The code of the _do_fork() function is copied below:

long _do_fork(unsigned long clone_flags,
          unsigned long stack_start,
          unsigned long stack_size,
          int __user *parent_tidptr,
          int __user *child_tidptr,
          unsigned long tls)
{
    struct task_struct *p;
    int trace = 0;
    long nr;

    /*
     * Determine whether and which event to report to ptracer.  When
     * called from kernel_thread or CLONE_UNTRACED is explicitly
     * requested, no event is reported; otherwise, report if the event
     * for the type of forking is enabled.
     */
    if (!(clone_flags & CLONE_UNTRACED)) {
        if (clone_flags & CLONE_VFORK)
            trace = PTRACE_EVENT_VFORK;
        else if ((clone_flags & CSIGNAL) != SIGCHLD)
            trace = PTRACE_EVENT_CLONE;
        else
            trace = PTRACE_EVENT_FORK;

        if (likely(!ptrace_event_enabled(current, trace)))
            trace = 0;
    }

    p = copy_process(clone_flags, stack_start, stack_size,
             child_tidptr, NULL, trace, tls, NUMA_NO_NODE);
    add_latent_entropy();
    /*
     * Do this prior waking up the new thread - the thread pointer
     * might get invalid after that point, if the thread exits quickly.
     */
    if (!IS_ERR(p)) {
        struct completion vfork;
        struct pid *pid;

        trace_sched_process_fork(current, p);

        pid = get_task_pid(p, PIDTYPE_PID);
        nr = pid_vnr(pid);

        if (clone_flags & CLONE_PARENT_SETTID)
            put_user(nr, parent_tidptr);

        if (clone_flags & CLONE_VFORK) {
            p->vfork_done = &vfork;
            init_completion(&vfork);
            get_task_struct(p);
        }

        wake_up_new_task(p);

        /* forking complete and child started to run, tell ptracer */
        if (unlikely(trace))
            ptrace_event_pid(trace, pid);

        if (clone_flags & CLONE_VFORK) {
            if (!wait_for_vfork_done(p, &vfork))
                ptrace_event_pid(PTRACE_EVENT_VFORK_DONE, pid);
        }

        put_pid(pid);
    } else {
        nr = PTR_ERR(p);
    }
    return nr;
}

Ajay Brahmakshatriya · Accepted Answer

You have correctly identified how the new process id is returned to the parent, with return nr. But you will never actually see a return 0 anywhere since this code is executed on the parent thread. This code is not for the new process that is created.

Now let us examine the _do_fork function.

...
}
p = copy_process(clone_flags, stack_start, stack_size,
         child_tidptr, NULL, trace, tls, NUMA_NO_NODE);
add_latent_entropy();
...

This is where all the magic happens. When you call copy_process , it internally calls copy_thread which is a target specific code. This function is responsible for coping the thread related data structures.

Now say we have the target as X86_64 with the calling convention that the return value is returned in the %rax register. This function then copies 0 into %rax and copies the value of return_from_fork address to %rip(the instruction pointer).

On other platforms the ABI might require the return value to go on the stack. In that case 0 is placed on the stack. copy_thread is target specific but copy_process is not.

This is the implementation of copy_thread for X86_64. You can see around line number 160 the sp registers being set. And at line 182 you can see %ax (which is a subregister of %rax) being set to 0.

I hope this clears some confusion.

How can _do_fork() return two different PIDs (one for the parent process and one for the child process)

Answers (1)

Related Questions