Tim
Tim

Reputation: 99428

How is backgrounding a process implemented in terms of Linux system calls?

How is backgrounding a process (for example, in Bash) implemented in terms of Linux system calls?


The purpose of my question is that I don't understand why bash manual says

asynchronous commands are invoked in a subshell environment,

(if I am correct, "aynchronous commands" means running the commands in background), while, by using strace, I found that the parent shell process first calls clone() to create a subshell which is a copy of itself, and then the subshell calls a execve() to replace the subshell itself with the command to run in background.

This is just like running a foreground process. I don't see the command is invoked in the subshell. If I am correct, invoking a command in subshell means the subshell invokes clone() to create a subsubshell, and then the subsubshell invokes execve() to replace the subsubshell itself with the command to run in background. But in reality, the subshell doesn't call clone().

For example,

In Ubuntu, I run date in an interactive bash shell whose pid is 6913, and at the same time, trace the bash shell from another interactive bash shell by strace.

When running date, the output of tracing the first shell 6913 in the second shell is :

$ sudo strace -f -e trace=process -p 6913
[sudo] password for t: 
Process 6913 attached
clone(Process 12918 attached
child_stack=0, flags=CLONE_CHILD_CLEARTID|CLONE_CHILD_SETTID|SIGCHLD, child_tidptr=0x7f457c05ca10) = 12918
[pid  6913] wait4(-1,  <unfinished ...>
[pid 12918] execve("/bin/date", ["date"], [/* 66 vars */]) = 0
[pid 12918] arch_prctl(ARCH_SET_FS, 0x7ff00c632740) = 0
[pid 12918] exit_group(0)               = ?
[pid 12918] +++ exited with 0 +++
<... wait4 resumed> [{WIFEXITED(s) && WEXITSTATUS(s) == 0}], WSTOPPED|WCONTINUED, NULL) = 12918
--- SIGCHLD {si_signo=SIGCHLD, si_code=CLD_EXITED, si_pid=12918, si_status=0, si_utime=0, si_stime=0} ---
wait4(-1, 0x7ffea6781518, WNOHANG|WSTOPPED|WCONTINUED, NULL) = -1 ECHILD (No child processes)

When running date &, the output of tracing the first shell 6913 in the second shell is :

$ sudo strace -f -e trace=process -p 6913
Process 6913 attached
clone(Process 12931 attached
child_stack=0, flags=CLONE_CHILD_CLEARTID|CLONE_CHILD_SETTID|SIGCHLD, child_tidptr=0x7f457c05ca10) = 12931
[pid 12931] execve("/bin/date", ["date"], [/* 66 vars */]) = 0
[pid 12931] arch_prctl(ARCH_SET_FS, 0x7f530c5ee740) = 0
[pid 12931] exit_group(0)               = ?
[pid 12931] +++ exited with 0 +++
--- SIGCHLD {si_signo=SIGCHLD, si_code=CLD_EXITED, si_pid=12931, si_status=0, si_utime=0, si_stime=0} ---
wait4(-1, [{WIFEXITED(s) && WEXITSTATUS(s) == 0}], WNOHANG|WSTOPPED|WCONTINUED, NULL) = 12931
wait4(-1, 0x7ffea6780718, WNOHANG|WSTOPPED|WCONTINUED, NULL) = -1 ECHILD (No child processes)

Upvotes: 4

Views: 1370

Answers (2)

that other guy
that other guy

Reputation: 123470

the subshell doesn't call clone()

This is an explicit optimization. bash realizes that there is no point in doing another costly fork if the subshell's only job is to execute a single external command synchronously. From execute_cmd.c:

/* If this is a simple command, tell execute_disk_command that it
     might be able to get away without forking and simply exec.
     This means things like ( sleep 10 ) will only cause one fork.
     If we're timing the command or inverting its return value, however,
     we cannot do this optimization. */

The only real syscall difference between date and date & is therefore that the latter isn't followed by a wait.

The more interesting case that the bash manual alludes to is when you run non-external commands. The difference between let i++ and let i++ & is that the former is evaluated by the bash itself, while the latter is evaluated by a subshell which then exits, and therefore has no effect.

Upvotes: 1

Kaz
Kaz

Reputation: 58578

The text "asynchronous commands are invoked in a subshell environment" doesn't specifically refer to backgrounding, because backgrounding is an interactive concept from "POSIX job control". Whereas asynchronous commands occur interactively or non-interactively.

"Invoked in a subshell environment" is just shell terminology which means that a fork takes place, and these commands run in a child process which cannot modify variables in the parent and other state:

$ VAR=value &
[1] 15479
[1]+  Done                    VAR=value
$ echo $VAR
$

Because the variable assignment is run in a subshell, it has no effect in the parent shell.

How backgrounding works in terms of system calls is that POSIX job control revolves around a set of processes which are organized into process groups, which belong to a session, which is attached to a controlling terminal. Only one group at a time in a session is the foreground process group.

The reason the design revolves around groups rather than individual processes is because jobs consist of multiple processes when piping is used. For instance, sort -u file | grep foo creates a process group with two processes.

The shell itself is also in a process group. When the shell is prompting you for input, that process group is in the foreground. When the shell executes a job in the foreground, it places itself into the background with some special job control system calls.

When you send a Ctrl-Z to the TTY, it generates a SIGTSTP signal to every process in the foreground process group. The shell detects this change in child status (via waitpid or some such) and then shuffles that group to the background, putting itself to the foreground again, to receive TTY input.

When you're talking to the shell, all jobs are in the background, whether they are running or not: the shell is the foreground process group, so all the other jobs are not. With the bg command, you are just changing the running state of a suspended background job. Ctrl-Z sent a SIGTSTP which suspended it, and the shell moved it to the background. bg will resume it, allowing it to execute. However, if a background job is resumed and tries to obtain input from the TTY, it gets a SIGTTIN signal and is suspended again:

$ cat &
[1] 12620
$ bg
[1]+ cat &
$     # hit Enter 
[1]+  Stopped                 cat
$ bg
[1]+ cat &
$     # hit Enter
[1]+  Stopped                 cat

cat wants to read from the tty, so when we put it in the background, it gets a SIGTTIN from the kernel's TTY subsystem which stops it. The shell detects this signal-driven change in status similarly to Ctrl-Z/SIGTSTP and prints a message telling us that the job stopped. Every time we bg it (resume it) the same thing happens. The shell dispatches cat (probably with a SIGCONT), and cat immediately resumes trying to get input from the TTY which responds with "you're not a member of the foreground process group, bad kitty: SIGTTIN for you". All this whole time, cat never leaves the background.

Upvotes: 1

Related Questions