Lone Learner
Lone Learner

Reputation: 20628

POSIX compliant way to find out if a process with a certain PID is alive

I learnt from https://serverfault.com/q/366474 that the following code is a POSIX-compliant way of testing whether a process with PID = $pid is alive. It uses the kill -0 command.

# First code sample
pid=100

if kill -0 "$pid" 2> /dev/null
then
    echo PID "$pid" is alive.
else
    echo PID "$pid" not found.
fi

pid=100

Another approach I learnt is by using the ps -p command.

# Second code sample
if ps -p "$pid" > /dev/null
then
    echo PID "$pid" is alive.
else
    echo PID "$pid" not found.
fi

I have been trying to figure if the first code sample using kill -0 command is indeed POSIX-compliant. The closest thing I found in favour of it are the statements in the 'EXIT STATUS' and 'RATIONALE' sections of http://pubs.opengroup.org/onlinepubs/9699919799/utilities/kill.html. Emphasis has been added by me.

EXIT STATUS

The following exit values shall be returned:

0 At least one matching process was found for each pid operand, and the specified signal was successfully processed for at least one matching process.

>0 An error occurred.

RATIONALE

...

...

An early proposal invented the name SIGNULL as a signal_name for signal 0 (used by the System Interfaces volume of POSIX.1-2008 to test for the existence of a process without sending it a signal). Since the signal_name 0 can be used in this case unambiguously, SIGNULL has been removed.

But I have been unable to find a mention of this in the System Interfaces volume of POSIX.1-2008.

For the second code sample, I found nothing at http://pubs.opengroup.org/onlinepubs/9699919799/utilities/ps.html that guarantees that the exit status would be greater than zero if the process with the matching $pid is not found by the ps -p "$pid" command.

Here are my questions.

  1. Is first code sample really a POSIX-compliant way to find out if a process with PID = $pid is indeed alive?
  2. Can you please point me to the appropriate page and section in System Interfaces volume of POSIX.1-2008 where I can find the behaviour of signal 0 documented?
  3. Is the second code sample a POSIX-compliant way to find out if a process with PID = $pid is alive?

Upvotes: 1

Views: 659

Answers (1)

Ignacio Vazquez-Abrams
Ignacio Vazquez-Abrams

Reputation: 798686

From the kill(1p) man page:

SYNOPSIS
   kill -s signal_name pid ...

   kill -l [exit_status]

   kill [-signal_name] pid ...

   kill [-signal_number] pid ...

 ...

   -signal_number

          Specify a non-negative decimal  integer,  signal_number,  repre‐
          senting  the  signal  to  be used instead of SIGTERM, as the sig
          argument in the effective call  to  kill().  The  correspondence
          between  integer  values  and the sig value used is shown in the
          following table.

   The effects of specifying any signal_number other than those listed  in
   the table are undefined.

                          signal_number   sig Value
                          0               0
                           ...

 ...

SEE ALSO
   Shell Command Language, ps, wait(), the  System  Interfaces  volume  of
   IEEE Std 1003.1-2001,   kill(),   the   Base   Definitions   volume  of
   IEEE Std 1003.1-2001, <signal.h>

From the kill(3p) man page:

SYNOPSIS
   #include <signal.h>

   int kill(pid_t pid, int sig);

DESCRIPTION
   The kill() function shall send a signal to a process or a group of pro‐
   cesses  specified by pid. The signal to be sent is specified by sig and
   is either one from the list given in <signal.h> or 0. If sig is 0  (the
   null  signal),  error  checking  is performed but no signal is actually
   sent. The null signal can be used to check the validity of pid.

 ...

SEE ALSO
   getpid(), raise(), setsid(), sigaction(), sigqueue(), the Base  Defini‐
   tions volume of IEEE Std 1003.1-2001, <signal.h>, <sys/types.h>

EDIT:

  1. Yes. POSIX specifies both that signal 0 will be passed transparently from kill to kill() and what signal 0 does.

  2. It's in the first paragraph of the description of the kill() function.

  3. POSIX specifies that a non-zero return status for ps means that "an error occurred", but does not specify that no command matching the given parameters is an error. Hence the behavior of the second piece of code should be considered system-specific.

Upvotes: 1

Related Questions