Paul
Paul

Reputation: 1167

Killing entire process tree in ruby

So I'm working on some background task and I ended up having to spawn a child processes (using a binary provided by another team). At some point I want to stop such a process if it times out.

Seems straightforward enough.

def run!(command, timeout)
  Timeout.timeout(timeout) do
    stdin, stdout, wait_thr = Open3.popen2e(command)
    @pid = wait_thr.pid

    # ... boring and irrelevant...
  end
rescue Timeout::Error
  Process.kill 'TERM', @pid
  Process.wait pid

  raise
end

Now I also quite like prepending my command with a call to time. Nice logs and all. This makes a command something like this (MacOS)

gtime -f 'Time spent %E memory used %M' some/binary --with parameters" 

And so my process tree becomes something like this

ruby (my background job)
 \__ gtime
      \__ some/binary

Of course now when I kill the child process, only gtime is killed, the binary lives on.

  1. If I had control over the executable I'm using as the immediate child, I could probably handle TERM and kill its immediate children processes. But it's time/gtime so I obviously don't. But maybe there's some arcane parameter?
  2. The docs mention process groups but of course the children share the same process group as my parent process. Is there a way to spawn a process in a new process group thus making the "kill the entire process group" option viable?

I also could probably parse the ps output, build a process tree and traverse it killing processes one-by-one, but it seems a bit overkill (sorry). Is there something really basic I'm missing here?

Upvotes: 2

Views: 916

Answers (2)

Stefan
Stefan

Reputation: 114248

You can start the process in a new process group by passing pgroup: true: (see the docs for Process.spawn for available options)

stdin, stdout, wait_thr = Open3.popen2e(command, pgroup: true)

The whole process group can then be kill-ed via its process group ID by prepending the signal with a minus sign:

If signal is negative (or starts with a minus sign), kills process groups instead of processes.

pgid = Process.getpgid(wait_thr.pid)
Process.kill '-TERM', pgid

Upvotes: 8

Casper
Casper

Reputation: 34338

Instead of parsing the output of ps, you might want to look at pgrep. Doing a bit of detective work and discovering some code, we find this convenient function:

# get child pids ordered by youngest descendants first
def child_processes(pid)
  pids = `pgrep -P #{pid}`.split("\n").map(&:to_i)
  pids.flat_map { |p| child_processes(p) } + pids
end

So let's say we have a process tree like this (example from my machine):

 9088 pts/3    Sl+    1:13  \_ ruby smtserver.rb
 9092 pts/3    Sl     0:41      \_ /usr/local/bin/chromedriver --port=9516
 9101 pts/3    Sl    10:36          \_ /usr/lib/chromium-browser/chromium-browser
 9111 pts/3    S      0:00              \_ /usr/lib/chromium-bro ser/chromium-bro
 9113 pts/3    S      0:00              |   \_ /usr/lib/chromium-bro ser/chromium
 9154 pts/3    Sl    14:06              |       \_ /usr/lib/chromium-bro ser/chro
 9187 pts/3    Sl     0:07              |       \_ /usr/lib/chromium-bro ser/chro
 9135 pts/3    Sl     2:08              \_ /usr/lib/chromium-browser/chromium-bro
 9312 pts/3    Sl     4:44              \_ /usr/lib/chromium-browser/chromium-bro

Now doing child_processes(9092), we get this:

[9154, 9187, 9113, 9111, 9135, 9312, 9101]

And then you have enough information to individually kill the whole tree if needed.

In the case of gtime, it's enough to simply kill your some/binary, and gmtime will then exit by itself. Assuming some/binary is not creating more children, something like this should solve your problem:

rescue Timeout::Error
  Process.kill 'TERM', child_processes(@pid).last
  Process.wait @pid

Upvotes: 1

Related Questions