Kevin Bedell
Kevin Bedell

Reputation: 13414

Find out if a resque job is still running and kill it if it's stuck

I have an application that uses resque to run some long-running jobs. Sometimes the take 8 hours or more to complete.

In situations where the job fails, is there a way to monitor resque itself to see if the job is running? I know I can update the job's status in a database table (or in redis itself), but I want to know if the job is still running so I can kill it if necessary.

The specific things I need to do are:

Upvotes: 2

Views: 3541

Answers (2)

Shai
Shai

Reputation: 1529

The god solution ends up killing off workers that possibly aren't stuck or bad at all. I started working on addressing this issue as well via a different approach. You do whatever you want - register a handler (can kill, email, send a pager alert, etc) when any resque problems come up.

If a job doesn't get processed during a certain timeframe (either because resque is stuck, the queue has an insane backlog, or resque just isn't running at all), the handler will get invoked. Feel free to poke at the README for more details.

https://github.com/shaiguitar/resque_stuck_queue#readme

Upvotes: 1

Sergio Tulentsev
Sergio Tulentsev

Reputation: 230521

Resque github repository has this secret gem, a god task that will do exactly this: watch your tasks and kill stale ones.

https://github.com/resque/resque/blob/master/examples/god/stale.god

# This will ride alongside god and kill any rogue stale worker
# processes. Their sacrifice is for the greater good.

WORKER_TIMEOUT = 60 * 10 # 10 minutes

Thread.new do
  loop do
    begin
      `ps -e -o pid,command | grep [r]esque`.split("\n").each do |line|
        parts   = line.split(' ')
        next if parts[-2] != "at"
        started = parts[-1].to_i
        elapsed = Time.now - Time.at(started)

        if elapsed >= WORKER_TIMEOUT
          ::Process.kill('USR1', parts[0].to_i)
        end
      end
    rescue
      # don't die because of stupid exceptions
      nil
    end

    sleep 30
  end
end

Upvotes: 3

Related Questions