le_me
le_me

Reputation: 3419

How do I ensure a process is running, even if it kills itself? (it needs to be restarted then)

I'm using linux. I want a process (an irc bot) to run every time I start the computer. But I've got a problem: The network is bad and it disconnects often, so I need to manually restart the bot a few times a day. How do I automate that?

Additional information: The bot creates a pid file, called bot.pid The bot reconnects itself, but only a few times. The network is too bad, so the bot kills itself sometimes because it gets no response.

What I do currently (aka my approach ;) ) I have a cron job executing startbot.rb every 5 minutes. (The script itself is in the same directory as the bot)

The script:

#!/usr/bin/ruby
require 'fileutils'

if File.exists?(File.expand_path('tmp/bot.pid'))
  @pid = File.read(File.expand_path('tmp/bot.pid')).chomp!.to_i
  begin
    raise "ouch" if Process.kill(0, @pid) != 1
  rescue
    puts "Removing abandoned pid file"
    FileUtils.rm(File.expand_path('tmp/bot.pid'))
    puts "Starting the bot!"
    Kernel.exec(File.expand_path('./bot.rb'))
  else
    puts "Bot up and running!"
  end
else
  puts "Starting the bot!"
  Kernel.exec(File.expand_path('./bot.rb'))
end

What this does: It checks if the pid file exists, if that's true it checks if kill -s 0 BOT_PID == 1 (if the bot's running) and starts the bot if one of the two checks fail/are not true.

My approach seems to be quite dirty so how do I do it better?

Upvotes: 3

Views: 779

Answers (2)

bhelm
bhelm

Reputation: 715

Restarting a application is a bad workaround, not a solution.

i recommend to review the documentation of your bot, look for a option to configure after how many bad retries it exits or how to disable this functionality completely. if the bot is open source, you can also review its source code and modify the retry code. Try to find a clean solution.

edit: nowadays, if your system is using systemd instead of init, create a service file /etc/systemd/system/bot.service for your bot like this:

[Unit]
Description=bot service
After=network.target

[Service]
Type=simple
Restart=always
RestartSec=10
ExecStart=/usr/bin/ruby /path/to/bot.rb

[Install]
WantedBy=multi-user.target

This will restart the bot after 10 seconds. Also consider using the User= directive so it doesn't run under root privileges, if not needed, or using the user instance of systemd. See the systemd documentation for more infos on the options. For the user systemd service see this question, the arch linux wiki and the official documentation.

then start it with systemctl enable bot (starts the bot after reboot) and systemctl start bot

alternatively, i would create a shellscript that runs the bot in a loop. make sure bot.rb does not fork into the background:

#/bin/bash
for (( ; ; ))
do
     ./bot.rb
done

you can run that script with nohup ./startscript.sh & so it does not terminate if you close the console.

Upvotes: 6

chipmunk
chipmunk

Reputation: 970

There a tool called daemontools which was created to supervise and for managing UNIX services

From This link your can learn and use it.

Upvotes: 4

Related Questions