Reputation: 51
On a Debian 8 server I have monit 5.9-1
setup and monitoring several services. I plan on monitoring atop 1.26-2
, this I can simply do with the following config
check process atop with pidfile /var/run/atop.pid
group system
group atop
start program = "/usr/sbin/service atop start"
stop program = "/usr/sbin/service atop stop"
This works fine. However I have noticed on occasion the following entries in /var/log/messages
:
traps: atop[8810] trap divide error ip:40780a sp:7ffdf663cdc8 error:0 in atop[400000+26000]
When this happens atop does not create the daily log file /var/log/atop/atop-$( date '+%Y%m%d' )
so attempting to run atop -r 20160127 -b 15:00
results in the output
/var/log/atop/atop_20160127 - open raw file: No such file or directory
I have been attempting to get monit to check for the presence of the logfile and restart if missing by changing the above config to
date=$( date '+%Y%m%d' )
check process atop with pidfile /var/run/atop.pid
group atop
start program = "/usr/sbin/service atop start"
stop program = "/usr/sbin/service atop stop"
depend on atop_log
check file atop_log with path /var/log/atop/atop-$date
group atop
It does not complain bit it does not expand the variable.
Anyone have any ideas if this is possible / how to do this?
Upvotes: 1
Views: 1398
Reputation: 596
The solution I've found is to use a bash script that checks for the presence of the file and the bash script is invoked by monit.
Create the file /etc/monit/scripts/atop-log-check.sh
with the contents:
#!/bin/bash
if [ -f "/var/log/atop/atop_$( date '+%Y%m%d' )" ]; then
exit 0
else
exit 1
fi
chmod it to 500 then update atop monit config to:
check process atop with pidfile /var/run/atop.pid
group atop
start program = "/usr/sbin/service atop start"
stop program = "/usr/sbin/service atop stop"
check file atop-log-check path /etc/monit/scripts/atop-log-check.sh
group atop
if changed checksum then alert
if failed permission 500 then alert
if failed uid root then alert
check program atop_log path /etc/monit/scripts/atop-log-check.sh
group atop
if status != 0 then restart
Upvotes: 2