Alasdair
Alasdair

Reputation: 51

M/Monit config file variable expansion

On a Debian 8 server I have monit 5.9-1 setup and monitoring several services. I plan on monitoring atop 1.26-2, this I can simply do with the following config

check process atop with pidfile /var/run/atop.pid
    group system
    group atop
    start program = "/usr/sbin/service atop start"
    stop program  = "/usr/sbin/service atop stop"

This works fine. However I have noticed on occasion the following entries in /var/log/messages:

traps: atop[8810] trap divide error ip:40780a sp:7ffdf663cdc8 error:0 in atop[400000+26000]

When this happens atop does not create the daily log file /var/log/atop/atop-$( date '+%Y%m%d' ) so attempting to run atop -r 20160127 -b 15:00 results in the output

/var/log/atop/atop_20160127 - open raw file: No such file or directory

I have been attempting to get monit to check for the presence of the logfile and restart if missing by changing the above config to

date=$( date '+%Y%m%d' )
check process atop with pidfile /var/run/atop.pid
    group atop
    start program = "/usr/sbin/service atop start"
    stop program  = "/usr/sbin/service atop stop"
    depend on atop_log

check file atop_log with path /var/log/atop/atop-$date
    group atop

It does not complain bit it does not expand the variable.

Anyone have any ideas if this is possible / how to do this?

Upvotes: 1

Views: 1398

Answers (1)

Alasdair
Alasdair

Reputation: 596

The solution I've found is to use a bash script that checks for the presence of the file and the bash script is invoked by monit.

Create the file /etc/monit/scripts/atop-log-check.sh with the contents:

#!/bin/bash
if [ -f "/var/log/atop/atop_$( date '+%Y%m%d' )" ]; then
    exit 0
else
    exit 1
fi

chmod it to 500 then update atop monit config to:

check process atop with pidfile /var/run/atop.pid
    group atop
    start program = "/usr/sbin/service atop start"
    stop program  = "/usr/sbin/service atop stop"

check file atop-log-check path /etc/monit/scripts/atop-log-check.sh
    group atop
    if changed checksum then alert
    if failed permission 500 then alert
    if failed uid root then alert

check program atop_log path /etc/monit/scripts/atop-log-check.sh
    group atop
    if status != 0 then restart

Upvotes: 2

Related Questions