Reputation: 45
We are monitoring our production environments using Zabbix 2.4. New instances are provisioned with Ansible that sets up a Zabbix agent. What we need is for hosts to be removed from the server if they have been terminated so that we only receive messages about running instances becoming unavailable.
To do this I wrote a Python script that can take a zabbix host name as an argument, check if that host is on the list of running instances by calling awscli and delete the host if it's not on a "not terminated" list.
I put the script in /usr/bin/delete_host.py and configured an action to call for it when a "Agent not available" trigger is activated. This is how the Operation tab looks like link
And here is the Action Log link
I've tried a couple of ways to write the command, also placed the script in ExternalScripts directory. Turned on debug logs for the server but nothing in it mentioned an error or anything. In fact it only showed messages that command is being executed and everything is ok, but the host is still there. When I copy the command from Action Log and execute it manually everything works fine.
At this point I am really out of options on how to troubleshoot this further. I disabled selinux and added zabbix user to sudoers file with nopasswd. I can't find anything in any logs. Is it even possible to execute non-messaging scripts with zabbix?
Upvotes: 1
Views: 17946
Reputation: 91
Upvotes: 0
Reputation: 4153
The script does not have to be in the ExternalScripts directory, that is only required for items of type "external check". The operation screenshot you linked to uses relative path of delete_host.py
, and that is almost guaranteed not to work. Your action log screenshot shows a few entries with /usr/bin/ prefixed, which is better.
At least for testing, make sure to specify full path to everything, including the python
binary, for example /full/path/to/python /full/path/to/delete_host.py
.
You also had a few entries that redirected all output to a file in /tmp/
, but you didn't mention what got logged in there. Please use that approach and check the potential error messages as well.
Upvotes: 0
Reputation: 1618
Try to write the script in a way that will print "OK" or 0 if it ran properly and the error message or error code if it fails. Run the script using an active zabbix agent item on the Zabbix server host (use the function system.run). In this way you'll be able to create a trigger that will raise an error if the script fails to run.
You can also just schedule it using a different tool such as Rundeck.
Upvotes: 0