Reputation: 1322
I am using a Logstash + Elasticsearch stack to aggregate logs from a few interrelated apps.
I am trying to get Monit to alert whenever the word 'ERROR' is returned as part of an Elasticsearch REST query from Monit, but the 'content' regex check does not seem to be working for me. (I am sending email and SMS alerts from Monit via M/Monit.)
I know my Monit and M/Monit instances are configured properly because I can get alerts for server pings and file checksum changes, etc. just fine.
My Monit Elasticsearch HTTP query looks like this:
check host elasticsearch_error with address 12.34.56.789
if failed
url http://12.34.56.789:9200/_search?q=severity%3AERROR%20AND%20timestamp%3A>now-2d
and content = "ERROR"
then alert
BTW, %20
escapes 'space', %3A
escapes ':'
My logstash only has error log entries that are between one and two days old. i.e., when I run
http://12.34.56.789:9200/_search?q=severity%3AERROR%20AND%20timestamp%3A>now-2d
in the browser, I see errors (with the word 'ERROR') in the response body, but when I run
http://12.34.56.789:9200/_search?q=severity%3AERROR%20AND%20timestamp%3A>now-1d
I do not. (Note the one-day difference.) This is expected behavior. Note: my response body is a JSON with the "ERROR" string in a child element a few levels down. I don't know if this affects how Monit processes the regex.
When I run the check as above I see
'elasticsearch_error' failed protocol test [HTTP] at
INET[12.34.56.789:9200/_search
q=severity%3AERROR%20AND%20timestamp%3A>now-2d]
via TCP -- HTTP error: Regular expression doesn't match:
regexec() failed to match
in the log. Good. Content == "ERROR" is true. I can alert from this (even though I find the Connection failed
message in the Monit browser dashboard a little irritating...should be something like Regex failure
.)
The Problem
When I 'monit reload' and run the check with
url http://12.34.56.789:9200/_search?q=severity%3AERROR%20AND%20timestamp%3A>now-1d
I STILL get the regexec() failed to match
error as above. Note, I return no "ERROR" string in the response body. Content == "ERROR" is false. Why does this check fail? Any light shed on this issue will be appreciated!
The Answer
Turns out this problem is about URL encoding for the Elasticsearch query.
I used url http://12.34.56.789:9200/_search?q=severity:ERROR×tamp:>now-36d
in the check to get Monit to make a request that looks like 12.34.56.789:9200/_search?q=severity:ERROR×tamp:%3Enow-36d
. Note change in encoding. This seems to work.
The actual URL used by monit can be seen by starting monit in debug mode using monit -vI
.
Side Question
The 'content' object seems to respect '=' and '==' and '!='. '=' is referenced in the documentation, but a lot of third-party examples use '=='. What is the most correct use?
Side Question Answer
The helpful folks on the M/Monit team advise that "=" is an alias for "==" in the Monit configuration file.
Upvotes: 1
Views: 1890