Peter Bloom
Peter Bloom

Reputation: 1322

Monit http response content regex behavior

I am using a Logstash + Elasticsearch stack to aggregate logs from a few interrelated apps.

I am trying to get Monit to alert whenever the word 'ERROR' is returned as part of an Elasticsearch REST query from Monit, but the 'content' regex check does not seem to be working for me. (I am sending email and SMS alerts from Monit via M/Monit.)

I know my Monit and M/Monit instances are configured properly because I can get alerts for server pings and file checksum changes, etc. just fine.

My Monit Elasticsearch HTTP query looks like this:

check host elasticsearch_error with address 12.34.56.789
    if failed 
      url http://12.34.56.789:9200/_search?q=severity%3AERROR%20AND%20timestamp%3A>now-2d 
      and content = "ERROR" 
    then alert

BTW, %20 escapes 'space', %3A escapes ':'

My logstash only has error log entries that are between one and two days old. i.e., when I run

http://12.34.56.789:9200/_search?q=severity%3AERROR%20AND%20timestamp%3A>now-2d

in the browser, I see errors (with the word 'ERROR') in the response body, but when I run

http://12.34.56.789:9200/_search?q=severity%3AERROR%20AND%20timestamp%3A>now-1d

I do not. (Note the one-day difference.) This is expected behavior. Note: my response body is a JSON with the "ERROR" string in a child element a few levels down. I don't know if this affects how Monit processes the regex.

When I run the check as above I see

'elasticsearch_error' failed protocol test [HTTP] at 
INET[12.34.56.789:9200/_search
q=severity%3AERROR%20AND%20timestamp%3A>now-2d] 
via TCP -- HTTP error: Regular expression doesn't match:
regexec() failed to match

in the log. Good. Content == "ERROR" is true. I can alert from this (even though I find the Connection failed message in the Monit browser dashboard a little irritating...should be something like Regex failure.)

The Problem

When I 'monit reload' and run the check with

url http://12.34.56.789:9200/_search?q=severity%3AERROR%20AND%20timestamp%3A>now-1d

I STILL get the regexec() failed to match error as above. Note, I return no "ERROR" string in the response body. Content == "ERROR" is false. Why does this check fail? Any light shed on this issue will be appreciated!

The Answer

Turns out this problem is about URL encoding for the Elasticsearch query.

I used url http://12.34.56.789:9200/_search?q=severity:ERROR&timestamp:>now-36d in the check to get Monit to make a request that looks like 12.34.56.789:9200/_search?q=severity:ERROR&timestamp:%3Enow-36d. Note change in encoding. This seems to work.

The actual URL used by monit can be seen by starting monit in debug mode using monit -vI.

Side Question

The 'content' object seems to respect '=' and '==' and '!='. '=' is referenced in the documentation, but a lot of third-party examples use '=='. What is the most correct use?

Side Question Answer

The helpful folks on the M/Monit team advise that "=" is an alias for "==" in the Monit configuration file.

Upvotes: 1

Views: 1890

Answers (1)

Peter Bloom
Peter Bloom

Reputation: 1322

I added the solution I found to my question above.

Upvotes: 0

Related Questions