Eddie
Eddie

Reputation: 33

Including additional measurements with Telegraf plugin inputs.logparser using "grok" patterns (Or regex)

I am using telegraf plugin[[inputs.logparser]] to grab the access_log data from Apache based on a local web page I have got running.

Using ["%{COMBINED_LOG_FORMAT}"] patterns, I am able to retrieve the default measurements provided by the access_logs, including http_version, request, resp_bytes etc.

I have appended the "Log Format" within httpd.conf file to include the additional "Response time" to each request access_log records with %D at the end, this has been successful when i look at the access_log after implementing.

However I am so far unable to successfully tell Telegraf to acknowledge this new measurement with the inputs.logparser - I am using a grafana dashboard with InfluxDB to monitor this data and it has not yet appeared as an additional measurement.

So far I have attempted the following:

First [[inputs.logparser]] section remains the same throughout my attempts and is always present/active, this seems right in order to be able to obtain the default measurements?

######## default logparser using COMBINED to obtain default access_log measurements ######
# Stream and parse log file(s).
[[inputs.logparser]]
  files = ["/var/log/httpd/access_log"]
  from_beginning = true

  ## Parse logstash-style "grok" patterns:
  [inputs.logparser.grok]

    patterns = ["%{COMBINED_LOG_FORMAT}"
    measurement = "apache_access_log"
    custom_patterns = '''
    '''

Attempt 1 at matching the response time appended to access_log:

############# Grok/RegEx for matching response time ######################
# Stream and parse log file(s).
[[inputs.logparser]]
  ## Log files to parse.
  files = ["/var/log/httpd/access_log"]
  from_beginning = true

  ## Parse logstash-style "grok" patterns:
  [inputs.logparser.grok]
    patterns = ["%{METRICS_INCLUDE_RESPONSE}"]

    measurement = "apache_access_log"
    custom_patterns = '''
    METRICS_INCLUDE_RESPONSE [%{NUMBER:resp}]
    '''

And my 2nd attempt I thought to try normal regular expressions

############# Grok/RegEx for matching response time ######################
# Stream and parse log file(s).
[[inputs.logparser]]
  ## Log files to parse.
  files = ["/var/log/httpd/access_log"]
  from_beginning = true

  ## Parse logstash-style "grok" patterns:
  [inputs.logparser.grok]
    patterns = ["%{METRICS_INCLUDE_RESPONSE}"]   
    measurement = "apache_access_log"
    custom_patterns = '''
    METRICS_INCLUDE_RESPONSE [%([0-9]{1,3})]
    '''

After both of these attempts, the default measurements are still recorded and grabbed fine by Telegraf, but the response time does not appear as an additional measurement.

I believe the issue to be syntax within my custom grok pattern, and that it is not matching as I have intended it to because I am not telling it to pull the correct information? But I am unsure.

I have provided an example of the access_log output below, ALL details are pulled from Telegraf without issue under COMBINED_LOG_FORMAT, except for the number at the end, which is representative of the response time.

10.30.20.32 - - [09/Jan/2020:11:08:14 +0000] "POST /404.php HTTP/1.1" 200 252 "http://10.30.10.77/" "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/79.0.3945.88 Safari/537.36" 600
10.30.20.32 - - [09/Jan/2020:11:08:15 +0000] "POST /boop.html HTTP/1.1" 200 76 "http://10.30.10.77/404.php" "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/79.0.3945.88 Safari/537.36" 472

Upvotes: 1

Views: 1452

Answers (1)

Derrick Paul
Derrick Paul

Reputation: 51

You are essentially extending a pre-defined pattern. So, the pattern should be written like so (assuming your response time value is within square brackets in the log) :

######## default logparser using COMBINED to obtain default access_log measurements ######
# Stream and parse log file(s).
[[inputs.logparser]]
  files = ["/var/log/httpd/access_log"]
  from_beginning = true

  ## Parse logstash-style "grok" patterns:
  [inputs.logparser.grok]
    patterns = ["%{COMBINED_LOG_FORMAT} \\[%{NUMBER:responseTime:float}\\]"]
    measurement = "apache_access_log"
    custom_patterns = '''
    '''

You will get the response time value in a metric named 'responseTime' in float data type.

Upvotes: 2

Related Questions