Ashik Mohammed
Ashik Mohammed

Reputation: 1109

Nginx grok pattern for logstash

Following is my Nginx log format

log_format timed_combined '$http_x_forwarded_for - $remote_user [$time_local] ' '"$request" $status $body_bytes_sent ' '"$http_referer" "$http_user_agent" ' '$request_time $upstream_response_time $pipe';

Following is Nginx log entry(for reference)

- - test.user [26/May/2017:21:54:26 +0000] "POST /elasticsearch/_msearch HTTP/1.1" 200 263 "https://myserver.com/app/kibana" "Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/58.0.3029.110 Safari/537.36" 0.020 0.008 .

Following is the logstash grok pattern

NGUSERNAME [a-zA-Z\.\@\-\+_%]+
NGUSER %{NGUSERNAME}
NGINXACCESS %{IPORHOST:clientip} - - \[%{HTTPDATE:timestamp}\] \"(?:%{WORD:verb} %{NOTSPACE:request}(?: HTTP/%{NUMBER:httpversion})?|%{DATA:rawrequest})\" %{NUMBER:response} (?:%{NUMBER:bytes}|-) %{QS:referrer} %{QS:agent} %{NUMBER:request_time} %{NUMBER:upstream_time}

Error found in logstash log

"status"=>400, "error"=>{"type"=>"mapper_parsing_exception", "reason"=>"failed to parse [timestamp]", "caused_by"=>{"type"=>"illegal_argument_exception", "reason"=>"Invalid format: \"26/May/2017:19:28:14 -0400\" is malformed at \"/May/2017:19:28:14 -0400\"

Issue: - Nginx logs are not getting grokked. 
Requirement: - Timestamp should be filtered into a particular field.

What's wrong in my configuration? How to fix this error?

Upvotes: 3

Views: 15787

Answers (2)

Anton
Anton

Reputation: 720

Here is the pattern for NGINX access.log and error.log files.

filter {

############################# NGINX ##############################
  if [event][module] == "nginx" {

########## access.log ##########
    if [fileset][name] == "access" {
      grok {
        match => { "message" => ["%{IPORHOST:ip} - %{DATA:user_name} \[%{HTTPDATE:time}\] \"%{WORD:http_method} %{DATA:url} HTTP/%{NUMBER:http_version}\" %{NUMBER:response_code} %{NUMBER:body_sent_bytes} \"%{DATA:referrer}\" \"%{DATA:agent}\""] }
        remove_field => "message"
      }
      date {
        match => ["time", "dd/MMM/YYYY:HH:mm:ss Z"]
        target => "@timestamp"
        remove_field => "time"
      }
      useragent {
        source => "agent"
        target => "user_agent"
        remove_field => "agent"
      }
      geoip {
        source => "ip"
        target => "geoip"
      }
    }

########## error.log ##########
    else if [fileset][name] == "error" {
      grok {
        match => { "message" => ["%{DATA:time} \[%{DATA:log_level}\] %{NUMBER:pid}#%{NUMBER:tid}: (\*%{NUMBER:connection_id} )?%{GREEDYDATA:messageTmp}"] }
        remove_field => "message"
      }
      date {
        match => ["time", "YYYY/MM/dd HH:mm:ss"]
        target => "@timestamp"
        remove_field => "time"
      }

      mutate {
        rename => {"messageTmp" => "message"}
      }
    }

    grok {
      remove_field => "[event]"
    }

    mutate {
      add_field => {"serviceName" => "nginx"}
    }
  }
}

Also for Tomcat: https://gist.github.com/petrov9/4740c61459a5dcedcef2f27c7c2900fd

Upvotes: 2

breml
breml

Reputation: 31

The log line you provided does not match the default NGINXACCESS grok pattern because of two differences:

  1. As the first element in the log line an ip address or hostname is expected, but in your log line a dash (-) is the first element.
  2. The third element in your log line is a username, but the grok pattern expects a dash (-)

So there are two way to resolve this:

  1. Make sure, your log lines match the default pattern
  2. Change the grok pattern to something like this:

NGINXACCESS - - %{USERNAME:username} \[%{HTTPDATE:timestamp}\] \"(?:%{WORD:verb} %{NOTSPACE:request}(?: HTTP/%{NUMBER:httpversion})?|%{DATA:rawrequest})\" %{NUMBER:response} (?:%{NUMBER:bytes}|-) %{QS:referrer} %{QS:agent} %{NUMBER:request_time} %{NUMBER:upstream_time}

I suggest to use Grok debugger to develop and debug grok patterns. It allows you to create and test them incrementally.

Upvotes: 2

Related Questions