Anoop
Anoop

Reputation: 903

Unable to extract fields form log line containing a mix of JSON and non-JSON data using grok in Logstash

I am running a couple of Spring Boot applications in Docker containers. Since I don't want to log to files, I am instead logging to the console and then using logspout to forward the logs to Logstash. I am using logstash-logback-encoder to log all logs from the application in JSON format.

Apart from these, there are also some logs (console outputs) which are made by the docker container before starting the Spring Boot application. These are not in JSON format.

To both of these, Logspout appends metadata (container name, container id, etc) before sending to Logstash. Below are my example logs in both formats.

  1. Direct from container (no JSON)

<14>1 2016-12-01T12:58:20Z 903c18d47759 com-test-myapp 31635 - - Setting active profile to test

  1. Application logs (in JSON format)

<14>1 2016-12-01T13:08:13Z 903c18d47759 com-test-myapp 31635 - - {"@timestamp":"2016-12-01T13:08:13.651+00:00","@version":1,"message":"Some log message goes here","logger_name":"com.test.myapp.MyClass","thread_name":"http-nio-8080-exec-1","level":"DEBUG","level_value":10000,"HOSTNAME":"903c18d47759"}

Below is my Logstash grok configuration.

input {
  tcp {
    port => 5000
    type => "logspout-syslog-tcp"
  }
}
filter {
  if [type] == "logspout-syslog-tcp" {
    grok {
      match => {
        "message" => [
          "<%{NUMBER:syslogPriority}>1 %{TIMESTAMP_ISO8601:eventTimestamp} %{BASE16NUM:containerId} %{DATA:containerName} %{NUMBER:containerPort} - - %{DATA:jsonLog}",
          "<%{NUMBER:syslogPriority}>1 %{TIMESTAMP_ISO8601:eventTimestamp} %{BASE16NUM:containerId} %{DATA:containerName} %{NUMBER:containerPort} - - %{DATA:regularLog}"
        ]
      }
    }

    json {
        source => "jsonLog"
        target => "parsedJson"
        remove_field=>["jsonLog"]
    }

    mutate {
      add_field => {
        "level" => "%{[parsedJson][level]}"
        "thread" => "%{[parsedJson][thread_name]}"
        "logger" => "%{[parsedJson][logger_name]}"
        "message" => ["%{[parsedJson][message]}"]
      }
    }
  }
}
output {
  elasticsearch { hosts => ["localhost:9200"] }
  stdout { codec => rubydebug }
}

Based on this, I was hoping to have each field in JSON available as a filter in Elasticsearch/Kibana. But I am not able to get the value of those fields. It shows up in Kibana as below: enter image description here

I am not sure what I am missing here. How should I go about extracting the fields from JSON? Also, is the grok filter correct for handling both JSON and non-JSON logs?

Thanks, Anoop

Upvotes: 2

Views: 999

Answers (1)

baudsp
baudsp

Reputation: 4100

The problem is with the %{DATA:jsonLog} part. The DATA pattern, .*?, is not greedy (see here), so it won't grab anything and won't create the jsonLog field. You'll need to use the GREEDYDATA pattern instead.

See http://grokconstructor.appspot.com/do/match#result to test your patterns.

Upvotes: 2

Related Questions