cesex
cesex

Reputation: 11

Parsing log data throught grok filter (logstash)

I'm pretty new to ELK, and I'm trying to parse my logs throught logstash. Logs are sent by filebeat.

Logs looks like:

2019.12.02 16:21:54.330536 [ 1 ] {} <Information> Application: starting up
2020.03.21 13:14:54.941405 [ 28 ] {xxx23xx-xxx23xx-4f0e-a3c6-rge3gu1} <Debug> executeQuery: (from [::ffff:192.0.0.0]:9999) blahblahblah
2020.03.21 13:14:54.941469 [ 28 ] {xxx23xx-xxx23xx-4f0e-a3c6-rge3gu0} <Error> executeQuery: Code: 62, e.displayText() = DB::Exception: Syntax error: failed at position 1

My default logstash configuration is:

input {
  beats {
    port => 5044
  }
}

output {
  elasticsearch {
    hosts => ["http://localhost:9200"]
    index => "%{[@metadata][beat]}-%{[@metadata][version]}-%{+YYYY.MM.dd}"
  }
}

In my log example, I extract fields like this: timestamp code pipelineId logLevel program message.

But I have several problems with my grok pattern. First, the timestamp on the log is quite different than a classic timestamp. How can I get it recognized ? I also have problems when {} can be empty or not. Can you give me some advices on what should be the correct grok pattern please ?

Also, in Kibana, I have A LOT of informations, such as hostname, os details, agent details, source etc. I've read that these fields are ES metadata so it's not possible to remove them. I found that it's a lot of informations throught, is there any way to "hide" these ?

Upvotes: 1

Views: 1017

Answers (1)

little_pinecone
little_pinecone

Reputation: 66

Grok pattern

On the screenshot below you can see the pattern I constructed for your example log (in Grok Debugger): enter image description here

Is this the result you're looking for?

Logstash config

# logstash.conf
…
filter {
    grok {
        patterns_dir => ["./patterns"]
        match => {
            "message" => "%{CUSTOM_DATE:timestamp}\s\[\s%{BASE10NUM:code}\s\]\s\{%{GREEDYDATA:pipeline_id}\}\s\<%{GREEDYDATA:log_level}\>\s%{GREEDYDATA:program_message}"
        }
    }
}
…

Custom pattern

As you can see, I told grok to look for my custom patterns in the patterns directory which I put in the same location as my logstash.conf file. In this directory I created the custom.txt file with the following content:

# patterns/custom.txt
CUSTOM_DATE (?>\d\d){1,2}\.(?:0?[1-9]|1[0-2])\.(?:(?:0[1-9])|(?:[12][0-9])|(?:3[01])|[1-9])\s(?!<[0-9])(?:2[0123]|[01]?[0-9]):(?:[0-5][0-9])(?::(?:(?:[0-5][0-9]|60)(?:[:.,][0-9]+)?))(?![0-9])

I didn't write this long pattern on my own. I started with this line:

CUSTOM_DATE %{YEAR}\.%{MONTHNUM}\.%{MONTHDAY}\s%{TIME}

Then, I replaced every predefined pattern with a corresponding regular expression (one by one, directly in the Grok Debugger). You can use the %{YEAR}\.%{MONTHNUM}\.%{MONTHDAY}\s%{TIME} in your application, but the Grok Debugger interface will print every part separately.

Do you want to remove empty fields?

I don't know what you want to do in case the pipeline_id field is empty. If you want to remove it completely you can try adding the following lines to your config:

# logstash.conf
…
filter {
    grok {
        …
    }
    if [pipeline_id] == "" {
        mutate {
            remove_field => ["pipeline_id"]
        }
    }
}
…

Useful resources

Upvotes: 0

Related Questions