Roy
Roy

Reputation: 11

Grok Pattern not working in Logstash

After parsing logs I am find there are some new lines at the end of the message

Sample message

ts:2016-04-26 05-02-16-018 CDT|ll:TRACE|tid:10000.140|scf:xxxxxxxxxxxxxxxxxxxxxxxxxxx.pc|mn:null|fn:xxxxxxxxxxxxxxxxxxxxxxxxxxx|ln:749|auid:xxxxxxxxxxxxxxxxxxxxxxxxxxx|eid:xxx.xxx.xxx.xxx-58261618-1-1461664935955-139|cid:900009865|ml:null|mid:-99|uip:xxx.xxx.xxx.xxx|hip:xxx.xxx.xxx.xxx|pli:null|msg: xxxxxxxxxxxxxxxxxxxxxxxxxxx|pl: xxxxxxxxxxxxxxxxxxxxxxxxxxx

TAKE 1 xxxxxxxxxxxxxxxxxxxxxxxxxxx xxxxxxxxxxxxxxxxxxxxxxxxxxx xxxxxxxxxxxxxxxxxxxxxxxxxxx xxxxxxxxxxxxxxxxxxxxxxxxxxx

I am using the regex pattern below as suggested below as answers

ts:(?(([0-9]+)-)+ ([0-9]+-)+ [A-Z]+)\|ll:%{WORD:ll}\|tid:%{NUMBER:tid}\|scf:%{DATA:scf}\|mn:%{WORD:mn}\|fn:%{WORD:fn}\|ln:%{WORD:ln}\|auid:%{WORD:auid}\|eid:%{DATA:eid}\|cid:%{WORD:cid}\|ml:%{WORD:ml}\|mid:%{NUMBER:mid}\|uip:%{DATA:uip}\|hip:%{DATA:hip}\|pli:%{WORD:pli}\|\smsg:%{GREEDYDATA:msg}(\|pl:(?(.|\r|\n)))

But unfortunately it is not working properly when the last part of the log is not present

ts:2016-04-26 05-02-16-018 CDT|ll:TRACE|tid:10000.140|scf:xxxxxxxxxxxxxxxxxxxxxxxxxxx.pc|mn:null|fn:xxxxxxxxxxxxxxxxxxxxxxxxxxx|ln:749|auid:xxxxxxxxxxxxxxxxxxxxxxxxxxx|eid:xxx.xxx.xxx.xxx-58261618-1-1461664935955-139|cid:900009865|ml:null|mid:-99|uip:xxx.xxx.xxx.xxx|hip:xxx.xxx.xxx.xxx

What should be the correct pattern?

-------------------Previous Question --------------------------------------

I am trying to parse log line such as this one.

ts:2016-04-26 05-02-16-018 CDT|ll:TRACE|tid:10000.140|scf:xxxxxxxxxxxxxxxxxxxxxxxxxxx.pc|mn:null|fn:xxxxxxxxxxxxxxxxxxxxxxxxxxx|ln:749|auid:xxxxxxxxxxxxxxxxxxxxxxxxxxx|eid:xxx.xxx.xxx.xxx-58261618-1-1461664935955-139|cid:900009865|ml:null|mid:-99|uip:xxx.xxx.xxx.xxx|hip:xxx.xxx.xxx.xxx|pli:null|msg: xxxxxxxxxxxxxxxxxxxxxxxxxxx xxxxxxxxxxxxxxxxxxxxxxxxxxx

Below is my logstash filter

filter {
    grok {
        match => ["mesage", "ts:(?<date>(([0-9]+)-*)+ ([0-9]+-*)+ [A-Z]+)\|ll:%{WORD:ll}\|tid:%{WORD:tid}\|scf:%{WORD:scf}\|mn:%{WORD:mn}\|fn:%{WORD:fn}\|ln:%{WORD:ln}\|auid:%{WORD:auid}\|eid:%{WORD:eid}\|cid:%{WORD:cid}\|ml:%{WORD:ml}\|mid:%{WORD:mid}\|uip:%{WORD:uip}\|hip:%{WORD:hip}\|pli:%{WORD:pli}\|msg:%{WORD:msg}"]
    }
    date {
        match => ["ts","yyyy-MM-dd HH-mm-ss-SSS ZZZ"]
        target => "@timestamp"
    }
}

I am getting "_grokparsefailure"

Upvotes: 0

Views: 1149

Answers (2)

baudsp
baudsp

Reputation: 4110

I have tested the configuration from @HAL, there was a few things to change:

  • In the grok filter mesage => message
  • In the date filter ts => date so the date parsing is on the right field
  • The CDT is a time zone name, it is captured by z in the date syntax.

So the right configuration would look like this :

filter{
    grok {
        match => ["message", "ts:(?<date>(([0-9]+)-*)+ ([0-9]+-*)+ [A-Z]+)\|ll:%{WORD:ll}\|tid:%{NUMBER:tid}\|scf:%{DATA:scf}\|mn:%{WORD:mn}\|fn:%{WORD:fn}\|ln:%{WORD:ln}\|auid:%{WORD:auid}\|eid:%{DATA:eid}\|cid:%{WORD:cid}\|ml:%{WORD:ml}\|mid:%{NUMBER:mid}\|uip:%{DATA:uip}\|hip:%{DATA:hip}\|pli:%{WORD:pli}\|\s*msg:%{GREEDYDATA:msg}"]
    }
    date {
        match => ["date","yyyy-MM-dd HH-mm-ss-SSS z"]
        target => "@timestamp"
    }
}

Upvotes: 1

HAL
HAL

Reputation: 2071

Tried to parse your input via grokdebug with your expression but it failed to read out any fields. Managed to get it to work by changing the expression to:

ts:(?<date>(([0-9]+)-*)+ ([0-9]+-*)+ [A-Z]+)\|ll:%{WORD:ll}\|tid:%{NUMBER:tid}\|scf:%{DATA:scf}\|mn:%{WORD:mn}\|fn:%{WORD:fn}\|ln:%{WORD:ln}\|auid:%{WORD:auid}\|eid:%{DATA:eid}\|cid:%{WORD:cid}\|ml:%{WORD:ml}\|mid:%{NUMBER:mid}\|uip:%{DATA:uip}\|hip:%{DATA:hip}\|pli:%{WORD:pli}\|\s*msg:%{GREEDYDATA:msg}

I also think that you need to change the name of the column that logstash shall parse from mesage to message.

Also, the date parsing pattern should match the format of the date in the input. There is no timezone identity (ZZZ) in your input data (at least not in the example).

Something like this should work better (not tested though):

filter {
    grok {
        match => ["mesage", "ts:(?<date>(([0-9]+)-*)+ ([0-9]+-*)+ [A-Z]+)\|ll:%{WORD:ll}\|tid:%{NUMBER:tid}\|scf:%{DATA:scf}\|mn:%{WORD:mn}\|fn:%{WORD:fn}\|ln:%{WORD:ln}\|auid:%{WORD:auid}\|eid:%{DATA:eid}\|cid:%{WORD:cid}\|ml:%{WORD:ml}\|mid:%{NUMBER:mid}\|uip:%{DATA:uip}\|hip:%{DATA:hip}\|pli:%{WORD:pli}\|\s*msg:%{GREEDYDATA:msg}"]
    }
    date {
        match => ["ts","yyyy-MM-dd HH-mm-ss-SSS"]
        target => "@timestamp"
    }

}

Upvotes: 0

Related Questions