Indrid
Indrid

Reputation: 1192

Logstash grok custom pattern yields no fields

Newbie alert!

I have the following grok filter:

filter {

        grok {
                match   =>  [ "message","%{DATESTAMP:timestamp}" ]
                match   =>  [ "message", "(?<number_after_timestamp>[0-9]{8}\s\w+)"]
                match   =>  [ "message", "(?<error_or_debug>(ERROR|DEBUG))"]
                match   =>  [ "message", "(?<first_part>ORB\.thread\.pool.*(?=\s{2}))" ]
                match   =>  [ "message", "(?<exception_class_name>(?<=\<Exception class name\=\s).*?\>)" ]
                match   =>  [ "message", "(?<exception_message>(?<=\<Exception message\=).*?(?=\>))" ]

        }   


}

Individually every one of those patterns matched exactly the chunk of text I need when tested using the grok debugger. In the grok debugger the named pattern name is used as the field and emitted just fine. However when I run this over the same log events I used in the grok debugger none of the data from the log event lines or the field names is emitted.

For example the exception class name pattern yields:

{
    "exception_class_name": [
    [
      "com.ultatica.bd.exceptions.TTFException>"
    ]
   ]
}

But when run against the data from logstash command line -> not a sausage!

Would really appreciate any help.

Thanks

The logfile is like this:

[30/09/14 23:07:15:195 BST] 00000043 SystemOut O ERROR 32109 Tue Sep 30 23:07:15 BST 2014 ORB.thread.pool : 2 webuser com.ultra.bd.services.UltraticoCustomerService.processRequest API getPerson  <Exception class name= com.Ultratico.bd.exceptions.UCOException> <Exception message= e05CX432182S> <UCOException Error = 32109>

[30/09/14 23:07:15:200 BST] 00000043 SystemOut O ERROR 32109 Tue Sep 30 23:07:15 BST 2014 ORB.thread.pool : 2 webuser com.Ultratico.ecrm.framework.sessionHandler.UltraticoSessionHandler.execute  <Exception class name= com.Ultratico.bd.exceptions.UCOException> <Exception message= e05CX432182S> <UCOException Error = 32109>

Upvotes: 1

Views: 8532

Answers (2)

Andrew Corkery
Andrew Corkery

Reputation: 1024

Looking at the input, the way I would try to parse it is using a single match expression. In logstash I use multiple match patterns as a way of parsing different types of log entries. e.g.

  1. match pattern 1 => NO
  2. match pattern 2 => YES
  3. match pattern 3 => NO

So for your example I would do something like:

filter {
    grok {      
        break_on_match => false
        match => [ "message", "%{SYSLOG5424SD:timestamp} %{NUMBER:number_after_timestamp} (?<forget1>.*) (?<error_or_debug>ERROR|DEBUG) %{NUMBER:process_id} (?<timestamp_2>.{7} \d{2} \d{2}:\d{2}:\d{2} \w{3} \d{4}) %{JAVACLASS:origin} : (?<first_part>.*) %{JAVACLASS:exception_class_name} (?<exception_message>.*)" ]
        match => [ "message", "..some other pattern you want to extract.." ]
        match => [ "message", "..some other pattern you want to extract.." ]            
    }
}

Could be tidied up but you get the gist..

Upvotes: 4

Alcanzar
Alcanzar

Reputation: 17155

The syntax for grok is match => [ "field", "pattern1", "pattern2", "pattern3",...,"patternN"]. Multiple match arguments won't work because it loads them into a hash -- causing it to only use the last one.

You would need to create multiple grok blocks each with break_on_match => false to do it the way you are showing, but it would be better if you used the first form and did full patterns to match whole lines so that you can avoid the inevitable _grokparsefailures.

Upvotes: 3

Related Questions