Reputation: 2076
I have log entries that look like this...
2014-02-25 00:00:03,936 INFO - something happened...bla bla bla
2014-02-25 00:00:03,952 INFO - ***Request Completed*** [ 78.002] mS [http://cloud.mydomain.local/schedule/search?param=45]
2014-02-25 00:00:04,233 INFO - something else happened...bla bla bla
I have a grok filter that correctly parses the lines...
grok {
match => [ "message", "%{TIMESTAMP_ISO8601:logdate} %{WORD:severity}%{SPACE}- %{GREEDYDATA:body}" ]
}
I'd like to parse additional data out of 'body' if 'body' begins with "***Request Completed***". Namely the 'elaspsedms' and 'uri'. How can I do this?
Elsewhere it was suggested that I add another message entry to the grok filter like this...
grok {
match => [
"message", "%{TIMESTAMP_ISO8601:logdate} %{WORD:severity}%{SPACE}- \*\*\*Request Completed\*\*\* \[%{SPACE}%{NUMBER:elaspedms}\] mS \[%{URI:uri}\]",
"message", "%{TIMESTAMP_ISO8601:logdate} %{WORD:severity}%{SPACE}- %{GREEDYDATA:body}"
]
}
...this works, but for timing lines, the value of 'body' does NOT get set. Ideally I'd like body to always contain the last part of the entry and iff, the entry is a timing line, perform additional parsing of elapsedms and uri.
Any ideas how I can do this?
Is there a means to parse fields? Such that I could attempt parse 'body' into elapsedms/uri, if that fails, continue. Or is there a means to nest field matches in the grok expression?
Thoughts?
Edit: Rather than making sure 'body' is always set, could I just create body from 'elaspedms' and 'uri' if 'elaspedms' is set?
Upvotes: 4
Views: 5808
Reputation: 113
Here is a better way that is known to work in Logstash 1.5.3:
grok {
match => [
"message", "%{TIMESTAMP_ISO8601:logdate} %{WORD:severity}%{SPACE}- %{GREEDYDATA:body}"
]
}
# if body is set (which should always be true, but it's good to check anyway)
if [body] {
grok {
break_on_match => true
match => [
"body", "\*\*\*Request Completed\*\*\* \[%{SPACE}%{NUMBER:elaspedms}\] mS \[%{URI:uri}\]"
]
}
}
This way, every record will have a body
field, but only the lines that contain "***Request Completed***"
will have elapsedms
and uri
fields. You can continue this logic with sub-sub fields and sub-sub-sub fields as far down into the weeds as you like.
I also included the "break_on_match"
syntax in case that is helpful. You can set it to either true
or false
.
The key is to use the body
field (or whichever field you're parsing) as the match source rather than message
.
Upvotes: 1
Reputation: 653
I believe you need to use the break_on_match
option within grok and set it to false: http://logstash.net/docs/1.4.2/filters/grok#break_on_match
Upvotes: 0
Reputation: 2076
This works. Is there a better way?
grok {
match => [
"message", "%{TIMESTAMP_ISO8601:logdate} %{WORD:severity}%{SPACE}- \*\*\*Request Completed\*\*\* \[%{SPACE}%{NUMBER:elaspedms}\] mS \[%{URI:uri}\]",
"message", "%{TIMESTAMP_ISO8601:logdate} %{WORD:severity}%{SPACE}- %{GREEDYDATA:body}"
]
}
# if body is NOT set (timing line) make one
if ![body] {
mutate {
add_field => [ "body", "***Request Completed*** [%{elapsedms}] mS [%{uri}]"]
}
}
Upvotes: 3