Reputation: 4882
I have a log file which comes from spring log file. The log file has three formats. Each of the first two formats is a single line, between them if there is keyword app-info, it is the message printed by own developer. If no, it is printed by spring framework. We may treat developers message different from spring framework ones. The third format is a multiline stack trace.
We have an example for our own format, for example
2018-04-27 10:42:49 [http-nio-8088-exec-1] - INFO - app-info - injectip ip 192.168.16.89
The above line has app-info
key works, so it is our own developers'.
2018-04-27 10:42:23 [RMI TCP Connection(10)-127.0.0.1] - INFO - org.apache.catalina.core.ContainerBase.[Tomcat].[localhost].[/] - Initializing Spring FrameworkServlet 'dispatcherServlet'
The above line has not app-info
keyword, so it is printed by spring framework.
In my Grok filter, The first pattern is for messages printed from spring framework, the second is for developers' message, the third format is for multiline stacktrace. I want to first regex clearly mention that spring framework pattern does not have key word app-info so that it could get paserexception and follow the second pattern which is developers own format. So I have following formats in regex tool, but I got compile error. My regex is as follows:
(?<timestamp>[\d\-\s\:]+)\s\[(?<threadname>[\d\.\w\s\(\)\-]+)\]\s-\s(?<loglevel>[\w]+)\s+-\s+(?<systemmsg>[^((?app-info).)*\s\.\w\-\'\:\d\[\]\/]+)
since in Grok filter, I use instruction from this link
filter {
grok {
match => [ "message", "PATTERN1", "PATTERN2" , "PATTERN3" ]
}
}
My current configure in logstash is as follows which does not mention app-info clearly in the pattern:
filter {
grok {
match => [
"message",
'(?<timestamp>[\d\-\s\:]+)\s\[(?<threadname>[\d\.\w\s\(\)\-]+)\]\s-\s(?<loglevel>[\w]+)\s+-\s+(?<systemmsg>[\s\.\w\-\'\:\d\[\]\/^[app-info]]+)',
'(?<timestamp>[\d\-\s\:]+)\s\[(?<threadname>[\d\.\w\s\(\)\-]+)\]\s-\s(?<loglevel>[\w]+)\s+-\s(?<appinfo>app-info)\s-\s(?<systemmsg>[\w\d\:\{\}\,\-\(\)\s\"]+)',
'(?<timestamp>[\d\-\s\:]+)\s\[(?<threadname>[\w\-\d]+)\]\s-\s(?<loglevel>[\w]+)\s\-\s(?<appinfo>app-info)\s-\s(?<params>params):(?<jsonstr>[\"\w\d\,\:\.\{\}]+)\s(?<exceptionname>[\w\d\.]+Exception):\s(?<exceptiondetail>[\w\d\.]+)\n\t(?<extralines>at[\s\w\.\d\~\?\n\t\(\)\_\[\]\/\:\-]+)\n\d'
]
}
}
With the format in above logstash configuration, when handling with
2018-04-27 10:42:49 [http-nio-8088-exec-1] - INFO - app-info - injectip ip 192.168.16.89
The first pattern(spring framework pattern) already works, so it does not fall into second pattern which is our own developers format. The parser has parsered successfully as follows:
{
"timestamp": [
[
"2018-04-27 10:42:49"
]
],
"threadname": [
[
"http-nio-8088-exec-1"
]
],
"loglevel": [
[
"INFO"
]
],
"systemmsg": [
[
"app-info - injectip ip 192.168.16.89\n\n"
]
]
}
Any hints I could let first pattern clearly mention that systemmsg shall not contain key word "app-info"?
My goal is that if there is no key word app-info, I let pattern 1 to handle the log. If there is key word app-info, I let pattern 2 to handle the log.
With following log which does not contains key word app-info (pattern 1 shall works),
2018-04-27 10:42:23 [RMI TCP Connection(10)-127.0.0.1] - INFO - org.apache.catalina.core.ContainerBase.[Tomcat].[localhost].[/] - Initializing Spring FrameworkServlet 'dispatcherServlet'
I got following result no match with first pattern modified following your suggestion, which is not my goal.
(?<timestamp>[\d\-\s\:]+)\s\[(?<threadname>[\d\.\w\s\(\)\-]+)\]\s-\s(?<loglevel>[\w]+)\s+-\s+(?<systemmsg>[^(?:(?!app\-info).)*\s\.\w\-\'\:\d\[\]\/]+)
see demo. My goal is to extract timestamp, thread name, log level and system msg. But first pattern does not give me the expected result. The tool say there is no match.
if I remove ^(?:(?!app-info).)*, then above log(without key word app-info) parser works. See demo But now, It also works for log which contains key word app-info which is not expected, since now I want to extract timestamp, threadname, loglevel,app-info(exist or not)(the field shall be extracted or grouped), then systemmsg. The expectation is that the first parser returns error, let second parser to handle the log. demo could see the parser also works for log with key word app-info. Systemmsg put field app-info into its value which is not expected.
So I want pattern 1, handles log without keyword app-info, pattern 2 handles log with keyword app-info. So I clearly let pattern 1 throw parse error or exception when it contains key word app-info.
Upvotes: 2
Views: 3617
Reputation: 1053
I used GREEDYDATA for this, suppose you have following log line
Redirect Controller: successful redirection for click data: {a:123, b:345}
and you want to capture until "data" then use GREEDYDATA as following
%{GREEDYDATA}data:%{SPACE}%{rest of the pattern}
Upvotes: 0
Reputation: 18743
My goal is let pattern 1 handles log without keyword app-info. If there is app-info, the first pattern shall throw parse error, so that the second parser could handle the log.
You can use the following as your first pattern,
(?<data>^(?!.*app-info).*)%{LOGLEVEL:log}%{DATA:other_data}%{IP:ip}$
What it will do is, it will ignore the log if there is app-info
in it at any position, and move to the 2nd PATTERN
.
Log without app-info
,
2018-04-27 10:42:49 [http-nio-8088-exec-1] - INFO injectip ip 192.168.16.89
You can filter it as per your requirements.
OUTPUT
{
"data": [
[
"2018-04-27 10:42:49 [http-nio-8088-exec-1] - "
]
],
"log": [
[
"INFO"
]
],
"other_data": [
[
" injectip ip "
]
],
"ip": [
[
"192.168.16.89"
]
]
}
Now log with app-info
,
2018-04-27 10:42:49 [http-nio-8088-exec-1] - INFO app-info injectip ip 192.168.16.89
OUTPUT
No Matches
Please test it here
EDIT 2:
If you make PATTERN1
equals to (?<data>^(?!.*app-info).*)
you will get,
{
"data": [
[
"2018-04-27 10:42:49 [http-nio-8088-exec-1] - INFO injectip ip 192.168.16.89"
]
]
}
you can then add a 2nd grok filter for the data
field as follows,
grok {
match => {"data" => "DEFINE PATTERN HERE"}
}
Upvotes: 1