Reputation: 33
I have this log message:
"sid-cmascioieiow89322&New*Sou,th%20Skvn%20and%20ir&o,n%20Age,Mozilla/5.0 (Linux; Android 6.0; CHM-U01 Build/HonorCHM-U01) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/52.0.2743.98 Mobile Safari/537.36"
And the pattern:
"(?[^&])&(?[^,]),%{GREEDYDATA:User_Agent}"
The problem is p2 sometimes contains zero or one or more then one comma. I want to match to the last comma before UserAgent
because UserAgent
some time contains commas.
This is the grok debugger link: https://grokdebug.herokuapp.com/
Now:
{
"p1": [
"sid-cmascioieiow89322"
],
"p2": [
"New*Sou"
],
"User_Agent": [
"th%20Skvn%20and%20iro,n%20Age,Mozilla/5.0 (Linux; Android 6.0; CHM-U01 Build/HonorCHM-U01) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/52.0.2743.98 Mobile Safari/537.36"
]
}
I want like this:
{
"p1": [
"sid-cmascioieiow89322"
],
"p2": [
"New*Sou,th%20Skvn%20and%20ir&o,n%20Age"
],
"User_Agent": [
"Mozilla/5.0 (Linux; Android 6.0; CHM-U01 Build/HonorCHM-U01) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/52.0.2743.98 Mobile Safari/537.36"
]
}
Thank you for your help.
Upvotes: 3
Views: 2336
Reputation: 626738
The part of string that you want to capture into p2
part has no whitespaces. Thus, instead of a [^,]*
pattern that matches any zero or more chars other than ,
you may use \S*
- any 0+ non-whitespace chars as many as possible, thus \S*,
will match the comma that is the last in the streak of non-whitespace chars.
(?<p1>[^&]*)&(?<p2>\S*),%{GREEDYDATA:User_Agent}
^^^^^^^^^^
This is how this regex matches your log data:
Upvotes: 1