Reputation: 3175
I am trying to parse some logs which return some responses in a key-pair format. I only want they value contained by the last key-pair (Rs: {".."}). The information I want are enclosed inside the curly braces.
What I have done is to use regex to match anything inside the curly braces like this:
import re
log = '2016-10-13 17:04:50 - info - uri:"GET x/y/z" ip:1.1.1.1 Rs:{"data": "blah blah"}'
text = re.compile("Rs\:{(.*)\}").search(log).group(1)
print (text)
>>> "data": "blah blah"
# Desired results
>>> {"data": "blah blah"}
However there are some issues doing it this way:
I also wanted the starting curly braces and closing curly braces.
This method doesn't work if there other opening ("{") or closing ("}:) curly braces before or inside the Rs values.
Is there a better way to do this?
Upvotes: 1
Views: 242
Reputation: 6281
The first part is easy: just move the capturing parens out a little bit use this as your regex:
"Rs:(\{.*\})"
The other problem is more complicated - if you want the rest of the line (starting at {
), then
r'Rs:(\{.*)\Z'
would get you what you want.
Upvotes: 1
Reputation: 626932
It seems that you need two things: re-adjust the first capturing group boundaries to include curly braces, and use a lazy version of .*
(in case there are multiple values in the string). I also recommend checking if there is a match first if you are using re.search
, or just use re.findall
import re
log = '2016-10-13 17:04:50 - info - uri:"GET x/y/z" ip:1.1.1.1 Rs:{"data": "blah blah"}'
text = re.compile(r"Rs:({[^}]*})").search(log)
if text:
print (text.group(1))
# or
print(re.findall(r"Rs:({[^}]*})", log))
See the Python demo online
Pattern details:
Rs:
- a whole word Rs
and a :
({[^}]*})
- Group 1 capturing
{
- a literal {
[^}]*
- 0+ chars other than }
(see more details on Negated Character Classes here)}
- a literal }
.Upvotes: 0