Reputation: 18790
I have log structure looks like
a b c|
so for example:
Mozilla 5.0 white|
should be matched/extracted to sth like
a: Mozilla, b: 5.0, c: white
but there is an entry in my log is:
iOS|
which can be explained as
a:iOS, b:null, c:null
I am using python3 re, doing match with named group ?P
is there any way to achieve this?
Upvotes: 0
Views: 220
Reputation: 2316
>>> m = re.match(r"(?P<a>[^\s]+)(\s+(?P<b>[^\s]+))?(\s+(?P<c>[^\s]+))?\s*\|")
>>> m.groups()
('iOS', None, None)
>>> m.groupdict()
{'c': None, 'a': 'iOS', 'b': None}
>>> m = re.match(r"(?P<a>[^\s]+)(\s+(?P<b>[^\s]+))?(\s+(?P<c>[^\s]+))?\s*\|")
>>> m.groups()
('Mozilla', ' 5.0', ' white')
>>> m.groupdict()
{'c': 'white', 'a': 'Mozilla', 'b': '5.0'}
UPDATE:
I noticed that the previous version included spaces in the returned groups - I had factored the \s+ into the (?P<>...) to save a couple bytes, but it had that side effect. So I fixed that and also made it tolerant of spaces before the final '|'
Upvotes: 2
Reputation: 107347
You can put your patterns in a list like following :
>>> pattern = ['a', 'b', 'c']
Then use re.findall()
to find all the relative parts, then use zip
and dict
to create the relative dictionary:
>>> s = "IOS|"
>>> dict(zip(pattern,re.findall('([^\s]+)?\s?([^\s]+)?\s?([^\s]+)?\|',s)[0]))
{'a': 'IOS', 'c': '', 'b': ''}
>>>
>>> s = "Mozilla 5.0 white|"
>>>
>>> dict(zip(pattern,re.findall('([^\s]+)?\s?([^\s]+)?\s?([^\s]+)?\|',s)[0]))
{'a': 'Mozilla', 'c': 'white', 'b': '5.0'}
Upvotes: 2