Reputation: 1111
I am trying to parse following multi line output with regex,
>>> a = """
... Feature 101
... Learning: Yes
... --------------
... Feature 102
... Learning: No
... """
What I get is only one value, shouldn't it return both the values as I have used re.MULTILINE|re.DOTALL
?
>>> import re
>>> re.findall('.*Feature\s*(\d+).*Learning\s*:\s*(\w+).*', a, re.MULTILINE|re.DOTALL)
[('102', 'No')]
Appreciate the help!
Upvotes: 0
Views: 38
Reputation: 124648
The problem is the greedy .*
(all 3 of them in the regex).
If you make them all non-greedy by appending a ?
(change them to .*?
),
you'll get all the results you expected:
>>> re.findall(r'.*?Feature\s*(\d+).*?Learning\s*:\s*(\w+).*?', a, re.MULTILINE|re.DOTALL)
[('101', 'Yes'), ('102', 'No')]
Also, it's always good to use raw strings with r'...'
for regular expressions.
Upvotes: 2