Reputation: 155
I have been trying to find matches where they may be optional words in the string that need to be ignored if they are present.
The code I tried is:
import re
str = '''
topping consensus estimates
topping analysis' consensus estimate
topping estimate
'''
for m in re.finditer(r'(?P<p3c>topping\s+(?:\w+\s(?!estimate)){0,2}(estimate))',str):
print(m.group())
print('done')
I want to get all three cases found in the string but only get the last. I want to skip up to two words between topping and estimate but cannot guarantee that they will be analysis and consensus. I tried with (?:\w+\s(?!estimate)){0,2}
to skip up to two word to get the results but it is not working for some reason.
Upvotes: 2
Views: 108
Reputation: 271575
You don't need to get "topping estimate" as the result. What you really want is to check whether each line starts with topping
followed by 2 or fewer words, then estimate
or estimates
.
This regex will help you:
^topping(\s\S+){0,2}\sestimates?\s*$
Match this against each line, or multiple lines if you turn on m
. It will tell you whether the string satisfies the requirement.
Upvotes: 4