Reputation: 10675
I want to match a regex to match a word that might not exist. I read here that I should try something like this:
import re
line = "a little boy went to the small garden and ate an apple"
res = re.findall("a (little|big) (boy|girl) went to the (?=.*\bsmall\b) garden and ate a(n?)",line)
print res
but the output of this is
[]
which is also the output if I set line
to be
a little boy went to the garden and ate an apple
How do I allow for a possible word to exist or not exist in my text and catch it if it exist?
Upvotes: 0
Views: 584
Reputation: 815
First, you need to match not only a "small" word, but also a space after that (or before that). So you could use regex like this: (small )?
.
On the other hand you want to catch the word only. To exclude the match from capturing you should use regex like this: (?:(small) )?
Full example:
import re
lines = [
'a little boy went to the small garden and ate an apple',
'a little boy went to the garden and ate an apple'
]
for line in lines:
res = re.findall(r'a (little|big) (boy|girl) went to the (?:(small) )?garden and ate a(n?)', line)
print res
Output:
[('little', 'boy', 'small', 'n')]
[('little', 'boy', '', 'n')]
Upvotes: 2