Yotam
Yotam

Reputation: 10675

python regex match a possible word

I want to match a regex to match a word that might not exist. I read here that I should try something like this:

import re

line = "a little boy went to the small garden and ate an apple"


res = re.findall("a (little|big) (boy|girl) went to the (?=.*\bsmall\b) garden and ate a(n?)",line)

print res

but the output of this is

[]

which is also the output if I set line to be

a little boy went to the garden and ate an apple

How do I allow for a possible word to exist or not exist in my text and catch it if it exist?

Upvotes: 0

Views: 584

Answers (1)

vasi1y
vasi1y

Reputation: 815

First, you need to match not only a "small" word, but also a space after that (or before that). So you could use regex like this: (small )?. On the other hand you want to catch the word only. To exclude the match from capturing you should use regex like this: (?:(small) )?

Full example:

import re

lines = [
    'a little boy went to the small garden and ate an apple',
    'a little boy went to the garden and ate an apple'
]

for line in lines:
    res = re.findall(r'a (little|big) (boy|girl) went to the (?:(small) )?garden and ate a(n?)', line)
    print res

Output:

[('little', 'boy', 'small', 'n')]
[('little', 'boy', '', 'n')]

Upvotes: 2

Related Questions