Pat
Pat

Reputation: 155

python regex skipping optional words not working

I have been trying to find matches where they may be optional words in the string that need to be ignored if they are present.

The code I tried is:

    import re
    str = '''
         topping consensus estimates 
         topping analysis' consensus estimate
         topping estimate
    '''
    for m in re.finditer(r'(?P<p3c>topping\s+(?:\w+\s(?!estimate)){0,2}(estimate))',str):
        print(m.group())
    print('done')

I want to get all three cases found in the string but only get the last. I want to skip up to two words between topping and estimate but cannot guarantee that they will be analysis and consensus. I tried with (?:\w+\s(?!estimate)){0,2} to skip up to two word to get the results but it is not working for some reason.

Upvotes: 2

Views: 108

Answers (1)

Sweeper
Sweeper

Reputation: 271575

You don't need to get "topping estimate" as the result. What you really want is to check whether each line starts with topping followed by 2 or fewer words, then estimate or estimates.

This regex will help you:

^topping(\s\S+){0,2}\sestimates?\s*$

Match this against each line, or multiple lines if you turn on m. It will tell you whether the string satisfies the requirement.

Upvotes: 4

Related Questions