Aaron
Aaron

Reputation: 2393

Match two word in arbitrary order using regex

I have spent some time learning Regular Expression, but I still don't understand how the following trick works to match two words in different order.

import re
reobj = re.compile(r'^(?=.*?(John))(?=.*?(Peter)).*$',re.MULTILINE)

string = '''
John and Peter
Peter and John
James and Peter and John
'''
re.findall(reobj,string)

result

[('John', 'Peter'), ('John', 'Peter'), ('John', 'Peter')]

enter image description here

( https://www.regex101.com/r/qW4rF4/1)

I know the (?=.* ) part is called Positive Lookahead, but how does it work in this situation?

Any explanation?

Upvotes: 6

Views: 1384

Answers (1)

vks
vks

Reputation: 67998

It just does not match in any arbitrary order.Capturing here is being done by .* which consumes anything which comes its way.The positive lookahead makes an assertion .You have two lookaheads .They are independent of each other.Each makes an assertion one word.So finally your regex works like:

1)(?=.*?(John))===String should have a John.Just an assertion.Does not consume anything

2)(?=.*?(Peter))===String should have a Peter.Just an assertion.Does not consume anything

3).*===Consume anything if assertions have passed

So you see the order does not matter here.,what is imp is that assertions should pass.

Upvotes: 2

Related Questions