Reputation: 2393
I have spent some time learning Regular Expression, but I still don't understand how the following trick works to match two words in different order.
import re
reobj = re.compile(r'^(?=.*?(John))(?=.*?(Peter)).*$',re.MULTILINE)
string = '''
John and Peter
Peter and John
James and Peter and John
'''
re.findall(reobj,string)
result
[('John', 'Peter'), ('John', 'Peter'), ('John', 'Peter')]
( https://www.regex101.com/r/qW4rF4/1)
I know the (?=.* )
part is called Positive Lookahead
, but how does it work in this situation?
Any explanation?
Upvotes: 6
Views: 1384
Reputation: 67998
It just does not match in any arbitrary order.Capturing here is being done by .*
which consumes anything which comes its way.The positive lookahead
makes an assertion .You have two lookaheads
.They are independent of each other.Each makes an assertion one word.So finally your regex works like:
1)(?=.*?(John))
===String should have a John
.Just an assertion.Does not consume anything
2)(?=.*?(Peter))
===String should have a Peter
.Just an assertion.Does not consume anything
3).*
===Consume anything if assertions have passed
So you see the order does not matter here.,what is imp is that assertions should pass
.
Upvotes: 2