Reputation: 105
The following code works as I expected. But I have one questions:
import re
names_email="Harry Rogers [email protected]"
name_match=re.compile("([\w\s]*)(\s)([\w.]*@[\w.]*)")
name=re.search(name_match,names_email)
print (name.group(3))
print(name.group(1))
[email protected]
Harry Rogers
But why ([\w\s]*)
is not matching upto Harry Rogers
being greedy ? Why it is trying to match best possible for ([\w\s]*)(\s)
Upvotes: 0
Views: 76
Reputation: 48761
But why
([\w\s]*)
is not matching uptoHarry Rogers
being greedy ?
It doesn't include four spaces after Rogers
in first capturing group because a space character must be matched in another group after being satisfied with first pattern.
This means [\w\s]*
will match up to @
character then backtracks to match a space character which is right after h
in harri
. Leaving first capturing group with Harry Rogers
(three space characters).
Upvotes: 1
Reputation: 2255
It's because (\s) indicates it only matches one space if you want group(1) to only match the "Harry Rogers" without tailing space, the codes should looks like this:
import re
names_email = "Harry Rogers [email protected]"
name_match = re.compile("([\w\s]*?)([\s]+)([\w.]*@[\w.]*)")
name = re.search(name_match, names_email)
print(name.groups())
Upvotes: 0