Reputation: 107
I'm having some problem extracting the last name from a list.
list = ['Cristiano Ronaldo', 'L. Messi', 'M. Neuer', 'L. Suarez', 'De Gea', 'Z. Ibrahimovic', 'G. Bale', 'J. Boateng', 'R. Lewandowski']
for item in list:
print(item)
print(re.findall(r'(\s(.*))', item))
But the output is as such:
Cristiano Ronaldo
[(' Ronaldo', 'Ronaldo')]
L. Messi
[(' Messi', 'Messi')]
M. Neuer
[(' Neuer', 'Neuer')]
L. Suarez
[(' Suarez', 'Suarez')]
De Gea
[(' Gea', 'Gea')]
Z. Ibrahimovic
[(' Ibrahimovic', 'Ibrahimovic')]
G. Bale
[(' Bale', 'Bale')]
J. Boateng
[(' Boateng', 'Boateng')]
R. Lewandowski
[(' Lewandowski', 'Lewandowski')]
I am curious as to why the last names were returned twice; I only want to get back the last names once.
Can any of you kind folks help? Thank you!
Upvotes: 0
Views: 699
Reputation: 91375
\S
matches any character that is not a space.
list = ['Cristiano Ronaldo', 'L. Messi', 'M. Neuer', 'L. Suarez', 'De Gea', 'Z. Ibrahimovic', 'G. Bale', 'J. Boateng', 'R. Lewandowski']
for item in list:
print(item)
print(re.findall(r'\S+$', item)) # match 1 or more non space before end of string
Output:
Cristiano Ronaldo
['Ronaldo']
L. Messi
['Messi']
M. Neuer
['Neuer']
L. Suarez
['Suarez']
De Gea
['Gea']
Z. Ibrahimovic
['Ibrahimovic']
G. Bale
['Bale']
J. Boateng
['Boateng']
R. Lewandowski
['Lewandowski']
Upvotes: 1
Reputation: 2436
You create 2 group with the two pairs of brackets. Remove the outer one and you will get only the last name:
list = ['Cristiano Ronaldo', 'L. Messi', 'M. Neuer', 'L. Suarez', 'De Gea', 'Z. Ibrahimovic', 'G. Bale', 'J. Boateng', 'R. Lewandowski']
for item in list:
print(item)
print(re.findall(r'\s(.*)', item))
Upvotes: 3
Reputation: 4426
Check this out https://regex101.com/r/CGrruO/1
You can see that your regex returns 2 matches.
You added another set of () so you got two matches, one with space and one without.
Changing to \s(.*)
should work
Upvotes: 0
Reputation: 82755
Use str.split()
with negative indexing
Ex:
lst = ['Cristiano Ronaldo', 'L. Messi', 'M. Neuer', 'L. Suarez', 'De Gea', 'Z. Ibrahimovic', 'G. Bale', 'J. Boateng', 'R. Lewandowski']
for item in lst:
print(item)
print(item.split()[-1])
Output:
Ronaldo
Messi
Neuer
Suarez
Gea
Ibrahimovic
Bale
Boateng
Lewandowski
Upvotes: 3