Reputation: 37
from each row i want to extract the last occurrence of the word "user" + the number that follow right after it from a pandas series. everything else can be discarded. how would you perform this action? thanks!!!
here's an example of the series :
0 1 - Unassigned, 2 - User 397335
1 1 - Unassigned, 2 - User 525767, 3 - Unassigned
2 1 - Unassigned
3 1 - Unassigned
4 1 - Unassigned
...
163678 1 - Unassigned
163679 1 - Unassigned, 2 - User 347991, 3 - Unassigned
163680 1 - Unassigned
163681 1 - Unassigned
163682 1 - Unassigned, 2 - User 663455, 3 - Unassigned
Upvotes: 1
Views: 733
Reputation: 120419
Use str.findall
:
>>> df['A'].str.findall(r'User \d+').str[-1]
0 User 397335
1 User 525767
2 NaN
3 NaN
4 NaN
163678 NaN
163679 User 347991
163680 NaN
163681 NaN
163682 User 663455
Name: A, dtype: object
Upvotes: 2