Reputation: 115
my df look like this:
as you can see the User starts with 'ff' and it could be in access column or any other column rather than user column.
i want to create a new column in this df called "UserID" where whenever the is 'ff' in all the columns copy this value to my new column "UserId"
i have been using this method which is working fine but i have to repeat this line in all the columns:
hist.loc[hist.User.str.startswith("ff",na=False),'UserId']=hist['User'].str[2:]
is there any other method i can use to loop over all rows at once?
thanks
Upvotes: 1
Views: 441
Reputation: 16747
NumPy + Pandas solution below.
In case of ambiguity (several ff
-strings in a row) leftmost occurance is taken. In case of absence (no ff
-string in a row) NaN
value is used.
import pandas as pd, numpy as np
df = pd.DataFrame({
'user': ['fftest', 'fwadmin', 'fshelpdesk3', 'no', 'ffone'],
'access': ['fwadmin', 'ffuser2', 'fwadmin', 'user', 'fftwo'],
'station': ['fshelpdesk', 'fshelpdesk2', 'ffuser3', 'here', 'three'],
})
sv = df.values.astype(np.str)
ix = np.argwhere(np.char.startswith(sv, 'ff'))[::-1].T
df.loc[ix[0], 'UserID'] = pd.Series(sv[(ix[0], ix[1])]).str[2:].values
print(df)
Output:
user access station UserID
0 fftest fwadmin fshelpdesk test
1 fwadmin ffuser2 fshelpdesk2 user2
2 fshelpdesk3 fwadmin ffuser3 user3
3 no user here NaN
4 ffone fftwo three one
Upvotes: 0
Reputation: 31
Hey here is my attempt at solving the problem, hope it helps.
d = df[df.apply(lambda x: x.str.startswith('ff'))]
df['user_id'] = d['user'].fillna(d['access'].fillna(d['station']))
Result
user access station user_id
0 fftest fwadmin fshelpdesk fftest
1 fwadmin ffuser2 fshelpdesk2 ffuser2
2 fshelpdesk3 fwadmin ffuser3 ffuser3
Upvotes: 0
Reputation: 1391
If you are cool with picking only the first occurence:
df['UserID'] = df.apply(lambda x: x[x.str.startswith('ff')][:1], axis=1)
Upvotes: 1