Reputation: 149
Hi I have an excel data with multiple columns and i need to fined specific word and return it in new column the table look like this:
ID col0 col1 col2 col3 col4 col5
1 jack a/h t/m w/n y/h 56
2 sam z/n b/w null null 93
3 john b/i y/d p/d null 33
I want to look for 'b' in columns col1, col2, col3, and col4 and create a new column called "b" where the value the cell value with be is returned
the result would look like this
ID col0 col1 col2 col3 col4 col5 b
1 jack a/h t/m w/n y/h 56 -
2 sam z/n b/w null null 93 b/w
3 john b/i y/d p/d null 33 b/i
and I need an efficient way to do it I tried to use where like this
df1 = df[['col1', 'col2', 'col3', 'col4']]
df1['b']==[x for x in df1.values[0] if any(b for b in lst if b in str(x))]
I got this from this answer https://stackoverflow.com/a/50250103/3105140
yet it is not working for me snice I have null value and rows where the condition do not work
Upvotes: 1
Views: 446
Reputation: 28644
In a bid to avoid selecting the columns, I used melt:
M = (df.copy()
.melt(id_vars='ID')
.loc[lambda x:x['value'].astype('str').str.contains('b')]
.drop('variable',axis=1))
pd.merge(df,M,how='left',on='ID').rename({'value':'b'},axis=1)
D col0 col1 col2 col3 col4 col5 b
0 1 jack a/h t/m w/n y/h 56 NaN
1 2 sam z/n b/w NaN NaN 93 b/w
2 3 john b/i y/d p/d NaN 33 b/i
Upvotes: 0
Reputation: 30920
I would use DataFrame.stack
with callable
:
cols = ['col1', 'col2', 'col3', 'col4']
df['b']=(df[cols].stack()
.loc[lambda x: x.str.contains('b')]
.reset_index(level=1,drop=1)
#.fillna('-') #for the expected output
)
Output
ID col0 col1 col2 col3 col4 col5 b
0 1 jack a/h t/m w/n y/h 56 NaN
1 2 sam z/n b/w NaN NaN 93 b/w
2 3 john b/i y/d p/d NaN 33 b/i
Upvotes: 3
Reputation: 75080
Here is a way using stack
and str.contains
with df.where
:
cols = ['col1', 'col2', 'col3', 'col4']
df['b'] = (df[cols].where(df[cols].stack().str.contains('b')
.unstack(fill_value=False)).ffill(1).iloc[:,-1])
print(df)
ID col0 col1 col2 col3 col4 col5 b
0 1 jack a/h t/m w/n y/h 56 NaN
1 2 sam z/n b/w NaN NaN 93 b/w
2 3 john b/i y/d p/d NaN 33 b/i
Upvotes: 3