Reputation: 217
I have a dataframe that looks like this:
ID Column1 Column2 Column3
1 cats dog bird
2 dog elephant tiger
3 leopard monkey cat
I'd like to create a new column that indicates whether cat
is present in that row, as part of a string, so that the dataframe looks like this:
ID Column1 Column2 Column3 Column4
1 cats dog bird Yes
2 dog elephant tiger No
3 leopard monkey cat Yes
I would like to do this without assessing each column individually, because in the real data set there are a lot of columns.
Upvotes: 1
Views: 589
Reputation: 39800
The following should do the trick for you:
df['Column4'] = np.where((df.astype(np.object)=='cat').any(1), 'Yes', 'No')
Working example:
>>> import pandas as pd
>>> import numpy as np
>>> d = {'ID': [1, 2, 3], 'Column1': ['cat', 'dog', 'leopard'], 'Column2': ['dog', 'elephant', 'monkey'], 'Column3': ['bird', 'tiger', 'cat']}
>>> df = pd.DataFrame(data=d)
>>> df
Column1 Column2 Column3 ID
0 cat dog bird 1
1 dog elephant tiger 2
2 leopard monkey cat 3
>>> df['Column4'] = np.where((df.astype(np.object)=='cat').any(1), 'Yes', 'No')
>>> df
Column1 Column2 Column3 ID Column4
0 cat dog bird 1 Yes
1 dog elephant tiger 2 No
2 leopard monkey cat 3 Yes
EDIT: In case you want to check if any of the columns contains a particular string you can use the following:
df['Column4'] = df.apply(lambda r: r.str.contains('cat', case=False).any(), axis=1)
Upvotes: 3