Reputation:
I have a dataframe that I want to check if it contains certain data. If anything in the dataframe column Names
has store
, pharmacy
or str1
, then I want that row of data in a different file. And everything else in another.
I want to know if there is a way to search for these strings in the dataframe. I have this line of code with a for loop inside back when I had only one string key='store'
, now it's a list as seen below:
key = ['store', 'pharmacy', 'str1'] #this is what I want to use now
#key = 'store' #this is what I had before
indiceName = [key in value for value in df['Names']]
subsetName = df[indiceName]
indiceStr = [key not in row for row in df['Names']]
subsetStr = df[indiceStr]
Ouput looks like:
[False, False, True, True]
[True, True, False, False]
I want to keep it as that one line for loop. Is that possible? Something like:
indiceName = [key[i] in value for value in df['Names']]
indiceStr = [key[i] not in row for row in df['Names']]
Upvotes: 1
Views: 479
Reputation: 23146
Use str.contains
:
df1 = df[df["Names"].str.lower().str.contains("|".join(key))]
df2 = df[~df["Names"].str.lower().str.contains("|".join(key))]
Upvotes: 1