user17841968
user17841968

Reputation:

Filter a Dataframe by iterating through a list of strings

I have a dataframe that I want to check if it contains certain data. If anything in the dataframe column Names has store, pharmacy or str1, then I want that row of data in a different file. And everything else in another.

I want to know if there is a way to search for these strings in the dataframe. I have this line of code with a for loop inside back when I had only one string key='store', now it's a list as seen below:

key = ['store', 'pharmacy', 'str1'] #this is what I want to use now
#key = 'store' #this is what I had before

indiceName = [key in value for value in df['Names']]
subsetName = df[indiceName]
indiceStr  = [key not in row for row in df['Names']] 
subsetStr  = df[indiceStr]

Ouput looks like:

[False, False, True, True]
[True, True, False, False]

I want to keep it as that one line for loop. Is that possible? Something like:

indiceName = [key[i] in value for value in df['Names']]
indiceStr  = [key[i] not in row for row in df['Names']]

Upvotes: 1

Views: 479

Answers (1)

not_speshal
not_speshal

Reputation: 23146

Use str.contains:

df1 = df[df["Names"].str.lower().str.contains("|".join(key))]
df2 = df[~df["Names"].str.lower().str.contains("|".join(key))]

Upvotes: 1

Related Questions