Miranda
Miranda

Reputation: 21

Pandas - using isin to return if column contains any values in a list, rather than all

Sorry for a somewhat basic question, pretty new to python / pandas.

I'm trying to create a column from my database that returns True or False as to whether another column contains any (not all) string from a list of strings. Currently my code looks like this:

keywords_list = ["foo, bar, ..etc]

df['relevant'] = df['Description'].isin(keywords_list)

I know that my 'Description' column contains some of the values in the list, but it is returning all as false. I've looked at similar stackoverflow questions (see below), and they all say to do what I am doing. But the pandas documentation (also below) says that isin only works if it contains all the values in the list. Is there a function I can use that will return if the column includes any of the values in the list? Please help!

Filter out rows based on list of strings in Pandas https://pandas.pydata.org/pandas-docs/stable/generated/pandas.DataFrame.isin.html

Upvotes: 2

Views: 4551

Answers (2)

Vaishali
Vaishali

Reputation: 38415

You may have to separate the words using split and then use isin

df = pd.DataFrame({'Description': ['foo bar blah', 'new foo', 'newfoo', 'bar']})
keywords_list = ["foo", "bar"]

df['Description'].str.split(expand = True).isin(keywords_list).any(1)

0     True
1     True
2    False
3     True

Upvotes: 3

piRSquared
piRSquared

Reputation: 294278

Use pandas.Series.str.contains

df['Description'].str.contains('|'.join(keywords_list))

Upvotes: 4

Related Questions