hydradon
hydradon

Reputation: 1446

Python pandas dataframe check if values of one column is in another list

I have a Pandas dataframe:

id         attr
1          val1
2          val1||val2
3          val1||val3
4          val3

and a list special_val = ['val1', 'val2', 'val4']

I want to filter the first dataframe to keep rows whose ALL attr values are in the list. So I need the results to be like this:

id     attr
1      val1                #val1 is in special_val
2      val1||val2          #both val1 and val2 are in special_val 

I am thinking of using pandas.DataFrame.isin or pandas.Series.isin but I can't come up with the correct syntax. Could you help?

Upvotes: 0

Views: 94

Answers (3)

null
null

Reputation: 2137

You can try the following.

df['match'] = df['attr'].apply(lambda x: True if set(x.split('||')).intersection(set(special_val)) else False)
df[df['match'] == True]

Output

   id        attr
0   1        val1
1   2  val1||val2

Upvotes: 1

Georgina Skibinski
Georgina Skibinski

Reputation: 13407

You can do:

import numpy as np
special_val = set(['val1', 'val2', 'val4'])

df["attr2"]=df["attr"].str.split("\|\|").map(set)
df=df.loc[df["attr2"].eq(np.bitwise_and(df["attr2"], special_val))].drop(columns="attr2")

Outputs:

   id        attr
0   1        val1
1   2  val1||val2

Upvotes: 0

Quang Hoang
Quang Hoang

Reputation: 150815

You can combine str.split, isin(), and groupby():

s = df['attr'].str.split('\|+', expand=True).stack().isin(special_val).groupby(level=0).all()
df[s]

Output:

   id        attr
0   1        val1
1   2  val1||val2

Upvotes: 2

Related Questions