Use `isin(list1)` in pandas to identify values in a column that has all the items in list1

Question

For a given pandas dataframe like the following,

    h1  h2  h3
    mn  a   1
    mn  b   1
    rs  b   1
    pq  a   1
    we  c   1

if I use the filtering with isin(), say df[df["h2"].isin(["a","b"])]["h1"].unique(), it would result in the following:

    h1
    mn
    rs
    pq

Instead of the behavior that matches with any element of the list, I need to find entries that matches all of the elements in the list, i.e. the desired output should be:

 h1
 mn

How exactly can this be achieved? The number of elements in the list inside isin() is arbitrary, and can be more than 2.

jezrael · Accepted Answer

You can use issubset with set per groups for mask:

s = df.groupby('h1')['h2'].apply(lambda x: set(["a","b"]).issubset(x))
print (s)
h1
mn     True
pq    False
rs    False
we    False
Name: h2, dtype: bool

And then filter index values:

vals = s.index[s]
print (vals)
Index(['mn'], dtype='object', name='h1')

Use `isin(list1)` in pandas to identify values in a column that has all the items in list1

Answers (2)

Related Questions