Checking if dataframe groups contain values from list

Question

How do I print/return the value based on values from another column?

analyse = input[['SR. NO', 'COUNTRY_NAME']]
print(analyse)

SR. NO    COUNTRY_NAME
     2          Norway
     2         Denmark
     2         Iceland
     2         Finland
     3         Denmark
     3         Iceland
     4         Finland
     4          Norway

Here, I want to check if Norway or Denmark are present for every SR. NO, return those serial numbers where either one of these 2 countries aren't found! I tried using groupby and iterating over countries but that didn't help. I'm stuck at that point.

So, the expected output is:

[3,4]

jezrael · Accepted Answer

You can use set.issubset for test if all values of list exist per groups:

L = ['Norway', 'Denmark']
s = set(L)
out = df.groupby('SR. NO')['COUNTRY_NAME'].apply(lambda x: s.issubset(x))

Thank you @yatu and @taras for improvement:

s = frozenset(L)
out = df.groupby('SR. NO')['COUNTRY_NAME'].apply(s.issubset)

Then filter index of only Trues values:

out = out.index[~out].tolist()
print (out)
[3, 4]

Another solution with filter in list comprehension:

L = ['Norway', 'Denmark']
s = set(L)
out =  [k for k, v in df.groupby('SR. NO')['COUNTRY_NAME'].apply(set).items() 
               if not s.issubset(v)]
print (out)
[3, 4]

Checking if dataframe groups contain values from list

Answers (2)

Related Questions