Pandas : filter the rows based on a column containing lists

Question

How to filter the rows in a data frame based on another column value?

I have a data frame which is,

ip_df:
     class    name     marks          min_marks  min_subjects
0    I        tom      [89,85,80,74]  80         2
1    II       sam      [65,72,43,40]  85         1

Based on the column values of "min_subject" and "min_marks", the row should be filtered.

For index 0, the "min_subjects" is "2", at least 2 elements in "marks" column should be greater than 80 i.e., "min_marks" column then a new column named "flag" has to be added as 1
For index 1, the "min_subjects" is "1", at least 1 element in "marks" column should be greater than 85 i.e., "min_marks" column then a new column named "flag" has to be added as 0 (i.e., flag=0 as the condition didnt satisfy here)

The final outcome should be,

op_df:
     class    name     marks          min_marks  min_subjects flag
0    I        tom      [89,85,80,74]  80         2            1
1    II       sam      [65,72,43,40]  85         1            0

Can anyone help me to achieve the same in the data frame?

jezrael · Accepted Answer

Use list comprehension with zip by 3 columns, compare each value in generator and sum for count, last compare by minimal marks and convert to integers:

df['flag'] = [1 if sum(x > c for x in a) >= b else 0 
                 for a, b, c in zip(df['marks'], df['min_subjects'], df['min_marks'])]

Alternative with convert boolean by int to 0,1:

df['flag'] = [int(sum(x > c for x in a) >= b)
                 for a, b, c in zip(df['marks'], df['min_subjects'], df['min_marks'])]

Or solution with numpy:

df['flag'] = [int(np.sum(np.array(a) > c) >= b)
                  for a, b, c in zip(df['marks'], df['min_subjects'], df['min_marks'])]

print (df)
  class name             marks  min_marks  min_subjects  flag
0     I  tom  [89, 85, 80, 74]         80             2     1
1    II  sam  [65, 72, 43, 40]         85             1     0

Pandas : filter the rows based on a column containing lists

Answers (2)

Related Questions