Raven
Raven

Reputation: 147

Pandas: result column by aggregating entire row

I have a pandas dataframe containing tuples of booleans (real value, predicted value) and want to create new columns containing the amount of true/false positives/negatives. pandas dataframe I know i could loop through the indices and set the column value for that index after looping through the entire row, but i believe that's a pandas anti-pattern. Is there a cleaner and more efficient way to do this?

Upvotes: 2

Views: 88

Answers (2)

n1colas.m
n1colas.m

Reputation: 3989

Another alternative would be to check the whole dataframe for (True,False) values and sum the amount of matches along the columns axis (sum(axis=1)).

df['false_positives'] = df.apply(lambda x: x==(True,False)).sum(axis=1)

Upvotes: 1

Raven
Raven

Reputation: 147

This seems to work fine:

def count_false_positives(row):
  count = 0
  for el in df.columns:
    if(row[el][0] and not row[el][1]): 
      count+=1
  return count

df.false_positives = df.apply(lambda row: count_false_positives(row), axis=1)

Upvotes: 1

Related Questions