Jonas Palačionis
Jonas Palačionis

Reputation: 4842

Doing mathematic operations in pandas dataframe if condition is met

I am new to pandas.

My DataFrame looks like this:

    a1  b1   c1  d1  e1 
A   10  10   1   2   0   
B   20  20   2   1   1
C   30  30   3   1   0
D   40  40   4   1   1
E   40  40   4   1   2
F   40  40   4   1   1

I want to do math operations only for values where e1 is the same.

For example: (a1A + a1C) / ( c1A + c1C ) for values where C is the same. So I would end up with a dataframe like this:

    a1  b1   c1  d1  e1     result
A   10  10   1   2   0      (a1A + a1C) / ( c1A + c1C )
B   20  20   2   1   1      (a1B + a1D+ a1F) / ( c1B + c1D+ c1F )
C   30  30   3   1   0      Do not calculate it because its already calculated
D   40  40   4   1   1      Do not calculate it because its already calculated
E   40  40   4   1   2      (a1E / c1E)
F   40  40   4   1   1      Do not calculate it because its already calculatedcalculated

I do not know how could I apply a condition to the calculations and how would I omit calculations if it has already been calculated.

Thank you for your suggestions.

Upvotes: 0

Views: 81

Answers (1)

jezrael
jezrael

Reputation: 863166

First aggregate sum per groups, then remove duplicates by Series.drop_duplicates and last use Series.map by difference:

s = df.groupby('e1')['a1','c1'].sum() 

df['new'] = df['e1'].drop_duplicates().map(s.a1 / s.c1)
print (df)
   a1  b1  c1  d1  e1   new
A  10  10   1   2   0  10.0
B  20  20   2   1   1  10.0
C  30  30   3   1   0   NaN
D  40  40   4   1   1   NaN
E  40  40   4   1   2  10.0
F  40  40   4   1   1   NaN

Also I think in pandas obviously map by unique values is not necessary, obviously is used GroupBy.transform and added new column filled by mapped data:

df2 = df.groupby('e1')['a1','c1'].transform('sum')
print (df2)
    a1  c1
A   40   4
B  100  10
C   40   4
D  100  10
E   40   4
F  100  10

df['new'] = df2.a1 / df2.c1
print (df)
   a1  b1  c1  d1  e1   new
A  10  10   1   2   0  10.0
B  20  20   2   1   1  10.0
C  30  30   3   1   0  10.0
D  40  40   4   1   1  10.0
E  40  40   4   1   2  10.0
F  40  40   4   1   1  10.0

Upvotes: 3

Related Questions