Reputation: 35

Sum one dataframe based on value of other dataframe in same index/row

I would like to sum values of a dataframe conditionally, based on the values of a different dataframe. Say for example I have two dataframes:

df1 = pd.DataFrame(data = [[1,-1,5],[2,1,1],[3,0,0]],index=[0,1,2],columns = [0,1,2])

index   0   1   2        
-----------------
0       1   -1   5    
1       2   1   1  
2       3   0   0

df2 = pd.DataFrame(data = [[1,1,3],[1,1,2],[0,2,1]],index=[0,1,2],columns = [0,1,2])

index   0   1   2    
-----------------
0       1   1   3   
1       1   1   2  
2       0   2   1

Now what I would like is that for example, if the row/index value of df1 equals 1, to sum the location of those values in df2.

In this example, if the condition is 1, then the sum of df2 would be 4. If the condition was 0, the result would be 3.

Upvotes: 1

Answers (3)

Gonçalo Peres

Reputation: 13582

Assuming that one wants to store the value in the variable value, there are various options to achieve that. Will leave below two of them.

Option 1

One can simply do the following

value = df2[df1 == 1].sum().sum()

[Out]: 4.0 # numpy.float64

# or

value = sum(df2[df1 == 1].sum())

[Out]: 4.0 # float

Option 2

Using pandas.DataFrame.where

value = df2.where(df1 == 1, 0).sum().sum()

[Out]: 4.0 # numpy.int64

# or

value = sum(df2.where(df1 == 1, 0).sum())

[Out]: 4 # int

Notes:

Both df2[df1 == 1] and df2.where(df1 == 1, 0) give the following output

     0    1    2
 0  1.0  NaN  NaN
 1  NaN  1.0  2.0
 2  NaN  NaN  NaN

Depending on the desired output (float, int, numpy.float64,...) one method might be better than the other.

Upvotes: 0

yatu

Reputation: 88226

Another option with Pandas' query:

df2.query("@df1==1").sum().sum()
# 4

Upvotes: 4

mozway

Reputation: 260335

You can use a mask with where:

df2.where(df1.eq(1)).to_numpy().sum()
# or
# df2.where(df1.eq(1)).sum().sum()

output: 4.0

intermediate:

df2.where(df1.eq(1))
     0    1    2
0  1.0  NaN  NaN
1  NaN  1.0  2.0
2  NaN  NaN  NaN

Upvotes: 2

Sum one dataframe based on value of other dataframe in same index/row

Answers (3)

Related Questions