Add the values of two tables based on a combination of columns

Question

I have two tables:

df1 = pd.DataFrame({
    "c_id": [2000,3000,3000], 
    "cloud":["GCP","GCP","Azure"], 
    "invoice":[100,100,300]
})

c_id    cloud   invoice
2000    GCP     100
3000    GCP     100
3000    Azure   300

df2 = pd.DataFrame({
    "c_id": [1000,2000,2000,3000,3000], 
    "cloud":["Azure","GCP","Azure","AWS","Azure"], 
    "invoice":[200,200,300,100,100]
})

c_id    cloud   invoice
1000    Azure   200
2000    GCP     200
2000    Azure   300
3000    AWS     100
3000    Azure   100

I want to add the two tables based on the combination of columns c_id and cloud. The result I'm looking for is:

c_id    cloud   invoice
1000    Azure   200
2000    Azure   300
2000    GCP     300
3000    AWS     100
3000    Azure   400
3000    GCP     100

In my example, I only displayed the column invoice. In my actual dataset, there are actually more than 40 columns with a lot more constraints. Several of the columns only have value when the cloud is Azure, while others only have value when the cloud is either Azure or GCP

Is there's a clean way to add df1 and df2?

jezrael · Accepted Answer

Use concat with aggregate sum:

df1 = pd.DataFrame({
    "c_id": [2000,3000,3000], 
    "cloud":["GCP","GCP","Azure"], 
    "invoice":[100,100,300]
})
print (df1)
   c_id  cloud  invoice
0  2000    GCP      100
1  3000    GCP      100
2  3000  Azure      300


df2 = pd.DataFrame({
    "c_id": [1000,2000,2000,3000,3000], 
    "cloud":["Azure","GCP","Azure","AWS","Azure"], 
    "invoice":[200,200,300,100,100]
})
print (df2)
   c_id  cloud  invoice
0  1000  Azure      200
1  2000    GCP      200
2  2000  Azure      300
3  3000    AWS      100
4  3000  Azure      100

df = pd.concat([df1, df2]).groupby(['c_id','cloud'], as_index=False).sum()
print (df)
   c_id  cloud  invoice
0  1000  Azure      200
1  2000  Azure      300
2  2000    GCP      300
3  3000    AWS      100
4  3000  Azure      400
5  3000    GCP      100

Add the values of two tables based on a combination of columns

Answers (2)

Related Questions