Summing up two tables by values common in one of their columns

Question

d is holding data about people's planned and actual work duration in hours in either owner or manager role. Note that one person can - but does not have to - be an owner and manager at the same time.

I need to rearrange d in a way that gives me all the names in one column and in the other columns the planned and actual hours in each role.

The code below achieves that, but is not very pretty.

How can I utilize some native Pandas features to get the same result, but with less typing?

import pandas as pd

d = {
    'owner': ['mike', 'john', 'jake', 'lucy', 'mary', 'hans'],
    'owner planned': [54, 67, 52, 19, 87, 45],
    'owner actual': [12, 54, 3, 67, 84, 22],
    'manager': ['andrew', 'tom', 'john', 'mike', 'hans', 'paul'],
    'manager planned': [13, 432, 453, 765, 432, 234], 
    'manager actual': [22, 33, 44, 55, 66, 77],
}

df = pd.DataFrame(d)
names = list(set(df['owner'].tolist() + df['manager'].tolist()))
output = {}

for name in names:
    op = df[df['owner'] == name]['owner planned'].sum()
    oa = df[df['owner'] == name]['owner actual'].sum()
    mp = df[df['manager'] == name]['manager planned'].sum()
    ma = df[df['manager'] == name]['manager actual'].sum()

    output.setdefault('owner_planned', []).append(op)
    output.setdefault('owner_actual', []).append(oa)
    output.setdefault('manager_planned', []).append(mp)
    output.setdefault('manager_actual', []).append(ma)
    output.setdefault('names', []).append(name)

df2 = pd.DataFrame(output)
print(df2)

cs95 · Accepted Answer

Use filter, concat, and DataFrameGroupBy.sum:

u = df.filter(like='owner').rename({'owner':'names'}, axis=1)
v = df.filter(like='manager').rename({'manager':'names'}, axis=1)

pd.concat([u,v], sort=False).groupby('names').sum()

        owner planned  owner actual  manager planned  manager actual
names                                                               
andrew            0.0           0.0             13.0            22.0
hans             45.0          22.0            432.0            66.0
jake             52.0           3.0              0.0             0.0
john             67.0          54.0            453.0            44.0
lucy             19.0          67.0              0.0             0.0
mary             87.0          84.0              0.0             0.0
mike             54.0          12.0            765.0            55.0
paul              0.0           0.0            234.0            77.0
tom               0.0           0.0            432.0            33.0

Summing up two tables by values common in one of their columns

Answers (2)

Related Questions