Reputation: 986
My dataframe A:
id groupA groupB groupC groupD groupE ...
001 0 0 0 0 0
002 0 0 0 0 0
003 0 0 0 0 0
...
Dataframe B:
id value count
001 groupA 2
002 groupB 1
001 groupB 3
001 groupC 1
003 groupC 2
002 groupA 1
dfB.groupby(['id', 'value'])['count'].sum()
By running this groupby I get:
id value
1 groupA 2
groupB 3
groupC 1
2 groupA 1
groupB 1
3 groupC 2
Name: count, dtype: int64
I was trying to transform this result to a dataframe and map it back to dataframe A, but it didn't work out.
My ideal dataframe Aoutput is:
id groupA groupB groupC groupD groupE ...
001 2 3 1 0 0
002 1 1 0 0 0
003 0 0 2 0 0
...
Upvotes: 2
Views: 454
Reputation: 323356
Do with one-line chain with reindex
dfb.groupby(['id', 'value'])['count'].sum().\
unstack(fill_value=0).\
reindex(columns=dfa.columns,index=dfa.id,fill_value=0).\
drop('id',1)
groupA groupB groupC groupD groupE
id
1 2 3 1 0 0
2 1 1 0 0 0
3 0 0 2 0 0
Upvotes: 3
Reputation: 1475
You can use pivot like this too. Seems cleaner
df
id value count
0 1 groupA 2
1 2 groupB 1
2 1 groupB 3
3 1 groupC 1
4 3 groupC 2
5 2 groupA 1
6 3 groupA 5
aa = pd.DataFrame(df.groupby(by=["id", "value"]).sum()).reset_index().pivot(index="id", columns="value", values="count").fillna(0)
aa.columns.name = aa.index.name
aa.index.name = ""
aa
id groupA groupB groupC
1 5.0 3.0 1.0
2 1.0 1.0 0.0
3 5.0 0.0 2.0
Upvotes: 2
Reputation: 38415
You can pivot the df2 and merge the two dataframes,
df2.pivot_table(index = 'id', columns = 'value', values = 'count', aggfunc='sum').reset_index()\
.merge(df1, how = 'left').fillna(0).astype(int)
id groupA groupB groupC groupD groupE
0 1 2 3 1 0 0
1 2 1 1 0 0 0
2 3 0 0 2 0 0
You can do the same with groupby as well,
df2.groupby(['id', 'value'])['count'].sum().unstack().reset_index()\
.merge(df1, how = 'left').fillna(0).astype(int)
Upvotes: 3