Reputation: 891
I have the following df that I group and then try to change values of US with the overall per type sum, but I get nan:
dff=pd.DataFrame({'country':['US','US','UK','UK','FR','FR','ES','ES'],'type':['A','B','C','B','A','C','A','B'],'sales':[100,200,100,100,50,10,20,40]})
grouped=dff.groupby(['country','type'])['sales'].sum()
grouped['US']=dff.loc[dff.country!='US'].groupby('type')['sales'].sum()
print(grouped)
And I want to get:
US A 70
US B 140
Upvotes: 0
Views: 193
Reputation: 28699
We can use pivot
to assign the values to US
:
dff=pd.DataFrame({'country':['US','US','UK','UK','FR','FR','ES','ES'],'type':['A','B','C','B','A','C','A','B'],'sales':[100,200,100,100,50,10,20,40]})
#use as_index=False to keep group keys within the dataframe
grouped=dff.groupby(['country','type'],as_index=False)['sales'].sum()
grouped.loc[lambda x: x.country.eq("US"), "sales"] = (grouped
#filter out US
#and search for A and B in the type column
.query("country != 'US' and type in ['A','B']")
.pivot(columns='type',values='sales')
.sum()
.array)
grouped
country type sales
0 ES A 20
1 ES B 40
2 FR A 50
3 FR C 10
4 UK B 100
5 UK C 100
6 US A 70
7 US B 140
Upvotes: 1
Reputation: 2804
When you call
dff.loc[dff.country != 'US'].groupby('type')['sales'].sum()
you get 3 values
type
A 70
B 140
C 110
Name: sales, dtype: int64
But
grouped["US"]
has only 2 values
type
A 70
B 140
Name: sales, dtype: int64
You can try it:
dff.loc[dff.country != 'US'].groupby('type')['sales'].sum()[["A", "B"]]
type
A 70
B 140
Name: sales, dtype: int64
Upvotes: 1
Reputation: 862801
Create MultiIndex
by MultiIndex.from_product
for assign Series to US
level:
s = dff.loc[dff.country!='US'].groupby('type')['sales'].sum()
s.index = pd.MultiIndex.from_product([['US'], s.index])
grouped['US'] = s
print (grouped)
country type
ES A 20
B 40
FR A 50
C 10
UK B 100
C 100
US A 70
B 140
Name: sales, dtype: int64
Upvotes: 1