Reputation: 1065
I have a data frame that looks like this:
DF =
ID Shop Sales Ind
1 A 554 T
2 B 678 F
3 A 546 T
4 A 896 T
5 B 426 F
6 B 391 T
7 C 998 F
8 C 565 T
9 C 128 T
I am trying to sum for each ID the sales of each shop so that I have them in separate columns as below (where x is the sum ). The values to be summed should only be the ones matching the True values in the Ind
variable
DF2 =
ID Shop Sales Ind A_Sum B_Sum C_Sum
1 A 554 T x 0 0
2 B 678 F 0 x 0
3 A 546 T x 0 0
4 A 896 T x 0 0
5 B 426 F 0 x 0
6 B 391 T 0 x 0
7 C 998 F 0 0 x
8 C 565 T 0 0 x
9 C 128 T 0 0 x
I tried this, but I am far from being right! I am stick at how to encode the boolean index at the sum operation? and also at automatically naming the columns
DF2 = DF.groupby(['ID', 'Shop'])['Sales'].transform('sum')
Any help on this?
Upvotes: 0
Views: 106
Reputation: 323346
Based on your effort
DF['SUM']=DF.groupby(['ID', 'Shop'])['Sales'].transform('sum')
DF.loc[DF.Ind == 'F', 'SUM'] = 0
pd.concat([DF,DF.pivot(columns='Shop',values='SUM').
add_suffix('_Sum').fillna(0)],axis=1).drop(['SUM'],axis=1)
Out[247]:
ID Shop Sales Ind A_Sum B_Sum C_Sum
0 1 A 554 T 554.0 0.0 0.0
1 2 B 678 F 0.0 0.0 0.0
2 3 A 546 T 546.0 0.0 0.0
3 4 A 896 T 896.0 0.0 0.0
4 5 B 426 F 0.0 0.0 0.0
5 6 B 391 T 0.0 391.0 0.0
6 7 C 998 F 0.0 0.0 0.0
7 8 C 565 T 0.0 0.0 565.0
8 9 C 128 T 0.0 0.0 128.0
Upvotes: 1
Reputation: 38415
You can do it this way
df.merge(df.groupby(['ID','Shop']).Sales.sum().unstack(fill_value = 0).reset_index(), on = 'ID').rename(columns = {'A': 'A_sum', 'B': 'B_sum', 'C': 'C_sum'})
ID Shop Sales Ind A_sum B_sum C_sum
0 1 A 554 T 554 0 0
1 2 B 678 F 0 678 0
2 3 A 546 T 546 0 0
3 4 A 896 T 896 0 0
4 5 B 426 F 0 426 0
5 6 B 391 T 0 391 0
6 7 C 998 F 0 0 998
7 8 C 565 T 0 0 565
8 9 C 128 T 0 0 128
Another solution without merge or concat and much faster gives the same result
df[['ID','A_sum', 'B_sum', 'C_sum']] = df.groupby(['ID','Shop']).Sales.sum().unstack(fill_value = 0).reset_index()
Upvotes: 0
Reputation: 152
Did you perhaps want something like this?
Shop = ["A", "B", "A", "A", "B", "B", "C", "C", "C"]
Sales = [554, 678, 546, 896, 426, 319, 998, 565, 128]
List = ["A", "B", "C"]
A = []
B = []
C = []
Ticker = 0
for x in range(len(Sales)):
if Shop[Ticker] == "A":
A.append(Sales[Ticker])
elif Shop[Ticker] == "B":
B.append(Sales[Ticker])
else:
C.append(Sales[Ticker])
Ticker += 1
print(sum(A), sum(B), sum(C))
Upvotes: 0