Reputation: 850
I have this data frame
df = pd.DataFrame({'upc':[1,1,1],'store':[1,2,3],'date':['jan','jan','jan'],'pred':[[1,1,1],[2,2,2],[3,3,3]],'act':[[4,4,4],[5,5,5],[6,6,6]]})
looks like this
upc store date pred act
0 1 1 jan [1, 1, 1] [4, 4, 4]
1 1 2 jan [2, 2, 2] [5, 5, 5]
2 1 3 jan [3, 3, 3] [6, 6, 6]
When I do groupby and agg along store
for pred
and act
df.groupby(by = ["upc","date"]).agg({"pred":"sum","act":"sum"})
I get all the list concatenated
pred act
upc date
1 jan [1, 1, 1, 2, 2, 2, 3, 3, 3] [4, 4, 4, 5, 5, 5, 6, 6, 6]
I want the sum of the list element-wise something like this
upc date pred act
0 1 jan [6, 6, 6] [15, 15, 15]
Upvotes: 1
Views: 203
Reputation: 863166
Use lambda function with convert values to numy array and sum
per axis=0
:
f = lambda x: np.array(x.tolist()).sum(axis=0).tolist()
df = df.groupby(by = ["upc","date"], as_index=False).agg({"pred":f,"act":f})
print (df)
upc date pred act
0 1 jan [6, 6, 6] [15, 15, 15]
Solution with function:
def f(x):
return np.array(x.tolist()).sum(axis=0).tolist()
df = df.groupby(by = ["upc","date"], as_index=False).agg({"pred":f,"act":f})
Upvotes: 2
Reputation: 71610
Try this with:
>>> df.groupby(['upc', 'date'], as_index=False).agg({"pred": lambda x: pd.DataFrame(x.values.tolist()).sum().tolist(), "act": lambda x: pd.DataFrame(x.values.tolist()).sum().tolist()})
upc date pred act
0 1 jan [6, 6, 6] [15, 15, 15]
>>>
Upvotes: 1