Reputation: 423
I am having some issues with my code anybody know what could be wrong
Code
df=df.groupby(by='store')['transaction_number','transaction_amount'].agg(['count','sum']).reset_index()
Error
ValueError: cannot reindex from a duplicate axis
Upvotes: 1
Views: 43
Reputation: 403120
Change your aggfunc a bit, make sure to explicitly inform agg
about what columns it is aggregating.
c = ['transaction_number','transaction_amount']
f = dict.fromkeys(c, ['count','sum'])
df = df.groupby('store', as_index=False).agg(f)
When performing a groupby
, you can specify as_index=False
so the grouper is automatically inserted as a column in the final result (this is more efficient than calling reset_index
at the end).
Here's a quick demo with some contrived data:
df
store transaction_number transaction_amount
0 a 0 100
1 a 1 200
2 a 2 100
3 b 3 400
4 c 1 50
5 c 3 45
f
{
"transaction_amount": [
"count",
"sum"
],
"transaction_number": [
"count",
"sum"
]
}
df.groupby('store', as_index=False).agg(f)
store transaction_number transaction_amount
count sum count sum
0 a 3 3 3 400
1 b 1 3 1 400
2 c 2 4 2 95
Upvotes: 2