aiden rosenblatt
aiden rosenblatt

Reputation: 423

reindex error - python

I am having some issues with my code anybody know what could be wrong

Code

df=df.groupby(by='store')['transaction_number','transaction_amount'].agg(['count','sum']).reset_index()

Error

ValueError: cannot reindex from a duplicate axis

Upvotes: 1

Views: 43

Answers (1)

cs95
cs95

Reputation: 403120

Change your aggfunc a bit, make sure to explicitly inform agg about what columns it is aggregating.

c = ['transaction_number','transaction_amount']
f = dict.fromkeys(c, ['count','sum'])

df = df.groupby('store', as_index=False).agg(f)

When performing a groupby, you can specify as_index=False so the grouper is automatically inserted as a column in the final result (this is more efficient than calling reset_index at the end).


Here's a quick demo with some contrived data:

df

  store  transaction_number  transaction_amount
0     a                   0                 100
1     a                   1                 200
2     a                   2                 100
3     b                   3                 400
4     c                   1                  50
5     c                   3                  45

f
{
    "transaction_amount": [
        "count",
        "sum"
    ],
    "transaction_number": [
        "count",
        "sum"
    ]
}


df.groupby('store', as_index=False).agg(f)

  store transaction_number     transaction_amount     
                     count sum              count  sum
0     a                  3   3                  3  400
1     b                  1   3                  1  400
2     c                  2   4                  2   95

Upvotes: 2

Related Questions