yigitozmen
yigitozmen

Reputation: 957

Pandas duplicates when grouped

x = df.groupby(["Customer ID", "Category"]).sum().sort_values(by="VALUE", ascending=False)

I want to group by Customer ID but when I use above code, it duplicates customers...

Here is the result:

Image

Source DF:

  Customer ID Category  Value
0           A        x      5
1           B        y      5
2           B        z      6
3           C        x      7
4           A        z      2
5           B        x      5
6           A        x      1

new: https://ufile.io/dpruz

Upvotes: 0

Views: 49

Answers (1)

Scott Boston
Scott Boston

Reputation: 153460

I think you are looking for something like this:

df_out = df.groupby(['Customer ID','Category']).sum()
df_out.reindex(df_out.sum(level=0).sort_values('Value', ascending=False).index,level=0)

Output:

                      Value
Customer ID Category       
B           x             5
            y             5
            z             6
A           x             6
            z             2
C           x             7

Upvotes: 2

Related Questions