Pandas filter by column value

Question

How can I sort values by columns, and then only keep n number of columns? This is what I'm struggling with:

import pandas as pd
import numpy as np

file = ('example.csv')
df = pd.read_csv(file)

df = df[['ID', 'Hero', 'Customer']].drop_duplicates('ID', keep="first")
df.head()

Output:
ID   |    Hero     | Customer

45        Rambo      Nils
90        Superman   Sophia
33        Superman   Sophia
55        Rambo      Sophia
12        Hulk       Sophia

To get into the format I want it (heatmap in this case), I've done this:

heatmap = df.groupby(['Hero', 'Customer']).size().unstack(fill_value=0)
heatmap

Output:

            Ida      Jonas    Morgan    Sophia    Nils
Rambo        0          0          3        11       1
Superman     2          0          0        66       0
Hulk         0          0          0         7       0

What I want to do is to sort it by descending order so that the column with the highest values get pushed to the left, like this:

Wanted output:

            Sophia      Morgan    Ida    Nils    Jonas
Rambo           11          3       0       1        0
Superman        66          0       2       0        0
Hulk             7          0       0       0        0

And if I would like to keep n-numbers of customers, how do I do that? Another problem I have is that the ID's are getting lost in the data manipulation, and I'm not sure how to deal with it after the count()-function. I feel like I've done it unnecessary complicated (?).

Pandas filter by column value

Answers (1)

Related Questions