Pandas: select column with the highest percentage from a frequency table

Question

Hi I have a dataframe that I'd like to select the column with the highest percentage from a frequency table.

d = {'c1':['a', 'a', 'b', 'b', 'c', 'c'], 'c2':['Low', 'High', 'Low', 'High', 'High', 'High']}
dd = pd.DataFrame(data=d)
dd.groupby('c1')['c2'].value_counts(normalize=True).mul(100)

It will return a frequency table

c1  c2  
a   High     50.0
    Low      50.0
b   High     50.0
    Low      50.0
c   High    100.0
Name: c2, dtype: float64

I'd like to print out c which has the highest percentage 100.0

I'm able to use max() to print out 100.0 but don't know how to print out c

wwnde · Accepted Answer

Lets try reset_index and drop level=1 and then find the maximum index using idxmax

dd.groupby('c1')['c2'].value_counts(normalize=True).mul(100).reset_index(level=1, drop=True).idxmax()

Pandas: select column with the highest percentage from a frequency table

Answers (2)

Related Questions