Osca
Osca

Reputation: 1694

Pandas: select column with the highest percentage from a frequency table

Hi I have a dataframe that I'd like to select the column with the highest percentage from a frequency table.

d = {'c1':['a', 'a', 'b', 'b', 'c', 'c'], 'c2':['Low', 'High', 'Low', 'High', 'High', 'High']}
dd = pd.DataFrame(data=d)
dd.groupby('c1')['c2'].value_counts(normalize=True).mul(100)

It will return a frequency table

c1  c2  
a   High     50.0
    Low      50.0
b   High     50.0
    Low      50.0
c   High    100.0
Name: c2, dtype: float64

I'd like to print out c which has the highest percentage 100.0

I'm able to use max() to print out 100.0 but don't know how to print out c

Upvotes: 2

Views: 142

Answers (2)

BENY
BENY

Reputation: 323316

Maybe just do

dd.groupby('c1')['c2'].value_counts(normalize=True).idxmax()[0]
Out[102]: 'c'

Upvotes: 1

wwnde
wwnde

Reputation: 26676

Lets try reset_index and drop level=1 and then find the maximum index using idxmax

dd.groupby('c1')['c2'].value_counts(normalize=True).mul(100).reset_index(level=1, drop=True).idxmax()

Upvotes: 5

Related Questions