String mode aggregation with group by function

Question

I have dataframe which looks like below

Country  City
UK       London
USA      Washington
UK       London
UK       Manchester
USA      Washington
USA      Chicago

I want to group country and aggregate on the most repeated city in a country

My desired output should be like

Country City
UK      London
USA     Washington

Because London and Washington appears 2 times whereas Manchester and Chicago appears only 1 time.

I tried

from scipy.stats import mode
df_summary = df.groupby('Country')['City'].\
                        apply(lambda x: mode(x)[0][0]).reset_index()

But it seems it won't work on strings

jpp · Accepted Answer

I can't replicate your error, but you can use pd.Series.mode, which accepts strings and returns a series, using iat to extract the first value:

res = df.groupby('Country')['City'].apply(lambda x: x.mode().iat[0]).reset_index()

print(res)

  Country        City
0      UK      London
1     USA  Washington

String mode aggregation with group by function

Answers (2)

Related Questions