Reputation: 169
Suppose I have the following dataframe
Country Year Count
0 USA 2021 1500
1 USA 2018 6000
2 India 2019 3000
3 India 2021 5000
4 UK 2019 4000
5 USA 2019 3200
6 India 2018 5000
I want to print the following
Entry with Max count is (USA, 2018, 6000)
Country with max total count is: (India, 13000)
Entry with max count in each year is:
2018, USA, 6000
2019, UK, 4000
2021, India, 5000
The code below works. But a couple of questions to see if I can do better
maxidx
and then getting the values in it?# Print (country, year, count) of the row with max count among all entries
max_idx = df['Count'].idxmax()
print("Entry with Max count is (" + \
str(df.loc[max_idx]['Country']) + ", " \
+ str(df.loc[max_idx]['Year']) + ", " \
+ str(df.loc[max_idx]['Count']) + ")" )
# Print country with max total count and print (country, max total count)
country_sum = pd.pivot_table(df, index='Country', aggfunc=np.sum)
print("\nCountry with max total count is: ("\
+ country_sum['Count'].idxmax() + ", "\
+ str(country_sum['Count'].max())\
+ ")")
# Print country with max count in each year
year_country_groupby = df.groupby('Year')
print('\nEntry with max count in each year is:')
for key, gdf in year_country_groupby:
max_idx = gdf['Count'].idxmax()
print(str(key) + ", "\
+ str(gdf.loc[max_idx]['Country']) + ", "\
+ str(df.loc[max_idx]['Count']))
Upvotes: 3
Views: 2291
Reputation: 120391
You can simplify your output like this:
# 1st output
cty, year, cnt = df.loc[df['Count'].idxmax()]
print(f"Entry with Max count is ({cty}, {year}, {cnt})")
# 2nd output
cty, cnt = df.groupby('Country')['Count'].sum().nlargest(1).reset_index().squeeze()
print(f"Country with max total count is: ({cty}, {cnt})")
# 3rd output
print("Entry with max count in each year is:")
for _, (cty, year, cnt) in df.loc[df.groupby('Year')['Count'].idxmax()].iterrows():
print(f"{year}, {cty}, {cnt}")
Output:
Entry with Max count is (USA, 2018, 6000)
Country with max total count is: (India, 13000)
Entry with max count in each year is:
2018, USA, 6000
2019, UK, 4000
2021, India, 5000
Update
To get both max index and value, you can use agg
:
idxmax, valmax = df['Count'].agg(['idxmax', 'max'])
print(idxmax, valmax)
# Output:
1 6000
Upvotes: 2