js352
js352

Reputation: 374

Count max depending on two columns

I have a dataframe and I want to get the max number of counts and its corresponding year for movies and series. The df is as follows:

df = pd.DataFrame(
     [
       ("movie", "2020", 45),
       ("movie", "2019", 37),
       ("series", "2018", 16),
       ("series", "2010", 45),
    ],
columns=("type", "year", "count"),
 )

What I've tried:

df[df['type'] == "series"].max()
df[df['type'] == "movie"].max()

However this does not seem to work well as it returns the max year and max count. As for example, for the series one, it will return:

type     series
year       2018
count        45
dtype: object

and it should return:

type     series
year       2010
count        45
dtype: object

How should I build my query so it returns the highest count with its corresponding year?

Thank you

Upvotes: 1

Views: 45

Answers (2)

BENY
BENY

Reputation: 323306

Try with idxmax

out = df.loc[df.groupby('type')['count'].idxmax()]
Out[200]: 
     type  year  count
0   movie  2020     45
3  series  2010     45

Upvotes: 3

Quang Hoang
Quang Hoang

Reputation: 150765

You can try sort and drop duplicates:

df.sort_values('count').drop_duplicates('type', keep='last')

Output:

     type  year  count
0   movie  2020     45
3  series  2010     45

Upvotes: 2

Related Questions