humble
humble

Reputation: 2188

Get the max value of 2nd column where the value in column 1 = specified, from dataframe in pandas

I have a pandas data frame. eg:

df=
  paper id  year
0         3  1997
1         3  1999
2         3  1999
3         3  1999
4         6  1997
                so on

I want the maximum year corresponding to a paper id given as input. For example, if the paper id given is 3, I want 1999 as the answer.

How can I do this?

Upvotes: 2

Views: 40

Answers (1)

jezrael
jezrael

Reputation: 863791

There are 2 general solutions - filter first and then get max:

s = df.loc[df['paper id'] == 3, 'year'].max()
print (s)
1999

s = df.set_index('paper id').loc[3, 'year'].max()
print (s)
1999

Or aggregate max to Series and then select by index values:

s = df.groupby('paper id')['year'].max()
print (s)
paper id
3    1999
6    1997
Name: year, dtype: int64

print (s[3])
1999

Upvotes: 2

Related Questions