Pandas retrieve value in one column(s) corresponding to the maximum value in another

Question

Relatively new Python scripter here with a quick question about Pandas and DataFrames. There may be an easier method in Python to do what I am doing (outside of Pandas), so I am open to any and all suggestions.

I have a large data-set (don't we all), with dozens of attributes and tens of thousands of entries. I have successfully opened it (.csv file) and removed the unnecessary columns for the exercise, as well as used pandas techniques I learned from other questions here to parry down the table to something I can use

As an example, I now have dataframe df, with three columns - A, B and C. I need to find the index of the max of A and then pull the values of B and C at that index. Based off research on the best method, it seemed that idxmax was the best option.

MaxIDX = df['A'].idxmax()

This gives me the correct answer, however when I try to then grab a value using at based on this variable, I am getting errors. I believe it is because idxmax produces a series, and not an integer output.

variable = df.at[MaxIDX, 'B']

So the question I have is kind of two part.

How do I convert the series to the proper input for at? And, is there an easier way to do this that I am completely missing? All I want to do is get the index of the max of column A, and then pull the values of Column B and C at that index.

Any help is appreciated. Thanks a bunch! Cheers!

Note: Using: Python 3.6.4 and Pandas 0.22.0

cs95 · Accepted Answer

np.random.seed(0)
df = pd.DataFrame(np.random.randn(5, 3), columns=list('ABC'))

df

          A         B         C
0  1.764052  0.400157  0.978738
1  2.240893  1.867558 -0.977278
2  0.950088 -0.151357 -0.103219
3  0.410599  0.144044  1.454274
4  0.761038  0.121675  0.443863


df.A.idxmax()
1

What you claim fails, seems to work for me:

df.at[df.A.idxmax(), 'B']
1.8675579901499675

Although, based on your explanation, you may instead want loc, not at:

df.loc[df.A.idxmax(), ['B', 'C']]

B    1.867558
C   -0.977278
Name: 1, dtype: float64

Note: You may want to check that your index does not contain duplicate entries. This is one possible reason for failure.

Pandas retrieve value in one column(s) corresponding to the maximum value in another

Answers (1)

Related Questions