Brigitte Maillère
Brigitte Maillère

Reputation: 865

position of n highest values in a pandas dataframe

df = pd.DataFrame([[1, 10, 1], [6, 1, 1], [1,1,9]])

I'd like to find the position [row index, column index] for the 3 highest values (10, 9 and 6) in the dataframe

expected results is :

[[0,1],[2,2],[1,0]]

Upvotes: 0

Views: 823

Answers (3)

Leonardo Alves
Leonardo Alves

Reputation: 533

You can do the following:

df['max'] = df.idxmax(axis=1)

this will set a column "max" with the maximum value by column. After that you can get the value using df.apply like this:

my_list = df.apply(lambda value: [value.name, value['max']], axis=1).to_list()

result:

[[0, 1], [1, 0], [2, 2]]

Upvotes: 1

Horace
Horace

Reputation: 1054

You can use the idxmax method :

In [2]: df.idxmax()
Out[2]: 
0 1
1 0
2 2

If you want the array with both coordinates:

In [3]: df.idxmax().reset_index().values
Out [3]: 
array([[0, 1], [1, 0], [2, 2]])

Upvotes: 0

jezrael
jezrael

Reputation: 862511

Use DataFrame.stack with Series.nlargest:

a = df.stack().nlargest(3).index.tolist()
print (a)
[(0, 1), (2, 2), (1, 0)]

If need nested lists:

a = list(map(list, df.stack().nlargest(3).index))
print (a)
[[0, 1], [2, 2], [1, 0]]

Upvotes: 2

Related Questions