hmg
hmg

Reputation: 443

How to get the indexes of all minimum values in pandas dataframe?

I have a dataframe:

df = pd.DataFrame({'A': [0, 0, 1], 'B': [1, 0, 0]}, index=['x', 'y', 'z'])

   A  B
x  0  1
y  0  0
z  1  0

For each row, I want the names of all the columns with the lowest value (edit: per row), something like:

x  A
y  A
y  B
z  B

# or 

x  [A]
y  [A, B]
z  [B]

I know idxmin() gives the first instance of the lowest value:

df.idxmin(axis=1)

x    A
y    A
z    B

But what is an efficient way to get all of them?

This question gives all of the rows with the minimum value in a specific column, but that's not quite what I want.

Edit: Here's a better toy df to play with for getting the column names with the minimum value in each row:

df2 = pd.DataFrame({'A': [1, 0, 6], 'B': [3, 0, 2]}, index=['x', 'y', 'z'])

   A  B
x  1  3
y  0  0
z  6  2

Upvotes: 2

Views: 3149

Answers (3)

mozway
mozway

Reputation: 262634

You can use groupby+transform('min'):

s = df.stack()
s[s.eq(s.groupby(level=0).transform('min'))]

Output:

x  A    0
y  A    0
   B    0
z  B    0

Alternative format:

s = df.stack()
(s[s.eq(s.groupby(level=0).transform('min'))]
  .reset_index()
  .groupby('level_0')['level_1'].apply(list)
 )

Output:

level_0
x       [A]
y    [A, B]
z       [B]
Name: level_1, dtype: object

Upvotes: 2

user7864386
user7864386

Reputation:

This is a one-liner, similar to @mozway's second solution but uses a boolean mask similar to @wwnde's:

min_cols = df.eq(df.min(axis=1), axis=0).stack().groupby(level=0).apply(lambda x: x.index.get_level_values(1)[x].tolist())

Output:

x       [A]
y    [A, B]
z       [B]

Upvotes: 1

wwnde
wwnde

Reputation: 26686

Convert the df into bool by finding every min value and pull columns that return True into a list

 s= df==df.min()

df['column_min']=s.agg(lambda s: s.index[s].values, axis=1)




 A   B   column_min
x  0  1        [A]
y  0  0     [A, B]
z  1  0        [B]

Upvotes: 2

Related Questions