Choose two columns values based on smallest value of column

Question

I have a dataframe as below.

df = pd.DataFrame({'vx_1': [23.,31.,19.,np.nan,23.,np.nan,21.],
                  'ux_1': [13., 14., 11., np.nan, 13., np.nan, 17.],
                 'vx_2': [20.,30.,21.,22.,22.,np.nan,19.6],
                  'ux_2': [11., 4., 12., 9., 14, np.nan, 9.5],
                  'vx_3': [18.,26.5,29.,19.,np.nan,37.,20.],
                  'ux_3': [8., 14., 15., 9., np.nan, 12, 6.]})

I want to choose the smallest value from vx_1, vx_2, vx_3 columns and corresponded ux column value and add two columns to dataframe.

expected result:

pd.DataFrame({'vx_1': [23.,31.,19.,np.nan,23.,np.nan,21.],
                  'ux_1': [13., 14., 11., np.nan, 13., np.nan, 17.],
                 'vx_2': [20.,30.,21.,22.,22.,np.nan,19.6],
                  'ux_2': [11., 4., 12., 9., 14, np.nan, 9.5],
                  'vx_3': [18.,26.5,29.,19.,np.nan,37.,20.],
                  'ux_3': [8., 14., 15., 9., np.nan, 12, 6.],
                   'vx': [18.,26.5,19.,19.,22.,37.,19.6],
                  'ux': [8., 14., 11., 9., 14., 12., 9.5],})

I tried to apply below two functions and get different results.

def v_smallest(df):
    return df[['vx_1','vx_2', 'vx_3' ]].min()
def u_smallest(df):
    return df[['ux_1','ux_2','ux_3']].min()

df['vx'] = df.apply(v_smallest, axis=1)
df['ux'] = df.apply(u_smallest, axis=1)

sammywemmy · Accepted Answer

You can use the filter function to pick out the respective vx and ux columns, then get the row mininum for each :

df.assign(vx=df.filter(regex="^vx").min(1), ux=df.filter(regex="^ux").min(1))

    vx_1    ux_1    vx_2    ux_2    vx_3    ux_3    vx      ux
0   23.0    13.0    20.0    11.0    18.0    8.0     18.0    8.0
1   31.0    14.0    30.0    4.0     26.5    14.0    26.5    14.0
2   19.0    11.0    21.0    12.0    29.0    15.0    19.0    11.0
3   NaN     NaN     22.0    9.0     19.0    9.0     19.0    9.0
4   23.0    13.0    22.0    14.0    NaN      NaN    22.0    14.0
5   NaN     NaN     NaN     NaN     37.0    12.0    37.0    12.0
6   21.0    17.0    19.6    9.5     20.0    6.0     19.6    9.5

If however, you wish to get values of 'ux' where 'vx' is min, then pd.wide_to_long can be handy:

res = pd.wide_to_long(df.reset_index(), 
                     stubnames=["vx", "ux"], 
                     i="index", 
                     j="num", 
                     sep="_")

vx = res.groupby(level=0).vx.min()
ux = (res.loc[res.vx.eq(vx, axis=0), "ux"]
         .sort_index()
         .array)

df.assign(vx=vx.array, ux=ux)

    vx_1    ux_1    vx_2    ux_2    vx_3    ux_3    vx      ux
0   23.0    13.0    20.0    11.0    18.0    8.0     18.0    8.0
1   31.0    14.0    30.0    4.0     26.5    14.0    26.5    14.0
2   19.0    11.0    21.0    12.0    29.0    15.0    19.0    11.0
3   NaN     NaN     22.0    9.0     19.0    9.0     19.0    9.0
4   23.0    13.0    22.0    14.0    NaN     NaN     22.0    14.0
5   NaN     NaN      NaN    NaN     37.0    12.0    37.0    12.0
6   21.0    17.0    19.6    9.5     20.0    6.0     19.6    9.5

Choose two columns values based on smallest value of column

Answers (2)

Related Questions