Reputation: 109
This is sample dataframe
ID,IS,Val1,Val2,Val3
1,100,11,9,1
2,101,3,15,16
3,99,10,18,3
1,97,29,25,26
I am also using idxmin to calculate each rows minimum value and when i find the minimum value, I want to check if that minimum value corresponding to that column is less than some number and if it is, then I want to include otherwise I want to remove it. This is what I am doing with the help of stack overflow.
df1 = df.set_index('ID').iloc[:,1:].idxmin(axis=1).reset_index(name= 'New')
df2 = df1.loc[34 > df.iloc[:, 1:].min(1)]
I got this result
ID New
1 Val3
2 Val1
3 Val3
1 Val2
I also got the same result when I am using this code
df2 = df1.loc[34 > df.iloc[:, 3:].min(1)]
# in this code, I am starting my column from Val2
but it still gives the same result (including Val1)
ID New
1 Val3
2 Val1
3 Val3
1 Val2
Why I am getting same result even though I am selecting from third column? What exactly this line of code is doing here ? df1.loc[34 > df.iloc[:, 1:].min(1)]
Upvotes: 1
Views: 534
Reputation: 9858
Your code for df2
is selecting from the columns headed Val2
and Val3
only, but as long as your code for df1
still includes Val1
then you will still see Val1
in the output.
It may be easier to see what's going on if you use the column headers to index the data and add the new columns to the same data frame.
group1 = df[["Val1", "Val2", "Val3"]] # find the min among these 3 cols
group2 = df[["Val2", "Val3"]] # find the min among only these 2 cols
df["min1"] = group1.min(axis=1)
df["col1"] = group1.idxmin(axis=1)
df["min2"] = group2.min(axis=1)
df["col2"] = group2.idxmin(axis=1)
filtered1 = df.loc[12 > df.min1] # Val3, Val1, Val3 contain the minimum values
filtered2 = df.loc[12 > df.min2] # Val3, Val3 contain the minimum values
Upvotes: 1
Reputation: 323226
Both your Boolean conditions return all true for each row , that is why you have same result
34 > df.iloc[:, 3:].min(1)
Out[202]:
0 True
1 True
2 True
3 True
dtype: bool
34 > df.iloc[:, 1:].min(1)
Out[203]:
0 True
1 True
2 True
3 True
dtype: bool
iloc
is slice the dataframe by position
df.iloc[:, 1:]
Out[204]:
IS Val1 Val2 Val3
0 100 11 9 1
1 101 3 15 16
2 99 10 18 3
3 97 29 25 26
Upvotes: 2