Ben
Ben

Reputation: 109

problem with selecting data using pandas. iloc

This is sample dataframe

ID,IS,Val1,Val2,Val3
1,100,11,9,1
2,101,3,15,16
3,99,10,18,3
1,97,29,25,26

I am also using idxmin to calculate each rows minimum value and when i find the minimum value, I want to check if that minimum value corresponding to that column is less than some number and if it is, then I want to include otherwise I want to remove it. This is what I am doing with the help of stack overflow.

df1 = df.set_index('ID').iloc[:,1:].idxmin(axis=1).reset_index(name= 'New')

df2 = df1.loc[34 > df.iloc[:, 1:].min(1)]

I got this result

ID   New
 1  Val3
 2  Val1
 3  Val3
 1  Val2

I also got the same result when I am using this code

df2 = df1.loc[34 > df.iloc[:, 3:].min(1)] # in this code, I am starting my column from Val2 but it still gives the same result (including Val1)

ID   New
 1  Val3
 2  Val1
 3  Val3
 1  Val2

Why I am getting same result even though I am selecting from third column? What exactly this line of code is doing here ? df1.loc[34 > df.iloc[:, 1:].min(1)]

Upvotes: 1

Views: 534

Answers (2)

Stuart
Stuart

Reputation: 9858

Your code for df2 is selecting from the columns headed Val2 and Val3 only, but as long as your code for df1 still includes Val1 then you will still see Val1 in the output.

It may be easier to see what's going on if you use the column headers to index the data and add the new columns to the same data frame.

group1 = df[["Val1", "Val2", "Val3"]] # find the min among these 3 cols
group2 = df[["Val2", "Val3"]]   # find the min among only these 2 cols
df["min1"] = group1.min(axis=1)
df["col1"] = group1.idxmin(axis=1)
df["min2"] = group2.min(axis=1)
df["col2"] = group2.idxmin(axis=1)

filtered1 = df.loc[12 > df.min1]  # Val3, Val1, Val3 contain the minimum values
filtered2 = df.loc[12 > df.min2]  # Val3, Val3 contain the minimum values

Upvotes: 1

BENY
BENY

Reputation: 323226

Both your Boolean conditions return all true for each row , that is why you have same result

34 > df.iloc[:, 3:].min(1)
Out[202]: 
0    True
1    True
2    True
3    True
dtype: bool
34 > df.iloc[:, 1:].min(1)
Out[203]: 
0    True
1    True
2    True
3    True
dtype: bool

iloc is slice the dataframe by position

df.iloc[:, 1:]
Out[204]: 
    IS  Val1  Val2  Val3
0  100    11     9     1
1  101     3    15    16
2   99    10    18     3
3   97    29    25    26

Upvotes: 2

Related Questions