HKE
HKE

Reputation: 473

Pandas where condition

I have a dataset as shown below

test=pd.DataFrame({'number': [1,2,3,4,5,6,7],
                   'A': [0,0, 0, 0,0,0,1],
                   'B': [0,0, 1, 0,0,0,0],
                   'C': [0,1, 0, 0,0,0,1],
                   'D': [0,0, 1, 0,0,0,1],
                   'E': [0,0, 0, 0,0,0,1],
                   })

enter image description here

Trying to creating a column flag at the end with a condition where if 'number'<=5 and (A!=0|B!=0|c!=0|D!=0|E!=0) then 1 else 0.

np.where(((test['number']<=5) & 
((test['A']!=0) | 
 (test['B']!=0) |
   (test['C']!=0) |
    (test['D']!=0) |
     (test['E']!=0))),1,0)

This worked out but I am trying to simplify the query by not hard encoding the columns names A/B/C/D/E as they change(names may change and also number of columns may also change). Only one column remains static which is 'number' column.

Upvotes: 0

Views: 1017

Answers (1)

Henry Ecker
Henry Ecker

Reputation: 35626

Let's try with any on axis=1 instead of joining with |:

test['flag'] = np.where(
    test['number'].le(5) &
    test.iloc[:, 1:].ne(0).any(axis=1), 1, 0
)

test:

   number  A  B  C  D  E  flag
0       1  0  0  0  0  0     0
1       2  0  0  1  0  0     1
2       3  0  1  0  1  0     1
3       4  0  0  0  0  0     0
4       5  0  0  0  0  0     0
5       6  0  0  0  0  0     0
6       7  1  0  1  1  1     0

Lot's of options to select columns:

  1. Select by location iloc all columns the first and after -> test.iloc[:, 1:]
  2. Select by loc all columns 'A' and after -> test.loc[:, 'A':]
  3. Select all columns except 'number' with Index.difference -> test[test.columns.difference(['number'])]

Upvotes: 3

Related Questions