Simon
Simon

Reputation: 10150

Python pandas remove rows where multiple conditions are not met

Lets say I have a dataframe like this:

   id  num
0   1    1
1   2    2
2   3    1
3   4    2
4   1    1
5   2    2
6   3    1
7   4    2

The above can be generated with this for testing purposes:

test = pd.DataFrame({'id': np.array([1,2,3,4] * 2,dtype='int32'),
                     'num': np.array([1,2] * 4,dtype='int32')
                    })

Now, I want to keep only the rows where a certain condition is met: id is not 1 AND num is not 1. Essentially I want to remove the rows with index 0 and 4. For my actual dataset its easier to remove the rows I dont want rather than to specify the rows that I do want

I have tried this:

test = test[(test['id'] != 1) & (test['num'] != 1)]

However, that gives me this:

   id  num
1   2    2
3   4    2
5   2    2
7   4    2

It seems to have removed all rows where id is 1 OR num is 1

I've seen a number of other questions where the answer is the one I used above but it doesn't seem to be working out in my case

Upvotes: 6

Views: 12568

Answers (1)

EdChum
EdChum

Reputation: 394041

If you change the boolean condition to be equality and invert the combined boolean conditions by enclosing both in additional parentheses then you get the desired behaviour:

In [14]:    
test = test[~((test['id'] == 1) & (test['num'] == 1))]
test

Out[14]:
   id  num
1   2    2
2   3    1
3   4    2
5   2    2
6   3    1
7   4    2

I also think your understanding of boolean syntax is incorrect what you want is to or the conditions:

In [22]:
test = test[(test['id'] != 1) | (test['num'] != 1)]
test

Out[22]:
   id  num
1   2    2
2   3    1
3   4    2
5   2    2
6   3    1
7   4    2

If you think about what this means the first condition excludes any row where 'id' is equal to 1 and similarly for the 'num' column:

In [24]:
test[test['id'] != 1]

Out[24]:
   id  num
1   2    2
2   3    1
3   4    2
5   2    2
6   3    1
7   4    2

In [25]:
test[test['num'] != 1]

Out[25]:
   id  num
1   2    2
3   4    2
5   2    2
7   4    2

So really you wanted to or (|) the above conditions

Upvotes: 13

Related Questions