Pandas column filtering on string gives unexpected results

Question

I have a dataframe with a column ClientAccount which contains a lot of test data data which I want to filter out.

To find how many rows contains test clients, I do the following:

test_users = order_data[order_data['ClientAccount'].str.contains("DEMO|test")==True]

Which returns Name: ClientAccount, Length: 2493

Cool, so 2.493 rows out of 71.458 original rows.

Then to get get everything that isn't these 2.493 rows, shouldn't I just do the opposite?

order_data = order_data[order_data['ClientAccount'].str.contains("DEMO|test")==False]

This gives 48.046 rows though, but how does that make sense? What am I missing?

Answers (1)