Reputation: 2120
I have a dataframe customers with some "bad" rows, the key in this dataframe is CustomerID. I know I should drop these rows. I have a list called badcu that says [23770, 24572, 28773, ...]
each value corresponds to a different "bad" customer.
Then I have another dataframe, lets call it sales, so I want to drop all the records for the bad customers, the ones in the badcu list.
If I do the following
sales[sales.CustomerID.isin(badcu)]
I got a dataframe with precisely the records I want to drop, but if I do a
sales.drop(sales.CustomerID.isin(badcu))
It returns a dataframe with the first row dropped (which is a legitimate order), and the rest of the rows intact (it doesn't delete the bad ones), I think I know why this happens, but I still don't know how to drop the incorrect customer id rows.
Upvotes: 31
Views: 61433
Reputation: 294198
You can also use query
sales.query('CustomerID not in @badcu')
Upvotes: 7
Reputation: 2322
I think the best way is to drop by index,try it and let me know
sales.drop(sales[sales.CustomerId.isin(badcu)].index.tolist())
Upvotes: 5