Reputation: 2130

Pandas: How to remove rows from a dataframe based on a list?

I have a dataframe customers with some "bad" rows, the key in this dataframe is CustomerID. I know I should drop these rows. I have a list called badcu that says [23770, 24572, 28773, ...] each value corresponds to a different "bad" customer.

Then I have another dataframe, lets call it sales, so I want to drop all the records for the bad customers, the ones in the badcu list.

If I do the following

sales[sales.CustomerID.isin(badcu)]

I got a dataframe with precisely the records I want to drop, but if I do a

sales.drop(sales.CustomerID.isin(badcu))

It returns a dataframe with the first row dropped (which is a legitimate order), and the rest of the rows intact (it doesn't delete the bad ones), I think I know why this happens, but I still don't know how to drop the incorrect customer id rows.

Upvotes: 31

Answers (3)

piRSquared

Reputation: 294516

You can also use query

sales.query('CustomerID not in @badcu')

Upvotes: 7

Vaishali

Reputation: 38425

You need

new_df = sales[~sales.CustomerID.isin(badcu)]

Upvotes: 75

Eliethesaiyan

Reputation: 2322

I think the best way is to drop by index,try it and let me know

sales.drop(sales[sales.CustomerId.isin(badcu)].index.tolist())

Upvotes: 5

Pandas: How to remove rows from a dataframe based on a list?

Answers (3)

Related Questions