luisfer
luisfer

Reputation: 2120

Pandas: How to remove rows from a dataframe based on a list?

I have a dataframe customers with some "bad" rows, the key in this dataframe is CustomerID. I know I should drop these rows. I have a list called badcu that says [23770, 24572, 28773, ...] each value corresponds to a different "bad" customer.

Then I have another dataframe, lets call it sales, so I want to drop all the records for the bad customers, the ones in the badcu list.

If I do the following

sales[sales.CustomerID.isin(badcu)]

I got a dataframe with precisely the records I want to drop, but if I do a

sales.drop(sales.CustomerID.isin(badcu))

It returns a dataframe with the first row dropped (which is a legitimate order), and the rest of the rows intact (it doesn't delete the bad ones), I think I know why this happens, but I still don't know how to drop the incorrect customer id rows.

Upvotes: 31

Views: 61433

Answers (3)

piRSquared
piRSquared

Reputation: 294198

You can also use query

sales.query('CustomerID not in @badcu')

Upvotes: 7

Vaishali
Vaishali

Reputation: 38415

You need

new_df = sales[~sales.CustomerID.isin(badcu)]

Upvotes: 75

Eliethesaiyan
Eliethesaiyan

Reputation: 2322

I think the best way is to drop by index,try it and let me know

sales.drop(sales[sales.CustomerId.isin(badcu)].index.tolist())

Upvotes: 5

Related Questions