Chris Fonnesbeck
Chris Fonnesbeck

Reputation: 4203

DataFrame.drop not dropping expected rows in Pandas

I have a Pandas DataFrame that includes rows that I want to drop based on values in a column "population":

data['population'].value_counts()

general population                          21
developmental delay                         20
sibling                                      2
general population + developmental delay     1
dtype: int64

here, I want to drop the two rows that have sibling as the value. So, I believe the following should do the trick:

data = data.drop(data.population=='sibling', axis=0)

It does drop 2 rows, as you can see in the resulting value counts, but they were not the rows with the specified value.

data.population.value_counts()

developmental delay                         20
general population                          19
sibling                                      2
general population + developmental delay     1
dtype: int64

Any idea what is going on here?

Upvotes: 3

Views: 11214

Answers (1)

joaquin
joaquin

Reputation: 85603

dataFrame.drop accepts an index (list of labels) as a parameter, not a mask.
To use drop you should do:

data = data.drop(data.index[data.population == 'sibling'])

however it is much simpler to do

data = data[data.population != 'sibling']

Upvotes: 7

Related Questions