lespaul
lespaul

Reputation: 527

Remove rows that have a specific value in any column

I have a DataFrame like this:

df = pd.DataFrame({'fav-animal-sound' : ['meow', 'woof','quack', 'moo', '?'],
                     'fav-word' : ['foo', 'bar','?', 'ho', 'hum'],
                     'fav-celeb' : ['cher', 'britney','bono', '?', 'big_bird']})

In this dataset, '?' is a common placeholder in several columns for unknown values. I want to remove these values.

This works with one column at a time:

valid_entries = df.loc[:, "fav-celeb"] != '?'

But this does not work:

valid_entries = df.loc[:, "fav-celeb", "fav-word", "fav-animal-sound"] != '?'

I would like to apply the valid_entries flag to each row which does not have a '?' in any of the selected columns, then remove them with something like:

df = df.loc[valid_entries]

Upvotes: 1

Views: 1461

Answers (1)

cs95
cs95

Reputation: 402813

You can perform element wise comparison on the whole DataFrame. This is how you'd do it:

df[(df != '?').all(1)]

  fav-animal-sound fav-word fav-celeb
0             meow      foo      cher
1             woof      bar   britney

Upvotes: 1

Related Questions