Using boolean masks in Pandas

Question

This is probably a trivial query but I can't work it out.

Essentially, I want to be able to filter out noisy tweets from a dataframe below


Int64Index: 140381 entries, 0 to 140380
Data columns:
text          140381  non-null values
created_at    140381  non-null values
id            140381  non-null values
from_user     140381  non-null values
geo           5493  non-null values
dtypes: float64(1), object(4)

I can create a dataframe based on unwanted keywords thus:

junk = df[df.text.str.contains("Swans")]

But what's the best way to use this to see what's left?

waitingkuo · Accepted Answer

df[~df.text.str.contains("Swans")]

Using boolean masks in Pandas

Answers (2)

option 1:

option 2:

Related Questions