Reputation: 7782
This is probably a trivial query but I can't work it out.
Essentially, I want to be able to filter out noisy tweets from a dataframe below
<class 'pandas.core.frame.DataFrame'>
Int64Index: 140381 entries, 0 to 140380
Data columns:
text 140381 non-null values
created_at 140381 non-null values
id 140381 non-null values
from_user 140381 non-null values
geo 5493 non-null values
dtypes: float64(1), object(4)
I can create a dataframe based on unwanted keywords thus:
junk = df[df.text.str.contains("Swans")]
But what's the best way to use this to see what's left?
Upvotes: 3
Views: 5428
Reputation: 14699
You can also use the following two options:
df[-df.text.str.contains("Swans")]
import numpy as np
df[np.invert(df.text.str.contains("Swans"))]
Upvotes: 1