Reputation: 2520
I have a dataframe with a column 'text".
I want to filter out everything else but rows in a text
column, containing certain strings.
And my list of words is long. For example, crime, taxation, etc.
This works for one word:
data_cleaned = data_cleaned.loc[data_cleaned['text'].str.contains('population')].reset_index(drop = True)
How to add multiple words, having not only population, but crime etc.
I see answers like this, but it does not work for me.
UPD.
My full list of words looks like this
key_words = ['population'
'migrarion'
'crime',
'safety',
'taxation',
'taxes',
'weather',
'climate',
'opportunities',
'employment',
'unemployment',
'cultural life',
'services',
'jobs',
'economic growth',
'economic decline',
'pollution',
'environment',
'health',
'insurance',
'education',
'natural disaster',
'retirement']
Upvotes: 0
Views: 83
Reputation: 7873
Assuming that lst
is the list of strings the following would work:
def selector(s):
for w in lst:
if w in s:
return True
return False
data_cleaned = data_cleaned.loc[data_cleaned['text'].apply(selector)]
Upvotes: 1