Reputation: 3
I'm cleaning form results in a survey.
Under the column named 'Which source do you trust the most to get insights on politics?' all entries which contain row entry with string/substring 'news' should be replaced with the string 'newspapers or news apps'
Here, 'responses' is the name of the csv file of survey responses.
if responses['Which source do you trust the most to get insights on politics?'].str.contains('news') == True:
responses['Which source do you trust the most to get insights on politics?'] = 'newspapers or news apps'
I got the following error for the code:
ValueError: The truth value of a Series is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all().
please help! any leads appreciated :)
Upvotes: 0
Views: 60
Reputation: 998
Colname = 'Which source do you trust the most to get insights on politics?'
responses.loc[responses[Colname].str.contains('news'), Colname]= 'newspapers or news apps'
Upvotes: 0
Reputation: 13417
The trick will be to create a boolean index with the .str.contains("news")
then use .loc
to update your original dataframe and overwrite those specific values. The following code should do the trick:
source_colname = 'Which source do you trust the most to get insights on politics?'
contains_news = responses[source_colname].str.contains('news')
responses.loc[contains_news, source_colname] = "newspapers or news app"
Upvotes: 1