Reputation: 29
I have a problem where , I need to calculate sentiment analysis of two columns present in the excel file and after calculation of polarity of those two columns, I need to update those polarity values in two other columns which are already present in the same excel input file. Any how I have achieved by calculating polarity of single text sentence . Need suggestions to calculate polarity of entire column present in the excel file. I am using pandas for excel processing.
from textblob import TextBlob
import pandas as pd
Input_file='filepath'
df = pd.read_excel(Input_file,
sheet_name='Sheet1')
col1 = pd['video_title'].tolist()
# col2 = pd['description'].tolist()
blob = TextBlob(col1)
# blob1 = Texxtblob(col2)
polarity_score = blob.sentiment.polarity
polarity_rounded = round(polarity_score, 6)
print(polarity_rounded)
As i posted in the above image, here i need to replace the values 'None' in the column 'title_sentiment' to the calculated polarity values. Likewise, i have to update the 'description_sentiment' column to the calculated polarity values.
Upvotes: 1
Views: 1583
Reputation: 7225
Let's blackbox your sentiment analysis stuff and reduce your problem to
I have a dataframe with a text column that I want to apply a function to, and the store the result as a new numeric column in the correct row.
Stealing this person's example dataframe with a text column to get started:
In [1]: import pandas as pd
...:
...: df = pd.DataFrame({
...: 'title': ['foo','bar','baz','baz','foo','bar'],
...: 'contents':[
...: 'Lorem ipsum dolor sit amet.',
...: 'Lorem ipsum dolor sit amet.',
...: 'Lorem ipsum dolor sit amet.',
...: 'Consectetur adipiscing elit.',
...: 'Lorem ipsum dolor sit amet.',
...: 'Lorem ipsum dolor sit amet.'
...: ],
...: 'year':[2010,2011,2000,2005,2010,2011]
...: })
...:
...: df
Out[1]:
title contents year
0 foo Lorem ipsum dolor sit amet. 2010
1 bar Lorem ipsum dolor sit amet. 2011
2 baz Lorem ipsum dolor sit amet. 2000
3 baz Consectetur adipiscing elit. 2005
4 foo Lorem ipsum dolor sit amet. 2010
5 bar Lorem ipsum dolor sit amet. 2011
Now we want to define a function to apply to "contents" and store the result in a new column. For this, we can use pd.Series.apply()
:
In [2]: def sentiment_function(text):
...: # Put all your fancy sentiment stuff here; I will just use `len` as a dummy function.
...: return len(text)
...:
...: df['sentiment_score'] = df['contents'].apply(sentiment_function)
...: df
Out[2]:
title contents year sentiment_score
0 foo Lorem ipsum dolor sit amet. 2010 27
1 bar Lorem ipsum dolor sit amet. 2011 27
2 baz Lorem ipsum dolor sit amet. 2000 27
3 baz Consectetur adipiscing elit. 2005 28
4 foo Lorem ipsum dolor sit amet. 2010 27
5 bar Lorem ipsum dolor sit amet. 2011 27
You can do this for your both of your columns, title_sentiment
and description_sentiment
.
Upvotes: 1