Reputation: 29
I've just run the Vader sentiment analysis on my dataset:
from nltk.sentiment.vader import SentimentIntensityAnalyzer
from nltk import tokenize
sid = SentimentIntensityAnalyzer()
for sentence in filtered_lines2:
print(sentence)
ss = sid.polarity_scores(sentence)
for k in sorted(ss):
print('{0}: {1}, '.format(k, ss[k]), )
print()
Here a sample of my results:
Are these guests on Samsung and Google event mostly Chinese Wow Theyre
boring
Google Samsung
('compound: 0.3612, ',)
()
('neg: 0.12, ',)
()
('neu: 0.681, ',)
()
('pos: 0.199, ',)
()
Adobe lose 135bn to piracy Report
('compound: -0.4019, ',)
()
('neg: 0.31, ',)
()
('neu: 0.69, ',)
()
('pos: 0.0, ',)
()
Samsung Galaxy Nexus announced
('compound: 0.0, ',)
()
('neg: 0.0, ',)
()
('neu: 1.0, ',)
()
('pos: 0.0, ',)
()
I want to know how many times "compound" is equal, greater or less than zero.
I know that probably it is very easy but I'm really new to Python and coding in general. I've tried in a lot of different ways to create what I need but I can't find any solution.
(please edit my question if the "sample of results" is incorrect, because i don't know the right way to write it)
Upvotes: 0
Views: 2892
Reputation: 16935
I might define a function that returns the type of inequality that's being represented by a document:
def inequality_type(val):
if val == 0.0:
return "equal"
elif val > 0.0:
return "greater"
return "less"
Then use this on the compound scores of all the sentences to increment the count of the corresponding inequality type.
from collections import defaultdict
def count_sentiments(sentences):
# Create a dictionary with values defaulted to 0
counts = defaultdict(int)
# Create a polarity score for each sentence
for score in map(sid.polarity_scores, sentences):
# Increment the dictionary entry for that inequality type
counts[inequality_type(score["compound"])] += 1
return counts
You could then call it on your filtered lines.
However, this can be obviated by just using collections.Counter
:
from collections import Counter
def count_sentiments(sentences):
# Count the inequality type for each score in the sentences' polarity scores
return Counter((inequality_type(score["compound"]) for score in map(sid.polarity_scores, sentences)))
Upvotes: 0
Reputation: 5940
By far not the most pythonic way of doing it but I think this would be the easiest to understand if you don't have much experience with python. Essentially you create a dictionary with 0 values and increment the value in each one of the cases.
from nltk.sentiment.vader import SentimentIntensityAnalyzer
from nltk import tokenize
sid = SentimentIntensityAnalyzer()
res = {"greater":0,"less":0,"equal":0}
for sentence in filtered_lines2:
ss = sid.polarity_scores(sentence)
if ss["compound"] == 0.0:
res["equal"] +=1
elif ss["compound"] > 0.0:
res["greater"] +=1
else:
res["less"] +=1
print(res)
Upvotes: 1
Reputation: 5817
You can use a simple counter for each of the classes:
positive, negative, neutral = 0, 0, 0
Then, inside the sentence loop, test the compound value and increase the corresponding counter:
...
if ss['compound'] > 0:
positive += 1
elif ss['compound'] == 0:
neutral += 1
elif ...
etc.
Upvotes: 1