Luca Perinati
Luca Perinati

Reputation: 29

NLTK sentiment vader: ordering results

I've just run the Vader sentiment analysis on my dataset:

from nltk.sentiment.vader import SentimentIntensityAnalyzer
from nltk import tokenize
sid = SentimentIntensityAnalyzer()
for sentence in filtered_lines2:
    print(sentence)
    ss = sid.polarity_scores(sentence)
    for k in sorted(ss):
        print('{0}: {1}, '.format(k, ss[k]), )
        print()

Here a sample of my results:

Are these guests on Samsung and Google event mostly Chinese Wow Theyre
boring 

Google Samsung 

('compound: 0.3612, ',)

()

('neg: 0.12, ',)

()


('neu: 0.681, ',)


()


('pos: 0.199, ',)


()

 Adobe lose 135bn to piracy Report 


('compound: -0.4019, ',)


()


('neg: 0.31, ',)


()


('neu: 0.69, ',)


()


('pos: 0.0, ',)


()

Samsung Galaxy Nexus announced

('compound: 0.0, ',)

()

('neg: 0.0, ',)

()

('neu: 1.0, ',)

()

('pos: 0.0, ',)

()

I want to know how many times "compound" is equal, greater or less than zero.

I know that probably it is very easy but I'm really new to Python and coding in general. I've tried in a lot of different ways to create what I need but I can't find any solution.

(please edit my question if the "sample of results" is incorrect, because i don't know the right way to write it)

Upvotes: 0

Views: 2892

Answers (3)

erip
erip

Reputation: 16935

I might define a function that returns the type of inequality that's being represented by a document:

def inequality_type(val):
  if val == 0.0: 
      return "equal"
  elif val > 0.0: 
      return "greater"
  return "less"

Then use this on the compound scores of all the sentences to increment the count of the corresponding inequality type.

from collections import defaultdict

def count_sentiments(sentences):
    # Create a dictionary with values defaulted to 0
    counts = defaultdict(int)

    # Create a polarity score for each sentence
    for score in map(sid.polarity_scores, sentences):
        # Increment the dictionary entry for that inequality type
        counts[inequality_type(score["compound"])] += 1

    return counts

You could then call it on your filtered lines.

However, this can be obviated by just using collections.Counter:

from collections import Counter

def count_sentiments(sentences):
    # Count the inequality type for each score in the sentences' polarity scores
    return Counter((inequality_type(score["compound"]) for score in map(sid.polarity_scores, sentences)))

Upvotes: 0

Alexander Ejbekov
Alexander Ejbekov

Reputation: 5940

By far not the most pythonic way of doing it but I think this would be the easiest to understand if you don't have much experience with python. Essentially you create a dictionary with 0 values and increment the value in each one of the cases.

from nltk.sentiment.vader import SentimentIntensityAnalyzer
from nltk import tokenize
sid = SentimentIntensityAnalyzer()
res = {"greater":0,"less":0,"equal":0}
for sentence in filtered_lines2:
    ss = sid.polarity_scores(sentence)
    if ss["compound"] == 0.0:
        res["equal"] +=1
    elif ss["compound"] > 0.0:
        res["greater"] +=1
    else:
        res["less"] +=1
print(res)

Upvotes: 1

lenz
lenz

Reputation: 5817

You can use a simple counter for each of the classes:

positive, negative, neutral = 0, 0, 0

Then, inside the sentence loop, test the compound value and increase the corresponding counter:

    ...
    if ss['compound'] > 0:
        positive += 1
    elif ss['compound'] == 0:
        neutral += 1
    elif ...

etc.

Upvotes: 1

Related Questions