Neir0
Neir0

Reputation: 13367

Good algorithm for sentiment analysis

I tried naive bayes classifier and it's working very bad. SVM works a little better but still horrible. Most of the papers which i read about SVM and naive bayes with some variations(n-gram, POS etc) but all of them gives results close to 50% (authors of articles talk about 80% and high but i cannt to get same accurate on real data).

Is there any more powerfull methods except lexixal analys? SVM and Bayes suppose that words independet. These approach called "bag of words". What if we suppose that words are associated?

For example: Use apriory algorithm to detect that if sentences contains "bad and horrible" then 70% probality that sentence is negative. Also we can use distance between words and so on.

Is it good idea or i'm inventing bicycle?

Upvotes: 7

Views: 16713

Answers (4)

Sven Büchel
Sven Büchel

Reputation: 719

Sentiment Analysis is an area of ongoing research. And there is a lot of research going on right now. For an overview of the most recent, most successful approaches, I would generally advice you to have a look at the shared tasks of SemEval. Usually, every year they run a competition on Sentiment Analysis in Twitter. You can find the paper describing the task, and the results for 2016 here (might be a bit technical though): http://alt.qcri.org/semeval2016/task4/data/uploads/semeval2016_task4_report.pdf

Starting from there, you can have a look in the papers describing the individual systems (as referenced there).

Upvotes: 0

Nitin Pawar
Nitin Pawar

Reputation: 936

You can find some useful material on Sentimnetal analysis using python. This presentation summarizes Sentiment Analysis as 3 simple steps

  • Labeling data
  • Preprocessing &
  • Model Learning

Upvotes: 2

Aravind Asok
Aravind Asok

Reputation: 514

Algorithms like SVM, Naive Bayes and maximum entropy ones are supervised machine learning algorithms and the output of your program depends on the training set you have provided. For large scale sentiment analysis I prefer using unsupervised learning method in which one can determine the sentiments of the adjectives by clustering documents into same-oriented parts, and label the clusters positive or negative. More information can be found out from this paper. http://icwsm.org/papers/3--Godbole-Srinivasaiah-Skiena.pdf

Hope this helps you in your work :)

Upvotes: 5

Fred Foo
Fred Foo

Reputation: 363607

You're confusing a couple of concepts here. Neither Naive Bayes nor SVMs are tied to the bag of words approach. Neither SVMs nor the BOW approach have an independence assumption between terms.

Here's some things you can try:

  • include punctuation marks in your bags of words; esp. ! and ? can be helpful for sentiment analysis, while many feature extractors geared toward document classification throw them away
  • same for stop words: words like "I" and "my" may be indicative of subjective text
  • build a two-stage classifier; first determine whether any opinion is expressed, then whether it's positive or negative
  • try a quadratic kernel SVM instead of a linear one to capture interactions between features.

Upvotes: 6

Related Questions