Good algorithm for sentiment analysis

Neir0 picture Neir0 · Jun 11, 2012 · Viewed 16.4k times · Source

I tried naive bayes classifier and it's working very bad. SVM works a little better but still horrible. Most of the papers which i read about SVM and naive bayes with some variations(n-gram, POS etc) but all of them gives results close to 50% (authors of articles talk about 80% and high but i cannt to get same accurate on real data).

Is there any more powerfull methods except lexixal analys? SVM and Bayes suppose that words independet. These approach called "bag of words". What if we suppose that words are associated?

For example: Use apriory algorithm to detect that if sentences contains "bad and horrible" then 70% probality that sentence is negative. Also we can use distance between words and so on.

Is it good idea or i'm inventing bicycle?

Answer

Fred Foo picture Fred Foo · Jun 11, 2012

You're confusing a couple of concepts here. Neither Naive Bayes nor SVMs are tied to the bag of words approach. Neither SVMs nor the BOW approach have an independence assumption between terms.

Here's some things you can try:

  • include punctuation marks in your bags of words; esp. ! and ? can be helpful for sentiment analysis, while many feature extractors geared toward document classification throw them away
  • same for stop words: words like "I" and "my" may be indicative of subjective text
  • build a two-stage classifier; first determine whether any opinion is expressed, then whether it's positive or negative
  • try a quadratic kernel SVM instead of a linear one to capture interactions between features.