I need an algorithm to determine if a sentence, paragraph or article is negative or positive in tone... or better yet, how negative or positive.
For instance:
Jason is the worst SO user I have ever witnessed (-10)
Jason is an SO user (0)
Jason is the best SO user I have ever seen (+10)
Jason is the best at sucking with SO (-10)
While, okay at SO, Jason is the worst at doing bad (+10)
Not easy, huh? :)
I don't expect somebody to explain this algorithm to me, but I assume there is already much work on something like this in academia somewhere. If you can point me to some articles or research, I would love it.
Thanks.
There is a sub-field of natural language processing called sentiment analysis that deals specifically with this problem domain. There is a fair amount of commercial work done in the area because consumer products are so heavily reviewed in online user forums (ugc or user-generated-content). There is also a prototype platform for text analytics called GATE from the university of sheffield, and a python project called nltk. Both are considered flexible, but not very high performance. One or the other might be good for working out your own ideas.