Negation handling in sentiment analysis

user1565960 picture user1565960 · Mar 31, 2015 · Viewed 10.9k times · Source

I am in need of a little help here, I need to identify the negative words like "not good","not bad" and then identify the polarity (negative or positive) of the sentiment. I did everything except handling the negations. I just want to know how I can include negations into it. How do I go about it?

Answer

manan picture manan · Apr 5, 2015

Negation handling is quite a broad field, with numerous different potential implementations. Here I can provide sample code that negates a sequence of text and stores negated uni/bi/trigrams in not_ form. Note that nltk isn't used here in favor of simple text processing.

# negate_sequence(text)
#   text: sentence to process (creation of uni/bi/trigrams
#    is handled here)
#
# Detects negations and transforms negated words into 'not_' form
#
def negate_sequence(text):
    negation = False
    delims = "?.,!:;"
    result = []
    words = text.split()
    prev = None
    pprev = None
    for word in words:
        stripped = word.strip(delims).lower()
        negated = "not_" + stripped if negation else stripped
        result.append(negated)
        if prev:
            bigram = prev + " " + negated
            result.append(bigram)
            if pprev:
                trigram = pprev + " " + bigram
                result.append(trigram)
            pprev = prev
        prev = negated

        if any(neg in word for neg in ["not", "n't", "no"]):
            negation = not negation

        if any(c in word for c in delims):
            negation = False

    return result

If we run this program on a sample input text = "I am not happy today, and I am not feeling well", we obtain the following sequences of unigrams, bigrams, and trigrams:

[   'i',
    'am',
    'i am',
    'not',
    'am not',
    'i am not',
    'not_happy',
    'not not_happy',
    'am not not_happy',
    'not_today',
    'not_happy not_today',
    'not not_happy not_today',
    'and',
    'not_today and',
    'not_happy not_today and',
    'i',
    'and i',
    'not_today and i',
    'am',
    'i am',
    'and i am',
    'not',
    'am not',
    'i am not',
    'not_feeling',
    'not not_feeling',
    'am not not_feeling',
    'not_well',
    'not_feeling not_well',
    'not not_feeling not_well']

We may subsequently store these trigrams in an array for future retreival and analysis. Process the not_ words as negative of the [sentiment, polarity] that you have defined for their counterparts.