Is there a way to get the subject of a sentence using OpenNLP?

rockit picture rockit · Apr 5, 2011 · Viewed 9.7k times · Source

Is there a way to get the subject of a sentence using OpenNLP? I'm trying to identify the most important part of a users sentence. Generally, users will be submitting sentences to our "engine" and we want to know exactly what the core topic is of that sentence.

Currently we are using openNlp to:

  1. Chunk the sentence
  2. Identify the noun-phrase, verbs, etc of the sentence
  3. Identify all "topics" of the sentence
  4. (NOT YET DONE!) Identify the "core topic" of the sentence

Please let me know if you have any bright ideas..

Answer

dmcer picture dmcer · Apr 6, 2011

Dependency Parser

If you're interested in extracting grammatical relations such as what word or phrase is the subject of a sentence, you should really use a dependency parser. While OpenNLP does support phrase structure parsing, I don't think it does dependency parsing yet.

Opensource Software

Packages written in Java that support dependency parsing include:

Of these, the Stanford Parser is the most accurate. However, some configurations of the MaltParser can be insanely fast (Cer et al. 2010).