How to extract the noun phrases using Open nlp's chunking parser

user2024234 picture user2024234 · Feb 5, 2013 · Viewed 10.5k times · Source

I am newbie to Natural Language processing.I need to extract the noun phrases from the text.So far i have used open nlp's chunking parser for parsing my text to get the Tree structure.But i am not able to extract the noun phrases from the tree structure, is there any regular expression pattern in open nlp so that i can use it to extract the noun phrases.

Below is the code that i am using

    InputStream is = new FileInputStream("en-parser-chunking.bin");
    ParserModel model = new ParserModel(is);
    Parser parser = ParserFactory.create(model);
    Parse topParses[] = ParserTool.parseLine(line, parser, 1);
        for (Parse p : topParses){
                 p.show();}

Here I am getting the output as

(TOP (S (S (ADJP (JJ welcome) (PP (TO to) (NP (NNP Big) (NNP Data.))))) (S (NP (PRP We)) (VP (VP (VBP are) (VP (VBG working) (PP (IN on) (NP (NNP Natural) (NNP Language) (NNP Processing.can))))) (NP (DT some) (CD one) (NN help)) (NP (PRP us)) (PP (IN in) (S (VP (VBG extracting) (NP (DT the) (NN noun) (NNS phrases)) (PP (IN from) (NP (DT the) (NN tree) (WP stucture.))))))))))

Can some one please help me in getting the noun phrases like NP,NNP,NN etc.Can some one tell me do I need to use any other NP Chunker to get the noun phrases?Is there any regex pattern to achieve the same.

Please help me on this.

Thanks in advance

Gouse.

Answer

icecream picture icecream · Mar 23, 2013

The Parse object is a tree; you can use getParent() and getChildren() and getType() to navigate the tree.

List<Parse> nounPhrases;

public void getNounPhrases(Parse p) {
    if (p.getType().equals("NP")) {
         nounPhrases.add(p);
    }
    for (Parse child : p.getChildren()) {
         getNounPhrases(child);
    }
}