Stanford Parser and NLTK

ThanaDaray picture ThanaDaray · Dec 14, 2012 · Viewed 95.1k times · Source

Is it possible to use Stanford Parser in NLTK? (I am not talking about Stanford POS.)

Answer

danger89 picture danger89 · Mar 8, 2014

Note that this answer applies to NLTK v 3.0, and not to more recent versions.

Sure, try the following in Python:

import os
from nltk.parse import stanford
os.environ['STANFORD_PARSER'] = '/path/to/standford/jars'
os.environ['STANFORD_MODELS'] = '/path/to/standford/jars'

parser = stanford.StanfordParser(model_path="/location/of/the/englishPCFG.ser.gz")
sentences = parser.raw_parse_sents(("Hello, My name is Melroy.", "What is your name?"))
print sentences

# GUI
for line in sentences:
    for sentence in line:
        sentence.draw()

Output:

[Tree('ROOT', [Tree('S', [Tree('INTJ', [Tree('UH', ['Hello'])]), Tree(',', [',']), Tree('NP', [Tree('PRP$', ['My']), Tree('NN', ['name'])]), Tree('VP', [Tree('VBZ', ['is']), Tree('ADJP', [Tree('JJ', ['Melroy'])])]), Tree('.', ['.'])])]), Tree('ROOT', [Tree('SBARQ', [Tree('WHNP', [Tree('WP', ['What'])]), Tree('SQ', [Tree('VBZ', ['is']), Tree('NP', [Tree('PRP$', ['your']), Tree('NN', ['name'])])]), Tree('.', ['?'])])])]

Note 1: In this example both the parser & model jars are in the same folder.

Note 2:

  • File name of stanford parser is: stanford-parser.jar
  • File name of stanford models is: stanford-parser-x.x.x-models.jar

Note 3: The englishPCFG.ser.gz file can be found inside the models.jar file (/edu/stanford/nlp/models/lexparser/englishPCFG.ser.gz). Please use come archive manager to 'unzip' the models.jar file.

Note 4: Be sure you are using Java JRE (Runtime Environment) 1.8 also known as Oracle JDK 8. Otherwise you will get: Unsupported major.minor version 52.0.

Installation

  1. Download NLTK v3 from: https://github.com/nltk/nltk. And install NLTK:

    sudo python setup.py install

  2. You can use the NLTK downloader to get Stanford Parser, using Python:

    import nltk
    nltk.download()
    
  3. Try my example! (don't forget the change the jar paths and change the model path to the ser.gz location)

OR:

  1. Download and install NLTK v3, same as above.

  2. Download the latest version from (current version filename is stanford-parser-full-2015-01-29.zip): http://nlp.stanford.edu/software/lex-parser.shtml#Download

  3. Extract the standford-parser-full-20xx-xx-xx.zip.

  4. Create a new folder ('jars' in my example). Place the extracted files into this jar folder: stanford-parser-3.x.x-models.jar and stanford-parser.jar.

    As shown above you can use the environment variables (STANFORD_PARSER & STANFORD_MODELS) to point to this 'jars' folder. I'm using Linux, so if you use Windows please use something like: C://folder//jars.

  5. Open the stanford-parser-3.x.x-models.jar using an Archive manager (7zip).

  6. Browse inside the jar file; edu/stanford/nlp/models/lexparser. Again, extract the file called 'englishPCFG.ser.gz'. Remember the location where you extract this ser.gz file.

  7. When creating a StanfordParser instance, you can provide the model path as parameter. This is the complete path to the model, in our case /location/of/englishPCFG.ser.gz.

  8. Try my example! (don't forget the change the jar paths and change the model path to the ser.gz location)