Coreference resolution in python nltk using Stanford coreNLP

Riken Shah picture Riken Shah · Sep 9, 2016 · Viewed 13.9k times · Source

Stanford CoreNLP provides coreference resolution as mentioned here, also this thread, this, provides some insights about its implementation in Java.

However, I am using python and NLTK and I am not sure how can I use Coreference resolution functionality of CoreNLP in my python code. I have been able to set up StanfordParser in NLTK, this is my code so far.

from nltk.parse.stanford import StanfordDependencyParser
stanford_parser_dir = 'stanford-parser/'
eng_model_path = stanford_parser_dir  + "stanford-parser-models/edu/stanford/nlp/models/lexparser/englishRNN.ser.gz"
my_path_to_models_jar = stanford_parser_dir  + "stanford-parser-3.5.2-models.jar"
my_path_to_jar = stanford_parser_dir  + "stanford-parser.jar"

How can I use coreference resolution of CoreNLP in python?

Answer

Deesha picture Deesha · Mar 16, 2017

As mentioned by @Igor You can try the python wrapper implemented in this GitHub repo: https://github.com/dasmith/stanford-corenlp-python

This repo contains two main files: corenlp.py client.py

Perform the following changes to get coreNLP working:

  1. In the corenlp.py, change the path of the corenlp folder. Set the path where your local machine contains the corenlp folder and add the path in line 144 of corenlp.py

    if not corenlp_path: corenlp_path = <path to the corenlp file>

  2. The jar file version number in "corenlp.py" is different. Set it according to the corenlp version that you have. Change it at line 135 of corenlp.py

    jars = ["stanford-corenlp-3.4.1.jar", "stanford-corenlp-3.4.1-models.jar", "joda-time.jar", "xom.jar", "jollyday.jar"]

In this replace 3.4.1 with the jar version which you have downloaded.

  1. Run the command:

    python corenlp.py

This will start a server

  1. Now run the main client program

    python client.py

This provides a dictionary and you can access the coref using 'coref' as the key:

For example: John is a Computer Scientist. He likes coding.

{
     "coref": [[[["a Computer Scientist", 0, 4, 2, 5], ["John", 0, 0, 0, 1]], [["He", 1, 0, 0, 1], ["John", 0, 0, 0, 1]]]]
}

I have tried this on Ubuntu 16.04. Use java version 7 or 8.