How to get the dependency tree with spaCy?

Nicolas Joseph picture Nicolas Joseph · Apr 13, 2016 · Viewed 33.2k times · Source

I have been trying to find how to get the dependency tree with spaCy but I can't find anything on how to get the tree, only on how to navigate the tree.

Answer

Christos Baziotis picture Christos Baziotis · Aug 23, 2016

In case someone wants to easily view the dependency tree produced by spacy, one solution would be to convert it to an nltk.tree.Tree and use the nltk.tree.Tree.pretty_print method. Here is an example:

import spacy
from nltk import Tree


en_nlp = spacy.load('en')

doc = en_nlp("The quick brown fox jumps over the lazy dog.")

def to_nltk_tree(node):
    if node.n_lefts + node.n_rights > 0:
        return Tree(node.orth_, [to_nltk_tree(child) for child in node.children])
    else:
        return node.orth_


[to_nltk_tree(sent.root).pretty_print() for sent in doc.sents]

Output:

                jumps                  
  ________________|____________         
 |    |     |     |    |      over     
 |    |     |     |    |       |        
 |    |     |     |    |      dog      
 |    |     |     |    |    ___|____    
The quick brown  fox   .  the      lazy

Edit: For changing the token representation you can do this:

def tok_format(tok):
    return "_".join([tok.orth_, tok.tag_])


def to_nltk_tree(node):
    if node.n_lefts + node.n_rights > 0:
        return Tree(tok_format(node), [to_nltk_tree(child) for child in node.children])
    else:
        return tok_format(node)

Which results in:

                         jumps_VBZ                           
   __________________________|___________________             
  |       |        |         |      |         over_IN        
  |       |        |         |      |            |            
  |       |        |         |      |          dog_NN        
  |       |        |         |      |     _______|_______     
The_DT quick_JJ brown_JJ   fox_NN  ._. the_DT         lazy_JJ