I have been trying to find how to get the dependency tree with spaCy but I can't find anything on how to get the tree, only on how to navigate the tree.
In case someone wants to easily view the dependency tree produced by spacy, one solution would be to convert it to an nltk.tree.Tree
and use the nltk.tree.Tree.pretty_print
method. Here is an example:
import spacy
from nltk import Tree
en_nlp = spacy.load('en')
doc = en_nlp("The quick brown fox jumps over the lazy dog.")
def to_nltk_tree(node):
if node.n_lefts + node.n_rights > 0:
return Tree(node.orth_, [to_nltk_tree(child) for child in node.children])
else:
return node.orth_
[to_nltk_tree(sent.root).pretty_print() for sent in doc.sents]
Output:
jumps
________________|____________
| | | | | over
| | | | | |
| | | | | dog
| | | | | ___|____
The quick brown fox . the lazy
Edit: For changing the token representation you can do this:
def tok_format(tok):
return "_".join([tok.orth_, tok.tag_])
def to_nltk_tree(node):
if node.n_lefts + node.n_rights > 0:
return Tree(tok_format(node), [to_nltk_tree(child) for child in node.children])
else:
return tok_format(node)
Which results in:
jumps_VBZ
__________________________|___________________
| | | | | over_IN
| | | | | |
| | | | | dog_NN
| | | | | _______|_______
The_DT quick_JJ brown_JJ fox_NN ._. the_DT lazy_JJ