Programmatically install NLTK corpora / models, i.e. without the GUI downloader?

Bluu picture Bluu · Apr 30, 2011 · Viewed 35.9k times · Source

My project uses the NLTK. How can I list the project's corpus & model requirements so they can be automatically installed? I don't want to click through the nltk.download() GUI, installing packages one by one.

Also, any way to freeze that same list of requirements (like pip freeze)?

Answer

burgersmoke picture burgersmoke · May 19, 2011

The NLTK site does list a command line interface for downloading packages and collections at the bottom of this page :

http://www.nltk.org/data

The command line usage varies by which version of Python you are using, but on my Python2.6 install I noticed I was missing the 'spanish_grammar' model and this worked fine:

python -m nltk.downloader spanish_grammars

You mention listing the project's corpus and model requirements and while I'm not sure of a way to automagically do that, I figured I would at least share this.