NLTK - how to find out what corpora are installed from within python?

Rafael S. Calsaverini picture Rafael S. Calsaverini · Dec 14, 2009 · Viewed 7.9k times · Source

I'm trying to load some corpora I installed with the NLTK installer but I got a:

>>> from nltk.corpus import machado
      Traceback (most recent call last):
      File "<stdin>", line 1, in <module>
      ImportError: cannot import name machado

But in the download manager (nltk.download()) the package machado is marked as installed and I have a nltk_data/corpus/machado folder.

How can I see from inside the python intepreter what are the installed corpora?

Also, what package should I install to work with this how-to? http://nltk.googlecode.com/svn/trunk/doc/howto/portuguese_en.html

I can't find the module nltk.examples refered to in the how-to.

Answer

Hank Gay picture Hank Gay · Dec 14, 2009

try

import nltk.corpus
dir(nltk.corpus)

at which point, it probably told you something about __LazyModule__... so do dir(nltk.corpus) again.

If that doesn't work, try tab-completion in iPython.