I'm trying to learn NLTK - Natural Language Toolkit written in Python and I want install a sample data set to run some examples.
My web connection uses a proxy server, and I'm trying to specify the proxy address as follows:
>>> nltk.set_proxy('http://proxy.example.com:3128' ('USERNAME', 'PASSWORD'))
>>> nltk.download()
But I get an error:
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
TypeError: 'str' object is not callable
I decided to set up a ProxyBasicAuthHandler
before calling nltk.download()
:
import urllib2
auth_handler = urllib2.ProxyBasicAuthHandler(urllib2.HTTPPasswordMgrWithDefaultRealm())
auth_handler.add_password(realm=None, uri='http://proxy.example.com:3128/', user='USERNAME', passwd='PASSWORD')
opener = urllib2.build_opener(auth_handler)
urllib2.install_opener(opener)
import nltk
nltk.download()
But now I get HTTP Error 407 - Proxy Autentification Required
.
The documentation says that if the proxy is set to None
then this function will attempt to detect the system proxy. But it isn't working.
How can I install a sample data set for NLTK?
There is an error with the website where you got those lines of code for your first attempt (I have seen that same error)
The line in error is
nltk.set_proxy('http://proxy.example.com:3128' ('USERNAME', 'PASSWORD'))
You need a comma to separate the arguments. The correct line should be
nltk.set_proxy('http://proxy.example.com:3128', ('USERNAME', 'PASSWORD'))
This will work just fine.