Are urllib2 and httplib thread safe?

Piotr Dobrogost picture Piotr Dobrogost · Apr 28, 2011 · Viewed 9.8k times · Source

I'm looking for information on thread safety of urllib2 and httplib. The official documentation (http://docs.python.org/library/urllib2.html and http://docs.python.org/library/httplib.html) lacks any information on this subject; the word thread is not even mentioned there...

UPDATE

Ok, they are not thread-safe out of the box. What's required to make them thread-safe or is there a scenario in which they can be thread-safe? I'm asking because it's seems that

  • using separate OpenerDirector in each thread
  • not sharing HTTP connection among threads

would suffice to safely use these libs in threads. Similar usage scenario was proposed in question urllib2 and cookielib thread safety

Answer

Gregg picture Gregg · Apr 28, 2011

httplib and urllib2 are not thread-safe.

urllib2 does not provide serialized access to a global (shared) OpenerDirector object, which is used by urllib2.urlopen().

Similarly, httplib does not provide serialized access to HTTPConnection objects (i.e. by using a thread-safe connection pool), so sharing HTTPConnection objects between threads is not safe.

I suggest using httplib2 or urllib3 as an alternative if thread-safety is required.

Generally, if a module's documentation does not mention thread-safety, I would assume it is not thread-safe. You can look at the module's source code for verification.

When browsing the source code to determine whether a module is thread-safe, you can start by looking for uses of thread synchronization primitives from the threading or multiprocessing modules, or use of queue.Queue.

UPDATE

Here is a relevant source code snippet from urllib2.py (Python 2.7.2):

_opener = None
def urlopen(url, data=None, timeout=socket._GLOBAL_DEFAULT_TIMEOUT):
    global _opener
    if _opener is None:
        _opener = build_opener()
    return _opener.open(url, data, timeout)

def install_opener(opener):
    global _opener
    _opener = opener

There is an obvious race condition when concurrent threads call install_opener() and urlopen().

Also, note that calling urlopen() with a Request object as the url parameter may mutate the Request object (see the source for OpenerDirector.open()), so it is not safe to concurrently call urlopen() with a shared Request object.

All told, urlopen() is thread-safe if the following conditions are met:

  • install_opener() is not called from another thread.
  • A non-shared Request object, or string is used as the url parameter.