Python urllib2 Response header

looter picture looter · Oct 31, 2009 · Viewed 55.8k times · Source

I'm trying to extract the response header of a URL request. When I use firebug to analyze the response output of a URL request, it returns:

Content-Type text/html

However when I use the python code:

urllib2.urlopen(URL).info()

the resulting output returns:

Content-Type: video/x-flv

I am new to python, and to web programming in general; any helpful insight is much appreciated. Also, if more info is needed please let me know.

Thanks in advance for reading this post

Answer

qingbo picture qingbo · Mar 26, 2010

Try to request as Firefox does. You can see the request headers in Firebug, so add them to your request object:

import urllib2

request = urllib2.Request('http://your.tld/...')
request.add_header('User-Agent', 'some fake agent string')
request.add_header('Referer', 'fake referrer')
...
response = urllib2.urlopen(request)
# check content type:
print response.info().getheader('Content-Type')

There's also HTTPCookieProcessor which can make it better, but I don't think you'll need it in most cases. Have a look at python's documentation:

http://docs.python.org/library/urllib2.html