Python check if website exists

Question 1

Python check if website exists

python html urlopen

James Hallen · May 27, 2013 · Viewed 124.3k times · Source

Answer

Answer

You can use HEAD request instead of GET. It will only download the header, but not the content. Then you can check the response status from the headers.

For python 2.7.x, you can use httplib:

import httplib
c = httplib.HTTPConnection('www.example.com')
c.request("HEAD", '')
if c.getresponse().status == 200:
   print('web site exists')

or urllib2:

import urllib2
try:
    urllib2.urlopen('http://www.example.com/some_page')
except urllib2.HTTPError, e:
    print(e.code)
except urllib2.URLError, e:
    print(e.args)

or for 2.7 and 3.x, you can install requests

import requests
request = requests.get('http://www.example.com')
if request.status_code == 200:
    print('Web site exists')
else:
    print('Web site does not exist')

Question 2

I wanted to check if a certain website exists, this is what I'm doing:

user_agent = 'Mozilla/20.0.1 (compatible; MSIE 5.5; Windows NT)'
headers = { 'User-Agent':user_agent }
link = "http://www.abc.com"
req = urllib2.Request(link, headers = headers)
page = urllib2.urlopen(req).read() - ERROR 402 generated here!

If the page doesn't exist (error 402, or whatever other errors), what can I do in the page = ... line to make sure that the page I'm reading does exit?

Python check if website exists

Answer

Related questions