Validating URLs in Python

mp94 picture mp94 · Mar 7, 2014 · Viewed 48k times · Source

I've been trying to figure out what the best way to validate a URL is (specifically in Python) but haven't really been able to find an answer. It seems like there isn't one known way to validate a URL, and it depends on what URLs you think you may need to validate. As well, I found it difficult to find an easy to read standard for URL structure. I did find the RFCs 3986 and 3987, but they contain much more than just how it is structured.

Am I missing something, or is there no one standard way to validate a URL?

Answer

bgschiller picture bgschiller · Mar 7, 2014

This looks like it might be a duplicate of How do you validate a URL with a regular expression in Python?

You should be able to use the urlparse library described there.

>>> from urllib.parse import urlparse # python2: from urlparse import urlparse
>>> urlparse('actually not a url')
ParseResult(scheme='', netloc='', path='actually not a url', params='', query='', fragment='')
>>> urlparse('http://google.com')
ParseResult(scheme='http', netloc='google.com', path='', params='', query='', fragment='')

call urlparse on the string you want to check and then make sure that the ParseResult has attributes for scheme and netloc