Parse custom URIs with urlparse (Python)

Question 1

Parse custom URIs with urlparse (Python)

python url python-2.6 urlparse

u0b34a0f6ae · Sep 13, 2009 · Viewed 17.5k times · Source

Answer

Answer

You can also register a custom handler with urlparse:

import urlparse

def register_scheme(scheme):
    for method in filter(lambda s: s.startswith('uses_'), dir(urlparse)):
        getattr(urlparse, method).append(scheme)

register_scheme('moose')

This will append your url scheme to the lists:

uses_fragment
uses_netloc
uses_params
uses_query
uses_relative

The uri will then be treated as http-like and will correctly return the path, fragment, username/password etc.

urlparse.urlparse('moose://username:password@hostname:port/path?query=value#fragment')._asdict()
=> {'fragment': 'fragment', 'netloc': 'username:password@hostname:port', 'params': '', 'query': 'query=value', 'path': '/path', 'scheme': 'moose'}

Question 2

My application creates custom URIs (or URLs?) to identify objects and resolve them. The problem is that Python's urlparse module refuses to parse unknown URL schemes like it parses http.

If I do not adjust urlparse's uses_* lists I get this:

>>> urlparse.urlparse("qqqq://base/id#hint")
('qqqq', '', '//base/id#hint', '', '', '')
>>> urlparse.urlparse("http://base/id#hint")
('http', 'base', '/id', '', '', 'hint')

Here is what I do, and I wonder if there is a better way to do it:

import urlparse

SCHEME = "qqqq"

# One would hope that there was a better way to do this
urlparse.uses_netloc.append(SCHEME)
urlparse.uses_fragment.append(SCHEME)

Why is there no better way to do this?

Parse custom URIs with urlparse (Python)

Answer

Related questions