How to parse data-uri in python?

blueFast picture blueFast · Nov 23, 2015 · Viewed 13.4k times · Source

HTML image elements have this simplified format:

<img src='something'>

That something can be data-uri, for example:

...

Is there a standard way of parsing this with python, so that I get content_type and base64 data separated, or should I create my own parser for this?

Answer

JRodDynamite picture JRodDynamite · Nov 23, 2015

Split the data URI on the comma to get the base64 encoded data without the header. Call base64.b64decode to decode that to bytes. Last, write the bytes to a file.

from base64 import b64decode

data_uri = "..."

# Python 2 and <Python 3.4
header, encoded = data_uri.split(",", 1)
data = b64decode(encoded)

# Python 3.4+
# from urllib import request
# with request.urlopen(data_uri) as response:
#     data = response.read()

with open("image.png", "wb") as f:
    f.write(data)