HTML image elements have this simplified format:
<img src='something'>
That something can be data-uri
, for example:
...
Is there a standard way of parsing this with python, so that I get content_type
and base64 data separated, or should I create my own parser for this?
Split the data URI on the comma to get the base64 encoded data without the header. Call base64.b64decode
to decode that to bytes. Last, write the bytes to a file.
from base64 import b64decode
data_uri = "..."
# Python 2 and <Python 3.4
header, encoded = data_uri.split(",", 1)
data = b64decode(encoded)
# Python 3.4+
# from urllib import request
# with request.urlopen(data_uri) as response:
# data = response.read()
with open("image.png", "wb") as f:
f.write(data)