Convert HTML to data:text/html link using JavaScript

Web_Designer picture Web_Designer · Feb 11, 2012 · Viewed 68.1k times · Source

Here's my HTML:

<a>View it in your browser</a>
<div id="html">
    <h1>Doggies</h1>
    <p style="color:blue;">Kitties</p>
</div>

How do I use JavaScript to make the href attribute of my link point to a base64 encoded webpage whose source is the innerHTML of div#html?

I basically want to do the same conversion done here (with the base64 checkbox checked) except for in JavaScript.

Answer

Rob W picture Rob W · Feb 11, 2012

Characteristics of a data-URI

A data-URI with MIME-type text/html has to be in one of these formats:

data:text/html,<HTML HERE>
data:text/html;charset=UTF-8,<HTML HERE>

Base-64 encoding is not necessary. If your code contains non-ASCII characters, such as éé, charset=UTF-8 has to be added.

The following characters have to be escaped:

  • # - Firefox and Opera interpret this character as the marker of a hash (as in location.hash).
  • % - This character is used to escape characters. Escape this character to make sure that no side effects occur.

Additionally, if you want to embed the code in an anchor tag, the following characters should also be escaped:

  • " and/or ' - Quotes mark the value of the attribute.
  • & - The ampersand is used to mark HTML entities.
  • < and > do not have to be escaped inside a HTML attribute. However, if you're going to embed the link in the HTML, these should also be escaped (%3C and %3E)

JavaScript implementation

If you don't mind the size of the data-URI, the easiest method to do so is using encodeURIComponent:

var html = document.getElementById("html").innerHTML;
var dataURI = 'data:text/html,' + encodeURIComponent(html);

If size matters, you'd better strip out all consecutive white-space (this can safely be done, unless the HTML contains a <pre> element/style). Then, only replace the significant characters:

var html = document.getElementById("html").innerHTML;
html = html.replace(/\s{2,}/g, '')   // <-- Replace all consecutive spaces, 2+
           .replace(/%/g, '%25')     // <-- Escape %
           .replace(/&/g, '%26')     // <-- Escape &
           .replace(/#/g, '%23')     // <-- Escape #
           .replace(/"/g, '%22')     // <-- Escape "
           .replace(/'/g, '%27');    // <-- Escape ' (to be 100% safe)
var dataURI = 'data:text/html;charset=UTF-8,' + html;