Github Markdown - rendering code blocks with XML & HTML - kramdown vs redcarpet

Leah Wasser picture Leah Wasser · Jan 6, 2016 · Viewed 9.2k times · Source

I'm running into some unusual errors in my git site.

NOTE: I am using Kramdown + highlight.js to highlight code.

Below is a lesson that I am working on. In this lesson, I need to render EML (similar to XML) in code blocks. I am using kramdown. I also want to render HTML in code blocks - for example I need a code block that explains how to add an include file:

{% include _toc.html %}

Case in point - the lesson below: http://neoninc.github.io/NEON-DataSkills-Lesson-Development/R/EML/

git url https://github.com/NEONInc/NEON-DataSkills-Lesson-Development/blob/gh-pages/_posts/EML/2015-12-12-Intro-to-EML.md

Oddly enough when JEKYLL builds the page, it is running the code blocks containing XML and HTML.

I have tried:

<code>
<creator>
      <individualName>
        <givenName>Emery</givenName>
        <surName>Boose</surName>
      </individualName>
</creator>
</code>

This renders the closing </code> tag on the page which is weird. If i don't use the code tag, then jekyll attempts to PARSE the XML.

<pre><code class="xml">
<creator>
          <individualName>
            <givenName>Emery</givenName>
            <surName>Boose</surName>
          </individualName>
</creator>
</code></pre>

This still attempts to parse and as HTML tags.

I have tried to switch to redcarpet. However, redcarpet breaks my breakout boxes which I code as follows: using {: .notice } to apply a style to the div.

<i class="fa fa-star"></i> **Data Tip:**  To figure out the full slot string, 
in `RStudio` we can use Tab Complete as we type.

FINALLY -- My workflow is knitting from RMD to md so adding custom code to each code block (pre and code) is problematic when i build.

In short - I'm having a hard time finding a solution that

  1. Allows code blocks with HTML / XML
  2. Allows me to apply a class/ id to my breakout divs. {: .notice }

Thank you very much for any feedback / guidance with this issue. Leah

Answer

Waylan picture Waylan · Jan 7, 2016

If you want to represent XML/HTML in code blocks there are a few ways to do it.

The first and most basic (easy and works with every implementation) is to simply indent your code block by at least four spaces:

    <creator>
        <individualName>
            <givenName>Emery</givenName>
            <surName>Boose</surName>
        </individualName>
    </creator>

Note that no <pre> or <code> tags were used. In most any implementation, this will render as:

<pre><code>&lt;creator&gt;
    &lt;individualName&gt;
        &lt;givenName&gt;Emery&lt;/givenName&gt;
        &lt;surName&gt;Boose&lt;/surName&gt;
    &lt;/individualName&gt;
&lt;/creator&gt;
</code></pre>

Notice that the XML is escaped (angle brackets are replaced with HTML entities) so that the browser will not interpret that as HTML. Yet, it will display it properly for the reader.

As Highligh.js has automatic language detection, this should be sufficient for most code blocks. And if you are concerned that Highlight.js will be confused by the escaped XML, don't be, Highlight.js actually expects to get escaped code like this. It would more likely be confused if given unescaped XML. However, if you want to explicitly define the language, then you still have a couple options.

To create your own code block using raw HTML (perhaps to assign a class for syntax highlighting), then you need to manually escape the XML yourself. Insert the text below directly into your Markdown document:

<pre><code class="xml">&lt;creator&gt;
    &lt;individualName&gt;
        &lt;givenName&gt;Emery&lt;/givenName&gt;
        &lt;surName&gt;Boose&lt;/surName&gt;
    &lt;/individualName&gt;
&lt;/creator&gt;
</code></pre>

Notice in this case the code block is not indented as it is raw HTML and should be interpreted as-is. Of course, it is not very convenient to manually escape the XML like that every time.

As an alternative, some Markdown implementations have included an add-on called "fenced code blocks" which do not require the indentation but allow the assignment of a class to the code block and provide automated escaping. As it turns out, Kramdown supports this feature. To use it, do this:

~~~ xml
<creator>
    <individualName>
        <givenName>Emery</givenName>
        <surName>Boose</surName>
    </individualName>
</creator>
~~~

And Kramdown will output this:

<pre><code class="language-xml">&lt;creator&gt;
    &lt;individualName&gt;
        &lt;givenName&gt;Emery&lt;/givenName&gt;
        &lt;surName&gt;Boose&lt;/surName&gt;
    &lt;/individualName&gt;
&lt;/creator&gt;
</code></pre>

Note that the XML is properly escaped and a class is assigned identifying the language. While Kramdown appends language- to the class name, Highlight.js, understands this just fine (in fact, this is the recommended format by the HTML5 Spec). As its docs state:

Classes can also be prefixed with either language- or lang-.

As a side note, at first I was thinking that Krandown did not support "fenced code blocks," but that was because I tried the three back-ticks syntax (``` as opposed to ~~~) popularized by GitHub. However, Kramdown only supports the older, original tilde syntax first discussed in the Markdown mailing list and and introduced simultaneously by Python Markdown and PHP Markdown Extra many years ago. As far as I know, Kramdown is the only implementation that still only uses the older tilde syntax (for example modern versions of both the Python and PHP implementations support both tildes and back-ticks).