I often find this strange CDATA
tag in XML
files:
<![CDATA[some stuff]]>
I have observed that this CDATA
tag always comes at the beginning, and then followed by some stuff.
But sometimes it is used, sometimes it is not. I assume it is to mark that some stuff
is the "data" that will be inserted after that. But what kind of data is some stuff
? Isn't anything I write in XML tags some sort of data?
CDATA stands for Character Data and it means that the data in between these strings includes data that could be interpreted as XML markup, but should not be.
The key differences between CDATA and comments are:
]]>
(CDEnd
), while in a comment --
is invalid.This means given these four snippets of XML from one well-formed document:
<!ENTITY MyParamEntity "Has been expanded">
<!--
Within this comment I can use ]]>
and other reserved characters like <
&, ', and ", but %MyParamEntity; will not be expanded
(if I retrieve the text of this node it will contain
%MyParamEntity; and not "Has been expanded")
and I can't place two dashes next to each other.
-->
<![CDATA[
Within this Character Data block I can
use double dashes as much as I want (along with <, &, ', and ")
*and* %MyParamEntity; will be expanded to the text
"Has been expanded" ... however, I can't use
the CEND sequence. If I need to use CEND I must escape one of the
brackets or the greater-than sign using concatenated CDATA sections.
]]>
<description>An example of escaped CENDs</description>
<!-- This text contains a CEND ]]> -->
<!-- In this first case we put the ]] at the end of the first CDATA block
and the > in the second CDATA block -->
<data><![CDATA[This text contains a CEND ]]]]><![CDATA[>]]></data>
<!-- In this second case we put a ] at the end of the first CDATA block
and the ]> in the second CDATA block -->
<alternative><![CDATA[This text contains a CEND ]]]><![CDATA[]>]]></alternative>