what actually is PCDATA and CDATA?

nonopolarity picture nonopolarity · May 13, 2009 · Viewed 24.4k times · Source

it seems that a loose definition of PCDATA and CDATA is that

  1. PCDATA is character data, but is to be parsed.
  2. CDATA is character data, and is not to be parsed.

but then someone told me that CDATA is actually parsed or PCDATA is actually not parsed... so it is a bit of a confusion. Does anyone know the real deal is?

Update: I actually added the PCDATA definition on Wikipedia... so don't take that answer too seriously as that's only my rough understanding of it.

Answer

Ólafur Waage picture Ólafur Waage · May 13, 2009

From WIKI:

PCDATA

Simply speaking, PCDATA stands for Parsed Character Data. That means the characters are to be parsed by the XML, XHTML, or HTML parser. (&lt; will be changed to <, <p> will be taken to mean a paragraph tag, etc). Compare that with CDATA, where the characters are not to be parsed by the XML, XHTML, or HTML parser.

CDATA

The term CDATA, meaning character data, is used for distinct, but related purposes in the markup languages SGML and XML. The term indicates that a certain portion of the document is general character data, rather than non-character data or character data with a more specific, limited structure.