Illegal XML Characters When Declaring XML Variable In SQL

jackskis picture jackskis · Apr 8, 2016 · Viewed 8.4k times · Source

The Problem

I am attempting to declare an XML variable and and set it equal to this XML file within my SQL query like this: DECLARE @x XML = 'xml content here';. However, when I try to do this with my current XML file, I get this error message printed to my screen:

Msg 9420, Level 16, State 1, Line 1 XML parsing: line 1132, character 265, illegal xml character

Ways I've Already Tried To Solve It

  • I've tried to escape the single quotes by replacing all the single quotes in the XML file with two single quotes and then copying into the query. No luck.
  • I've tried taking out the ampersands, but that doesn't seem to work. Near where the error is detected by SQL, there don't seem to be any special characters. No luck again.

The Question

It seems to me that single quotes aren't the problem. I am having trouble finding exactly what the illegal character is, how I filter these characters out. Maybe importing the XML file from an external file and setting the contents equal to the variable instead of heard coding it will solve the issue. How would this work? I am also aware that single quotes have to be replaced by two single quotes. I did that, but I still have the exact same problem. Any help understanding the issue is appreciated.

System Parameters

  • MS SQL Server Management Studio
  • Windows Server 2012 R2 Standard

The Full Code

Here's the full code that I'm executing that's giving me the error:

DECLARE @x XML = ' copy xml file here... ';

With MyPersonCTE AS
(
    SELECT ROW_NUMBER() OVER(ORDER BY (SELECT NULL)) AS PersonID
          ,p.value('FirstName[1]','varchar(max)') AS FirstName
          ,p.value('LastName[1]','varchar(max)') AS LastName
          ,p.value('Biography[1]','varchar(max)') AS Biography
          ,p.value('Expertise[1]','varchar(max)') AS Expertise
          ,p.value('Image[1]','varchar(max)') AS Image
          ,p.value('Link[1]','varchar(max)') AS Link
          ,p.query('Books') AS BookNode
          ,p.query('Articles') AS ArticleNode
          --same for Papers, Artwork...
    FROM @x.nodes('/People/Person') AS A(p) 
)
,MyBooksCTE AS
(
    SELECT MyPersonCTE.*
          ,ROW_NUMBER() OVER(ORDER BY (SELECT NULL)) AS BookID
          ,x.value('Year[1]','int') AS BookYear
          ,x.value('Details[1]','varchar(max)') AS BookDetails
    FROM MyPersonCTE
    CROSS APPLY MyPersonCTE.BookNode.nodes('/Books/Book') A(x)  
)
,MyArticlesCTE AS
(
    SELECT MyPersonCTE.*
          ,ROW_NUMBER() OVER(ORDER BY (SELECT NULL)) AS ArticleID
          ,x.value('Year[1]','int') AS ArticleYear
          ,x.value('Details[1]','varchar(max)') AS ArticleDetails
    FROM MyPersonCTE
    CROSS APPLY MyPersonCTE.ArticleNode.nodes('/Articles/Article') A(x)  
)
--same for Papers, Artwork...
SELECT p.*
      ,b.BookID
      ,b.BookYear
      ,b.BookDetails
      ,a.ArticleID
      ,a.ArticleYear
      ,a.ArticleDetails  
INTO #tempAllData
FROM MyPersonCTE AS p
LEFT JOIN MyBooksCTE AS b ON p.PersonID=b.PersonID
LEFT JOIN MyArticlesCTE AS a ON p.PersonID=a.PersonID ;

--#tempAllData is now filled with all data, copied in all combination: much to much
--but DISTINCT is your friend
--in this case you'd use the PersonID as FK in all related tables

SELECT DISTINCT PersonID,FirstName,LastName,Biography,Expertise --other fields
FROM #tempAllData;

SELECT DISTINCT PersonID,BookID,BookYear,BookDetails
FROM #tempAllData;

SELECT DISTINCT PersonID,ArticleID,ArticleYear,ArticleDetails
FROM #tempAllData;

DROP TABLE #tempAllData;

Answer

Shnugo picture Shnugo · Apr 9, 2016

Look, this is a reduced example to reproduce the error

In this snippet you've got twice the name "O’Brien" where the ’ makes troubles And here “Experiencing physical warmth promotes interpersonal warmth” the opening and closing qoutes are problematic

DECLARE @x XML=
'<?xml version="1.0" encoding="UTF-8"?>
<People>
<Person>
    <FirstName>Katherine</FirstName>
    <LastName>Corker</LastName>
    <Articles>
                <Article>
            <Year></Year>
            <Details><![CDATA[<p>Corker, K. S., Lynott, D., Wortman, J., Connell, L., Donnellan, M. B., Lucas, R. E., & O’Brien, K. (2014). High quality direct replications matter: Response to Williams (2014). Social Psychology, 45, 324-326. Available <a href="https://www.academia.edu/attachments/35015451/download_file?st=MTQxMzMwNDE0MiwxMzguMjguOS4xNjEsMjEzMTg%3D&s=work_strip">here.</a></p>
<p>Lynott, D., Corker, K. S., Wortman, J., Connell, L., Donnellan, M. B., Lucas, R. E., & O’Brien, K. (2014). Replication of “Experiencing physical warmth promotes interpersonal warmth” by Williams & Bargh (2008, Science). Social Psychology, 45, 216-222. Available <a href="https://www.academia.edu/attachments/33247494/download_file?st=MTQxMzMwNDE0MiwxMzguMjguOS4xNjEsMjEzMTg%3D&s=work_strip">here.</a></p>]]></Details>
        </Article>
            </Articles>
</Person>
</People>';
SELECT @x;

And now just change this

'<?xml version="1.0" encoding="UTF-8"?>

to this (don't forget the "N" in the beginning to force this to Unicode)

N'<?xml version="1.0" encoding="UTF-16"?>

But just try this:

DECLARE @yourXML AS XML=
(
SELECT CONVERT(XML, BulkColumn,2) AS BulkColumn
FROM OPENROWSET(BULK 'X:\Path2file\faculty-xml.xml', SINGLE_BLOB) AS x
);
SELECT @yourXML;

With this you do not have to bother about encondings, you just read the file from a location on the disk...