The Problem
I am attempting to declare an XML variable and and set it equal to this XML file within my SQL query like this: DECLARE @x XML = 'xml content here';
.
However, when I try to do this with my current XML file, I get this error message printed to my screen:
Msg 9420, Level 16, State 1, Line 1
XML parsing: line 1132, character 265, illegal xml character
Ways I've Already Tried To Solve It
The Question
It seems to me that single quotes aren't the problem. I am having trouble finding exactly what the illegal character is, how I filter these characters out. Maybe importing the XML file from an external file and setting the contents equal to the variable instead of heard coding it will solve the issue. How would this work? I am also aware that single quotes have to be replaced by two single quotes. I did that, but I still have the exact same problem. Any help understanding the issue is appreciated.
System Parameters
The Full Code
Here's the full code that I'm executing that's giving me the error:
DECLARE @x XML = ' copy xml file here... ';
With MyPersonCTE AS
(
SELECT ROW_NUMBER() OVER(ORDER BY (SELECT NULL)) AS PersonID
,p.value('FirstName[1]','varchar(max)') AS FirstName
,p.value('LastName[1]','varchar(max)') AS LastName
,p.value('Biography[1]','varchar(max)') AS Biography
,p.value('Expertise[1]','varchar(max)') AS Expertise
,p.value('Image[1]','varchar(max)') AS Image
,p.value('Link[1]','varchar(max)') AS Link
,p.query('Books') AS BookNode
,p.query('Articles') AS ArticleNode
--same for Papers, Artwork...
FROM @x.nodes('/People/Person') AS A(p)
)
,MyBooksCTE AS
(
SELECT MyPersonCTE.*
,ROW_NUMBER() OVER(ORDER BY (SELECT NULL)) AS BookID
,x.value('Year[1]','int') AS BookYear
,x.value('Details[1]','varchar(max)') AS BookDetails
FROM MyPersonCTE
CROSS APPLY MyPersonCTE.BookNode.nodes('/Books/Book') A(x)
)
,MyArticlesCTE AS
(
SELECT MyPersonCTE.*
,ROW_NUMBER() OVER(ORDER BY (SELECT NULL)) AS ArticleID
,x.value('Year[1]','int') AS ArticleYear
,x.value('Details[1]','varchar(max)') AS ArticleDetails
FROM MyPersonCTE
CROSS APPLY MyPersonCTE.ArticleNode.nodes('/Articles/Article') A(x)
)
--same for Papers, Artwork...
SELECT p.*
,b.BookID
,b.BookYear
,b.BookDetails
,a.ArticleID
,a.ArticleYear
,a.ArticleDetails
INTO #tempAllData
FROM MyPersonCTE AS p
LEFT JOIN MyBooksCTE AS b ON p.PersonID=b.PersonID
LEFT JOIN MyArticlesCTE AS a ON p.PersonID=a.PersonID ;
--#tempAllData is now filled with all data, copied in all combination: much to much
--but DISTINCT is your friend
--in this case you'd use the PersonID as FK in all related tables
SELECT DISTINCT PersonID,FirstName,LastName,Biography,Expertise --other fields
FROM #tempAllData;
SELECT DISTINCT PersonID,BookID,BookYear,BookDetails
FROM #tempAllData;
SELECT DISTINCT PersonID,ArticleID,ArticleYear,ArticleDetails
FROM #tempAllData;
DROP TABLE #tempAllData;
Look, this is a reduced example to reproduce the error
In this snippet you've got twice the name "O’Brien" where the ’ makes troubles And here “Experiencing physical warmth promotes interpersonal warmth” the opening and closing qoutes are problematic
DECLARE @x XML=
'<?xml version="1.0" encoding="UTF-8"?>
<People>
<Person>
<FirstName>Katherine</FirstName>
<LastName>Corker</LastName>
<Articles>
<Article>
<Year></Year>
<Details><![CDATA[<p>Corker, K. S., Lynott, D., Wortman, J., Connell, L., Donnellan, M. B., Lucas, R. E., & O’Brien, K. (2014). High quality direct replications matter: Response to Williams (2014). Social Psychology, 45, 324-326. Available <a href="https://www.academia.edu/attachments/35015451/download_file?st=MTQxMzMwNDE0MiwxMzguMjguOS4xNjEsMjEzMTg%3D&s=work_strip">here.</a></p>
<p>Lynott, D., Corker, K. S., Wortman, J., Connell, L., Donnellan, M. B., Lucas, R. E., & O’Brien, K. (2014). Replication of “Experiencing physical warmth promotes interpersonal warmth” by Williams & Bargh (2008, Science). Social Psychology, 45, 216-222. Available <a href="https://www.academia.edu/attachments/33247494/download_file?st=MTQxMzMwNDE0MiwxMzguMjguOS4xNjEsMjEzMTg%3D&s=work_strip">here.</a></p>]]></Details>
</Article>
</Articles>
</Person>
</People>';
SELECT @x;
And now just change this
'<?xml version="1.0" encoding="UTF-8"?>
to this (don't forget the "N" in the beginning to force this to Unicode)
N'<?xml version="1.0" encoding="UTF-16"?>
DECLARE @yourXML AS XML=
(
SELECT CONVERT(XML, BulkColumn,2) AS BulkColumn
FROM OPENROWSET(BULK 'X:\Path2file\faculty-xml.xml', SINGLE_BLOB) AS x
);
SELECT @yourXML;
With this you do not have to bother about encondings, you just read the file from a location on the disk...