I am trying to insert the following string into an sql xml field
<?xml version="1.0" encoding="UTF-8"?>
<Response>
<Ip>x.x.x.x</Ip>
<CountryCode>CA</CountryCode>
<CountryName>Canada</CountryName>
<RegionCode>QC</RegionCode>
<RegionName>Québec</RegionName>
<City>Dorval</City>
<ZipCode>h9p1j3</ZipCode>
<Latitude>45.45000076293945</Latitude>
<Longitude>-73.75</Longitude>
<MetroCode></MetroCode>
<AreaCode></AreaCode>
</Response>
The insert code looks like:
INSERT
INTO Traffic(... , xmlGeoLocation, ...)
VALUES (
...
<!---
<cfqueryparam CFSQLType="cf_sql_varchar" value="#xmlGeoLocation#">,
--->
'#xmlGeoLocation#',
...
)
Two bad things happen:
Québec gets turned into Québec
I get an error saying [Macromedia][SQLServer JDBC Driver][SQLServer]XML parsing: line 8, character 16, illegal xml character
UPDATE:
The incoming test stream is mostly single byte characters.
The é is a two byte character. In particular C3A9
Also I don't have control over the incoming xml stream
I'm going to strip the header...
I'm having the same issue with a funny little apostrophe thing. I think the issue is that by the time the string is getting converted to XML, it's not UTF-8 anymore, but sql server is trying to use the header to decode it. If it's VARCHAR, it's in the client's encoding. If it's NVARCHAR, it's UTF-16. Here are some variations I tested:
SQL (varchar, UTF-8):
SELECT CONVERT(XML,'<?xml version="1.0" encoding="UTF-8"?><t>We’re sorry</t>')
Error:
XML parsing: line 1, character 44, illegal xml character
SQL (nvarchar, UTF-8):
SELECT CONVERT(XML,N'<?xml version="1.0" encoding="UTF-8"?><t>We’re sorry</t>')
Error: XML parsing: line 1, character 38, unable to switch the encoding
SQL (varchar, UTF-16)
SELECT CONVERT(XML,'<?xml version="1.0" encoding="UTF-16"?><t>We’re sorry</t>')
Error:
XML parsing: line 1, character 39, unable to switch the encoding
SQL (nvarchar, UTF-16)
SELECT CONVERT(XML,N'<?xml version="1.0" encoding="UTF-16"?><t>We’re sorry</t>')
Worked!