XmlException while parsing xml with encoding specified as "utf-16"

user845405 picture user845405 · Oct 9, 2013 · Viewed 10.8k times · Source

I have an issue with parsing XML which has utf-16 encoding but it works perfectly fine with utf-8.
Can any help me out on this issue?.

I get the following error :

System.Web.HttpUnhandledException' was thrown.
System.Xml.XmlException: There is no Unicode byte order mark.
Cannot switch to Unicode

XML Header:

<?xml version="1.0" encoding="utf-16"?>
<RiskAssessmentRequestValue xmlns:xsd="http://www.w3.org/2001/XMLSchema" 


    rptTransformedXml.DataSource = parser.ExtractData(xml);

    public List<XmlDataExtract> ExtractData(string xml)
        MemoryStream stream = new MemoryStream(Encoding.ASCII.GetBytes(xml));
        return ExtractData(stream);

    public List<XmlDataExtract> ExtractData(Stream xmlStream)
        XmlReaderSettings settings = new XmlReaderSettings
                                             IgnoreComments = true,
                                             IgnoreWhitespace = true,
                                             CloseInput = true

        XmlReader reader = XmlReader.Create(xmlStream, settings);
        XmlPathBuilder pathBuilder = new XmlPathBuilder(reader);
        List<XmlDataExtract> xmlDataList = new List<XmlDataExtract>();

        while (reader.Read())
            if (reader.NodeType == XmlNodeType.XmlDeclaration)
            CollectAttributeData(reader, xmlDataList, pathBuilder);
            CollectElementData(reader, xmlDataList, pathBuilder);
        return xmlDataList;


Amitd picture Amitd · Oct 9, 2013

You can create an encoder based on the encoding of the xml content :

string encoding = "UTF-8"; // should match encoding in XML
string xml = @"<?xml version='1.0' encoding='UTF-8'?><table><row>1</row></table>";

var ms = new MemoryStream(Encoding.GetEncoding(encoding).GetBytes(xml));

var xdrs = new XmlReaderSettings()
    {IgnoreComments = true,
    IgnoreWhitespace = true,
    CloseInput = true};

var xdr = XmlReader.Create(ms, xdrs);
while (xdr.Read())

For more information about encoding, there is a related question