XmlWriter encoding issues

John picture John · May 14, 2009 · Viewed 10.6k times · Source

I have the following code:

    MemoryStream ms = new MemoryStream();
    XmlWriter w = XmlWriter.Create(ms);

    w.WriteStartDocument(true);
    w.WriteStartElement("data");

    w.WriteElementString("child", "myvalue");

    w.WriteEndElement();//data
    w.Close();
    ms.Close();

    string test = UTF8Encoding.UTF8.GetString(ms.ToArray());

The XML is generated correctly; however, my problem is the first character of the string 'test' is ï (char #239), making it invalid to some xml parsers: where is this coming from? What exactly am I doing incorrectly?

I know I can resolve the issue by just starting after the first character, but I'd rather know why it's there than simply patching over the problem.

Thanks!

Answer

John picture John · May 14, 2009

Found one solution here: https://timvw.be/2007/01/08/generating-utf-8-with-systemxmlxmlwriter/

I was missing this at the top:

XmlWriterSettings xmlWriterSettings = new XmlWriterSettings();
xmlWriterSettings.Encoding = new UTF8Encoding(false);
MemoryStream ms = new MemoryStream();
XmlWriter w = XmlWriter.Create(ms, xmlWriterSettings);

Thanks for the help everyone!