How to remove BOM from byte array

Ravi Gupta picture Ravi Gupta · Mar 18, 2013 · Viewed 10.8k times · Source

I have xml data in byte[] byteArray which may or mayn't contain BOM. Is there any standard way in C# to remove BOM from it? If not, what is the best way, which handles all the cases including all types of encoding, to do the same?

Actually, I am fixing a bug in the code and I don't want to change much of the code. So it would be better if someone can give me the code to remove BOM.

I know that I can do like find out 60 which is ASCII value of '<' and ignore bytes before that but I don't want to do that.

Answer

Rich O&#39;Kelly picture Rich O'Kelly · Mar 18, 2013

All of the C# XML parsers will automatically handle the BOM for you. I'd recommend using XDocument - in my opinion it provides the cleanest abstraction of XML data.

Using XDocument as an example:

using (var stream = new memoryStream(bytes))
{
  var document = XDocument.Load(stream);
  ...
}

Once you have an XDocument you can then use it to omit the bytes without the BOM:

using (var stream = new MemoryStream())
using (var writer = XmlWriter.Create(stream))
{
  writer.Settings.Encoding = new UTF8Encoding(false);
  document.WriteTo(writer);
  var bytesWithoutBOM = stream.ToArray();
}