How can I detect if a .NET StreamReader found a UTF8 BOM on the underlying stream?

bookclub picture bookclub · Feb 16, 2011 · Viewed 10.5k times · Source

I get a FileStream(filename,FileMode.Open,FileAccess.Read,FileShare.ReadWrite) and then a StreamReader(stream,true).

Is there a way I can check if the stream started with a UTF8 BOM? I am noticing that files without the BOM are read as UTF8 by the StreamReader.

How can I tell them apart?

Answer

Carlo V. Dango picture Carlo V. Dango · Feb 27, 2012

Rather than hardcoding the bytes, it is prettier to use the API

public string ConvertFromUtf8(byte[] bytes)
{
  var enc = new UTF8Encoding(true);
  var preamble = enc.GetPreamble();
  if (preamble.Where((p, i) => p != bytes[i]).Any()) 
    throw new ArgumentException("Not utf8-BOM");
  return enc.GetString(bytes.Skip(preamble.Length).ToArray());
}