XML - Data At Root Level is Invalid

George Stocker picture George Stocker · Nov 14, 2008 · Viewed 30.4k times · Source

I have an XSD file that is encoded in UTF-8, and any text editor I run it through doesn't show any character at the beginning of the file, but when I pull it up in Visual Studio's debugger, I clearly see an empty box in front of the file.

Box in file

I also get the error:

Data at the root level is invalid. Line 1, position 1.

alt text

Anyone know what this is?

Update: Edited post to qualify type of file. It's an XSD file created by Microsoft's XSD creator.

Answer

George Stocker picture George Stocker · Nov 14, 2008

It turns out, the answer is that what I'm seeing is a Byte Order Mark, which is a character that tells whatever is loading the document what it is encoded in. In my case, it's encoded in utf-8, so the corresponding BOM was EF BB BF, as shown below. To remove it, I opened it up in Notepad++ and clicked on "Encode in UTF-8 without BOM", as shown below:

Saving in NotePad++.

To actually see the BOM, I had to open it up in TextPad in Binary mode:, and conducted a Google search for "EF BB BF".

binary mode

It took me about 8 hours to find out this was what was causing it, so I thought I'd share this with everyone.

Update: If I had read Joel Spolsky's blog post: The Absolute Minimum Every Software Developer Absolutely, Positively Must Know About Unicode and Character Sets (No Excuses!), then I might not have had this problem.