Clean source code files of invisible characters

deceze picture deceze · Jul 1, 2009 · Viewed 43.8k times · Source

I have a bizarre problem: Somewhere in my HTML/PHP code there's a hidden, invisible character that I can't seem to get rid of. By copying it from Firebug and converting it I identified it as  or 'Zero width no-break space'. It shows up as non-empty text node in my website and is causing a serious layout problem.

The problem is, I can't get rid of it. I can't see it in my files even when turning Invisibles on (d'uh). I can't seem to find it, no search tool seems to pick up on it. I rewrote my code around where it could be, but it seems to be somewhere deeper in one of the framework files.

Any good tools to find characters by charcode across files or something like that? (Mac OS X)

Answer

Boldewyn picture Boldewyn · Jul 1, 2009

You don't get the character in the editor, because you can't find it in text editors. #FEFF or #FFFE are so-called byte-order marks. They are a Microsoft invention to tell in a Unicode file, in which order multi-byte characters are stored.

To get rid of it, tell your editor to save the file either as ANSI/ISO-8859 or as Unicode without BOM. If your editor can't do so, you'll either have to switch editors (sadly) or use some kind of truncation tool like, e.g., a hex editor that allows you to see how the file really looks.

On googleing, it seems, that TextWrangler has a "UTF-8, no BOM" mode. Otherwise, if you're comfortable with the terminal, you can use Vim:

:set nobomb

and save the file. Presto!

The characters are always the very first in a text file. Editors with support for the BOM will not, as I mentioned, show it to you at all.

Cheers,