How to get vim to show a byte-by-byte representation of file data

Jesse Hogan picture Jesse Hogan · Aug 31, 2012 · Viewed 9k times · Source

I don't want vim to ever interpret my data in any encoding specific way. In other words, when I'm in vim, I want the character that my cursor is on to correspond to the actual byte, not a utf* (etc.) representation of that byte.

I need to use vim to analyze issues caused by Unicode conversion errors made by other people (using other software) so it's important that I see what is actually there.

For example, in Cygwin's vim, I have been able to see UTF-8 BOMs as

 [START OF FILE DATA]

This is perfect. I recognize this as a UTF-8 BOM and if I want to know what the hex for each character is, I can put the cursor on the characters and use 'ga'.

I recently got a proper Linux machine (Fedora). In /etc/vimrc, this line exists

set fileencodings=ucs-bom,utf-8,latin1

When I look at a UTF-8 BOM on this machine, the BOM is completely hidden.

When I add the following line to ~/.vimrc

set fileencodings=latin1

I see



The first 3 characters are the BOM (when ga is used against them). I don't know what the last 3 characters are.

At one point, I even saw the UTF-8 BOM represented as "feff" - the UTF-16 BOM.

Anyway, you see my problem. I need to see exactly what is in my file without vim interpreting the bytes for me. I know I could use xxd, od, etc but vim has always been very convenient as an analysis tool. Plus I want to be able to edit the files and save them without any conversion problems.

Thanks for your help.

Answer

Ingo Karkat picture Ingo Karkat · Aug 31, 2012

Use 'binary' mode:

:edit ++bin file

or

vim -b file

From :help 'binary':

The 'fileencoding' and 'fileencodings' options will not be used, the file is read without conversion.