Converting ANSI to UTF-8 in shell

Neringan picture Neringan · Nov 28, 2013 · Viewed 57.2k times · Source

I'm making a parser (1 csv to 3 csv) script and I have a problem. I am French so in my language I have letters like: é è à ....

A customer sent me a csv file that Linux recognizes as "unknown-8bit" (ansi I guess).

In my script, I'm writing 3 new csv files. But ViM creates them as ISO latin1 because it's close to what it got in the entry, but my é,è,à... are broken. I need UTF-8.

So I tried to convert the first ANSI csv to UTF-8 :

iconv -f "windows-1252" -t "UTF-8" import.csv -o import.csv

The problem is that it breaks my CSV. It's now on only one row. But my special chars are ok. Is there a way to convert ANSI to UTF-8 and keeping my rows?

Answer

Grzegorz Żur picture Grzegorz Żur · Nov 28, 2013

Put the output into another file. Don't overwrite the old one.

iconv -f "windows-1252" -t "UTF-8" import.csv -o new_import.csv

iconv fails when reading and writing to the same file.