Unaccent string in bash script (RHEL)

Petr Kozelka picture Petr Kozelka · Mar 27, 2012 · Viewed 7.1k times · Source

On Debian-based distributions, there is a utility called unaccent which can be used to remove accents from accented letters in a text.

I was looking for a package containing this on Redhat distros, but the only one I found was unac available for Mandriva only.

I tried to use iconv but it seems to not support my case.

What is the best, lightweight approach, easily usable in a bash script ? Are there any secret options to iconv that allow this ?

Answer

kev picture kev · Mar 27, 2012

You can use the -c(clear) option in iconv to remove non-ascii chars:

$ echo 'été' | iconv -c -f utf8 -t ascii
t

If you just want to remove the accent:

$ echo 'été' | iconv -f utf8 -t ascii//TRANSLIT
ete