How to remove all special characters in Linux text

vinllen picture vinllen · Mar 30, 2017 · Viewed 98.7k times · Source

vim pic How to remove the special characters shown as blue color in the picture 1 like: ^M, ^A, ^@, ^[. In my understanding, ^M is a windows newline character, I can use sed -i '/^M//g' to remove it, but it doesn't work to remove others. The command dos2unix doesn't work, neither. Are there exist any ways that I can use to remove them both?

Answer

heemayl picture heemayl · Mar 30, 2017

Remove everything except the printable characters (character class [:print:]), with sed:

sed $'s/[^[:print:]\t]//g' file.txt

[:print:] includes:

  • [:alnum:] (alpha-numerics)
  • [:punct:] (punctuations)
  • space

The ANSI C quoting ($'') is used for interpreting \t as literal tab inside $'' (in bash and alike).