using backreferences regex in sed

jgua1 picture jgua1 · Aug 26, 2010 · Viewed 14.4k times · Source

I would like to remove multiple spaces in a file with a single character. Example

cat      kill    rat
dog      kill    cat

I used the following regex, which seemed to matched in http://www.regexpal.com/ but wasn't working in sed.

([^ ])*([ ])*

I used the sed command like so:

sed s/\(\[\^\ \]\)*\(\[\ \]\)*/\$1\|/g < inputfile

I expect,

cat|kill|rat
dog|kill|cat

But I couldn't get it to work. Any help would be much appreciated. Thanks.

Edit: kindly note that cat/dog could be any character than whitespace.

Answer

Lekensteyn picture Lekensteyn · Aug 26, 2010

sed backreferences with backslashes, so use \1 instead of $1. Surround your expressions with quotes:

sed 's/match/replace/g' < inputfile

Manpages are the best invention in Linux world: man sed

Watch out for *, it can actually match NOTHING. If you want to replace multiple spaces with a '|', use this RE:

sed -r 's/ +/\|/g'

From man sed:

-r, --regexp-extended
   use extended regular expressions in the script.

You don't need any backreferences if you just want to replace all spaces. Replace (space) by \s if you want to match tabs too.