Should I use cut or awk to extract fields and field substrings?

user3486154 picture user3486154 · Apr 1, 2014 · Viewed 11.9k times · Source

I have a file with pipe-separated fields. I want to print a subset of field 1 and all of field 2:

cat tmpfile.txt

# 10 chars.|variable length num|text
ABCDEFGHIJ|99|U|HOMEWORK
JIDVESDFXW|8|C|CHORES
DDFEXFEWEW|73|B|AFTER-HOURS

I'd like the output to look like this:

# 6 chars.|variable length num
ABCDEF|99
JIDVES|8
DDFEXF|73

I know how to get fields 1 & 2:

cat tmpfile.txt | awk '{FS="|"} {print $1"|"$2}'

And know how to get the first 6 characters of field 1:

cat tmpfile.txt | cut -c 1-6

I know this is fairly simple, but I can't figure out is how to combine the awk and cut commands.

Any suggestions would be greatly appreciated.

Answer

devnull picture devnull · Apr 1, 2014

You could use awk. Use the substr() function to trim the first field:

awk -F'|' '{print substr($1,1,6),$2}' OFS='|' inputfile

For your input, it'd produce:

ABCDEF|99
JIDVES|8
DDFEXF|73

Using sed, you could say:

sed -r 's/^(.{6})[^|]*([|][^|]*).*/\1\2/' inputfile

to produce the same output.