How to split a tab-delimited string in bash script WITHOUT collapsing blanks?

Neil C. Obremski picture Neil C. Obremski · Nov 1, 2013 · Viewed 11k times · Source

I have my string in $LINE and I want $ITEMS to be the array version of this, split on single tabs and retaining blanks. Here's where I'm at now:

IFS=$'\n' ITEMS=($(echo "$LINE" | tr "\t" "\n"))

The issue here is that IFS is one-or-more so it gobbles up new-lines, tabs, whatever. I've tried a few other things based on other questions posted here but they assume that there will always be a value in all fields, never blank. And the one that seems to hold the key is far beyond me and operating on an entire file (I am just splitting a single string).

My preference here is a pure-BASH solution.

Answer

rici picture rici · Nov 1, 2013

IFS is only one-or-more if the characters are whitespace. Non-whitespace characters are single delimiters. So a simple solution, if there is some non-whitespace character which you are confident is not in your string, is to translate tabs to that character and then split on it:

IFS=$'\2' read -ra ITEMS <<<"${LINE//$'\t'/$'\2'}"

Unfortunately, assumptions like "there is no instance of \2 in the input" tend to fail in the long-run, where "in the long-run" translates to "at the worst possible time". So you might want to do it in two steps:

IFS=$'\2' read -ra TEMP < <(tr $'\t\2' $'\2\t' <<<"$LINE")
ITEMS=("${TEMP[@]//$'\t'/$'\2'}")