How can I escape white space in a bash loop list?

MCS picture MCS · Nov 19, 2008 · Viewed 130.1k times · Source

I have a bash shell script that loops through all child directories (but not files) of a certain directory. The problem is that some of the directory names contain spaces.

Here are the contents of my test directory:

$ls -F test
Baltimore/  Cherry Hill/  Edison/  New York City/  Philadelphia/  cities.txt

And the code that loops through the directories:

for f in `find test/* -type d`; do
  echo $f
done

Here's the output:

test/Baltimore
test/Cherry
Hill
test/Edison 
test/New
York
City
test/Philadelphia

Cherry Hill and New York City are treated as 2 or 3 separate entries.

I tried quoting the filenames, like so:

for f in `find test/* -type d | sed -e 's/^/\"/' | sed -e 's/$/\"/'`; do
  echo $f
done

but to no avail.

There's got to be a simple way to do this.


The answers below are great. But to make this more complicated - I don't always want to use the directories listed in my test directory. Sometimes I want to pass in the directory names as command-line parameters instead.

I took Charles' suggestion of setting the IFS and came up with the following:

dirlist="${@}"
(
  [[ -z "$dirlist" ]] && dirlist=`find test -mindepth 1 -type d` && IFS=$'\n'
  for d in $dirlist; do
    echo $d
  done
)

and this works just fine unless there are spaces in the command line arguments (even if those arguments are quoted). For example, calling the script like this: test.sh "Cherry Hill" "New York City" produces the following output:

Cherry
Hill
New
York
City

Answer

Charles Duffy picture Charles Duffy · Nov 19, 2008

First, don't do it that way. The best approach is to use find -exec properly:

# this is safe
find test -type d -exec echo '{}' +

The other safe approach is to use NUL-terminated list, though this requires that your find support -print0:

# this is safe
while IFS= read -r -d '' n; do
  printf '%q\n' "$n"
done < <(find test -mindepth 1 -type d -print0)

You can also populate an array from find, and pass that array later:

# this is safe
declare -a myarray
while IFS= read -r -d '' n; do
  myarray+=( "$n" )
done < <(find test -mindepth 1 -type d -print0)
printf '%q\n' "${myarray[@]}" # printf is an example; use it however you want

If your find doesn't support -print0, your result is then unsafe -- the below will not behave as desired if files exist containing newlines in their names (which, yes, is legal):

# this is unsafe
while IFS= read -r n; do
  printf '%q\n' "$n"
done < <(find test -mindepth 1 -type d)

If one isn't going to use one of the above, a third approach (less efficient in terms of both time and memory usage, as it reads the entire output of the subprocess before doing word-splitting) is to use an IFS variable which doesn't contain the space character. Turn off globbing (set -f) to prevent strings containing glob characters such as [], * or ? from being expanded:

# this is unsafe (but less unsafe than it would be without the following precautions)
(
 IFS=$'\n' # split only on newlines
 set -f    # disable globbing
 for n in $(find test -mindepth 1 -type d); do
   printf '%q\n' "$n"
 done
)

Finally, for the command-line parameter case, you should be using arrays if your shell supports them (i.e. it's ksh, bash or zsh):

# this is safe
for d in "$@"; do
  printf '%s\n' "$d"
done

will maintain separation. Note that the quoting (and the use of $@ rather than $*) is important. Arrays can be populated in other ways as well, such as glob expressions:

# this is safe
entries=( test/* )
for d in "${entries[@]}"; do
  printf '%s\n' "$d"
done