Why piping input to "read" only works when fed into "while read ..." construct?

huoneusto picture huoneusto · Dec 7, 2012 · Viewed 32.5k times · Source

I've been trying to read input into environment variables from program output like this:

echo first second | read A B ; echo $A-$B 

And the result is:

-

Both A and B are always empty. I read about bash executing piped commands in sub-shell and that basically preventing one from piping input to read. However, the following:

echo first second | while read A B ; do echo $A-$B ; done

Seems to work, the result is:

first-second

Can someone please explain what is the logic here? Is it that the commands inside the while ... done construct are actually executed in the same shell as echo and not in a sub-shell?

Answer

F. Hauri picture F. Hauri · Dec 7, 2012

How to do a loop against stdin and get result stored in a variable

Under (and other also), when you pipe something by using | to another command, you will implicitely create a fork, a subshell who's a child of current session and who can't affect current session's environ.

So this:

TOTAL=0
printf "%s %s\n" 9 4 3 1 77 2 25 12 226 664 |
  while read A B;do
      ((TOTAL+=A-B))
      printf "%3d - %3d = %4d -> TOTAL= %4d\n" $A $B $[A-B] $TOTAL
    done
echo final total: $TOTAL

won't give expected result! :

  9 -   4 =    5 -> TOTAL=    5
  3 -   1 =    2 -> TOTAL=    7
 77 -   2 =   75 -> TOTAL=   82
 25 -  12 =   13 -> TOTAL=   95
226 - 664 = -438 -> TOTAL= -343
echo final total: $TOTAL
final total: 0

Where computed TOTAL could'nt be reused in main script.

Inverting the fork

By using Process Substitution, Here Documents or Here Strings, you could inverse the fork:

Here strings

read A B <<<"first second"
echo $A
first

echo $B
second

Here Documents

while read A B;do
    echo $A-$B
    C=$A-$B
  done << eodoc
first second
third fourth
eodoc
first-second
third-fourth

outside of the loop:

echo : $C
: third-fourth

Here Commands

TOTAL=0
while read A B;do
    ((TOTAL+=A-B))
    printf "%3d - %3d = %4d -> TOTAL= %4d\n" $A $B $[A-B] $TOTAL
  done < <(
    printf "%s %s\n" 9 4 3 1 77 2 25 12 226 664
)
  9 -   4 =    5 -> TOTAL=    5
  3 -   1 =    2 -> TOTAL=    7
 77 -   2 =   75 -> TOTAL=   82
 25 -  12 =   13 -> TOTAL=   95
226 - 664 = -438 -> TOTAL= -343

# and finally out of loop:
echo $TOTAL
-343

Now you could use $TOTAL in your main script.

Piping to a command list

But for working only against stdin, you may create a kind of script into the fork:

printf "%s %s\n" 9 4 3 1 77 2 25 12 226 664 | {
    TOTAL=0
    while read A B;do
        ((TOTAL+=A-B))
        printf "%3d - %3d = %4d -> TOTAL= %4d\n" $A $B $[A-B] $TOTAL
    done
    echo "Out of the loop total:" $TOTAL
  }

Will give:

  9 -   4 =    5 -> TOTAL=    5
  3 -   1 =    2 -> TOTAL=    7
 77 -   2 =   75 -> TOTAL=   82
 25 -  12 =   13 -> TOTAL=   95
226 - 664 = -438 -> TOTAL= -343
Out of the loop total: -343

Note: $TOTAL could not be used in main script (after last right curly bracket } ).

Using lastpipe bash option

As @CharlesDuffy correctly pointed out, there is a bash option used to change this behaviour. But for this, we have to first disable job control:

shopt -s lastpipe           # Set *lastpipe* option
set +m                      # Disabling job control
TOTAL=0
printf "%s %s\n" 9 4 3 1 77 2 25 12 226 664 |
  while read A B;do
      ((TOTAL+=A-B))
      printf "%3d - %3d = %4d -> TOTAL= %4d\n" $A $B $[A-B] $TOTAL
    done

  9 -   4 =    5 -> TOTAL= -338
  3 -   1 =    2 -> TOTAL= -336
 77 -   2 =   75 -> TOTAL= -261
 25 -  12 =   13 -> TOTAL= -248
226 - 664 = -438 -> TOTAL= -686

echo final total: $TOTAL
-343

This will work, but I (personally) don't like this because this is not standard and won't help to make script readable. Also disabling job control seem expensive for accessing this behaviour.

Note: Job control is enabled by default only in interactive sessions. So set +m is not required in normal scripts.

So forgotten set +m in a script would create different behaviours if run in a console or if run in a script. This will not going to make this easy to understand or to debug...