xargs
is widely used in shell scripting; it is usually easy to recast these uses in bash using while read -r; do ... done
or while read -ar; do ... done
loops.
When should xargs
be preferred, and when should while-read loops be preferred?
The thing with while
loops is that they tend to process one item at a time, often when it's unnecessary. This is where xargs
has an advantage - it can batch up the arguments to allow one command to process lots of items.
For example, a while loop:
pax> echo '1
2
3
4
5' | while read -r; do echo $REPLY; done
1
2
3
4
5
and the corresponding xargs
:
pax> echo '1
2
3
4
5' | xargs echo
1 2 3 4 5
Here you can see that the lines are processed one-by-one with the while
and altogether with the xargs
. In other words, the former is equivalent to echo 1 ; echo 2 ; echo 3 ; echo 4 ; echo 5
while the latter is equivalent to echo 1 2 3 4 5
(five processes as opposed to one). This really makes a difference when processing thousands or tens of thousands of lines, since process creation takes time.
It's mostly advantageous when using commands that can accept multiple arguments since it reduces the number of individual processes started, making things much faster.
When I'm processing small files or the commands to run on each item are complicated (where I'm too lazy to write a separate script to give to xargs
), I will use the while
variant.
Where I'm interested in performance (large files), I will use xargs
, even if I have to write a separate script.