I'm trying to learn GNU Parallel because I have a case where I think I could easily parallelize a bash function. So in trying to learn, I went to the GNU Parallel manual where there is an example...but I can't even get it working! To wit:
(232) $ bash --version
GNU bash, version 4.1.2(1)-release (x86_64-redhat-linux-gnu)
Copyright (C) 2009 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software; you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.
(233) $ cat tpar.bash
#!/bin/bash
echo `which parallel`
doit() {
echo Doing it for $1
sleep 2
echo Done with $1
}
export -f doit
parallel doit ::: 1 2 3
doubleit() {
echo Doing it for $1 $2
sleep 2
echo Done with $1 $2
}
export -f doubleit
parallel doubleit ::: 1 2 3 ::: a b
(234) $ bash tpar.bash
/home/mathomp4/bin/parallel
doit: Command not found.
doit: Command not found.
doit: Command not found.
doubleit: Command not found.
doubleit: Command not found.
doubleit: Command not found.
doubleit: Command not found.
doubleit: Command not found.
doubleit: Command not found.
As you can see, I can't even get the simple example to run. Thus, I'm probably doing something amazingly stupid and basic...but I'm at a loss.
ETA: As suggested by commenters (chmod +x, set -vx):
(27) $ ./tpar.bash
echo `which parallel`
which parallel
++ which parallel
+ echo /home/mathomp4/bin/parallel
/home/mathomp4/bin/parallel
doit() {
echo Doing it for $1
sleep 2
echo Done with $1
}
export -f doit
+ export -f doit
parallel doit ::: 1 2 3
+ parallel doit ::: 1 2 3
doit: Command not found.
doit: Command not found.
doit: Command not found.
doubleit() {
echo Doing it for $1 $2
sleep 2
echo Done with $1 $2
}
export -f doubleit
+ export -f doubleit
parallel doubleit ::: 1 2 3 ::: a b
+ parallel doubleit ::: 1 2 3 ::: a b
doubleit: Command not found.
doubleit: Command not found.
doubleit: Command not found.
doubleit: Command not found.
doubleit: Command not found.
doubleit: Command not found.
ETA2: Note, I can, in the script, just call 'doit 1', say, and it will do that. So the function is valid, it just isn't...exported?
You cannot call a shell function from outside the shell where it was defined. A shell function is a concept inside the shell. The parallel
command itself has no way to access it.
Calling export -f doit
in bash exports the function via the environment so that it is picked up by child processes. But only bash understands bash functions. A (grand)*child bash process can call it, but not other programs, for example not other shells.
Going by the message “Command not found”, it appears that your preferred shell is (t)csh. You need to tell parallel
to invoke bash instead. parallel
invokes the shell indicated by the SHELL
environment variable¹, so set it to point to bash.
export SHELL=$(type -p bash)
doit () { … }
export -f doit
parallel doit ::: 1 2 3
If you only want to set SHELL
for the execution of the parallel
command and not for the rest of the script:
doit () { … }
export -f doit
SHELL=$(type -p bash) parallel doit ::: 1 2 3
I'm not sure how to deal with remote jobs, you may need to pass --env=SHELL
in addition to --env=doit
(note that this assumes that the path to bash
is the same everywhere).
And yes, this oddity should be mentioned more prominently in the manual. There's a brief note in the description of the command
argument, but it isn't very explicit (it should explain that the command
words are concatenated with a space as a separator and then passed to $SHELL -c
), and SHELL
isn't even listed in the environment variables section. (I encourage you to report this as a bug; I'm not doing it because I hardly ever use this program.)
¹ which is bad design, since SHELL
is supposed to indicate a user interface preference for an interactive command line shell, and not to change the behavior of programs.