How do I manipulate $PATH elements in shell scripts?

dmckee --- ex-moderator kitten picture dmckee --- ex-moderator kitten · Nov 7, 2008 · Viewed 21k times · Source

Is there a idiomatic way of removing elements from PATH-like shell variables?

That is I want to take

PATH=/home/joe/bin:/usr/local/bin:/usr/bin:/bin:/path/to/app/bin:.

and remove or replace the /path/to/app/bin without clobbering the rest of the variable. Extra points for allowing me put new elements in arbitrary positions. The target will be recognizable by a well defined string, and may occur at any point in the list.

I know I've seen this done, and can probably cobble something together on my own, but I'm looking for a nice approach. Portability and standardization a plus.

I use bash, but example are welcome in your favorite shell as well.


The context here is one of needing to switch conveniently between multiple versions (one for doing analysis, another for working on the framework) of a large scientific analysis package which produces a couple dozen executables, has data stashed around the filesystem, and uses environment variable to help find all this stuff. I would like to write a script that selects a version, and need to be able to remove the $PATH elements relating to the currently active version and replace them with the same elements relating to the new version.


This is related to the problem of preventing repeated $PATH elements when re-running login scripts and the like.


Answer

Jonathan Leffler picture Jonathan Leffler · Nov 16, 2008

Addressing the proposed solution from dmckee:

  1. While some versions of Bash may allow hyphens in function names, others (MacOS X) do not.
  2. I don't see a need to use return immediately before the end of the function.
  3. I don't see the need for all the semi-colons.
  4. I don't see why you have path-element-by-pattern export a value. Think of export as equivalent to setting (or even creating) a global variable - something to be avoided whenever possible.
  5. I'm not sure what you expect 'replace-path PATH $PATH /usr' to do, but it does not do what I would expect.

Consider a PATH value that starts off containing:

.
/Users/jleffler/bin
/usr/local/postgresql/bin
/usr/local/mysql/bin
/Users/jleffler/perl/v5.10.0/bin
/usr/local/bin
/usr/bin
/bin
/sw/bin
/usr/sbin
/sbin

The result I got (from 'replace-path PATH $PATH /usr') is:

.
/Users/jleffler/bin
/local/postgresql/bin
/local/mysql/bin
/Users/jleffler/perl/v5.10.0/bin
/local/bin
/bin
/bin
/sw/bin
/sbin
/sbin

I would have expected to get my original path back since /usr does not appear as a (complete) path element, only as part of a path element.

This can be fixed in replace-path by modifying one of the sed commands:

export $path=$(echo -n $list | tr ":" "\n" | sed "s:^$removestr\$:$replacestr:" |
               tr "\n" ":" | sed "s|::|:|g")

I used ':' instead of '|' to separate parts of the substitute since '|' could (in theory) appear in a path component, whereas by definition of PATH, a colon cannot. I observe that the second sed could eliminate the current directory from the middle of a PATH. That is, a legitimate (though perverse) value of PATH could be:

PATH=/bin::/usr/local/bin

After processing, the current directory would no longer be on the PATH.

A similar change to anchor the match is appropriate in path-element-by-pattern:

export $target=$(echo -n $list | tr ":" "\n" | grep -m 1 "^$pat\$")

I note in passing that grep -m 1 is not standard (it is a GNU extension, also available on MacOS X). And, indeed, the-n option for echo is also non-standard; you would be better off simply deleting the trailing colon that is added by virtue of converting the newline from echo into a colon. Since path-element-by-pattern is used just once, has undesirable side-effects (it clobbers any pre-existing exported variable called $removestr), it can be replaced sensibly by its body. This, along with more liberal use of quotes to avoid problems with spaces or unwanted file name expansion, leads to:

# path_tools.bash
#
# A set of tools for manipulating ":" separated lists like the
# canonical $PATH variable.
#
# /bin/sh compatibility can probably be regained by replacing $( )
# style command expansion with ` ` style
###############################################################################
# Usage:
#
# To remove a path:
#    replace_path         PATH $PATH /exact/path/to/remove
#    replace_path_pattern PATH $PATH <grep pattern for target path>
#
# To replace a path:
#    replace_path         PATH $PATH /exact/path/to/remove /replacement/path
#    replace_path_pattern PATH $PATH <target pattern> /replacement/path
#
###############################################################################

# Remove or replace an element of $1
#
#   $1 name of the shell variable to set (e.g. PATH)
#   $2 a ":" delimited list to work from (e.g. $PATH)
#   $3 the precise string to be removed/replaced
#   $4 the replacement string (use "" for removal)
function replace_path () {
    path=$1
    list=$2
    remove=$3
    replace=$4        # Allowed to be empty or unset

    export $path=$(echo "$list" | tr ":" "\n" | sed "s:^$remove\$:$replace:" |
                   tr "\n" ":" | sed 's|:$||')
}

# Remove or replace an element of $1
#
#   $1 name of the shell variable to set (e.g. PATH)
#   $2 a ":" delimited list to work from (e.g. $PATH)
#   $3 a grep pattern identifying the element to be removed/replaced
#   $4 the replacement string (use "" for removal)
function replace_path_pattern () {
    path=$1
    list=$2
    removepat=$3
    replacestr=$4        # Allowed to be empty or unset

    removestr=$(echo "$list" | tr ":" "\n" | grep -m 1 "^$removepat\$")
    replace_path "$path" "$list" "$removestr" "$replacestr"
}

I have a Perl script called echopath which I find useful when debugging problems with PATH-like variables:

#!/usr/bin/perl -w
#
#   "@(#)$Id: echopath.pl,v 1.7 1998/09/15 03:16:36 jleffler Exp $"
#
#   Print the components of a PATH variable one per line.
#   If there are no colons in the arguments, assume that they are
#   the names of environment variables.

@ARGV = $ENV{PATH} unless @ARGV;

foreach $arg (@ARGV)
{
    $var = $arg;
    $var = $ENV{$arg} if $arg =~ /^[A-Za-z_][A-Za-z_0-9]*$/;
    $var = $arg unless $var;
    @lst = split /:/, $var;
    foreach $val (@lst)
    {
            print "$val\n";
    }
}

When I run the modified solution on the test code below:

echo
xpath=$PATH
replace_path xpath $xpath /usr
echopath $xpath

echo
xpath=$PATH
replace_path_pattern xpath $xpath /usr/bin /work/bin
echopath xpath

echo
xpath=$PATH
replace_path_pattern xpath $xpath "/usr/.*/bin" /work/bin
echopath xpath

The output is:

.
/Users/jleffler/bin
/usr/local/postgresql/bin
/usr/local/mysql/bin
/Users/jleffler/perl/v5.10.0/bin
/usr/local/bin
/usr/bin
/bin
/sw/bin
/usr/sbin
/sbin

.
/Users/jleffler/bin
/usr/local/postgresql/bin
/usr/local/mysql/bin
/Users/jleffler/perl/v5.10.0/bin
/usr/local/bin
/work/bin
/bin
/sw/bin
/usr/sbin
/sbin

.
/Users/jleffler/bin
/work/bin
/usr/local/mysql/bin
/Users/jleffler/perl/v5.10.0/bin
/usr/local/bin
/usr/bin
/bin
/sw/bin
/usr/sbin
/sbin

This looks correct to me - at least, for my definition of what the problem is.

I note that echopath LD_LIBRARY_PATH evaluates $LD_LIBRARY_PATH. It would be nice if your functions were able to do that, so the user could type:

replace_path PATH /usr/bin /work/bin

That can be done by using:

list=$(eval echo '$'$path)

This leads to this revision of the code:

# path_tools.bash
#
# A set of tools for manipulating ":" separated lists like the
# canonical $PATH variable.
#
# /bin/sh compatibility can probably be regained by replacing $( )
# style command expansion with ` ` style
###############################################################################
# Usage:
#
# To remove a path:
#    replace_path         PATH /exact/path/to/remove
#    replace_path_pattern PATH <grep pattern for target path>
#
# To replace a path:
#    replace_path         PATH /exact/path/to/remove /replacement/path
#    replace_path_pattern PATH <target pattern> /replacement/path
#
###############################################################################

# Remove or replace an element of $1
#
#   $1 name of the shell variable to set (e.g. PATH)
#   $2 the precise string to be removed/replaced
#   $3 the replacement string (use "" for removal)
function replace_path () {
    path=$1
    list=$(eval echo '$'$path)
    remove=$2
    replace=$3            # Allowed to be empty or unset

    export $path=$(echo "$list" | tr ":" "\n" | sed "s:^$remove\$:$replace:" |
                   tr "\n" ":" | sed 's|:$||')
}

# Remove or replace an element of $1
#
#   $1 name of the shell variable to set (e.g. PATH)
#   $2 a grep pattern identifying the element to be removed/replaced
#   $3 the replacement string (use "" for removal)
function replace_path_pattern () {
    path=$1
    list=$(eval echo '$'$path)
    removepat=$2
    replacestr=$3            # Allowed to be empty or unset

    removestr=$(echo "$list" | tr ":" "\n" | grep -m 1 "^$removepat\$")
    replace_path "$path" "$removestr" "$replacestr"
}

The following revised test now works too:

echo
xpath=$PATH
replace_path xpath /usr
echopath xpath

echo
xpath=$PATH
replace_path_pattern xpath /usr/bin /work/bin
echopath xpath

echo
xpath=$PATH
replace_path_pattern xpath "/usr/.*/bin" /work/bin
echopath xpath

It produces the same output as before.