Using getopts to process long and short command line options

gagneet picture gagneet · Dec 31, 2008 · Viewed 425.4k times · Source

I wish to have long and short forms of command line options invoked using my shell script.

I know that getopts can be used, but like in Perl, I have not been able to do the same with shell.

Any ideas on how this can be done, so that I can use options like:

./shell.sh --copyfile abc.pl /tmp/
./shell.sh -c abc.pl /tmp/

In the above, both the commands mean the same thing to my shell, but using getopts, I have not been able to implement these?

Answer

Urban Vagabond picture Urban Vagabond · Oct 31, 2011

getopt and getopts are different beasts, and people seem to have a bit of misunderstanding of what they do. getopts is a built-in command to bash to process command-line options in a loop and assign each found option and value in turn to built-in variables, so you can further process them. getopt, however, is an external utility program, and it doesn't actually process your options for you the way that e.g. bash getopts, the Perl Getopt module or the Python optparse/argparse modules do. All that getopt does is canonicalize the options that are passed in — i.e. convert them to a more standard form, so that it's easier for a shell script to process them. For example, an application of getopt might convert the following:

myscript -ab infile.txt -ooutfile.txt

into this:

myscript -a -b -o outfile.txt infile.txt

You have to do the actual processing yourself. You don't have to use getopt at all if you make various restrictions on the way you can specify options:

  • only put one option per argument;
  • all options go before any positional parameters (i.e. non-option arguments);
  • for options with values (e.g. -o above), the value has to go as a separate argument (after a space).

Why use getopt instead of getopts? The basic reason is that only GNU getopt gives you support for long-named command-line options.1 (GNU getopt is the default on Linux. Mac OS X and FreeBSD come with a basic and not-very-useful getopt, but the GNU version can be installed; see below.)

For example, here's an example of using GNU getopt, from a script of mine called javawrap:

# NOTE: This requires GNU getopt.  On Mac OS X and FreeBSD, you have to install this
# separately; see below.
TEMP=`getopt -o vdm: --long verbose,debug,memory:,debugfile:,minheap:,maxheap: \
             -n 'javawrap' -- "$@"`

if [ $? != 0 ] ; then echo "Terminating..." >&2 ; exit 1 ; fi

# Note the quotes around `$TEMP': they are essential!
eval set -- "$TEMP"

VERBOSE=false
DEBUG=false
MEMORY=
DEBUGFILE=
JAVA_MISC_OPT=
while true; do
  case "$1" in
    -v | --verbose ) VERBOSE=true; shift ;;
    -d | --debug ) DEBUG=true; shift ;;
    -m | --memory ) MEMORY="$2"; shift 2 ;;
    --debugfile ) DEBUGFILE="$2"; shift 2 ;;
    --minheap )
      JAVA_MISC_OPT="$JAVA_MISC_OPT -XX:MinHeapFreeRatio=$2"; shift 2 ;;
    --maxheap )
      JAVA_MISC_OPT="$JAVA_MISC_OPT -XX:MaxHeapFreeRatio=$2"; shift 2 ;;
    -- ) shift; break ;;
    * ) break ;;
  esac
done

This lets you specify options like --verbose -dm4096 --minh=20 --maxhe 40 --debugfi="/Users/John Johnson/debug.txt" or similar. The effect of the call to getopt is to canonicalize the options to --verbose -d -m 4096 --minheap 20 --maxheap 40 --debugfile "/Users/John Johnson/debug.txt" so that you can more easily process them. The quoting around "$1" and "$2" is important as it ensures that arguments with spaces in them get handled properly.

If you delete the first 9 lines (everything up through the eval set line), the code will still work! However, your code will be much pickier in what sorts of options it accepts: In particular, you'll have to specify all options in the "canonical" form described above. With the use of getopt, however, you can group single-letter options, use shorter non-ambiguous forms of long-options, use either the --file foo.txt or --file=foo.txt style, use either the -m 4096 or -m4096 style, mix options and non-options in any order, etc. getopt also outputs an error message if unrecognized or ambiguous options are found.

NOTE: There are actually two totally different versions of getopt, basic getopt and GNU getopt, with different features and different calling conventions.2 Basic getopt is quite broken: Not only does it not handle long options, it also can't even handle embedded spaces inside of arguments or empty arguments, whereas getopts does do this right. The above code will not work in basic getopt. GNU getopt is installed by default on Linux, but on Mac OS X and FreeBSD it needs to be installed separately. On Mac OS X, install MacPorts (http://www.macports.org) and then do sudo port install getopt to install GNU getopt (usually into /opt/local/bin), and make sure that /opt/local/bin is in your shell path ahead of /usr/bin. On FreeBSD, install misc/getopt.

A quick guide to modifying the example code for your own program: Of the first few lines, all is "boilerplate" that should stay the same, except the line that calls getopt. You should change the program name after -n, specify short options after -o, and long options after --long. Put a colon after options that take a value.

Finally, if you see code that has just set instead of eval set, it was written for BSD getopt. You should change it to use the eval set style, which works fine with both versions of getopt, while the plain set doesn't work right with GNU getopt.

1Actually, getopts in ksh93 supports long-named options, but this shell isn't used as often as bash. In zsh, use zparseopts to get this functionality.

2Technically, "GNU getopt" is a misnomer; this version was actually written for Linux rather than the GNU project. However, it follows all the GNU conventions, and the term "GNU getopt" is commonly used (e.g. on FreeBSD).