Argparse: how to handle variable number of arguments (nargs='*')

rubik picture rubik · Nov 23, 2013 · Viewed 87k times · Source

I thought that nargs='*' was enough to handle a variable number of arguments. Apparently it's not, and I don't understand the cause of this error.

The code:

p = argparse.ArgumentParser()
p.add_argument('pos')
p.add_argument('foo')
p.add_argument('--spam', default=24, type=int, dest='spam')
p.add_argument('vars', nargs='*')

p.parse_args('1 2 --spam 8 8 9'.split())

I think the resulting namespace should be Namespace(pos='1', foo='2', spam='8', vars=['8', '9']). Instead, argparse gives this error:

usage: prog.py [-h] [--spam SPAM] pos foo [vars [vars ...]]
error: unrecognized arguments: 9 8

Basically, argparse doesn't know where to put those additional arguments... Why is that?

Answer

hpaulj picture hpaulj · Nov 23, 2013

The relevant Python bug is Issue 15112.

argparse: nargs='*' positional argument doesn't accept any items if preceded by an option and another positional

When argparse parses ['1', '2', '--spam', '8', '8', '9'] it first tries to match ['1','2'] with as many of the positional arguments as possible. With your arguments the pattern matching string is AAA*: 1 argument each for pos and foo, and zero arguments for vars (remember * means ZERO_OR_MORE).

['--spam','8'] are handled by your --spam argument. Since vars has already been set to [], there is nothing left to handle ['8','9'].

The programming change to argparse checks for the case where 0 argument strings is satisfying the pattern, but there are still optionals to be parsed. It then defers the handling of that * argument.

You might be able to get around this by first parsing the input with parse_known_args, and then handling the remainder with another call to parse_args.

To have complete freedom in interspersing optionals among positionals, in issue 14191, I propose using parse_known_args with just the optionals, followed by a parse_args that only knows about the positionals. The parse_intermixed_args function that I posted there could be implemented in an ArgumentParser subclass, without modifying the argparse.py code itself.


Here's a way of handling subparsers. I've taken the parse_known_intermixed_args function, simplified it for presentation sake, and then made it the parse_known_args function of a Parser subclass. I had to take an extra step to avoid recursion.

Finally I changed the _parser_class of the subparsers Action, so each subparser uses this alternative parse_known_args. An alternative would be to subclass _SubParsersAction, possibly modifying its __call__.

from argparse import ArgumentParser

def parse_known_intermixed_args(self, args=None, namespace=None):
    # self - argparse parser
    # simplified from http://bugs.python.org/file30204/test_intermixed.py
    parsefn = super(SubParser, self).parse_known_args # avoid recursion

    positionals = self._get_positional_actions()
    for action in positionals:
        # deactivate positionals
        action.save_nargs = action.nargs
        action.nargs = 0

    namespace, remaining_args = parsefn(args, namespace)
    for action in positionals:
        # remove the empty positional values from namespace
        if hasattr(namespace, action.dest):
            delattr(namespace, action.dest)
    for action in positionals:
        action.nargs = action.save_nargs
    # parse positionals
    namespace, extras = parsefn(remaining_args, namespace)
    return namespace, extras

class SubParser(ArgumentParser):
    parse_known_args = parse_known_intermixed_args

parser = ArgumentParser()
parser.add_argument('foo')
sp = parser.add_subparsers(dest='cmd')
sp._parser_class = SubParser # use different parser class for subparsers
spp1 = sp.add_parser('cmd1')
spp1.add_argument('-x')
spp1.add_argument('bar')
spp1.add_argument('vars',nargs='*')

print parser.parse_args('foo cmd1 bar -x one 8 9'.split())
# Namespace(bar='bar', cmd='cmd1', foo='foo', vars=['8', '9'], x='one')