calling rsync from python subprocess.call

rdietrick picture rdietrick · Jan 11, 2013 · Viewed 9.3k times · Source

I'm trying to execute rsync over ssh from a subprocess in a python script to copy images from one server to another. I have a function defined as:

def rsyncBookContent(bookIds, serverEnv):
    bookPaths = ""
    if len(bookIds) > 1:
        bookPaths = "{" + ",".join(("book_"+str(x)) for x in bookIds) + "}"
    else:
        bookPaths = "book_" + str(bookIds[0])

    for host in serverEnv['content.hosts']:
        args = ["rsync", "-avz", "--include='*/'", "--include='*.jpg'", "--exclude='*'", "-e", "ssh", options.bookDestDir + "/" + bookPaths, "jill@" + host + ":/home/jill/web/public/static/"]
        print "executing " + ' '.join(args)
        subprocess.call(args)

What I'm ultimately trying to do is have Python execute this (which works from a bash shell):

rsync -avz --include='*/' --include='*.jpg' --exclude='*' -e ssh /shared/books/{book_482,book_347} [email protected]:/home/jill/web/public/static/

And indeed my print statement outputs:

executing rsync -avz --include='*/' --include='*.jpg' --exclude='*' -e ssh /shared/books/{book_482,book_347} [email protected]:/home/jill/web/public/static/

But when executed from within this python script, there are two problems:

  1. if len(bookIds) > 1, the list of sub-directories under /shared/books/ is somehow misinterpreted by bash or rsync. The error message is:
    • rsync: link_stat "/shared/books/{book_482,book_347}" failed: No such file or directory (2))
  2. if len(bookIds) == 1, all files under the source directory are rsynced (not just *.jpg, as is my intention)

Seems as if the subprocess.call function requires some characters to be escaped or something, no?

Answer

rdietrick picture rdietrick · Jan 11, 2013

Figured out my issues. My problems were the result of my misunderstanding of how the subprocess.call function executes and bash's expansion of lists inside curly braces.

When I was issuing the rsync command in a bash shell with subdirectories in curly braces, bash was really expanding that into multiple arguments which were being passed to rsync (/shared/books/book_1 shared/books/book_2, etc.). When passing the same string with curly braces "/shared/books/{book_1, book_2}" to the subprocess.call function, the expansion wasn't happening, since it wasn't going through bash, so my argument to rsync was really "/shared/books/{book_1, book_2}".

Similarly, the single quotes around the file patterns ('*', '*.jpg', etc.) work on the bash command line (only the values inside the single quotes are passed to rsync), but inside subprocess.call, the single quotes are passed to rsync as the file pattern ("'*.jpg'").

New (working) code looks like this:

def rsyncBookContent(bookIds, serverEnv):
    bookPaths = []
    for b in bookIds:
        bookPaths.append(options.bookDestDir + "/book_" + str(b))
    args = []
    for host in serverEnv['content.hosts']:
        # copy all *.jpg files via ssh
        args = ["rsync", "-avz", "--include", "*/", "--include", "*.jpg", "--exclude", "*", "-e", "ssh"]
        args.extend(bookPaths)
        args.append("jill@" + host + ":/home/jill/web/public/static/"])
        print "executing " + ' '.join(args)
        subprocess.call(args)