How to remove bad path characters in Python?

Martin picture Martin · Jun 23, 2009 · Viewed 28.4k times · Source

What is the most cross platform way of removing bad path characters (e.g. "\" or ":" on Windows) in Python?


Because there seems to be no ideal solution I decided to be relatively restrictive and did use the following code:

def remove(value, deletechars):
    for c in deletechars:
        value = value.replace(c,'')
    return value;

print remove(filename, '\/:*?"<>|')


Josh picture Josh · Nov 27, 2012

I think the safest approach here is to just replace any suspicious characters. So, I think you can just replace (or get rid of) anything that isn't alphanumeric, -, _, a space, or a period. And here's how you do that:

import re
re.sub('[^\w\-_\. ]', '_', filename)

The above escapes every character that's not a letter, '_', '-', '.' or space with an '_'. So, if you're looking at an entire path, you'll want to throw os.sep in the list of approved characters as well.

Here's some sample output:

In [27]: re.sub('[^\w\-_\. ]', '_', 'some\\*-file._n\\\\ame')
Out[27]: 'some__-file._n__ame'