I need to filter out only strings that contains only digits and/or a fix set of punctuation.
I've tried checking each character and then summing the Boolean conditions to check if it is equal to the len(str)
. Is there a more pythonic way to do this:
>>> import string
>>> x = ['12,523', '3.46', "this is not", "foo bar 42", "23fa"]
>>> [i for i in x if [True if j.isdigit() else False for j in i] ]
['12,523', '3.46', 'this is not', 'foo bar 42']
>>> [i for i in x if sum([True if j.isdigit() or j in string.punctuation else False for j in i]) == len(i)]
['12,523', '3.46']
Using all
with generator expression, you don't need to count, compare length:
>>> [i for i in x if all(j.isdigit() or j in string.punctuation for j in i)]
['12,523', '3.46']
BTW, above and OP's code will include strings that contains only punctuations.
>>> x = [',,,', '...', '123', 'not number']
>>> [i for i in x if all(j.isdigit() or j in string.punctuation for j in i)]
[',,,', '...', '123']
To handle that, add more condition:
>>> [i for i in x if all(j.isdigit() or j in string.punctuation for j in i) and any(j.isdigit() for j in i)]
['123']
You can make it a little bit faster by storing the result of string.punctuation in a set.
>>> puncs = set(string.punctuation)
>>> [i for i in x if all(j.isdigit() or j in puncs for j in i) and any(j.isdigit() for j in i)]
['123']