I come across a strange problem dealing with python isdigit function.
For example:
>>> a = u'\u2466'
>>> a.isdigit()
Out[1]: True
>>> a.isnumeric()
Out[2]: True
Why this character is a digit?
Any way to make this return False instead, thanks?
Edit, If I don't want to treat it as a digit, then how to filter it out?
For example, when I try to convert it to a int:
>>> int(u'\u2466')
Then UnicodeEncodeError
happened.
U+2466 is the CIRCLED DIGIT SEVEN (⑦), so yes, it's a digit.
If your definition of what is a digit differs from that of the Unicode Consortium, you might have to write your own isdigit()
method.
Edit, If I don't want to treat it as a digit, then how to filter it out?
If you are just interested in the ASCII digits 0
...9
, you could do something like:
In [4]: s = u'abc 12434 \u2466 5 def'
In [5]: u''.join(c for c in s if '0' <= c <= '9')
Out[5]: u'124345'