What is the difference between isinstance('aaa', basestring) and isinstance('aaa', str)?

zjm1126 picture zjm1126 · Dec 30, 2009 · Viewed 78.9k times · Source
a='aaaa'
print isinstance(a, basestring)#true
print isinstance(a, str)#true

Answer

Tendayi Mawushe picture Tendayi Mawushe · Dec 30, 2009

In Python versions prior to 3.0 there are two kinds of strings "plain strings" and "unicode strings". Plain strings (str) cannot represent characters outside of the Latin alphabet (ignoring details of code pages for simplicity). Unicode strings (unicode) can represent characters from any alphabet including some fictional ones like Klingon.

So why have two kinds of strings, would it not be better to just have Unicode since that would cover all the cases? Well it is better to have only Unicode but Python was created before Unicode was the preferred method for representing strings. It takes time to transition the string type in a language with many users, in Python 3.0 it is finally the case that all strings are Unicode.

The inheritance hierarchy of Python strings pre-3.0 is:

          object
             |
             |
         basestring
            / \
           /   \
         str  unicode

'basestring' introduced in Python 2.3 can be thought of as a step in the direction of string unification as it can be used to check whether an object is an instance of str or unicode

>>> string1 = "I am a plain string"
>>> string2 = u"I am a unicode string"
>>> isinstance(string1, str)
True
>>> isinstance(string2, str)
False
>>> isinstance(string1, unicode)
False
>>> isinstance(string2, unicode)
True
>>> isinstance(string1, basestring)
True
>>> isinstance(string2, basestring)
True