I am not sure why strings and tuples were made to be immutable; what are the advantages and disadvantage of making them immutable?
Imagine a language called FakeMutablePython, where you can alter strings using list assignment and such (such as mystr[0] = 'a'
)
a = "abc"
That creates an entry in memory in memory address 0x1, containing "abc", and the identifier a
pointing to it.
Now, say you do..
b = a
This creates the identifier b
and also points it to the same memory address of 0x1
Now, if the string were mutable, and you change b
:
b[0] = 'z'
This alters the first byte of the string stored at 0x1 to z
.. Since the identifier a
is pointing to here to, thus that string would altered also, so..
print a
print b
..would both output zbc
This could make for some really weird, unexpected behaviour. Dictionary keys would be a good example of this:
mykey = 'abc'
mydict = {
mykey: 123,
'zbc': 321
}
anotherstring = mykey
anotherstring[0] = 'z'
Now in FakeMutablePython, things become rather odd - you initially have two keys in the dictionary, "abc" and "zbc".. Then you alter the "abc" string (via the identifier anotherstring
) to "zbc", so the dict has two keys, "zbc" and "zbc"...
One solution to this weirdness would be, whenever you assign a string to an identifier (or use it as a dict key), it copies the string at 0x1 to 0x2.
This prevents the above, but what if you have a string that requires 200MB of memory?
a = "really, really long string [...]"
b = a
Suddenly your script takes up 400MB of memory? This isn't very good.
What about if we point it to the same memory address, until we modify it? Copy on write. The problem is, this can be quite complicated to do..
This is where immutability comes in.. Instead of requiring the .replace()
method to copy the string from memory into a new address, then modify it and return.. We just make all strings immutable, and thus the function must create a new string to return. This explains the following code:
a = "abc"
b = a.replace("a", "z")
And is proven by:
>>> a = 'abc'
>>> b = a
>>> id(a) == id(b)
True
>>> b = b.replace("a", "z")
>>> id(a) == id(b)
False
(the id()
function returns the memory address of the object)