A common antipattern in Python is to concatenate a sequence of strings using +
in a loop. This is bad because the Python interpreter has to create a new string object for each iteration, and it ends up taking quadratic time. (Recent versions of CPython can apparently optimize this in some cases, but other implementations can't, so programmers are discouraged from relying on this.) ''.join
is the right way to do this.
However, I've heard it said (including here on Stack Overflow) that you should never, ever use +
for string concatenation, but instead always use ''.join
or a format string. I don't understand why this is the case if you're only concatenating two strings. If my understanding is correct, it shouldn't take quadratic time, and I think a + b
is cleaner and more readable than either ''.join((a, b))
or '%s%s' % (a, b)
.
Is it good practice to use +
to concatenate two strings? Or is there a problem I'm not aware of?
There is nothing wrong in concatenating two strings with +
. Indeed it's easier to read than ''.join([a, b])
.
You are right though that concatenating more than 2 strings with +
is an O(n^2) operation (compared to O(n) for join
) and thus becomes inefficient. However this has not to do with using a loop. Even a + b + c + ...
is O(n^2), the reason being that each concatenation produces a new string.
CPython2.4 and above try to mitigate that, but it's still advisable to use join
when concatenating more than 2 strings.