C++ std::string append vs push_back()

Memento Mori picture Memento Mori · Feb 26, 2013 · Viewed 37.2k times · Source

This really is a question just for my own interest I haven't been able to determine through the documentation.

I see on http://www.cplusplus.com/reference/string/string/ that append has complexity:

"Unspecified, but generally up to linear in the new string length."

while push_back() has complexity:

"Unspecified; Generally amortized constant, but up to linear in the new string length."

As a toy example, suppose I wanted to append the characters "foo" to a string. Would

myString.push_back('f');
myString.push_back('o');
myString.push_back('o');

and

myString.append("foo");

amount to exactly the same thing? Or is there any difference? You might figure that append would be more efficient because the compiler would know how much memory is required to extend the string the specified number of characters, while push_back may need to secure memory each call?

Answer

Billy ONeal picture Billy ONeal · Feb 26, 2013

In C++03 (for which most of "cplusplus.com"'s documentation is written), the complexities were unspecified because library implementers were allowed to do Copy-On-Write or "rope-style" internal representations for strings. For instance, a COW implementation might require copying the entire string if a character is modified and there is sharing going on.

In C++11, COW and rope implementations are banned. You should expect constant amortized time per character added or linear amortized time in the number of characters added for appending to a string at the end. Implementers may still do relatively crazy things with strings (in comparison to, say std::vector), but most implementations are going to be limited to things like the "small string optimization".

In comparing push_back and append, push_back deprives the underlying implementation of potentially useful length information which it might use to preallocate space. On the other hand, append requires that an implementation walk over the input twice in order to find that length, so the performance gain or loss is going to depend on a number of unknowable factors such as the length of the string before you attempt the append. That said, the difference is probably extremely Extremely EXTREMELY small. Go with append for this -- it is far more readable.