Efficient string concatenation in C++

sneg picture sneg · Mar 4, 2009 · Viewed 95k times · Source

I heard a few people expressing worries about "+" operator in std::string and various workarounds to speed up concatenation. Are any of these really necessary? If so, what is the best way to concatenate strings in C++?

Answer

Brian R. Bondy picture Brian R. Bondy · Mar 4, 2009

The extra work is probably not worth it, unless you really really need efficiency. You probably will have much better efficiency simply by using operator += instead.

Now after that disclaimer, I will answer your actual question...

The efficiency of the STL string class depends on the implementation of STL you are using.

You could guarantee efficiency and have greater control yourself by doing concatenation manually via c built-in functions.

Why operator+ is not efficient:

Take a look at this interface:

template <class charT, class traits, class Alloc>
basic_string<charT, traits, Alloc>
operator+(const basic_string<charT, traits, Alloc>& s1,
          const basic_string<charT, traits, Alloc>& s2)

You can see that a new object is returned after each +. That means that a new buffer is used each time. If you are doing a ton of extra + operations it is not efficient.

Why you can make it more efficient:

  • You are guaranteeing efficiency instead of trusting a delegate to do it efficiently for you
  • the std::string class knows nothing about the max size of your string, nor how often you will be concatenating to it. You may have this knowledge and can do things based on having this information. This will lead to less re-allocations.
  • You will be controlling the buffers manually so you can be sure that you won't copy the whole string into new buffers when you don't want that to happen.
  • You can use the stack for your buffers instead of the heap which is much more efficient.
  • string + operator will create a new string object and return it hence using a new buffer.

Considerations for implementation:

  • Keep track of the string length.
  • Keep a pointer to the end of the string and the start, or just the start and use the start + the length as an offset to find the end of the string.
  • Make sure the buffer you are storing your string in, is big enough so you don't need to re-allocate data
  • Use strcpy instead of strcat so you don't need to iterate over the length of the string to find the end of the string.

Rope data structure:

If you need really fast concatenations consider using a rope data structure.