String or StringBuilder return values?

John Bubriski picture John Bubriski · May 7, 2009 · Viewed 14.9k times · Source

If I am building a string using a StringBuilder object in a method, would it make sense to:

Return the StringBuilder object, and let the calling code call ToString()?

return sb;

OR Return the string by calling ToString() myself.

return sb.ToString();

I guess it make a difference if we're returning small, or large strings. What would be appropriate in each case? Thanks in advance.

Edit: I don't plan on further modifying the string in the calling code, but good point Colin Burnett.

Mainly, is it more efficient to return the StringBuilder object, or the string? Would a reference to the string get returned, or a copy?

Answer

Colin Burnett picture Colin Burnett · May 7, 2009

Return the StringBuilder if you're going to further modify the string, otherwise return the string. This is an API question.

Regarding efficiency. Since this is a vague/general question without any specifics then I think mutable vs. immutable is more important than performance. Mutability is an API issue of letting your API return modifiable objects. String length is irrelevant to this.

That said. If you look at StringBuilder.ToString with Reflector:

public override string ToString()
{
    string stringValue = this.m_StringValue;
    if (this.m_currentThread != Thread.InternalGetCurrentThread())
    {
        return string.InternalCopy(stringValue);
    }
    if ((2 * stringValue.Length) < stringValue.ArrayLength)
    {
        return string.InternalCopy(stringValue);
    }
    stringValue.ClearPostNullChar();
    this.m_currentThread = IntPtr.Zero;
    return stringValue;
}

You can see it may make a copy but if you modify it with the StringBuilder then it will make a copy then (this is what I can tell the point of m_currentThread is because Append checks this and will copy it if it mismatches the current thread).

I guess the end of this is that if you do not modify the StringBuilder then you do not copy the string and length is irrelevant to efficiency (unless you hit that 2nd if).

UPDATE

System.String is a class which means it is a reference type (as opposed to value type) so "string foo;" is essentially a pointer. (When you pass a string into a method it passes the pointer, not a copy.) System.String is mutable inside mscorlib but immutable outside of it which is how StringBuilder can manipulate a string.

So when ToString() is called it returns its internal string object by reference. At this point you cannot modify it because your code is not in mscorlib. By setting the m_currentThread field to zero then any further operations on the StringBuilder will cause it to copy the string object so it can be modified and not modify the string object it returned in ToString(). Consider this:

StringBuilder sb = new StringBuilder();
sb.Append("Hello ");

string foo = sb.ToString();

sb.Append("World");

string bar = sb.ToString();

If StringBuilder did not make a copy then at the end foo would be "Hello World" because the StringBuilder modified it. But since it did make a copy then foo is still just "Hello " and bar is "Hello World".

Does that clarify the whole return/reference thing?