I want to implement a copy-on-write on my custom C++ String class, and I wonder how to...
I tried to implement some options, but they all turned out very inefficient.
Thank you guys :-)
In a multi-threaded environemnt (which is most of them nowadays) CoW is frequently a huge performance hit rather than a gain. And with careful use of const references, it's not much of a performance gain even in a single threaded environment.
This old DDJ article explains just how bad CoW can be in a multithreaded environment, even if there's only one thread.
Additionally, as other people have pointed out, CoW strings are really tricky to implement, and it's easy to make mistakes. That coupled with their poor performance in threading situations makes me really question their usefulness in general. This becomes even more true once you start using C++11 move construction and move assignment.
But, to answer your question....
Here are a couple of implementation techniques that may help with performance.
First, store the length in the string itself. The length is accessed quite frequently and eliminating the pointer dereference would probably help. I would, just for consistency put the allocated length there too. This will cost you in terms of your string objects being a bit bigger, but the overhead there in space and copying time is very small, especially since these values will then become easier for the compiler to play interesting optimization tricks with.
This leaves you with a string class that looks like this:
class MyString {
...
private:
class Buf {
...
private:
::std::size_t refct_;
char *data_;
};
::std::size_t len_;
::std::size_t alloclen_;
Buf *data_;
};
Now, there are further optimizations you can perform. The Buf class there looks like it doesn't really contain or do much, and this is true. Additionally, it requires allocating both an instance of Buf and a buffer to hold the characters. This seems rather wasteful. So, we'll turn to a common C implementation technique, stretchy buffers:
class MyString {
...
private:
struct Buf {
::std::size_t refct_;
char data_[1];
};
void resizeBufTo(::std::size_t newsize);
void dereferenceBuf();
::std::size_t len_;
::std::size_t alloclen_;
Buf *data_;
};
void MyString::resizeBufTo(::std::size_t newsize)
{
assert((data_ == 0) || (data_->refct_ == 1));
if (newsize != 0) {
// Yes, I'm using C's allocation functions on purpose.
// C++'s new is a poor match for stretchy buffers.
Buf *newbuf = ::std::realloc(data_, sizeof(*newbuf) + (newsize - 1));
if (newbuf == 0) {
throw ::std::bad_alloc();
} else {
data_ = newbuf_;
}
} else { // newsize is 0
if (data_ != 0) {
::std::free(data_);
data_ = 0;
}
}
alloclen_ = newsize;
}
When you do things this way, you can then treat data_->data_
as if it contained alloclen_
bytes instead of just 1.
Keep in mind that in all of these cases you will have to make sure that you either never ever use this in a multi-threaded environment, or that you make sure that refct_
is a type that you have both an atomic increment, and an atomic decrement and test instruction for.
There is an even more advanced optimization technique that involves using a union to store short strings right inside the bits of data that you would use to describe a longer string. But that's even more complex, and I don't think I will feel inclined to edit this to put a simplified example here later, but you never can tell.