A dynamic buffer type in C++?

Vilx- picture Vilx- · Dec 9, 2009 · Viewed 38k times · Source

I'm not exactly a C++ newbie, but I have had little serious dealings with it in the past, so my knowledge of its facilities is rather sketchy.

I'm writing a quick proof-of-concept program in C++ and I need a dynamically sizeable buffer of binary data. That is, I'm going to receive data from a network socket and I don't know how much there will be (although not more than a few MB). I could write such a buffer myself, but why bother if the standard library probably has something already? I'm using VS2008, so some Microsoft-specific extension is just fine by me. I only need four operations:

  • Create the buffer
  • Write data to the buffer (binary junk, not zero-terminated)
  • Get the written data as a char array (together with its length)
  • Free the buffer

What is the name of the class/function set/whatever that I need?

Added: Several votes go to std::vector. All nice and fine, but I don't want to push several MB of data byte-by-byte. The socket will give data to me in few-KB large chunks, so I'd like to write them all at once. Also, at the end I will need to get the data as a simple char*, because I will need to pass the whole blob along to some Win32 API functions unmodified.

Answer

GManNickG picture GManNickG · Dec 9, 2009

You want a std::vector:

std::vector<char> myData;

vector will automatically allocate and deallocate its memory for you. Use push_back to add new data (vector will resize for you if required), and the indexing operator [] to retrieve data.

If at any point you can guess how much memory you'll need, I suggest calling reserve so that subsequent push_back's won't have to reallocate as much.

If you want to read in a chunk of memory and append it to your buffer, easiest would probably be something like:

std::vector<char> myData;
for (;;) {
    const int BufferSize = 1024;
    char rawBuffer[BufferSize];

    const unsigned bytesRead = get_network_data(rawBuffer, sizeof(rawBuffer));
    if (bytesRead <= 0) {
        break;
    }

    myData.insert(myData.end(), rawBuffer, rawBuffer + bytesRead);
}

myData now has all the read data, reading chunk by chunk. However, we're copying twice.

We instead try something like this:

std::vector<char> myData;
for (;;) {
    const int BufferSize = 1024;

    const size_t oldSize = myData.size();
    myData.resize(myData.size() + BufferSize);        

    const unsigned bytesRead = get_network_data(&myData[oldSize], BufferSize);
    myData.resize(oldSize + bytesRead);

    if (bytesRead == 0) {
        break;
    }
}

Which reads directly into the buffer, at the cost of occasionally over-allocating.

This can be made smarter by e.g. doubling the vector size for each resize to amortize resizes, as the first solution does implicitly. And of course, you can reserve() a much larger buffer up front if you have a priori knowledge of the probable size of the final buffer, to minimize resizes.

Both are left as an exercise for the reader. :)

Finally, if you need to treat your data as a raw-array:

some_c_function(myData.data(), myData.size());

std::vector is guaranteed to be contiguous.