Memorystream and Large Object Heap

flayn picture flayn · May 12, 2010 · Viewed 11k times · Source

I have to transfer large files between computers on via unreliable connections using WCF.

Because I want to be able to resume the file and I don't want to be limited in my filesize by WCF, I am chunking the files into 1MB pieces. These "chunk" are transported as stream. Which works quite nice, so far.

My steps are:

  1. open filestream
  2. read chunk from file into byte[] and create memorystream
  3. transfer chunk
  4. back to 2. until the whole file is sent

My problem is in step 2. I assume that when I create a memory stream from a byte array, it will end up on the LOH and ultimately cause an outofmemory exception. I could not actually create this error, maybe I am wrong in my assumption.

Now, I don't want to send the byte[] in the message, as WCF will tell me the array size is too big. I can change the max allowed array size and/or the size of my chunk, but I hope there is another solution.

My actual question(s):

  • Will my current solution create objects on the LOH and will that cause me problem?
  • Is there a better way to solve this?

Btw.: On the receiving side I simple read smaller chunks from the arriving stream and write them directly into the file, so no large byte arrays involved.

Edit:

current solution:

for (int i = resumeChunk; i < chunks; i++)
{
 byte[] buffer = new byte[chunkSize];
 fileStream.Position = i * chunkSize;
 int actualLength = fileStream.Read(buffer, 0, (int)chunkSize);
 Array.Resize(ref buffer, actualLength);
 using (MemoryStream stream = new MemoryStream(buffer)) 
 {
  UploadFile(stream);
 }
}

Answer

Joe Simmonds picture Joe Simmonds · May 18, 2010

I hope this is okay. It's my first answer on StackOverflow.

Yes absolutely if your chunksize is over 85000 bytes then the array will get allocated on the large object heap. You will probably not run out of memory very quickly as you are allocating and deallocating contiguous areas of memory that are all the same size so when memory fills up the runtime can fit a new chunk into an old, reclaimed memory area.

I would be a little worried about the Array.Resize call as that will create another array (see http://msdn.microsoft.com/en-us/library/1ffy6686(VS.80).aspx). This is an unecessary step if actualLength==Chunksize as it will be for all but the last chunk. So I would as a minimum suggest:

if (actualLength != chunkSize) Array.Resize(ref buffer, actualLength);

This should remove a lot of allocations. If the actualSize is not the same as the chunkSize but is still > 85000 then the new array will also be allocated on the Large object heap potentially causing it to fragment and possibly causing apparent memory leaks. It would I believe still take a long time to actually run out of memory as the leak would be quite slow.

I think a better implementation would be to use some kind of Buffer Pool to provide the arrays. You could roll your own (it would be too complicated) but WCF does provide one for you. I have rewritten your code slightly to take advatage of that:

BufferManager bm = BufferManager.CreateBufferManager(chunkSize * 10, chunkSize);

for (int i = resumeChunk; i < chunks; i++)
{
    byte[] buffer = bm.TakeBuffer(chunkSize);
    try
    {
        fileStream.Position = i * chunkSize;
        int actualLength = fileStream.Read(buffer, 0, (int)chunkSize);
        if (actualLength == 0) break;
        //Array.Resize(ref buffer, actualLength);
        using (MemoryStream stream = new MemoryStream(buffer))
        {
            UploadFile(stream, actualLength);
        }
    }
    finally
    {
        bm.ReturnBuffer(buffer);
    }
}

this assumes that the implementation of UploadFile Can be rewritten to take an int for the no. of bytes to write.

I hope this helps

joe