I'm trying to implement a fully asynchronous blob download with .NET 4.5 async & await.
Let's assume the entire blob can fit in memory at once, and we want to hold it in a string
.
Code:
public async Task<string> DownloadTextAsync(ICloudBlob blob)
{
using (Stream memoryStream = new MemoryStream())
{
IAsyncResult asyncResult = blob.BeginDownloadToStream(memoryStream, null, null);
await Task.Factory.FromAsync(asyncResult, (r) => { blob.EndDownloadToStream(r); });
memoryStream.Position = 0;
using (StreamReader streamReader = new StreamReader(memoryStream))
{
// is this good enough?
return streamReader.ReadToEnd();
// or do we need this?
return await streamReader.ReadToEndAsync();
}
}
}
Usage:
CloudStorageAccount storageAccount = CloudStorageAccount.Parse(CloudConfigurationManager.GetSetting("StorageAccountConnectionString"));
CloudBlobClient blobClient = storageAccount.CreateCloudBlobClient();
CloudBlobContainer container = blobClient.GetContainerReference("container1");
CloudBlockBlob blockBlob = container.GetBlockBlobReference("blob1.txt");
string text = await DownloadTextAsync(blockBlob);
Is this code correct and this is indeed fully asynchronous? Would you implement this differently?
I'd appreciate some extra clarifications:
GetContainerReference
and GetBlockBlobReference
don't need to be async since they don't contact the server yet, right?
Does streamReader.ReadToEnd
need to be async or not?
I'm a little confused about what BeginDownloadToStream
does.. by the time EndDownloadToStream
is called, does my memory stream have all the data inside? or is the stream only open pre read?
Update: (as of Storage 2.1.0.0 RC)
Async now supported natively.
CloudStorageAccount storageAccount = CloudStorageAccount.Parse(CloudConfigurationManager.GetSetting("StorageAccountConnectionString"));
CloudBlobClient blobClient = storageAccount.CreateCloudBlobClient();
CloudBlobContainer container = blobClient.GetContainerReference("container1");
CloudBlockBlob blockBlob = container.GetBlockBlobReference("blob1.txt");
string text = await blockBlob.DownloadTextAsync();
Is this code correct and this is indeed fully asynchronous?
Yes.
Would you implement this differently?
Yes. In particular, the TaskFactory.FromAsync
wrappers are much more efficient if you pass in a Begin
/End
method pair instead of passing in an existing IAsyncResult
. Like this:
await Task.Factory.FromAsync(blob.BeginDownloadToStream,
blob.EndDownloadToStream, memoryStream, null);
I also prefer to wrap these up into separate extension methods so I can call it like this:
await blog.DownloadToStreamAsync(memoryStream);
Note that the next version of the client libraries (2.1, currently in RC) will have async
-ready methods, i.e., DownloadToStreamAsync
.
GetContainerReference and GetBlockBlobReference don't need to be async since they don't contact the server yet, right?
Correct.
Does streamReader.ReadToEnd need to be async or not?
It does not (and should not). Stream
is a bit of an unusual case with async
programming. Usually, if there's an async
method then you should use it in your async
code, but that guideline doesn't hold for Stream
types. The reason is that the base Stream
class doesn't know whether its implementation is synchronous or asynchronous, so it assumes that it's synchronous and by default will fake its asynchronous operations by just doing the synchronous work on a background thread. Truly asynchronous streams (e.g., NetworkStream
) override this and provide true asynchronous operations. Synchronous streams (e.g., MemoryStream
) keep this default behavior.
So you don't want to call ReadToEndAsync
on a MemoryStream
.
I'm a little confused about what BeginDownloadToStream does.. by the time EndDownloadToStream is called, does my memory stream have all the data inside?
Yes. The operation is DownloadToStream
; that it, it downloads a blob into a stream. Since you are downloading a blob into a MemoryStream
, the blob is entirely in memory by the time this operation completes.