Best way to upload a blob with a huge size in GBs to azure in the fastest time

Shraddha Bandekar picture Shraddha Bandekar · Feb 15, 2013 · Viewed 30.3k times · Source

Please can someone suggest the best way to upload/download a video blob of multiple GBs size in the fastest possible time to azure storage?

Answer

Stopped Contributing picture Stopped Contributing · Feb 15, 2013

Best way to upload/download large blobs from Windows Azure Storage is by chunking the upload/download and make proper use of multi-threading. There're a few things you would need to consider:

  1. Chunk size should depend on your Internet connection. For example, if you're on a really slow Internet connection then uploading large individual chunks will almost invariably result in request timeouts.
  2. Number of concurrent threads to upload/download should depend on the number of processor cores on the machine from where your application code is running. In my experience, if you're running your application on a 8 core machine for best performance you could spawn 8 multiple threads where each thread is uploading/downloading part of the data. One may get tempted to run 100s of threads and leave the thread management to the OS but what I have observed is that in such cases most of the time requests are getting timed out.
  3. Upload/download operation should be asynchronous. You don't want your application to block/hog resources on your computer.

For uploading a large file, you could decide the chunk size (let's say it is 1 MB) and concurrent threads (let's say it is 8) and then read 8 MB from the file in an array with 8 elements and start uploading those 8 elements in parallel using upload block functionality. Once the 8 elements are uploaded, you repeat the logic to read next 8 MB and continue this process till the time all bytes are uploaded. After that you would call commit block list functionality to commit the blob in blob storage.

Similarly for downloading a large file, again you could decide the chunk size and concurrent threads and then start reading parts of the blob by specifying "range" header in Get Blob functionality. Once these chunks are downloaded, you will need to rearrange based on their actual positions (as it may happen that you get 3 - 4 MB chunk downloaded before 0 - 1 MB chunk) and start writing these chunks to a file. You would need to repeat the process till the time all bytes are downloaded.