Huge JavaScript HTML5 blob (from large ArrayBuffers) to build a giant file in client side

sgmonda picture sgmonda · Dec 17, 2013 · Viewed 8.9k times · Source

I'm writing a web browser app (client-side) that downloads a huge amount of chunks from many locations and joins them to build a blob. Then that blob is saved to local filesystem as a common file. The way I'm doing this is by mean of ArrayBuffer objects and a blob.

var blob = new Blob([ArrayBuffer1, ArrayBuffer2, ArrayBuffer3, ...], {type: mimetype})

This works ok for small and medium-sized files (until 700 MB aprox), but browser crashes with larger files. I understand that RAM memory has its limits. The case is that I need to build the blob in order to generate a file, but I wanna allow users to download files much larger than that size (imagine, for instance, files about 8GB).

¿How can I build the blob avoiding size limits? LocalStorage is more limited than RAM, so I do not know what to use or how to do it.

Answer

Arthur Weborg picture Arthur Weborg · Dec 17, 2013

It looks like you are just concatenating arrays of data together? Why not go about appending the array-buffers together in a giant blob. You'd have to iterate and append each arrayBuffer one at a time. You would seek to the end of the filewriter to append arrays. And for reading only portions of your giant blob back you get a slice of the blob to avoid the browser crashing.

Appending Function

function appendToFile(fPath,data,callback){
    fs.root.getFile(fPath, {
        create: false
    }, function(fileEntry) {
        fileEntry.createWriter(function(writer) {
            writer.onwriteend = function(e) {
                callback();
            };
            writer.seek(writer.length);
            var blob = new Blob([data]);
            writer.write(blob);
        }, errorHandler);
    }, errorHandler);
}

Again to avoid reading the entire blob back, only read portions/chunks of your giant blob when generating the file you mention.

Partial Read Function

function getPartialBlobFromFile(fPath,start,stop,callback){
    fs.root.getFile(fPath, {
        creation:false
    }, function(fileEntry){
        fileEntry.file(function(file){
            var reader = new FileReader();
            reader.onloadend = function(evt){
                if(evt.target.readyState == FileReader.DONE){
                    callback(evt.target.result);
                }
            };
            stop = Math.min(stop,file.size);
            reader.readAsArrayBuffer(file.slice(start,stop));
        }, errorHandler)
    }, errorHandler);
}

You may have to keep indexes, perhaps in a header section of your giant BLOB - I would need to know more before I could give more precise feedback.


Update - avoiding quota limits, Temporary vs Persistent in response to your comments below
It appears that you are running into issues with storage quota because you are using temporary storage. The following is a snippet borrowed from google found here

Temporary storage is shared among all web apps running in the browser. The shared pool can be up to half of the of available disk space. Storage already used by apps is included in the calculation of the shared pool; that is to say, the calculation is based on (available storage space + storage being used by apps) * .5 .

Each app can have up to 20% of the shared pool. As an example, if the total available disk space is 50 GB, the shared pool is 25 GB, and the app can have up to 5 GB. This is calculated from 20% (up to 5 GB) of half (up to 25 GB) of the available disk space (50 GB).

To avoid this limit you'll have to switch to persistent, it will allow you to quota up to the available free space on the disk. To do this use the following to initialize the File-system instead of the temporary storage request.

navigator.webkitPersistentStorage.requestQuota(1024*1024*5, 
  function(gB){
  window.requestFileSystem(PERSISTENT, gB, onInitFs, errorHandler);
}, function(e){
  console.log('Error', e);
})