How to download, zip and save multiple files with Javascript and get progress?

guari picture guari · Jun 24, 2013 · Viewed 24.5k times · Source

I'm creating a Chrome extension that needs to download multiple files (images and/or videos) from a website. These files may have a huge size, so I want to show the download progress to the user. After some research I found that currently a possible solution might be:

  1. Download all the files with XMLHttpRequests.
  2. When downloaded, zip all the files into one archive with a JavaScript library (eg. JSZip.js, zip.js).
  3. Prompt the user to save the zip with SaveAs dialog.

I'm stuck at passage 2), how can I zip the downloaded files?

To understand, here is a code sample:

var fileURLs = ['http://www.test.com/img.jpg',...];
var zip = new JSZip();

var count = 0;
for (var i = 0; i < fileURLs.length; i++){
    var xhr = new XMLHttpRequest();
    xhr.onprogress = calculateAndUpdateProgress;
    xhr.open('GET', fileURLs[i], true);
    xhr.responseType = "blob";
    xhr.onreadystatechange = function () {
        if (xhr.readyState == 4) {
               var blob_url = URL.createObjectURL(response);
            // add downloaded file to zip:
            var fileName = fileURLs[count].substring(fileURLs[count].lastIndexOf('/')+1);
            zip.file(fileName, blob_url); // <- here's one problem

            count++;
            if (count == fileURLs.length){
                // all download are completed, create the zip
                var content = zip.generate();

                // then trigger the download link:
                var zipName = 'download.zip';
                var a = document.createElement('a'); 
                a.href = "data:application/zip;base64," + content;
                a.download = zipName;
                a.click();
            }
        }
    };
    xhr.send();
}

function calculateAndUpdateProgress(evt) {
    if (evt.lengthComputable) {
        // get download progress by performing some average 
        // calculations with evt.loaded, evt.total and the number
        // of file to download / already downloaded
        ...
        // then update the GUI elements (eg. page-action icon and popup if showed)
        ...
    }
}

The upper code generate a downloadable archive containing small corrupted files. There is also an issue with filename sync: blob object do not contains the file name, so If eg. fileURLs[0] takes more time to be downloaded than fileURLs[1] names become wrong (inverted)..

NOTE: I know that Chrome has a download API but it's in dev channel so unfortunately it's not a solution now, and I would like to avoid using NPAPI for such a simple task.

Answer

guari picture guari · Jul 17, 2013

I was reminded of this question.. since it has no answers yet, I write a possible solution in case it can be useful to someone else:

  • as said, the first problem is with passing blob url to jszip (it does not support blobs but it also does not throw any error to notify that and it successfully generates an archive of corrupted files): to correct this, simply pass a base64 string of the data instead of its blob object url;
  • the second problem is with file name synchronization: the easiest workaround here is to download one file at a time instead of using parallels xhr requests.

So, the modified upper code can be:

var fileURLs = ['http://www.test.com/img.jpg',...];
var zip = new JSZip();
var count = 0;

downloadFile(fileURLs[count], onDownloadComplete);


function downloadFile(url, onSuccess) {
    var xhr = new XMLHttpRequest();
    xhr.onprogress = calculateAndUpdateProgress;
    xhr.open('GET', url, true);
    xhr.responseType = "blob";
    xhr.onreadystatechange = function () {
        if (xhr.readyState == 4) {
            if (onSuccess) onSuccess(xhr.response);
}

function onDownloadComplete(blobData){
    if (count < fileURLs.length) {
        blobToBase64(blobData, function(binaryData){
                // add downloaded file to zip:
                var fileName = fileURLs[count].substring(fileURLs[count].lastIndexOf('/')+1);
                zip.file(fileName, binaryData, {base64: true});
                if (count < fileURLs.length -1){
                    count++;
                    downloadFile(fileURLs[count], onDownloadCompleted);
                }
                else {
                    // all files have been downloaded, create the zip
                    var content = zip.generate();

                    // then trigger the download link:        
                    var zipName = 'download.zip';
                    var a = document.createElement('a'); 
                    a.href = "data:application/zip;base64," + content;
                    a.download = zipName;
                    a.click();
                }
            });
    }
}

function blobToBase64(blob, callback) {
    var reader = new FileReader();
    reader.onload = function() {
        var dataUrl = reader.result;
        var base64 = dataUrl.split(',')[1];
        callback(base64);
    };
    reader.readAsDataURL(blob);
}

function calculateAndUpdateProgress(evt) {
    if (evt.lengthComputable) {
        ...
    }
}

Last note, this solution works quite well if you download few and little files (about less than 1MB as whole size for less than 10 files), in other cases JSZip will crash the browser tab when the archive is going to be generated, so it will be a better choice to use a separated thread for compression (a WebWorker, like zip.js does).

If after that the archive has been generated, the browser still keeps crashing with big files and without reporting any errors, try to trigger the saveAs window without passing binary data, but by passing a blob reference (a.href = URL.createObjectURL(zippedBlobData); where zippedBlobData is the blob object that refers to the generated archive data);