fileReader.readAsBinaryString to upload files

Smudge picture Smudge · Sep 15, 2011 · Viewed 134k times · Source

Trying to use fileReader.readAsBinaryString to upload a PNG file to the server via AJAX, stripped down code (fileObject is the object containing info on my file);

var fileReader = new FileReader();

fileReader.onload = function(e) {
    var xmlHttpRequest = new XMLHttpRequest();
    //Some AJAX-y stuff - callbacks, handlers etc.
    xmlHttpRequest.open("POST", '/pushfile', true);
    var dashes = '--';
    var boundary = 'aperturephotoupload';
    var crlf = "\r\n";

    //Post with the correct MIME type (If the OS can identify one)
    if ( fileObject.type == '' ){
        filetype = 'application/octet-stream';
    } else {
        filetype = fileObject.type;
    }

    //Build a HTTP request to post the file
    var data = dashes + boundary + crlf + "Content-Disposition: form-data;" + "name=\"file\";" + "filename=\"" + unescape(encodeURIComponent(fileObject.name)) + "\"" + crlf + "Content-Type: " + filetype + crlf + crlf + e.target.result + crlf + dashes + boundary + dashes;

    xmlHttpRequest.setRequestHeader("Content-Type", "multipart/form-data;boundary=" + boundary);

    //Send the binary data
    xmlHttpRequest.send(data);
}

fileReader.readAsBinaryString(fileObject);

Examining the first few lines of a file before upload (using VI) gives me

enter image description here

The same file after upload shows

enter image description here

So it looks like a formatting/encoding issue somewhere, I tried using a simple UTF8 encode function on the raw binary data

    function utf8encode(string) {
        string = string.replace(/\r\n/g,"\n");
        var utftext = "";

        for (var n = 0; n < string.length; n++) {

            var c = string.charCodeAt(n);

            if (c < 128) {
                utftext += String.fromCharCode(c);
            }
            else if((c > 127) && (c < 2048)) {
                utftext += String.fromCharCode((c >> 6) | 192);
                utftext += String.fromCharCode((c & 63) | 128);
            }
            else {
                utftext += String.fromCharCode((c >> 12) | 224);
                utftext += String.fromCharCode(((c >> 6) & 63) | 128);
                utftext += String.fromCharCode((c & 63) | 128);
            }

        }

        return utftext;
    )

Then in the original code

//Build a HTTP request to post the file
var data = dashes + boundary + crlf + "Content-Disposition: form-data;" + "name=\"file\";" + "filename=\"" + unescape(encodeURIComponent(file.file.name)) + "\"" + crlf + "Content-Type: " + filetype + crlf + crlf + utf8encode(e.target.result) + crlf + dashes + boundary + dashes;

which gives me the output of

enter image description here

Still not what the raw file was =(

How do I encode/load/process the file to avoid the encoding issues, so the file being received in the HTTP request is the same as the file before it was uploaded.

Some other possibly useful information, if instead of using fileReader.readAsBinaryString() I use fileObject.getAsBinary() to get the binary data, it works fine. But getAsBinary only works in Firefox. I've been testing this in Firefox and Chrome, both on Mac, getting the same result in both. The backend uploads are being handled by the NGINX Upload Module, again running on Mac. The server and client are on the same machine. The same thing is happening with any file I try to upload, I just chose PNG because it was the most obvious example.

Answer

KrisWebDev picture KrisWebDev · Jul 7, 2013

(Following is a late but complete answer)

FileReader methods support


FileReader.readAsBinaryString() is deprecated. Don't use it! It's no longer in the W3C File API working draft:

void abort();
void readAsArrayBuffer(Blob blob);
void readAsText(Blob blob, optional DOMString encoding);
void readAsDataURL(Blob blob);

NB: Note that File is a kind of extended Blob structure.

Mozilla still implements readAsBinaryString() and describes it in MDN FileApi documentation:

void abort();
void readAsArrayBuffer(in Blob blob); Requires Gecko 7.0
void readAsBinaryString(in Blob blob);
void readAsDataURL(in Blob file);
void readAsText(in Blob blob, [optional] in DOMString encoding);

The reason behind readAsBinaryString() deprecation is in my opinion the following: the standard for JavaScript strings are DOMString which only accept UTF-8 characters, NOT random binary data. So don't use readAsBinaryString(), that's not safe and ECMAScript-compliant at all.

We know that JavaScript strings are not supposed to store binary data but Mozilla in some sort can. That's dangerous in my opinion. Blob and typed arrays (ArrayBuffer and the not-yet-implemented but not necessary StringView) were invented for one purpose: allow the use of pure binary data, without UTF-8 strings restrictions.

XMLHttpRequest upload support


XMLHttpRequest.send() has the following invocations options:

void send();
void send(ArrayBuffer data);
void send(Blob data);
void send(Document data);
void send(DOMString? data);
void send(FormData data);

XMLHttpRequest.sendAsBinary() has the following invocations options:

void sendAsBinary(   in DOMString body );

sendAsBinary() is NOT a standard and may not be supported in Chrome.

Solutions


So you have several options:

  1. send() the FileReader.result of FileReader.readAsArrayBuffer ( fileObject ). It is more complicated to manipulate (you'll have to make a separate send() for it) but it's the RECOMMENDED APPROACH.
  2. send() the FileReader.result of FileReader.readAsDataURL( fileObject ). It generates useless overhead and compression latency, requires a decompression step on the server-side BUT it's easy to manipulate as a string in Javascript.
  3. Being non-standard and sendAsBinary() the FileReader.result of FileReader.readAsBinaryString( fileObject )

MDN states that:

The best way to send binary content (like in files upload) is using ArrayBuffers or Blobs in conjuncton with the send() method. However, if you want to send a stringifiable raw data, use the sendAsBinary() method instead, or the StringView (Non native) typed arrays superclass.