Trying to use fileReader.readAsBinaryString to upload a PNG file to the server via AJAX, stripped down code (fileObject is the object containing info on my file);
var fileReader = new FileReader();
fileReader.onload = function(e) {
var xmlHttpRequest = new XMLHttpRequest();
//Some AJAX-y stuff - callbacks, handlers etc.
xmlHttpRequest.open("POST", '/pushfile', true);
var dashes = '--';
var boundary = 'aperturephotoupload';
var crlf = "\r\n";
//Post with the correct MIME type (If the OS can identify one)
if ( fileObject.type == '' ){
filetype = 'application/octet-stream';
} else {
filetype = fileObject.type;
}
//Build a HTTP request to post the file
var data = dashes + boundary + crlf + "Content-Disposition: form-data;" + "name=\"file\";" + "filename=\"" + unescape(encodeURIComponent(fileObject.name)) + "\"" + crlf + "Content-Type: " + filetype + crlf + crlf + e.target.result + crlf + dashes + boundary + dashes;
xmlHttpRequest.setRequestHeader("Content-Type", "multipart/form-data;boundary=" + boundary);
//Send the binary data
xmlHttpRequest.send(data);
}
fileReader.readAsBinaryString(fileObject);
Examining the first few lines of a file before upload (using VI) gives me
The same file after upload shows
So it looks like a formatting/encoding issue somewhere, I tried using a simple UTF8 encode function on the raw binary data
function utf8encode(string) {
string = string.replace(/\r\n/g,"\n");
var utftext = "";
for (var n = 0; n < string.length; n++) {
var c = string.charCodeAt(n);
if (c < 128) {
utftext += String.fromCharCode(c);
}
else if((c > 127) && (c < 2048)) {
utftext += String.fromCharCode((c >> 6) | 192);
utftext += String.fromCharCode((c & 63) | 128);
}
else {
utftext += String.fromCharCode((c >> 12) | 224);
utftext += String.fromCharCode(((c >> 6) & 63) | 128);
utftext += String.fromCharCode((c & 63) | 128);
}
}
return utftext;
)
Then in the original code
//Build a HTTP request to post the file
var data = dashes + boundary + crlf + "Content-Disposition: form-data;" + "name=\"file\";" + "filename=\"" + unescape(encodeURIComponent(file.file.name)) + "\"" + crlf + "Content-Type: " + filetype + crlf + crlf + utf8encode(e.target.result) + crlf + dashes + boundary + dashes;
which gives me the output of
Still not what the raw file was =(
How do I encode/load/process the file to avoid the encoding issues, so the file being received in the HTTP request is the same as the file before it was uploaded.
Some other possibly useful information, if instead of using fileReader.readAsBinaryString() I use fileObject.getAsBinary() to get the binary data, it works fine. But getAsBinary only works in Firefox. I've been testing this in Firefox and Chrome, both on Mac, getting the same result in both. The backend uploads are being handled by the NGINX Upload Module, again running on Mac. The server and client are on the same machine. The same thing is happening with any file I try to upload, I just chose PNG because it was the most obvious example.
(Following is a late but complete answer)
FileReader.readAsBinaryString()
is deprecated. Don't use it! It's no longer in the W3C File API working draft:
void abort();
void readAsArrayBuffer(Blob blob);
void readAsText(Blob blob, optional DOMString encoding);
void readAsDataURL(Blob blob);
NB: Note that File
is a kind of extended Blob
structure.
Mozilla still implements readAsBinaryString()
and describes it in MDN FileApi documentation:
void abort();
void readAsArrayBuffer(in Blob blob); Requires Gecko 7.0
void readAsBinaryString(in Blob blob);
void readAsDataURL(in Blob file);
void readAsText(in Blob blob, [optional] in DOMString encoding);
The reason behind readAsBinaryString()
deprecation is in my opinion the following: the standard for JavaScript strings are DOMString
which only accept UTF-8 characters, NOT random binary data. So don't use readAsBinaryString(), that's not safe and ECMAScript-compliant at all.
We know that JavaScript strings are not supposed to store binary data but Mozilla in some sort can. That's dangerous in my opinion. Blob
and typed arrays
(ArrayBuffer
and the not-yet-implemented but not necessary StringView
) were invented for one purpose: allow the use of pure binary data, without UTF-8 strings restrictions.
XMLHttpRequest.send()
has the following invocations options:
void send();
void send(ArrayBuffer data);
void send(Blob data);
void send(Document data);
void send(DOMString? data);
void send(FormData data);
XMLHttpRequest.sendAsBinary()
has the following invocations options:
void sendAsBinary( in DOMString body );
sendAsBinary() is NOT a standard and may not be supported in Chrome.
So you have several options:
send()
the FileReader.result
of FileReader.readAsArrayBuffer ( fileObject )
. It is more complicated to manipulate (you'll have to make a separate send() for it) but it's the RECOMMENDED APPROACH.send()
the FileReader.result
of FileReader.readAsDataURL( fileObject )
. It generates useless overhead and compression latency, requires a decompression step on the server-side BUT it's easy to manipulate as a string in Javascript.sendAsBinary()
the FileReader.result
of FileReader.readAsBinaryString( fileObject )
MDN states that:
The best way to send binary content (like in files upload) is using ArrayBuffers or Blobs in conjuncton with the send() method. However, if you want to send a stringifiable raw data, use the sendAsBinary() method instead, or the StringView (Non native) typed arrays superclass.