I have hacked together a small tool to extract shipping data from Amazon CSV order data. it works so far. here is a simple version as JS Bin: http://output.jsbin.com/jarako
For printing stamps/shipping labels, I need a file for uploading to Deutsche Post and to other parcel services. I used a small function saveTextAsFile
which i found on stackoverflow. Everything good so far. No wrong displayed special characters (äöüß...) in the output textarea or downloaded files.
All these german post / parcel services sites accept only latin1 / iso-8859-1 encoded files for upload. But my downloaded file is always utf-8. If i upload it, all special characters (äöüß...) go wrong.
How can i change this? I still searched a lot. I have tried i.e.:
Setting the charset of the tool to iso-8859-1:
<META http-equiv="Content-Type" content="text/html; charset=iso-8859-1" />
But the result is: Now I have wrong special characters still in the output textarea and in the downloaded file. If I upload it to the post site, I still get more wrong characters. Also if I check the encoding in CODA Editor it still says the downloaded file is UTF-8.
The saveTextAsFile
function uses var textFileAsBlob = new Blob([textToWrite], {type:'text/plain'});
. May be there is a ways to set the charset for download there!?
function saveTextAsFile()
{
var textToWrite = $('#dataOutput').val();
var textFileAsBlob = new Blob([textToWrite], {type:'text/plain'});
var fileNameToSaveAs = "Brief.txt";
var downloadLink = document.createElement("a");
downloadLink.download = fileNameToSaveAs;
downloadLink.innerHTML = "Download File";
if (window.webkitURL != null)
{
// Chrome allows the link to be clicked
// without actually adding it to the DOM.
downloadLink.href = window.webkitURL.createObjectURL(textFileAsBlob);
}
else
{
// Firefox requires the link to be added to the DOM
// before it can be clicked.
downloadLink.href = window.URL.createObjectURL(textFileAsBlob);
downloadLink.onclick = destroyClickedElement;
downloadLink.style.display = "none";
document.body.appendChild(downloadLink);
}
downloadLink.click();
}
Anyhow, there have to be a way to download files in other encoding as the site uses itself. The Amazon site, where i download the CSV file from is UTF-8 encoded. But downloaded CSV file from there is Latin1 (iso-8859-1) if i check it in CODA...
SCROLL DOWN TO THE UPDATE for the real solution!
Because I got no answer, I have searched more and more. It looks like there is NO SOLUTION in Javascript. Every test download I'v made, which was generated in javascript was UTF-8 encoded. Looks like Javascript is only made for UNICODE / UTF-8 or an other encoding would (possibly) only apply if the data would be transported again using a former HTTP transport. But for a Javascript, which runs on the client no additional HTTP transport happens, because the data is still on the client..
I have helped me now with building a small PHP Script on my server, to which i send the Data via GET or POST request. It converters the encoding to latin1 / ISO-8859-1 and downloads it as file. This is a ISO-8859-1 file with correctly encoded special characters, which I can upload to the mentioned postal and parcel service sites and everything looks good.
latin-download.php: (It is VERY IMPORTANT to save the PHP file itself also in ISO-8859-1, to make it work!!)
<?php
$decoded_a = urldecode($_REQUEST["a"]);
$converted_to_latin = mb_convert_encoding($decoded_a,'ISO-8859-1', 'UTF-8');
$filename = $_REQUEST["filename"];
header('Content-Disposition: attachment; filename="'.$filename.'"; content-type: text/plain; charset=iso-8859-1;');
echo $converted_to_latin;
?>
in my javascript code i use:
<a id="downloadlink">Download File</a>
<script>
var mydata = "this is testdata containing äöüß";
document.getElementById("downloadlink").addEventListener("click", function() {
var mydataToSend = encodeURIComponent(mydata);
window.open("latin-download.php?a=" + mydataToSend + "&filename=letter-max.csv");
}, false);
</script>
for bigger amounts of data you have to switch from GET to POST...
UPDATE 08-Feb-2016
A half year later now i have found a solution in PURE JAVASCRIPT. Using inexorabletash/text-encoding. This is a polyfill for Encoding Living Standard. The standard includes decoding of old encodings like latin1 ("windows-1252"), but it forbids encoding into these old encoding types. So if you use the browser implemented window.TextEncoder
function it does offer only UTF encoding. BUT, the polyfill solution offers a legacy mode, which does ALLOW also encoding into old encodings like latin1.
i use it like that:
<!DOCTYPE html>
<script>
// 'Copy' browser build in TextEncoder function to TextEncoderOrg (because it can NOT encode windows-1252, but so you can still use it as TextEncoderOrg() )
var TextEncoderOrg = window.TextEncoder;
// ... and deactivate it, to make sure only the polyfill encoder script that follows will be used
window.TextEncoder = null;
</script>
<script src="lib/encoding-indexes.js"></script> // needed to support encode to old encoding types
<script src="lib/encoding.js"></script> // encording polyfill
<script>
function download (content, filename, contentType) {
if(!contentType) contentType = 'application/octet-stream';
var a = document.createElement('a');
var blob = new Blob([content], {'type':contentType});
a.href = window.URL.createObjectURL(blob);
a.download = filename;
a.click();
}
var text = "Es wird ein schöner Tag!";
// Do the encoding
var encoded = new TextEncoder("windows-1252",{ NONSTANDARD_allowLegacyEncoding: true }).encode(text);
// Download 2 files to see the difference
download(encoded,"windows-1252-encoded-text.txt");
download(text,"utf-8-original-text.txt");
</script>
The encoding-indexes.js file is about 500kb big, because it contains all the encoding tables. Because i need only windows-1252 encoding, for my use i have deleted the other encodings in this file. so now there are only 632 byte left.