Parse Remote CSV File using Nodejs / Papa Parse?

Necevil picture Necevil · Dec 14, 2017 · Viewed 15.2k times · Source

I am currently working on parsing a remote csv product feed from a Node app and would like to use Papa Parse to do that (as I have had success with it in the browser in the past).

Papa Parse Github: https://github.com/mholt/PapaParse

My initial attempts and web searching haven't turned up exactly how this would be done. The Papa readme says that Papa Parse is now compatible with Node and as such Baby Parse (which used to serve some of the Node parsing functionality) has been depreciated.

Here's a link to the Node section of the docs for anyone stumbling on this issue in the future: https://github.com/mholt/PapaParse#papa-parse-for-node

From that doc paragraph it looks like Papa Parse in Node can parse a readable stream instead of a File. My question is;

Is there any way to utilize Readable Streams functionality to use Papa to download / parse a remote CSV in Node some what similar to how Papa in the browser uses XMLHttpRequest to accomplish that same goal?

For Future Visibility For those searching on the topic (and to avoid repeating a similar question) attempting to utilize the remote file parsing functionality described here: http://papaparse.com/docs#remote-files will result in the following error in your console:

"Unhandled rejection ReferenceError: XMLHttpRequest is not defined"

I have opened an issue on the official repository and will update this Question as I learn more about the problems that need to be solved.

Answer

TheDuke picture TheDuke · Mar 9, 2018

OK, so I think I have an answer to this. But I guess only time will tell. Note that my file is .txt with tab delimiters.

var fs = require('fs');
var Papa = require('papaparse');
var file = './rawData/myfile.txt';
// When the file is a local file when need to convert to a file Obj.
//  This step may not be necissary when uploading via UI
var content = fs.readFileSync(file, "utf8");

var rows;
Papa.parse(content, {
    header: false,
    delimiter: "\t",
    complete: function(results) {
        //console.log("Finished:", results.data);
    rows = results.data;
    }
});