I am new to node.js (and to request.js). I'd like to get the body of a website back from a specific url with different paths (in the example below http://www.example.com/path1, http://www.example.com/path2, etc.) and log this data in an object with a key/value mapping (siteData[path] below).
var request = require('request'),
paths = ['path1','path2','path3'],
siteData = {},
pathLength = paths.length,
pathIndex = 0;
paths.forEach((path) => {
var url="http://www.example.com/"+path;
request(url, function(error, response, html){
if(!error){
siteData[path] = response.body;
pathIndex++;
if(pathIndex===pathLength){
someFunction(siteData);
}
}
});
function someFunction(data){
//manipulate data
}
My questions are:
(node) warning: possible EventEmitter memory leak detected. 11 unpipe listeners added. Use emitter.setMaxListeners() to increase limit.
I tried chaining request(url, function(...){}).setMaxListeners(100);
but that didn't work.Thanks for your help!
Looks like Promises are the right tool to get the job done here. Instead of a callback, we'll create a new Promise
object that will resolve when the job is done. We can say "once you're done, do some more stuff" with the .then
operator:
var rp = require('request-promise');
rp('http://www.google.com')
.then((htmlString) => {
// Process html...
});
(if anything goes wrong, the promise rejects and goes straight to .catch
)
someFunctionThatErrors('Yikes!')
.then((data) => {
// won't be called
})
.catch((err) => {
// Will be called, we handle the error here
});
We've got lots of async tasks to do, so just one promise won't work. One option is to string them all together in series, like so:
rp('http://www.google.com')
.then((htmlString) => rp('http://someOtherUrl.com'))
.then((otherHtmlString) => {
// and so forth...
But that loses some of the awesome of async - we can do all of these tasks in parallel.
var myRequests = [];
myRequests.push(rp('http://www.google.com').then(processStuff).catch(handleErr));
myRequests.push(rp('http://someOtherUrl.com').then(processStuff).catch(handleErr));
...boy does that look ugly. There's a better way with all of this - Promise.all()
(You're using arrow functions so I assume native Promise
will work for you too). It takes an array of promises and returns a promise that resolves when all of the array's promises have finished executing. (If any of them error, it immediately rejects). The .then
function will be given an array representing the value each promise resolved to.
var myRequests = [];
myRequests.push(rp('http://www.google.com'));
myRequests.push(rp('http://someOtherUrl.com'));
Promise.all(myRequests)
.then((arrayOfHtml) => {
// arrayOfHtml[0] is the results from google,
// arrayOfHtml[1] is the results from someOtherUrl
// ...etc
arrayOfHtml.forEach(processStuff);
})
.catch(/* handle error */);
Still, we have to manually call .push
for every link we want to hit. That won't do! Let's pull a nifty trick using Array.prototype.map
which will iterate over our array, manipulating each value in turn and return a new array comprised of the new values:
var arrayOfPromises = paths.map((path) => rp(`http://www.example.com/${path}`));
Promise.all(arrayOfPromises)
.then((arrayOfHtml) => arrayOfHtml.forEach(processStuff))
.catch(function (err) { console.log('agh!'); });
Much cleaner and easier error handling.