Puppeteer: How to get the contents of each element of a nodelist?

i.brod picture i.brod · Oct 16, 2018 · Viewed 18.6k times · Source

I'm trying to achieve something very trivial: Get a list of elements, and then do something with the innerText of each element.

const tweets = await page.$$('.tweet');

From what I can tell, this returns a nodelist, just like the document.querySelectorAll() method in the browser.

How do I just loop over it and get what I need? I tried various stuff, like:

[...tweets].forEach(tweet => {
  console.log(tweet.innerText)
});

Answer

Grant Miller picture Grant Miller · Oct 16, 2018

page.$$():

You can use a combination of elementHandle.getProperty() and jsHandle.jsonValue() to obtain the innerText from an ElementHandle obtained with page.$$():

const tweets = await page.$$('.tweet');

for (let i = 0; i < tweets.length; i++) {
  const tweet = await (await tweets[i].getProperty('innerText')).jsonValue();
  console.log(tweet);
}

If you are set on using the forEach() method, you can wrap the loop in a promise:

const tweets = await page.$$('.tweet');

await new Promise((resolve, reject) => {
  tweets.forEach(async (tweet, i) => {
    tweet = await (await tweet.getProperty('innerText')).jsonValue();
    console.log(tweet);
    if (i === tweets.length - 1) {
      resolve();
    }
  });
});

page.evaluate():

Alternatively, you can skip using page.$$() entirely, and use page.evaluate():

const tweets = await page.evaluate(() => Array.from(document.getElementsByClassName('tweet'), e => e.innerText));

tweets.forEach(tweet => {
  console.log(tweet);
});