Nick
Nick

Reputation: 628

How do you return an object from the browser environment to the Node environment in Puppeteer?

I have the following code that attempts to scrape all the 'Add to basket' button elements from the page, put them in an array and return that array to the Node environment.

const puppeteer = require('puppeteer');

let getArrayofButtons = async () => {
  const browser = await puppeteer.launch({
    devtools: 'true',
  });

  const page = await browser.newPage();
  await page.setViewport({ width: 1280, height: 1800 });

  await page.goto('http://books.toscrape.com/', {
    waitUntil: 'domcontentloaded',
  });

  await page.waitForSelector('.product_pod');
  let buttons = [];

  await page.evaluate(() => {
    buttons = [...document.querySelectorAll('*')].filter(e =>
      [...e.childNodes].find(n => n.nodeValue?.match('basket'))
    );
    console.log(buttons);
  });
  // browser.close();
};
getArrayofButtons().then(returnedButtons => {
  console.log(returnedButtons);
});

When I console.log(buttons); I can see the array of button elements in the browser environment, but when I try to return that array to the Node environment I get undefined.

My understanding is that page.evaluate() will return the value of the function passed to it, so if I replace:

articles = [...document.querySelectorAll('*')].filter(e => [...e.childNodes].find(n => n.nodeValue?.match('basket')) );

with:

return [...document.querySelectorAll('*')].filter(e => [...e.childNodes].find(n => n.nodeValue?.match('basket')) );

it seems like it should work. Am I not resolving the Promise correctly?

Upvotes: 2

Views: 243

Answers (2)

hardkoded
hardkoded

Reputation: 21695

You can call evaluateHandle to get a pointer to that result.

const arrayHandle = await page.evaluateHandle(() => {
    buttons = [...document.querySelectorAll('*')].filter(e =>
      [...e.childNodes].find(n => n.nodeValue?.match('basket'))
    );
    return buttons;
  });

Notice that arrayHandle is not an array. It is an ElementHandle pointing to the array in the browser.

If you want to process each button on your side you will need to process that handle calling the getProperties function.

const properties = await arrayHandle.getProperties();
await arrayHandle.dispose();
const buttons = [];
for (const property of properties.values()) {
  const elementHandle = property.asElement();
  if (elementHandle)
    buttons.push(elementHandle);
}

Yes, it's quite a boilerplate. But you could grab that handle and pass it to an evaluate function.

page.evaluate((elements) => elements[0].click(), arrayHandle);

Upvotes: 1

vsemozhebuty
vsemozhebuty

Reputation: 13812

Unfortunately, page.evaluate() can only transfer serializable data (roughly, the data JSON can handle). DOM elements are not serializable. Consider returning an array of strings or something like that (HTML markup, attributes, text content etc).

Also, buttons is declared in the puppeteer (Node.js) context and is not available in browser context (in page.evaluate() function argument context). So you need const buttons = await page.evaluate() here.

Upvotes: 0

Related Questions