Reputation: 520
I have some trouble using the newest version of puppeteer.
I'm using puppeteer version 0.13.0.
I have a site with this element:
<div class="header">hey there</div>
I'm trying to run this code:
const headerHandle = await page.evaluateHandle(() => {
const element = document.getElementsByClassName('header');
return element;
});
Now the headerHandle is a JSHandle with a description: 'HTMLCollection(0)'.
If I try to run
headerHandle.getProperties()
and try to console.log I get Promise { <pending> }
.
If I just try to get the element like this:
const result = await page.evaluate(() => {
const element = document.getElementsByClassName('header');
return Promise.resolve(element);
});
I get an empty object.
How do I get the actual element or the value of the element?
Upvotes: 7
Views: 12271
Reputation: 56885
Fabio's approach is good to have for working with arrays, but in many cases you don't need the nodes themselves, just their serializable contents or properties. In OP's case, there's only one element being selected, so the following works more directly (with less straightforward approaches shown for comparison):
const puppeteer = require("puppeteer"); // ^19.1.0
const html = `<!DOCTYPE html><html><body>
<div class="header">hey there</div>
</body></html>`;
let browser;
(async () => {
browser = await puppeteer.launch();
const [page] = await browser.pages();
await page.setContent(html);
const text = await page.$eval(".header", el => el.textContent);
console.log(text); // => hey there
// or, less directly:
const text2 = await page.evaluate(() => {
// const el = document.getElementsByClassName(".header")[0] // take the 0th element
const el = document.querySelector(".header"); // ... better still
return el.textContent;
});
console.log(text2); // => hey there
// even less directly, similar to OP:
const handle = await page.evaluateHandle(() =>
document.querySelector(".header")
);
const text3 = await handle.evaluate(el => el.textContent);
console.log(text3); // => hey there
})()
.catch(err => console.error(err))
.finally(() => browser?.close());
Getting the text from multiple elements is also straightforward, not requiring handles:
const html = `<!DOCTYPE html><html><body>
<div class="header">foo</div>
<div class="header">bar</div>
<div class="header">baz</div>
</body></html>`;
let browser;
(async () => {
browser = await puppeteer.launch();
const [page] = await browser.pages();
await page.setContent(html);
const text = await page.$$eval(
".header",
els => els.map(el => el.textContent)
);
console.log(text);
})()
.catch(err => console.error(err))
.finally(() => browser?.close());
As Fabio's approach attests, things get trickier when working with multiple elements when you want to use the handles in Puppeteer. Unlike the ElementHandle[]
return of page.$$
, page.evaluateHandle
's JSHandle return isn't iterable, even if the handle point to an array. It's only expandable into an array back into the browser.
One workaround is to return the length, optionally attach the selector array to the window (or re-query it multiple times), then run a loop and call evaluateHandle
to return each ElementHandle:
// ...
await page.setContent(html);
const length = await page.$$eval(".header", els => {
window.els = els;
return els.length;
});
const nodes = [];
for (let i = 0; i < length; i++) {
nodes.push(await page.evaluateHandle(i => window.els[i], i));
}
// now you can loop:
for (const el of nodes) {
console.log(await el.evaluate(el => el.textContent));
}
// ...
See also Puppeteer find list of shadowed elements and get list of ElementHandles which, in spite of the shadow DOM in the title, is mostly about working with arrays of handles.
Upvotes: 0
Reputation: 22862
Puppeteer has changed the way evaluate
works, the safest way to retrieve DOM elements is by creating a JSHandle, and passing that handle to the evaluate function:
const puppeteer = require('puppeteer');
(async () => {
const browser = await puppeteer.launch();
const page = await browser.newPage();
await page.goto('https://example.com', { waitUntil: 'networkidle2' });
const jsHandle = await page.evaluateHandle(() => {
const elements = document.getElementsByTagName('h1');
return elements;
});
console.log(jsHandle); // JSHandle
const result = await page.evaluate(els => els[0].innerHTML, jsHandle);
console.log(result); // it will log the string 'Example Domain'
await browser.close();
})();
For reference: evalute docs, issue #1590, issue #1003 and PR #1098
Upvotes: 21