Igor Savinkin
Igor Savinkin

Reputation: 6267

Get an attribute of a page element in pupeeter/apify

I could fetch the textContent of a html element in pupeeter:

var website_element = await page.$('a[itemprop="url"]');
var website= await (await website_element .getProperty('textContent')).jsonValue();

yet, sometimes the textContent is not enough, see the following html:

<a itemprop="url" href="https://www.4-b.ch/de/4b-fenster-fassaden/home/">
https://www.4-b.ch/de/4b-fenster-fassad...</a>

the result is obscure: "https://www.4-b.ch/de/4b-fenster-fassad..." with ... at the end.

So, i better get the href attribute.

But when:

var website_element = await page.$('a[itemprop="url"]');
var website = await (await website_element.getAttribute('href')).jsonValue();

The result is TypeError: website_element.getAttribute is not a function

Any suggestion?

Upvotes: 0

Views: 1116

Answers (2)

Ondra Urban
Ondra Urban

Reputation: 677

There's an easy and fast way to do this using the page.$eval function:

var website = await page.$eval('a[itemprop="url"]', el => el.href);

What page.$eval does is that it first finds an element in the DOM using the provided selector (first argument) and then invokes the callback (second argument) with the found element as its only argument. The return value of the callback becomes the return value of page.$eval() itself.

Upvotes: 1

Igor Savinkin
Igor Savinkin

Reputation: 6267

it works:

var website_element = await page.$('a[itemprop="url"]');
var website = await (await website_element.getProperty('href')).jsonValue();

Upvotes: 0

Related Questions