Jak
Jak

Reputation: 291

Page selector odd behaviors in puppeteer

I am trying to grab an element from the DOM of the page I am scraping in puppeteer.

After the page is loaded, I call page.$('.class-name'). It returns some weird object of the form { ClickTale: {} }.

When I call document.querySelector('.class-name') from the chrome console, I get a completely different object that corresponds to the element I am looking for.

My goal is to access the href property of said element in puppeteer. Thanks

Upvotes: 0

Views: 2561

Answers (2)

Hellonearthis
Hellonearthis

Reputation: 1762

I'm new to puppeteer too and experienced that too Jak. I would get an ElementHandle JSHandle in the response. I managed to extract the links using the getProperty but I could be doing it wrong. A better explanation of this is here

let te_responce = await page[0].$$('div[class="supergrid-bucket"] > a')
  console.log(`Number of entries ${te_responce.length}`)

for (let i = 0; i < te_responce.length; i++) {
  console.log(`link ${await(await te_responce[i].getProperty('href')).jsonValue()}`)
}

Upvotes: 0

Md. Abu Taher
Md. Abu Taher

Reputation: 18826

page.$(selector) is different than document.querySelector,

  • You run querySelector inside browser but page.$ on nodeJS.
  • page.$ returns an ElementHandle or null depending on result. querySelector returns a Dom element.

You can grab href of the said element using $eval. This will do a querySelector and evaluate on the result.

page.$eval('.class-name', elem => elem.href) 

Upvotes: 1

Related Questions