gianni
gianni

Reputation: 113

Using Puppeteer to extract text from span

I'm using Puppeteer to extract the text of a span by it's class name but I'm getting returned nothing. I don't know if its because the page isn't loading in time or not.

This is my current code:

async function Reload() {
    Page.reload()

    Price = await Page.evaluate(() => document.getElementsByClassName("text-robux-lg wait-for-i18n-format-render"))
    console.log(Price)
}
Reload()

HTML

<div class="icon-text-wrapper clearfix icon-robux-price-container">
     <span class="icon-robux-16x16 wait-for-i18n-format-render"></span>
     <span class="text-robux-lg wait-for-i18n-format-render">689</span>
</div>

Upvotes: 4

Views: 1798

Answers (1)

Amjed Omar
Amjed Omar

Reputation: 1014

because the function that you passed to Page.evaluate() returns a non-Serializable value.

from the puppeteer official document

If the function passed to the page.evaluate returns a non-Serializable value, then page.evaluate resolves to undefined

so you have to make the function that passed to Page.evaluate() returns the text of span element rather than returns the Element object of span.

like the following code

const puppeteer = require('puppeteer');

const htmlCode = `
  <div class="icon-text-wrapper clearfix icon-robux-price-container">
     <span class="icon-robux-16x16 wait-for-i18n-format-render"></span>
     <span class="text-robux-lg wait-for-i18n-format-render">689</span>
  </div>
`;

(async () => {
  const browser = await puppeteer.launch();

  const page = await browser.newPage();
  await page.setContent(htmlCode);

  const price = await page.evaluate(() => {
    const elements = document.getElementsByClassName('text-robux-lg wait-for-i18n-format-render');
    return Array.from(elements).map(element => element.innerText); // as you see, now this function returns array of texts instead of Array of elements
  })

  console.log(price); // this will log the text of all elements that have the specific class above
  console.log(price[0]); // this will log the first element that have the specific class above

  // other actions...
  await browser.close();
})();

NOTE: if you want to get the html code from another site by its url use page.goto() instead of page.setContent()

NOTE: because you are using document.getElementsByClassName() the returned value of the function that passed to page.evaluate() in the code above will be array of texts and not text as document.getElementById() do

NOTE: if you want to know what is the difference between Serializable objects and non-serializable objects read the answers of this question on Stackoverflow

Upvotes: 3

Related Questions