xpt
xpt

Reputation: 22994

puppeteer waitForSelector and "none-existing" element

If you try the following simple example, you will find it is working without a problem:

const puppeteer = require('puppeteer');

(async() => {

  const browser = await puppeteer.launch();
  const page = await browser.newPage();

  const response = await page.goto(`https://finance.yahoo.com/chart/TSLA`, {waitUntil: 'networkidle2'});
  console.log(await response.text());

  let selector = `div#chart-toolbar ul > li`
  page.waitForSelector(selector)
  selector = `div#chart-toolbar ul > li:nth-child( 7 ) > button` // > span > span
  page.click(selector)

  const inputElement = await page.$('div#fin-chartiq')
  await inputElement.screenshot({path: 'yahoo-finance.png'})
  await browser.close();

})();

The yahoo-finance.png file looks like this:

enter image description here

However, if you take a closer look at the saved page source, you'd find that the keyword chart-toolbar is simply not in the file.

How come the above code can work, with such "none-existing" element?

The reason I'm asking is that, this is the stripped down version from my complicated scraping program, in which puppeteer, browser, page, and selector all comes from different level of class hierarchy, of which I'm getting:

TimeoutError: waiting for selector "div#chart-toolbar ul > li" failed: timeout 30000ms exceeded

The code is literally the same, but just within different class files, I've run out of ideas why one works mystically, while the others doesn't. Please help. thx.

Upvotes: 1

Views: 1956

Answers (1)

mbit
mbit

Reputation: 3013

There is no such a thing as "none-existing" element. The element is in the final DOM and response.text() only gives you the original HTML content. For the final DOM use page.content():

const response = await page.goto(`https://finance.yahoo.com/chart/TSLA`, {waitUntil: 'networkidle0'});
console.log(await page.content());

Regarding the timeout, it might be a race condition between navigation and waiting for the element to appear. This can happen for many reasons and it's very difficult to pinpoint the problem without looking at the code. I'd suggest to set headless to false, navigate to the page, and see if the navigation is complete and you can find the element you're looking for.


I also noted that you're not awaiting waitForSelector and click. I'm getting expected results using the following:

const puppeteer = require('puppeteer');

(async () => {
  const browser = await puppeteer.launch({ headless: true });
    try{
      const page = await browser.newPage();
      //HERE change networkidle2 to networkidle0
      const response = await page.goto(`https://finance.yahoo.com/chart/TSLA`, {waitUntil: 'networkidle0'});

      let selector = `div#chart-toolbar ul > li`
      await page.waitForSelector(selector)
      selector = `div#chart-toolbar ul > li:nth-child( 7 ) > button` // > span > span
      await page.click(selector)


      const inputElement = await page.$('div#fin-chartiq')
      await inputElement.screenshot({path: 'yahoo-finance.png'})

    }catch(err) {console.log(err.message);}
    finally{
        await browser.close();
    }
})()

Upvotes: 2

Related Questions