pesky_programmer
pesky_programmer

Reputation: 149

timeout error with navigation and waitForSelector() in puppeteer irrespective of timeout value

I want my program to do this:

  1. open a web page
  2. click on a button to go to a new page
  3. take a screenshot of the new page.

Steps 1 and 2 are working fine but I'm running into timeout error with step 3. Based on responses to similar questions on StackOverflow, I used waitForNavigation() with bigger timeout spans (up to 2 min) but I'm still getting the same error. Using waitForSelector() instead of waitForNavigation() is also giving the same error. If I remove both, puppeteer takes a screenshot of the webpage in step 1. I have also tried using different options with waitUntil, such as "domcontentloaded", "loaded", "networkidle0" and "newtorkidle2", but nothing is working. This is my first program in puppeteer and I've been stuck on this problem for a long time.

Here's my code:

 await page.waitForSelector('#featured > c-wiz > div.OXo54d > div > div > div > span > span > span.veMtCf');
         
 // await navigation;
 await page.screenshot({path: 'learnmore.png'});
 console.log('GOT THIS FAR:)');
 //await page.close();
 await browser.close();
 return 0;

Here's the complete program:

const puppeteer = require('puppeteer');
(async () => {
    try{
        const browser = await puppeteer.launch({headless: false});
        const page = await browser.newPage();
       // const navigationPromise = page.waitForNavigation({waitUntil: "load"});

        //google.com
        await page.goto('https://google.com');
        await page.type('input.gLFyf.gsfi',"hotels in london");
        await page.keyboard.press('Enter');
        //search results
       // await navigationPromise;
        await page.waitForSelector('#rso > div:nth-child(2) > div > div > div > g-more-link > a > div');
        await page.click('#rso > div:nth-child(2) > div > div > div > g-more-link > a > div'); 
        //list of hotels
       // await navigationPromise;
        await page.waitForSelector('#yDmH0d > c-wiz.zQTmif.SSPGKf > div > div.lteUWc > div > c-wiz > div > div.gpcwnc > div.cGQUT > main > div > div.Hkwcrd.Sy8xcb.XBQ4u > c-wiz > div.J6e2Vc > div > div > span > span');
        await page.click("#yDmH0d > c-wiz.zQTmif.SSPGKf > div > div.lteUWc > div > c-wiz > div > div.gpcwnc > div.cGQUT > main > div > div.Hkwcrd.Sy8xcb.XBQ4u > c-wiz > div.l5cSPd > c-wiz:nth-child(3) > div > div > div > div.kCsInf.ZJqrAd.qiy8jf.G9g6o > div > div.TPQEac.qg10C.RCpQOe > a > button > span");
        //"learn more"
       // await navigationPromise;   
       
        //This is where timeout error occurs:
        await page.waitForSelector('#featured > c-wiz > div.OXo54d > div > div > div > span > span > span.veMtCf');             
       // await navigation;
        await page.screenshot({path: 'learnmore.png'});
        console.log('GOT THIS FAR:)');

        //await page.close();
        await browser.close();
        return 0;
    }
    catch(err){
       console.error(err);
    }
})()
.then(resolvedValue => {
    console.log(resolvedValue);
})
.catch(rejectedValue => {
    console.log(rejectedValue);
})

Upvotes: 0

Views: 3349

Answers (1)

theDavidBarton
theDavidBarton

Reputation: 8841

Your timeout occurs because the selector you are waitng for is not exist on the page. (If you are opening the browser console where the script stucks and launch $(selector) it will return null)

Google uses dynamic class and id values, exactly to prevent (or to make it harder) to retrieve data by scripts, the selectors will have different values everytime you visit the page.

If you really need to scrape its content you can use XPath selectors which are less fragile compared to dynamically changing selector names:

E.g.:

await page.waitForXpath('//h3[contains(text(), "The Best Hotels in London")]')

const link = await page.$x('//h3[contains(text(), "The Best Hotels in London")]')
await link[0].click()

Docs references:

Upvotes: 1

Related Questions