bp123
bp123

Reputation: 3417

Looping through page links puppeteer doesn't return values from the newly loaded page

This all works. It loads the first page and creates a loop to open each of the links within the list. The problem I'm having is with const name = await page.evaluate(() => document.querySelector('li.inline.t-24.t-black.t-normal.break-words').innerText);. I don't understand why it is referencing the first loaded page and not the page that has been opened using link.click();. Can someone please explain.

(async function main() {
  try {
    // launch puppeteer
    const browser = await puppeteer.launch({headless: false});
    // open browser
    const page = await browser.newPage();
    page.setUserAgent('Mozilla/5.0 (Macintosh; Intel Mac OS X 10_14_6) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/77.0.3844.0 Safari/537.36');


    // load first page to generate list of links to scrape
    await page.goto("https://www.websitepage1.come");
    // wait for page to load
    await page.waitForSelector('h4 > span')
    // get list of buttons on page
    const lists = await page.$$('ul.search-results__list.list-style-none > li.search-result.search-result__occluded-item.ember-view');

    // loop through list of links
    for (let i = 0; i < lists.length; i++) {
      const list = lists[i]
      const link = await list.$('a.search-result__result-link.ember-view');
      const linkName = await list.$('a.search-result__result-link.ember-view span.name.actor-name');  
      const linkeNameText = await page.evaluate(linkName => linkName.innerText, linkName)     

      // open the link
      link.click();
      await page.waitForSelector('h4 > span')
      await page.waitForSelector('li.inline.t-24.t-black.t-normal.break-words')

      // THIS IS WHERE THE ERROR IS OCCURING. IT RETURNS AN ELEMENT FROMawait page.goto("https://www.websitepage1.come");
      const name = await page.evaluate(() => document.querySelector('li.inline.t-24.t-black.t-normal.break-words').innerText);
      console.log('name', name);
      console.log('\n\n');
    }
  } catch (e) {
    console.log('our error', e);

  }
})();

Upvotes: 2

Views: 5507

Answers (1)

Vaviloff
Vaviloff

Reputation: 16856

After clicking that link you need to wait for the new page to load, otherwise the rest of the script will work with the first loaded page. You need to wait for navigation:

// open and process the first page

await Promise.all([
  page.waitForNavigation(), // The promise resolves after navigation has finished
  link.click(), // Clicking the link will indirectly cause a navigation
]);

// now process the second page

Upvotes: 3

Related Questions