Reputation: 3417
This all works. It loads the first page and creates a loop to open each of the links within the list. The problem I'm having is with const name = await page.evaluate(() => document.querySelector('li.inline.t-24.t-black.t-normal.break-words').innerText);
. I don't understand why it is referencing the first loaded page and not the page that has been opened using link.click();
. Can someone please explain.
(async function main() {
try {
// launch puppeteer
const browser = await puppeteer.launch({headless: false});
// open browser
const page = await browser.newPage();
page.setUserAgent('Mozilla/5.0 (Macintosh; Intel Mac OS X 10_14_6) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/77.0.3844.0 Safari/537.36');
// load first page to generate list of links to scrape
await page.goto("https://www.websitepage1.come");
// wait for page to load
await page.waitForSelector('h4 > span')
// get list of buttons on page
const lists = await page.$$('ul.search-results__list.list-style-none > li.search-result.search-result__occluded-item.ember-view');
// loop through list of links
for (let i = 0; i < lists.length; i++) {
const list = lists[i]
const link = await list.$('a.search-result__result-link.ember-view');
const linkName = await list.$('a.search-result__result-link.ember-view span.name.actor-name');
const linkeNameText = await page.evaluate(linkName => linkName.innerText, linkName)
// open the link
link.click();
await page.waitForSelector('h4 > span')
await page.waitForSelector('li.inline.t-24.t-black.t-normal.break-words')
// THIS IS WHERE THE ERROR IS OCCURING. IT RETURNS AN ELEMENT FROMawait page.goto("https://www.websitepage1.come");
const name = await page.evaluate(() => document.querySelector('li.inline.t-24.t-black.t-normal.break-words').innerText);
console.log('name', name);
console.log('\n\n');
}
} catch (e) {
console.log('our error', e);
}
})();
Upvotes: 2
Views: 5507
Reputation: 16856
After clicking that link you need to wait for the new page to load, otherwise the rest of the script will work with the first loaded page. You need to wait for navigation
:
// open and process the first page
await Promise.all([
page.waitForNavigation(), // The promise resolves after navigation has finished
link.click(), // Clicking the link will indirectly cause a navigation
]);
// now process the second page
Upvotes: 3