Reputation: 255
I'm using Puppeteer and jsDOM to scrape this site: https://www.lcfc.com/matches/results.
I want the names of the teams of every match, so on the console I use this:
document.querySelectorAll('.match-item__team-container span')
.forEach(element => console.log(element.textContent));
On the console, the names prints ok but when I use this on my code it returns nothing.
This is my code:
const puppeteer = require('puppeteer');
const jsdom = require('jsdom');
(async () => {
try {
const browser = await puppeteer.launch() ;
const page = await browser.newPage();
const response = await page.goto('https://www.lcfc.com/matches/results');
const body = await response.text();
const { window: { document } } = new jsdom.JSDOM(body);
document.querySelectorAll('.match-item__team-container span')
.forEach(element => console.log(element.textContent));
await browser.close();
} catch (error) {
console.error(error);
}
})();
And I don't have any error. Some suggestion? Thank you.
I tried with this code now, but still not working. I show the code and a picture of the console:
const puppeteer = require('puppeteer');
(async () => {
try {
const browser = await puppeteer.launch() ;
const page = await browser.newPage();
await page.waitForSelector('.match-item__team-container span');
const data = await page.evaluate(() => {
document.querySelectorAll('.match-item__team-container span')
.forEach(element => console.log(element.textContent));
});
//listen to console events in the chrome tab and log it in nodejs process
page.on('console', consoleObj => console.log(consoleObj.text()));
await browser.close();
} catch (error) {
console.log(error);
}
})();
Upvotes: 1
Views: 526
Reputation: 2904
Do it puppeter way and use evaluate
to run your code after waiting for the selector to appear via waitForSelector
await page.waitForSelector('.match-item__team-container span');
const data = await page.evaluate(() => {
document.querySelectorAll('.match-item__team-container span')
.forEach(element => console.log(element.textContent));
//or return the values of the selected item
return somevalue;
});
//listen to console events in the chrome tab and log it in nodejs process
page.on('console', consoleObj => console.log(consoleObj.text()));
evaluate
runs your code inside the active tab of the chrome so you will not need jsDOM
to parse the response.
UPDATE
The new timeout issue is because the page is taking too long to load: use {timeout : 0}
const data = await page.evaluate(() => {
document.querySelectorAll('.match-item__team-container span')
.forEach(element => console.log(element.textContent));
//or return the values of the selected item
return somevalue;
},{timeout:60000});
Upvotes: 1