Reputation: 279
I wrote this piece of code, but I'm not able to get the links:
const puppeteer = require('puppeteer');
(async() => {
const browser = await puppeteer.launch();
const page = await browser.newPage();
const countries = ['us', 'gb', 'ca', 'au', 'de', 'nz', 'albania', 'nl', 'is'];
const pia = 'https://www.privateinternetaccess.com/pages/network/'
await page.goto(pia);
for (let i = 0; i < countries.length; i++) {
let el = document.querySelectorAll(`#${countries[i]} > div > div > div.modal-body > div > .subregion > center > .hostname`);
for (let j = 0; j < el.length; j++) {
let url = `htpp://${el[j].innerText}:8888/speedtest`;
console.log(url);
}
}
await browser.close();
})();
The thing is, when I paste "countries[...]" and the for-loop in the browser's console it works just fine, but when I tried it from Node it gives me this big error, even though it prints the whole page if I use the "await page.content()" function:
(node:16300) UnhandledPromiseRejectionWarning: ReferenceError: document is not defined
at C:\Users\jason\Desktop\pptr\script.js:15:17
at processTicksAndRejections (internal/process/task_queues.js:97:5)
(node:16300) UnhandledPromiseRejectionWarning: Unhandled promise rejection. This error originated either by throwing inside of an async function without a catch block, or by rejecting a promise which was not handled with .catch(). To terminate the node process on unhandled promise rejection, use the CLI flag `--unhandled-rejections=strict` (see https://nodejs.org/api/cli.html#cli_unhandled_rejections_mode). (rejection id: 1)
(node:16300) [DEP0018] DeprecationWarning: Unhandled promise rejections are deprecated. In the future, promise rejections that are not handled will terminate the Node.js process with a non-zero exit code.
I thought it could be the way I'm targeting the elements, but again it works fine in the browser's console. What am I missing? All the help is welcome! Thanks!
Upvotes: 0
Views: 1028
Reputation: 13822
Puppeteer scripts run in Node.js context without direct access to browser (window, document, Web API) context. You need to use page.evaluate()
to run code in the browser context and get the data from the document:
const puppeteer = require('puppeteer');
(async () => {
const browser = await puppeteer.launch();
const page = await browser.newPage();
const countries = ['us', 'gb', 'ca', 'au', 'de', 'nz', 'albania', 'nl', 'is'];
const pia = 'https://www.privateinternetaccess.com/pages/network/';
await page.goto(pia);
for (let i = 0; i < countries.length; i++) {
const el = await page.evaluate(country => Array.from(
document.querySelectorAll(`#${country} > div > div > div.modal-body > div > .subregion > center > .hostname`),
element => element.innerText,
), countries[i]);
for (let j = 0; j < el.length; j++) {
const url = `htpp://${el[j]}:8888/speedtest`;
console.log(url);
}
}
await browser.close();
})();
Upvotes: 1