Reputation: 647
I'm facing an issue trying to scrape datas on the web with puppeteer and querySelector.
I have a nodeJS WebServer that handle a post query, and then call a function to scrape the datas. I'm sending 2 parameters (postBlogUrl & postDomValue).
PostDomValue will contains as string the selector I'm trying to fetch datas from, for example: [itemprop='articleBody'].
If I manually suggest the selector ([itemprop='articleBody']), everything is working well, I'm able to retrieve datas, but if i use the postDomValue var, nothing is returned.
I already tried to escape the var using CSS.escape(postDomValue), but no luck.
fetchBlogContent: async function(postBlogUrl, postDomValue) {
try {
const puppeteer = require('puppeteer');
const browser = await puppeteer.launch();
page = await browser.newPage();
await page.goto(postBlogUrl, {
waitUntil: 'load'
})
let description = await page.evaluate(() => {
//This works return document.querySelector("[itemprop='articleBody']").innerHTML;
//This won't return document.querySelector(postDomValue).innerHTML;
})
return description
} catch (err) {
// handle err
return err;
}
}
Upvotes: 1
Views: 428
Reputation: 4332
const description = await page.evaluate((value) =>
document.querySelector(value).innerHTML, JSON.stringify(postDomValue));
See docs on how to pass args to page.evaluate()
in puppeteer
Upvotes: 3
Reputation: 13812
If I understand correctly, the issue may be that you try to use a variable declared in the Node.js context inside an argument function of page.evaluate()
that is executed in the browser context. In such cases, you need to transfer the value of a variable as an additional argument:
let description = await page.evaluate((selector) => {
return document.querySelector(selector).innerHTML;
}, postDomValue);
See more in page.evaluate()
.
Upvotes: 2