f1rstsurf
f1rstsurf

Reputation: 647

Use post variable with querySelector

I'm facing an issue trying to scrape datas on the web with puppeteer and querySelector.

I have a nodeJS WebServer that handle a post query, and then call a function to scrape the datas. I'm sending 2 parameters (postBlogUrl & postDomValue).

PostDomValue will contains as string the selector I'm trying to fetch datas from, for example: [itemprop='articleBody'].

If I manually suggest the selector ([itemprop='articleBody']), everything is working well, I'm able to retrieve datas, but if i use the postDomValue var, nothing is returned.

I already tried to escape the var using CSS.escape(postDomValue), but no luck.

fetchBlogContent: async function(postBlogUrl, postDomValue) {
try {
  const puppeteer = require('puppeteer');
  const browser = await puppeteer.launch();
  page = await browser.newPage();
  await page.goto(postBlogUrl, {
    waitUntil: 'load'
  })
  let description = await page.evaluate(() => {
    //This works return document.querySelector("[itemprop='articleBody']").innerHTML;
    //This won't return document.querySelector(postDomValue).innerHTML;
  })
  return description
} catch (err) {
  // handle err
  return err;
 }
}

Upvotes: 1

Views: 428

Answers (2)

fortunee
fortunee

Reputation: 4332


const description = await page.evaluate((value) => 
    document.querySelector(value).innerHTML, JSON.stringify(postDomValue));

See docs on how to pass args to page.evaluate() in puppeteer

Upvotes: 3

vsemozhebuty
vsemozhebuty

Reputation: 13812

If I understand correctly, the issue may be that you try to use a variable declared in the Node.js context inside an argument function of page.evaluate() that is executed in the browser context. In such cases, you need to transfer the value of a variable as an additional argument:

  let description = await page.evaluate((selector) => {
    return document.querySelector(selector).innerHTML;
  }, postDomValue);

See more in page.evaluate().

Upvotes: 2

Related Questions