user1897151
user1897151

Reputation: 503

puppeteer to cheerio scraping from dynamic website for specific data

i wanted to scrape certain data from a mutual fund website where i can track only selective funds instead of all of them.

so i tried to puppeteer to scrape the dynamic table generated by the website. I manage to get the table but when i try to parse it to cheerio, seems like nothing happen

const scrapeImages = async (username) => {
   console.log("test");
   const browser = await puppeteer.launch({
      args: ['--no-sandbox']
    });
    const page = await browser.newPage();
    
    await page.goto('https://www.publicmutual.com.my/Our-Products/UT-Fund-Prices');
   await page.waitFor(5000);
   
  const data = await page.evaluate( () => {

        const tds = Array.from(document.querySelectorAll('div.form-group:nth-child(4) > div:nth-child(1) > div:nth-child(1)'))
    return tds.map(td => td.innerHTML)
    });
  
    await browser.close();

    console.log(data);
   
    let $ = cheerio.load(data);
   
      $('table > tbody > tr > td').each((index, element) => {

        console.log($(element).text());

    });

 };
  
  scrapeImages("test");

ultimately i am not sure how can i do this directly with puppeteer only instead of directing to cheerio for the scraping and also i would like to scrape only selected funds for instance, if you visit the web here https://www.publicmutual.com.my/Our-Products/UT-Fund-Prices

i would like to get only funds from abbreviation

instead of all of them. not sure how can i do this with only puppeteer?

Upvotes: 0

Views: 705

Answers (1)

pguardiario
pguardiario

Reputation: 54984

That page has jQuery already which is even better than cheerio:

const rows = await page.evaluate( () => {
  return $('.fundtable tr').get().map(tr => $(tr).find('td').get().map(td => $(td).text()))
}

Upvotes: 1

Related Questions