hassan-badat
hassan-badat

Reputation: 63

How do I scrape a site that loads data asynchronously using node.js?

I am trying to scrape a site using Axios to make the request and Cheerio to parse the data. The problem I have is that the site I am making a request to loads data before displaying it. This is resulting in the HTML returning "Loading..." rather than the actual data. Is there a way to configure the Axios request to wait for the data to finish loading or is there a different library I should use to make the request?

Upvotes: 3

Views: 939

Answers (1)

Marcos Casagrande
Marcos Casagrande

Reputation: 40424

You'll need to check what XHR calls are being done, and do a request against that URL instead, since the content you want is not coming from the main URL, but from other API calls.

But the easiest way to scrape content that's being loaded dynamically using Javascript is to use puppeteer.

 const puppeteer = require('puppeteer')
 const browser = await puppeteer.launch({ headless: true })
 const page = await browser.newPage()
 await page.goto('https://example.com')

 await page.waitForSelector(".someSelectorThatsLoadedWithJavascript")
 // get whatever value you want now.

Upvotes: 3

Related Questions