Reputation: 139
I'm trying to extract a table from a website, and wanted to get all the columns first. After the request is made, I load the html into cheerio, but when I try to display the selector content, nothing appears on the console. What is confusing me is when I try the same selector directly on to the page console, it works and shows me all of them.
Here is the url I'm scraping.
Here is the cheerio selector I'm using to return the columns. The content I want is on the tag th with the class 'sorting'.
$('.sorting').each(function (index, element) {
const $element = $(element);
console.log($element.text());
});
And here is the full code.
const request = require('request');
const cheerio = require('cheerio');
const fundsExplorerUrl = 'https://www.fundsexplorer.com.br/ranking';
request(fundsExplorerUrl,
function (error, response, body) {
if (!error && response.statusCode == 200) {
const $ = cheerio.load(body);
$('.sorting').each(function (index, element) {
const $element = $(element);
console.log($element.text());
});
}
}
);
Thanks for helping!
Upvotes: 1
Views: 592
Reputation: 58
In the raw HTML, there is no class called sorting
because javascript is dynamically adding this class to dom so in this specific case by using following code you can gather the content of all th
tags embedded in the thead
tag of table
tag.
const request = require('request-promise');
const cheerio = require('cheerio');
const url = 'https://www.fundsexplorer.com.br/ranking';
async function crawl() {
const rawHtml = await request(url);
const $ = cheerio.load(rawHtml);
$('table thead tr th')
.each( (index, element) => {
console.log($(element).text());
})
}
crawl();
Upvotes: 2