Reputation: 437
I need to make a simple web scraper to grab some basic info about the Athens Stock Exchange in real time. My weapon of choice is Node.js and more specifically the 'cheerio' module.
The info I want to grab is represented in the website as the text inside some elements. These elements are nested inside another one. An example is this:
<span id="tickerGeneralIndex" class="style3red">
<span class="percentagedelta">
-0,50%
</span>
</span>
In this case, the data I want to extract is '-0,50%'.
The code I have written is this:
var request = require('request'),
cheerio = require('cheerio');
request('http://www.euro2day.gr/AseRealTime.aspx', function (error, response, html) {
if (!error && response.statusCode == 200) {
var $ = cheerio.load(html);
var span = $('span.percentagedelta').text();
console.log(span);
}
});
This code does not produce the desired output. When run it logs a single empty line in the console.
I have tried to modify my code like this for testing purposes:
var request = require('request'),
cheerio = require('cheerio');
request('http://www.euro2day.gr/AseRealTime.aspx', function (error, response, html) {
if (!error && response.statusCode == 200) {
var $ = cheerio.load(html);
var span = $('span.percentagedelta').attr('class');
console.log(span);
}
});
This way I get 'percentagedelta' in the console. This is correct, as I have asked to get the class of the element. Of course this is not what I wanted. I merely did this to find out if the 'span' variable is loaded correctly.
I am beginning to suspect this has something to do with the characters in the text. Is it possible that some encoding issue is to blame? And if yes, how can I fix that?
Upvotes: 0
Views: 3647
Reputation: 2767
The original html of http://www.euro2day.gr/AseRealTime.aspx has no data in 'percentagedelta' You can look throw you html variable.
Data is setting synchronically by javascript on the page
$("#tickerGeneralIndex .percentagedelta").html(data.percentageDelta);
Maybe it would be more simple to fetch http://www.euro2day.gr/handlers/data.ashx?type=3 that page loads with ajax
Upvotes: 3