dlyk1988
dlyk1988

Reputation: 437

Web scraping with node.js/cheerio - cannot get <span> text

I need to make a simple web scraper to grab some basic info about the Athens Stock Exchange in real time. My weapon of choice is Node.js and more specifically the 'cheerio' module.

The info I want to grab is represented in the website as the text inside some elements. These elements are nested inside another one. An example is this:

<span id="tickerGeneralIndex" class="style3red">
  <span class="percentagedelta">
    -0,50%
  </span>
</span>

In this case, the data I want to extract is '-0,50%'.

The code I have written is this:

var request = require('request'),
    cheerio = require('cheerio');

request('http://www.euro2day.gr/AseRealTime.aspx', function (error, response, html) {
    if (!error && response.statusCode == 200) {
        var $ = cheerio.load(html);
        var span = $('span.percentagedelta').text();
        console.log(span);
    }
});

This code does not produce the desired output. When run it logs a single empty line in the console.

I have tried to modify my code like this for testing purposes:

var request = require('request'),
    cheerio = require('cheerio');

request('http://www.euro2day.gr/AseRealTime.aspx', function (error, response, html) {
    if (!error && response.statusCode == 200) {
        var $ = cheerio.load(html);
        var span = $('span.percentagedelta').attr('class');
        console.log(span);
    }
});

This way I get 'percentagedelta' in the console. This is correct, as I have asked to get the class of the element. Of course this is not what I wanted. I merely did this to find out if the 'span' variable is loaded correctly.

I am beginning to suspect this has something to do with the characters in the text. Is it possible that some encoding issue is to blame? And if yes, how can I fix that?

Upvotes: 0

Views: 3647

Answers (1)

Rax Wunter
Rax Wunter

Reputation: 2767

The original html of http://www.euro2day.gr/AseRealTime.aspx has no data in 'percentagedelta' You can look throw you html variable.

Data is setting synchronically by javascript on the page

$("#tickerGeneralIndex .percentagedelta").html(data.percentageDelta);

Maybe it would be more simple to fetch http://www.euro2day.gr/handlers/data.ashx?type=3 that page loads with ajax

Upvotes: 3

Related Questions