Edison Lo
Edison Lo

Reputation: 476

Phantomjs unable to get refreshed content from HTML website by aspx

I wanted to get the real time update of an value displayed on a website Website: http://www.aastocks.com/en/stocks/market/bmpfutures.aspx Target html element id: font26 bold cls ff-arial

and i have been using phantomjs code as the following

var page = require('webpage').create();
page.open('http://www.aastocks.com/en/stocks/market/bmpfutures.aspx', function(status) {
  var last_value = -1

  setInterval(function() {
    var value = page.evaluate(function() {
      return document.getElementsByClassName('font26 bold cls ff-arial')[0].innerText
    })

    if (value != last_value) {
      console.log("Value as been updated to " + value)
      last_value = value
    }
  }, 1000)
//  phantom.exit()
})

with the screenshot here: enter image description here

The problem is when the code is first run, it is able to get the value, but after that, the value is cached and not be able to update.

even tried with

var needle = require('needle');
const cheerio = require('cheerio')
needle.get('http://www.aastocks.com/en/stocks/market/bmpfutures.aspx', 
function(error, response) {
  if (!error && response.statusCode == 200){
    const $ = cheerio.load(response.body)
    var value = $('#font26 bold cls ff-arial').html()
    console.log(value)
  }

});

Upvotes: 2

Views: 141

Answers (1)

Vaviloff
Vaviloff

Reputation: 16838

Unfortunately the needed value on the target page won't update in real time, so we will have to move interval out of the page.open callback to the main scope and just refresh the page as often as necessary:

var page = require('webpage').create();

var last_value = -1;

setInterval(function() {

    page.open('http://www.aastocks.com/en/stocks/market/bmpfutures.aspx', function(status) {

        var value = page.evaluate(function() {
          return document.getElementsByClassName('font26 bold cls ff-arial')[0].innerText
        })

        if (value != last_value) {
            console.log("Value as been updated to " + value)
            last_value = value;
        }
    });

}, 3000)

Obviously it's better not to hit the target site too often, also you should add a valid user agent, set a realistic resolution and rotate IPs.

P.S.

Just looked at the source of the page and it turned out you don't even need PhantomJS, as <div class="font26 bold cls ff-arial">26,696</div> is right there in the HTML. You can get it with any scripted server-side language.

UPDATE on node migration

You've almost done it right! The nuance was in the way to compose the selector. Since all those classes belong to one element you need to put them down like this:

const needle = require('needle');
const cheerio = require('cheerio')

setInterval(function(){
    needle.get('http://www.aastocks.com/en/stocks/market/bmpfutures.aspx', 
    function(error, response) {
      if (!error && response.statusCode == 200){
        const $ = cheerio.load(response.body)
        var value = $('.font26.bold.cls.ff-arial').html().trim()
        console.log(value)
      }
    })
}, 1000)

Upvotes: 3

Related Questions