outluch
outluch

Reputation: 901

Meteor app renders unstable with spiderable

Situation:

I want to run some app on my vps ubuntu server for crawl testing purposes. My app uses meteor-router from 'atmosphere' with mrt package manager. On my local mac os x 10.8 with phantomjs, installed with brew, everything goes fine. I get nice snapshot of my page by adding

http://sample.com/?_escaped_fragment_=

to url.

Problem:

Lets try the same on my ubuntu vps server. 2 ways:

1) copy not bundled app to server and run it with mrt run command: It works unstable. Sometimes it renders ok. But sometimes my dynamic content is blank. Like my db is empty.

2) copy not bundled app to server and mrt bundle fname.tgz it, then unpack .tgz and run its main.js with node. This way spiderable works absolutely wrong. i get blank instead of dynamic data every time i try.

My ideas:

My ubuntu machine has a lot less memory and processor resources than my local machine. That is why it takes more time to generate dynamic content, but phantom thinks that page is over and makes snapshot before meteor render.

Any suggestions?

Upvotes: 3

Views: 513

Answers (2)

SanD
SanD

Reputation: 839

I believe that the proper way to do this is to pass a callback to page.open, like so (see the docs):

page.open(url, function (status) {
    ...
};

Also, if you want to rely on a timeout for the snapshotting, I would decrease the timeout and wrap it in a cycle to both speed it up and make it more reliable:

page.open(url, function (status) {
    if(status !== 'success') {
        phantom.exit();
        return;
    }

    function isReady() {
        return page.evaluate(function () {
            if('undefined' === typeof Meteor
            || 'undefined' === typeof(Meteor.status)
            || !Meteor.status().connected)
                return false;
            Meteor.flush();
            return Meteor._LivedataConnection._allSubscriptionsReady();
        }
    }

    function trySnapshot() {
        if(!isReady()) {
            setTimeout(trySnapshot, 100);
            return;
        }
        console.log(page.content
            .replace(/<script[^>]+>(.|\\n|\\r)*?<\\/script\\s*>/ig, '')
            .replace('<meta name=\"fragment\" content=\"!\">', '')
        );
        phantom.exit();
    }
    trySnapshot();
};

I also think that my last snippet will frequently be executed without timeout at all, because page.open callback is called at the proper time

Upvotes: 0

outluch
outluch

Reputation: 901

I think I solved this issue. It is really a problem in spiderable.js file. this module runs phantomjs in REPL state and gives him such code by stdin:

var url = '" + url + "';
var page = require('webpage').create();
page.open(url);

setInterval(function() {
  var ready = page.evaluate(function () {
    if (typeof Meteor !== 'undefined'
        && typeof(Meteor.status) !== 'undefined'
        && Meteor.status().connected) {
      Meteor.flush();
      return Meteor._LivedataConnection._allSubscriptionsReady();
    }
    return false;
  });

  if (ready) {
    var out = page.content;
    out = out.replace(/<script[^>]+>(.|\\n|\\r)*?<\\/script\\s*>/ig, '');
    out = out.replace('<meta name=\"fragment\" content=\"!\">', '');

    console.log(out);
    phantom.exit();
  }
}, 100);

The problem is when all Meteor conditions are passed, it thinks that page.content is 100% updated. But it is not. The solution i found and tested is to wrap if block in setTimeout (500ms worked for me just fine):

  if (ready) {
    setTimeout(function () {
      var out = page.content;
      out = out.replace(/<script[^>]+>(.|\\n|\\r)*?<\\/script\\s*>/ig, '');
      out = out.replace('<meta name=\"fragment\" content=\"!\">', '');

      console.log(out);
      phantom.exit();
    }, 500);
  }

Upvotes: 2

Related Questions