diegoaguilar
diegoaguilar

Reputation: 8376

How to properly use PhantomJS for any website?

I'm trying to capture a website using PhantomJS, in particupar I'm using Pageres.

This website has got:

So, I'm testing locally and I'm not getting expected results, sometimes the screenshot will work with errors -rendering part of the contents, sometimes it won't render complete contents.

It really looks like Pagerer gets not enough time to take the screenshot once the site has fully loaded. I already added delay option but it will fail anyways, actually I could said it has worked better with out the delay option.

This is what it should be rendered:

enter image description here

And when it has worked best, this is what I get:

enter image description here

This is my code for rendering:

  var pageres = new Pageres({})
      .src('fantastica.a2015.mediotiempo.com', ['1440x900'], {delay: 3, crop: false});

  pageres.on('warn', function (err,obj) {console.log(err,obj)});
  pageres.run(function (err, screenshot) {
      screenshot[0].pipe(response);
  });

And, (I know there would be MUCH code to explain now) this is JS code being rendered.

Any particular advice?

Upvotes: 3

Views: 163

Answers (1)

Darren Cook
Darren Cook

Reputation: 28928

  • Be aware of the differences between Phantom versions.

Phantom 1.9.x (which Pageres is using) is a browser engine from quite a few years ago (Chrome 13 is the closest equivalent) and will not render many HTML5 features.

Phantom 2.x is a much more modern webkit engine. But because of: a) because they have not produced a ready-made linux binary; b) some small API changes, projects like CasperJS and Pageres are holding back on supporting it. According a comment in https://github.com/sindresorhus/pageres/issues/77 if you make your own binary, and symlink to it, it works.

Also be aware that SlimerJS is an alternative to PhantomJS, based on Firefox rather than WebKit. There is no similar project based on Blink (to get screenshots how a modern Chrome would render them), but there is TrifleJS for IE. (However the Pageres pages say it is not the goal of the project to support other engines.)

  • Wait for DOM elements to appear, rather than using a delay.

Ajax calls, delayed loading, etc. make things very hard to predict. So, enter a polling loop, and don't take your screenshot until the DOM element you want in your screenshot is now visible. CasperJS has waitForSelector() for this case. PhantomJS has the slightly lower-level waitFor().

I think pageres would need a bit of hacking to add this functionality.

Upvotes: 4

Related Questions