Mahesh.D
Mahesh.D

Reputation: 1689

Generating PDF of a Web Page

I'm trying to generate a pdf file of a web page and want to save to local disk to email later.

I had tried this approach but the problem here is, its not working for pages like this. I'm able to generate the pdf, but its not matching with web page content.

Its very clear that pdf is generated before document.ready or might be something else. I'm unable to figure out the exact issue. I'm just looking for an approach where I can save web page output as pdf.

I hope generating pdf of a web page is more suitable in Node then PHP? If any solution in PHP is available then it will be a big help or even node implementation is also fine.

Upvotes: 1

Views: 2481

Answers (3)

K J
K J

Reputation: 11737

When saving HTML to PDF if the page is scripted over a time period we simply need to add a suitable delay so here the results are at 1/2 second (500 ms) and 1 second (1000 ms), you can simply increase more if page is more complex or your communications / PC is slower.

Using Chrome or Edge call the browser with more time allowance.

"C:\Program Files (x86)\Microsoft\Edge\Application\msedge.exe" --headless --print-to-pdf=C:/data/output2.pdf --no-pdf-header-footer --virtual-time-budget=1000 "https://www.chartjs.org/docs/latest/samples/other-charts/pie.html" && timeout 3 && C:/data/output2.pdf

enter image description here

Upvotes: 0

Fernando Paz
Fernando Paz

Reputation: 623

I did something similar using the html-pdf package.

The code is simple, you can use it like this:

pdf.create(html, options).toFile('./YourPDFName.pdf', function(err, res) {
        if (err) {
          console.log(err);
        }
});

See more about it on the package page here.

Hope it helps you.

Upvotes: 0

Vaviloff
Vaviloff

Reputation: 16838

Its very clear that pdf is generated before document ready

Very true, so it is necessary to wait until after scripts are loaded and executed.


You linked to an answer that uses phantom node module.

The module was upgraded since then and now supports async/await functions that make script much much more readable.

If I may suggest a solution that uses the async/await version (version 4.x, requires node 8+).

const phantom = require('phantom');

const timeout = ms => new Promise(resolve => setTimeout(resolve, ms));

(async function() {
  const instance = await phantom.create();
  const page = await instance.createPage();

  await page.property('viewportSize', { width: 1920, height: 1024 });

  const status = await page.open('http://www.chartjs.org/samples/latest/charts/pie.html');

  // If a page has no set background color, it will have gray bg in PhantomJS
  // so we'll set white background ourselves
  await page.evaluate(function(){
      document.querySelector('body').style.background = '#fff';
  });

  // Let's benchmark
  console.time('wait');

  // Wait until the script creates the canvas with the charts
  while (0 == await page.evaluate(function(){ return document.querySelectorAll("canvas").length }) )  {
      await timeout(250);
  }

  // Make sure animation of the chart has played
  await timeout(500);

  console.timeEnd('wait');

  await page.render('screen.pdf');

  await instance.exit();
})();

On my dev machine it takes 600ms to wait for the chart to be ready. Much better than to await timeout(3000) or any other arbitrary number of seconds.

Upvotes: 3

Related Questions