Reputation: 101

Save PDF to File using puppeteer

I'm using the puppeteer to get a PDF using the Fetch API and save the file to disk.

I'm trying to save a PDF to a file on disk but when I open the pdf I see a white screen.

!!Edited!!

Found a solution here https://github.com/GoogleChrome/puppeteer/issues/299#issuecomment-340199753

Upvotes: 6

Answers (3)

Badal Saibo

Reputation: 3665

Backend

If you add the path option to the page.pdf() it will save directly to the disk where the server is hosted.

Frontend

If you return the pdf buffer from the server page.pdf() and send it to client / front-end. You will have to process the pdf.

...
 const pdfBuffer = await page.pdf({
   printBackground: true,
   format: 'A4',
 });
 res.send(pdfBuffer);
...

Say you have a route /download which will return the pdf buffer from puppeteer's page.pdf() option. And on front-end you have a button with id 'download` to process the stream. Here's how you would do it

Processing the buffer
Creating an ObjectURL with that buffer.
Create an a tag which points the href to the object url.
Add download attribute and mimic a fake click to that link.
Binding the download button to the handler function's 'click' event.

Code

function handleClick() {
    fetch('/download')
        .then((res) => res.blob()) // --- 1.
        .then((readableStream) => {
            const blob = new Blob([readableStream], { type: 'application/pdf' }); // --- 1.
            blobToSaveAs('invoice', blob); // --- 2.
        })
        .catch((e) => console.error(e));
}

function blobToSaveAs(fileName, blob) {
    try {
        const url = window.URL.createObjectURL(blob); // --- 2.
        const link = document.createElement('a'); // --- 3.
        if (link.download !== undefined) {
            link.setAttribute('href', url); // --- 3.
            link.setAttribute('download', fileName); // --- 4.
            link.style.visibility = 'hidden';
            document.body.appendChild(link);
            link.click(); // --- 4.
            document.body.removeChild(link);
        }
    } catch (e) {
        console.error('BlobToSaveAs error', e);
    }
}

document.getElementById('download').addEventListener('click', handleClick); // --- 5.

Upvotes: 3

ifelse.codes

Reputation: 2389

await page.pdf({ path: 'path/to/save/pdf', format: 'A4' });

This will save the PDF in to disk.

Upvotes: 13

Ondra Urban

Reputation: 677

Since you're using Puppeteer already, the best way to save a webpage to PDF is just to open it using Puppeteer and then using the Puppeteer API to save the PDF.

The page.pdf() function does just that. See docs.

I'm assuming that by using fetch(), you're only downloading the getPdf.asp which doesn't by itself produce a valid PDF response stream. Perhaps it only responds with a client side HTML including a script that fetches the PDF from some remote resource.

I would thus try:

await page.goto(PDF_PAGE_URL);
const pdfBuffer = await page.pdf();
// process the buffer

Hope it helps!