Reputation: 101
I'm using the puppeteer to get a PDF using the Fetch API and save the file to disk.
I'm trying to save a PDF to a file on disk but when I open the pdf I see a white screen.
!!Edited!!
Found a solution here https://github.com/GoogleChrome/puppeteer/issues/299#issuecomment-340199753
Upvotes: 6
Views: 22143
Reputation: 3665
If you add the path option to the page.pdf()
it will save directly to the disk where the server is hosted.
If you return the pdf buffer from the server page.pdf()
and send it to client / front-end. You will have to process the pdf.
...
const pdfBuffer = await page.pdf({
printBackground: true,
format: 'A4',
});
res.send(pdfBuffer);
...
Say you have a route /download
which will return the pdf buffer from puppeteer's page.pdf()
option. And on front-end you have a button with id 'download` to process the stream. Here's how you would do it
a
tag which points the href
to the object url.function handleClick() {
fetch('/download')
.then((res) => res.blob()) // --- 1.
.then((readableStream) => {
const blob = new Blob([readableStream], { type: 'application/pdf' }); // --- 1.
blobToSaveAs('invoice', blob); // --- 2.
})
.catch((e) => console.error(e));
}
function blobToSaveAs(fileName, blob) {
try {
const url = window.URL.createObjectURL(blob); // --- 2.
const link = document.createElement('a'); // --- 3.
if (link.download !== undefined) {
link.setAttribute('href', url); // --- 3.
link.setAttribute('download', fileName); // --- 4.
link.style.visibility = 'hidden';
document.body.appendChild(link);
link.click(); // --- 4.
document.body.removeChild(link);
}
} catch (e) {
console.error('BlobToSaveAs error', e);
}
}
document.getElementById('download').addEventListener('click', handleClick); // --- 5.
Upvotes: 3
Reputation: 2389
await page.pdf({ path: 'path/to/save/pdf', format: 'A4' });
This will save the PDF in to disk.
Upvotes: 13
Reputation: 677
Since you're using Puppeteer already, the best way to save a webpage to PDF is just to open it using Puppeteer and then using the Puppeteer API to save the PDF.
The page.pdf()
function does just that. See docs.
I'm assuming that by using fetch()
, you're only downloading the getPdf.asp
which doesn't by itself produce a valid PDF response stream. Perhaps it only responds with a client side HTML including a script that fetches the PDF from some remote resource.
I would thus try:
await page.goto(PDF_PAGE_URL);
const pdfBuffer = await page.pdf();
// process the buffer
Hope it helps!
Upvotes: 3