Misha Moroshko
Misha Moroshko

Reputation: 171479

How to download PDF blob using puppeteer?

When the download button is clicked, a new tab is opened where the user can view a PDF statement.

This new tab has a URL starting with blob:, e.g.: blob:https://some-domain.com/statement-id.

How could I download this PDF statement to the file system?

Note: I'm using { headless: false } mode.

Upvotes: 0

Views: 5763

Answers (1)

vsemozhebuty
vsemozhebuty

Reputation: 13822

Trying to simulate the case:

import puppeteer from 'puppeteer';
import { writeFileSync } from 'fs';

// Minimal PDF from https://github.com/mathiasbynens/small#documents
const minimalPdf = `%PDF-1.
1 0 obj<</Pages 2 0 R>>endobj
2 0 obj<</Kids[3 0 R]/Count 1>>endobj
3 0 obj<</Parent 2 0 R>>endobj
trailer <</Root 1 0 R>>`;

const browser = await puppeteer.launch({ headless: false, defaultViewport: null });

try {
  const [page] = await browser.pages();
  await page.goto('http://example.com/');

  await page.evaluate((pdf) => {
    const url = URL.createObjectURL(new Blob([pdf], {type: 'application/pdf'}));
    window.open(url);
  }, minimalPdf);

  const newTarget = await page.browserContext().waitForTarget(
    target => target.url().startsWith('blob:')
  );
  const newPage = await newTarget.page();
  const blobUrl = newPage.url();
  page.once('response', async (response) => {
    console.log(response.url());
    const pdfBuffer = await response.buffer();
    console.log(pdfBuffer.toString());
    console.log('same:', pdfBuffer.toString() === minimalPdf);
    writeFileSync('minimal.pdf', pdfBuffer);
  });
  await page.evaluate((url) => { fetch(url); }, blobUrl);

} catch(err) { console.error(err); } finally { /* await browser.close(); */ }

Upvotes: 4

Related Questions