Timotronadon
Timotronadon

Reputation: 365

Pupeteer with very large PDF not waiting until loaded

Problem: Pupeteer generates a PDF when only about 5% of my data is there.

I'm using puppeteer to pass about 3000 lines of text to a handlebars HTML template I'm then trying to use puppeteer to print a PDF from. Had this working earlier today but a Git fiasco made me roll back and now I cant seem to generate a pdf longer than 3.5 pages (earlier this week it was up to about 90).

I'm thinking this has to do with the following:

const browser = await puppeteer.launch({
        args: ['--no-sandbox'],
        headless: true
    });

    var page = await browser.newPage();

    await page.goto(`data:text/html;charset=UTF-8,${html}`, {
        waitUntil:'load'. <------ (i've also tried networkidle0 and networkidle2)
    });


    await page.pdf(options);
    await browser.close()

Heres the template.html

<!DOCTYPE html>
<html>

<head>
    <title>PDF</title>

    <head>
        <style type="text/css">

        </style>
        <meta charset="utf-8">
    </head>

<body>
    <ul id="script">
        {{#each this}}
        <li class={{category}}>{{text}}</li>
        {{/each}}

    </ul>
</body>

</html>

My data is an array of 3300 objects and I know it's getting where it needs to. Is there anyway to set a static timeout for Puppeteer? I realize this is a lot of data but am I doing something wrong here?

Upvotes: 1

Views: 1390

Answers (1)

theDavidBarton
theDavidBarton

Reputation: 8851

The waitUntil:'load' goto parameter is the default, you don't need to set it, while the networkidle0 and networkidle2 options are waiting for network connections to be finished: as you don't have any of these as it is a plain HTML markup it neither helps to wait until it is populated with your desired data. I would rather suggest you to use domcontentloaded if you want to use waitUntil. You can check what are the exact differences between them in the docs.

I.) Your problem can be solved with a static timeout, it is called page.waitFor. If you are sure all data will be in the pdf in a certain time then you can set a static timeout, e.g. 3000 milliseconds (3 seconds) before the pdf generation.

await page.waitFor(3000);
await page.pdf(options);

II.) If you can access the very last text value of each object, you could also wait for the content to be appeared. But it will only work if you have unique content for each <li> element.

const veryLastItemText = options[options.length - 1].text // if "options" is an array with "category" and "text" property names inside

await page.waitForXPath(`//li[contains(text(), "${veryLastItemText}")]`);
await page.pdf(options);

Upvotes: 1

Related Questions