LioRz
LioRz

Reputation: 1035

Puppeteer - Protocol error (Page.navigate): Target closed

As you can see with the sample code below, I'm using Puppeteer with a cluster of workers in Node to run multiple requests of websites screenshots by a given URL:

const cluster = require('cluster');
const express = require('express');
const bodyParser = require('body-parser');
const puppeteer = require('puppeteer');

async function getScreenshot(domain) {
    let screenshot;
    const browser = await puppeteer.launch({ args: ['--no-sandbox', '--disable-setuid-sandbox', '--disable-dev-shm-usage'] });
    const page = await browser.newPage();

    try {
        await page.goto('http://' + domain + '/', { timeout: 60000, waitUntil: 'networkidle2' });
    } catch (error) {
        try {
            await page.goto('http://' + domain + '/', { timeout: 120000, waitUntil: 'networkidle2' });
            screenshot = await page.screenshot({ type: 'png', encoding: 'base64' });
        } catch (error) {
            console.error('Connecting to: ' + domain + ' failed due to: ' + error);
        }

    await page.close();
    await browser.close();

    return screenshot;
}

if (cluster.isMaster) {
    const numOfWorkers = require('os').cpus().length;
    for (let worker = 0; worker < numOfWorkers; worker++) {
        cluster.fork();
    }

    cluster.on('exit', function (worker, code, signal) {
        console.debug('Worker ' + worker.process.pid + ' died with code: ' + code + ', and signal: ' + signal);
        Cluster.fork();
    });

    cluster.on('message', function (handler, msg) {
        console.debug('Worker: ' + handler.process.pid + ' has finished working on ' + msg.domain + '. Exiting...');
        if (Cluster.workers[handler.id]) {
            Cluster.workers[handler.id].kill('SIGTERM');
        }
    });
} else {
    const app = express();
    app.use(bodyParser.json());
    app.listen(80, function() {
        console.debug('Worker ' + process.pid + ' is listening to incoming messages');
    });

    app.post('/screenshot', (req, res) => {
        const domain = req.body.domain;

        getScreenshot(domain)
            .then((screenshot) =>
                try {
                    process.send({ domain: domain });
                } catch (error) {
                    console.error('Error while exiting worker ' + process.pid + ' due to: ' + error);
                }

                res.status(200).json({ screenshot: screenshot });
            })
            .catch((error) => {
                try {
                    process.send({ domain: domain });
                } catch (error) {
                    console.error('Error while exiting worker ' + process.pid + ' due to: ' + error);
                }

                res.status(500).json({ error: error });
            });
    });
}

Some explanation:

  1. Each time a request arrives a worker will process it and kill itself at the end
  2. Each worker creates a new browser instance with a single page, and if a page took more than 60sec to load, it will retry reloading it (in the same page because maybe some resources has already been loaded) with timeout of 120sec
  3. Once finished both the page and the browser will be closed

My problem is that some legitimate domains get errors that I can't explain:

Error: Protocol error (Page.navigate): Target closed.
Error: Protocol error (Runtime.callFunctionOn): Session closed. Most likely the page has been closed.

I read at some git issue (that I can't find now) that it can happen when the page redirects and adds 'www' at the start, but I'm hoping it's false... Is there something I'm missing?

Upvotes: 75

Views: 134753

Answers (9)

ggorlen
ggorlen

Reputation: 56855

I've wound up at this thread a few times, and the typical culprit is that I forgot to await a Puppeteer page call that returned a promise, causing a race condition.

Here's a minimal example of what this can look like:

const puppeteer = require("puppeteer"); // ^17.0.0

let browser;
(async () => {
  browser = await puppeteer.launch({headless: true});
  const [page] = await browser.pages();
  page.goto("https://www.stackoverflow.com"); // whoops, forgot await!
})()
  .catch(err => console.error(err))
  .finally(() => browser?.close());

Output is:

node_modules/puppeteer/lib/cjs/puppeteer/common/Connection.js:298
                error: new Errors_js_1.ProtocolError(),
                       ^

ProtocolError: Protocol error (Page.navigate): Target closed.

In this case, it seems like an unmissable error, but in a larger chunk of code and the promise is nested or in a condition, it's easy to overlook.

You'll get a similar error for forgetting to await a page.click() or other promise call, for example, Error: Protocol error (Runtime.callFunctionOn): Target closed., which can be seen in the question UnhandledPromiseRejectionWarning: Error: Protocol error (Runtime.callFunctionOn): Target closed. (Puppeteer)

In Puppeteer ^19.3.0, the same code gives a different error:

node_modules/puppeteer/node_modules/puppeteer-core/lib/cjs/puppeteer/common/LifecycleWatcher.js:83
            (0, util_js_1.addEventListener)(frameManager.client, Connection_js_1.CDPSessionEmittedEvents.Disconnected, __classPrivateFieldGet(this, _LifecycleWatcher_instances, "m", _LifecycleWatcher_terminate).bind(this, new Error('Navigation failed because browser has disconnected!'))),
                                                                                                                                                                                                                              ^

Error: Navigation failed because browser has disconnected!

In Puppeteer ^23.3.0, the error is slightly different:

node_modules/puppeteer-core/lib/cjs/puppeteer/cdp/LifecycleWatcher.js:103
            this.#terminationDeferred.resolve(new Error('Navigating frame was detached'));
                                              ^

Error: Navigating frame was detached

This is a contribution to the thread as a canonical resource for the error and may not be the solution to OP's problem, although the fundamental race condition seems to be a likely cause.

Upvotes: 15

Miles
Miles

Reputation: 125

For me, updating the version of @sparticuz/chromium to the current version (123.0.1) fixed the issue

Upvotes: 7

Patrick Anjos
Patrick Anjos

Reputation: 41

The solution below works for me:

puppeteer: {
  headless: true,
  args: [
    "--no-sandbox",
    "--disable-setuid-sandbox",
    "--disable-dev-shm-usage",
    "--disable-accelerated-2d-canvas",
    "--no-first-run",
    "--no-zygote",
    // "--single-process",
    "--disable-gpu",
  ],
},

Upvotes: 4

destroyer22719
destroyer22719

Reputation: 404

After hours of frustrations I realized that this happens when it goes to a new page and I need to be using await page.waitForNavigation() before I do anything and after I press a button or do any action that will cause it to redirect.

Upvotes: 3

Shaig Khaligli
Shaig Khaligli

Reputation: 5485

For me removing '--single-process' from args fixed the issue.

puppeteerOptions: {
    headless: true,
    args: [
        '--disable-gpu',
        '--disable-dev-shm-usage',
        '--disable-setuid-sandbox',
        '--no-first-run',
        '--no-sandbox',
        '--no-zygote',
        '--deterministic-fetch',
        '--disable-features=IsolateOrigins',
        '--disable-site-isolation-trials',
        // '--single-process',
    ],
}

Upvotes: 29

Igor Kurkov
Igor Kurkov

Reputation: 5040

In 2021 I'm receiving the very similar following error Error: Error pdf creationError: Protocol error (Target.setDiscoverTargets): Target closed., I solved it by playing with different args, so if your production server has a pipe:true flag in puppeteer.launch obj it will produce errors.

Also --disable-dev-shm-usage flag do the trick

The solution below works for me:

const browser = await puppeteer.launch({
  headless: true,
  // pipe: true, <-- delete this property
  args: [
    '--no-sandbox',
    '--disable-dev-shm-usage', // <-- add this one
    ],
});

Upvotes: 7

Timo
Timo

Reputation: 197

I was just experiencing the same issue every time I tried running my puppeteer script*. The above did not resolve this issue for me.

I got it to work by removing and reinstalling the puppeteer package:

npm remove puppeteer
npm i puppeteer

*I only experienced this issue when setting the headless option to 'false`

Upvotes: 10

Shashank Shukla
Shashank Shukla

Reputation: 41

Check your jest-puppeteer.config.js file. I made the below mistake

module.exports = {
    launch: {
        headless: false,
        browserContext: "default",
    },
};

and after correcting it as below

module.exports = {
    launch: {
        headless: false
    },
    browserContext: "default",
};

everything worked just fine!!!

Upvotes: 1

Thomas Dondorf
Thomas Dondorf

Reputation: 25230

What "Target closed" means

When you launch a browser via puppeteer.launch it will start a browser and connect to it. From there on any function you execute on your opened browser (like page.goto) will be send via the Chrome DevTools Protocol to the browser. A target means a tab in this context.

The Target closed exception is thrown when you are trying to run a function, but the target (tab) was already closed.

Similar error messages

The error message was recently changed to give more meaningful information. It now gives the following message:

Error: Protocol error (Target.activateTarget): Session closed. Most likely the page has been closed.


Why does it happen

There are multiple reasons why this could happen.

  • You used a resource that was already closed

    Most likely, you are seeing this message because you closed the tab/browser and are still trying to use the resource. To give an simple example:

    const browser = await puppeteer.launch();
    const page = await browser.newPage();
    
    await browser.close();
    await page.goto('http://www.google.com');
    

    In this case the browser was closed and after that, a page.goto was called resulting in the error message. Most of the time, it will not be that obvious. Maybe an error handler already closed the page during a cleanup task, while your script is still crawling.

  • The browser crashed or was unable to initialize

    I also experience this every few hundred requests. There is an issue about this on the puppeteer repository as well. It seems to be the case, when you are using a lot of memory or CPU power. Maybe you are spawning a lot of browser? In these cases the browser might crash or disconnect.

    I found no "silver bullet" solution to this problem. But you might want to check out the library puppeteer-cluster (disclaimer: I'm the author) which handles these kind of error cases and let's you retry the URL when the error happens. It can also manage a pool of browser instances and would also simplify your code.

Upvotes: 86

Related Questions