Reputation: 1035
As you can see with the sample code below, I'm using Puppeteer with a cluster of workers in Node to run multiple requests of websites screenshots by a given URL:
const cluster = require('cluster');
const express = require('express');
const bodyParser = require('body-parser');
const puppeteer = require('puppeteer');
async function getScreenshot(domain) {
let screenshot;
const browser = await puppeteer.launch({ args: ['--no-sandbox', '--disable-setuid-sandbox', '--disable-dev-shm-usage'] });
const page = await browser.newPage();
try {
await page.goto('http://' + domain + '/', { timeout: 60000, waitUntil: 'networkidle2' });
} catch (error) {
try {
await page.goto('http://' + domain + '/', { timeout: 120000, waitUntil: 'networkidle2' });
screenshot = await page.screenshot({ type: 'png', encoding: 'base64' });
} catch (error) {
console.error('Connecting to: ' + domain + ' failed due to: ' + error);
}
await page.close();
await browser.close();
return screenshot;
}
if (cluster.isMaster) {
const numOfWorkers = require('os').cpus().length;
for (let worker = 0; worker < numOfWorkers; worker++) {
cluster.fork();
}
cluster.on('exit', function (worker, code, signal) {
console.debug('Worker ' + worker.process.pid + ' died with code: ' + code + ', and signal: ' + signal);
Cluster.fork();
});
cluster.on('message', function (handler, msg) {
console.debug('Worker: ' + handler.process.pid + ' has finished working on ' + msg.domain + '. Exiting...');
if (Cluster.workers[handler.id]) {
Cluster.workers[handler.id].kill('SIGTERM');
}
});
} else {
const app = express();
app.use(bodyParser.json());
app.listen(80, function() {
console.debug('Worker ' + process.pid + ' is listening to incoming messages');
});
app.post('/screenshot', (req, res) => {
const domain = req.body.domain;
getScreenshot(domain)
.then((screenshot) =>
try {
process.send({ domain: domain });
} catch (error) {
console.error('Error while exiting worker ' + process.pid + ' due to: ' + error);
}
res.status(200).json({ screenshot: screenshot });
})
.catch((error) => {
try {
process.send({ domain: domain });
} catch (error) {
console.error('Error while exiting worker ' + process.pid + ' due to: ' + error);
}
res.status(500).json({ error: error });
});
});
}
Some explanation:
My problem is that some legitimate domains get errors that I can't explain:
Error: Protocol error (Page.navigate): Target closed.
Error: Protocol error (Runtime.callFunctionOn): Session closed. Most likely the page has been closed.
I read at some git issue (that I can't find now) that it can happen when the page redirects and adds 'www' at the start, but I'm hoping it's false... Is there something I'm missing?
Upvotes: 75
Views: 134753
Reputation: 56855
I've wound up at this thread a few times, and the typical culprit is that I forgot to await
a Puppeteer page
call that returned a promise, causing a race condition.
Here's a minimal example of what this can look like:
const puppeteer = require("puppeteer"); // ^17.0.0
let browser;
(async () => {
browser = await puppeteer.launch({headless: true});
const [page] = await browser.pages();
page.goto("https://www.stackoverflow.com"); // whoops, forgot await!
})()
.catch(err => console.error(err))
.finally(() => browser?.close());
Output is:
node_modules/puppeteer/lib/cjs/puppeteer/common/Connection.js:298
error: new Errors_js_1.ProtocolError(),
^
ProtocolError: Protocol error (Page.navigate): Target closed.
In this case, it seems like an unmissable error, but in a larger chunk of code and the promise is nested or in a condition, it's easy to overlook.
You'll get a similar error for forgetting to await
a page.click()
or other promise call, for example, Error: Protocol error (Runtime.callFunctionOn): Target closed.
, which can be seen in the question UnhandledPromiseRejectionWarning: Error: Protocol error (Runtime.callFunctionOn): Target closed. (Puppeteer)
In Puppeteer ^19.3.0, the same code gives a different error:
node_modules/puppeteer/node_modules/puppeteer-core/lib/cjs/puppeteer/common/LifecycleWatcher.js:83
(0, util_js_1.addEventListener)(frameManager.client, Connection_js_1.CDPSessionEmittedEvents.Disconnected, __classPrivateFieldGet(this, _LifecycleWatcher_instances, "m", _LifecycleWatcher_terminate).bind(this, new Error('Navigation failed because browser has disconnected!'))),
^
Error: Navigation failed because browser has disconnected!
In Puppeteer ^23.3.0, the error is slightly different:
node_modules/puppeteer-core/lib/cjs/puppeteer/cdp/LifecycleWatcher.js:103
this.#terminationDeferred.resolve(new Error('Navigating frame was detached'));
^
Error: Navigating frame was detached
This is a contribution to the thread as a canonical resource for the error and may not be the solution to OP's problem, although the fundamental race condition seems to be a likely cause.
Upvotes: 15
Reputation: 125
For me, updating the version of @sparticuz/chromium to the current version (123.0.1) fixed the issue
Upvotes: 7
Reputation: 41
The solution below works for me:
puppeteer: {
headless: true,
args: [
"--no-sandbox",
"--disable-setuid-sandbox",
"--disable-dev-shm-usage",
"--disable-accelerated-2d-canvas",
"--no-first-run",
"--no-zygote",
// "--single-process",
"--disable-gpu",
],
},
Upvotes: 4
Reputation: 404
After hours of frustrations I realized that this happens when it goes to a new page and I need to be using await page.waitForNavigation()
before I do anything and after I press a button or do any action that will cause it to redirect.
Upvotes: 3
Reputation: 5485
For me removing '--single-process'
from args
fixed the issue.
puppeteerOptions: {
headless: true,
args: [
'--disable-gpu',
'--disable-dev-shm-usage',
'--disable-setuid-sandbox',
'--no-first-run',
'--no-sandbox',
'--no-zygote',
'--deterministic-fetch',
'--disable-features=IsolateOrigins',
'--disable-site-isolation-trials',
// '--single-process',
],
}
Upvotes: 29
Reputation: 5040
In 2021 I'm receiving the very similar following error Error: Error pdf creationError: Protocol error (Target.setDiscoverTargets): Target closed.
, I solved it by playing with different args, so if your production server has a pipe:true
flag in puppeteer.launch
obj it will produce errors.
Also --disable-dev-shm-usage
flag do the trick
The solution below works for me:
const browser = await puppeteer.launch({
headless: true,
// pipe: true, <-- delete this property
args: [
'--no-sandbox',
'--disable-dev-shm-usage', // <-- add this one
],
});
Upvotes: 7
Reputation: 197
I was just experiencing the same issue every time I tried running my puppeteer script*. The above did not resolve this issue for me.
I got it to work by removing and reinstalling the puppeteer package:
npm remove puppeteer
npm i puppeteer
*I only experienced this issue when setting the headless option to 'false`
Upvotes: 10
Reputation: 41
Check your jest-puppeteer.config.js file. I made the below mistake
module.exports = {
launch: {
headless: false,
browserContext: "default",
},
};
and after correcting it as below
module.exports = {
launch: {
headless: false
},
browserContext: "default",
};
everything worked just fine!!!
Upvotes: 1
Reputation: 25230
When you launch a browser via puppeteer.launch
it will start a browser and connect to it. From there on any function you execute on your opened browser (like page.goto
) will be send via the Chrome DevTools Protocol to the browser. A target means a tab in this context.
The Target closed exception is thrown when you are trying to run a function, but the target (tab) was already closed.
The error message was recently changed to give more meaningful information. It now gives the following message:
Error: Protocol error (Target.activateTarget): Session closed. Most likely the page has been closed.
There are multiple reasons why this could happen.
You used a resource that was already closed
Most likely, you are seeing this message because you closed the tab/browser and are still trying to use the resource. To give an simple example:
const browser = await puppeteer.launch();
const page = await browser.newPage();
await browser.close();
await page.goto('http://www.google.com');
In this case the browser was closed and after that, a page.goto
was called resulting in the error message. Most of the time, it will not be that obvious. Maybe an error handler already closed the page during a cleanup task, while your script is still crawling.
The browser crashed or was unable to initialize
I also experience this every few hundred requests. There is an issue about this on the puppeteer repository as well. It seems to be the case, when you are using a lot of memory or CPU power. Maybe you are spawning a lot of browser? In these cases the browser might crash or disconnect.
I found no "silver bullet" solution to this problem. But you might want to check out the library puppeteer-cluster (disclaimer: I'm the author) which handles these kind of error cases and let's you retry the URL when the error happens. It can also manage a pool of browser instances and would also simplify your code.
Upvotes: 86