Reputation: 11
I'm developing a nextjs project that requires the use of puppeteer with an API endpoint to load a webpage to then be analysed. I'm using Vercel to deploy & host the project.
I've implemented the following (which runs successfully locally):
const browser = await puppeteer.launch({
headless: "new",
args: \["--disable-application-cache", "--no-sandbox"\],
});
However, when the same code is run on Vercel I get "Error: Could not find Chrome (ver. 121.0.6167.85). This can occur if either 1. you did not perform an installation before running the script (e.g. npx puppeteer browsers install chrome
) or 2. your cache path is incorrectly"
I've tried all the suggested fixes that are on Stackoverflow currently.
const browser = await puppeteer.launch({
headless: chromium.headless,
args: [
...chromium.args,
"--disable-web-security",
"--hide-scrollbars",
// "--disable-application-cache",
// "--no-sandbox",
],
defaultViewport: chromium.defaultViewport,
executablePath: await chromium.executablePath(
`https://github.com/Sparticuz/chromium/releases/download/v121.0.0/chromium-v121.0.0-pack.tar`
),
ignoreHTTPSErrors: true,
});
None have worked so far. I'm wondering if there's a better way for me to do this, other than using puppeteer.
The end goal of the function is to get the total transfer size of the page (including resources). This is the current implementation through puppeteer:
async function calculateTotalDataTransferWithResources(url) {
try {
const browser = await puppeteer.launch({
headless: "new",
args: ["--disable-application-cache", "--no-sandbox"],
});
// const browser = await puppeteer.launch({
// headless: chromium.headless,
// args: [
// ...chromium.args,
// "--disable-web-security",
// "--hide-scrollbars",
// // "--disable-application-cache",
// // "--no-sandbox",
// ],
// defaultViewport: chromium.defaultViewport,
// executablePath: await chromium.executablePath(
// `https://github.com/Sparticuz/chromium/releases/download/v116.0.0/chromium-v116.0.0-pack.tar`
// ),
// ignoreHTTPSErrors: true,
// });
const page = await browser.newPage();
await page.goto(url, { waitUntil: "networkidle0" });
const networkData = await page.evaluate(() => {
return window.performance.getEntriesByType("resource").map((resource) => {
return {
name: resource.name,
transferSize: resource.transferSize,
};
});
});
const totalData = networkData.reduce((total, resource) => {
return total + resource.transferSize;
}, 0);
await browser.close();
return { totalData, networkData };
} catch (error) {
console.error("Error:", error.message);
return { totalData: 0, networkData: [] };
}
}
Any help would be very much appreciated!! I've spent 2 days trying different solutions.
const browser = await puppeteer.launch({
headless: chromium.headless,
args: [
...chromium.args,
"--disable-web-security",
"--hide-scrollbars",
// "--disable-application-cache",
// "--no-sandbox",
],
defaultViewport: chromium.defaultViewport,
executablePath: await chromium.executablePath(
`https://github.com/Sparticuz/chromium/releases/download/v121.0.0/chromium-v121.0.0-pack.tar`
),
ignoreHTTPSErrors: true,
});
None have worked so far. I'm wondering if there's a better way for me to do this, other than using puppeteer.
The end goal of the function is to get the total transfer size of the page (including resources).
Upvotes: 1
Views: 3337
Reputation: 51
Hard to directly solve the exact issues you are experiencing without seeing any error logs. However...
I managed to get puppeteer running on Vercel using puppeteer-core
, @sparticuz/chromium-min
and hosting the tar file myself, but you can also use the provided link as you tried.
Note: I am using Vercel Pro, the free plan is limited on maxDuration
and memory
which might not meet the requirements to run chromium/puppeteer (eg. the route implementation below sometimes takes over 10s to run which currently is above the free limit) reference.
Another hosting option is using AWS Lambda directly instead of using it through Vercel (Vercel uses AWS behind the scenes as far as I know), in that case you might not have to pay for the pro plan. Perhaps serverless is a good option if you want to go that route.
Here is the relevant code from my project
My package.json
:
Note that the @sparticuz/chromium-min
-version (and the tar file version) should match the chromium version required by the puppeteer-core
-version see install step.
{
"name": "next-puppeteer",
"version": "0.0.1",
"private": true,
"scripts": {
"dev": "next dev",
"build": "next build",
"start": "next start",
"lint": "next lint"
},
"dependencies": {
"@sparticuz/chromium-min": "121.0.0",
"next": "14.1.0",
"puppeteer-core": "^21.9.0",
"react": "^18",
"react-dom": "^18"
},
"devDependencies": {
"typescript": "^5",
"@types/node": "^20",
"@types/react": "^18",
"@types/react-dom": "^18",
"eslint": "^8",
"eslint-config-next": "14.1.0"
}
}
My route.ts
-file in the nextjs api folder located at /api/puppeteer
:
import { NextResponse } from "next/server";
import chromium from "@sparticuz/chromium-min";
import puppeteer from "puppeteer-core";
// Host the tar-file yourself
// Or use https://github.com/Sparticuz/chromium/releases/download/v121.0.0/chromium-v121.0.0-pack.tar
const chromiumPack = "https://my-domain/chromium-v121.0.0-pack.tar";
const handler = async () => {
const browser = await puppeteer.launch({
args: chromium.args,
// See https://www.npmjs.com/package/@sparticuz/chromium#running-locally--headlessheadful-mode for local executable path
executablePath: await chromium.executablePath(chromiumPack),
headless: true,
});
const page = await browser.newPage();
await page.goto("https://google.com", { waitUntil: "networkidle0" });
const title = await page.evaluate(() => {
return document.title;
});
return NextResponse.json({ title });
};
// Uncomment if needed, only applicable if your plan allows it
// export const maxDuration = 300; // Seconds
export { handler as POST };
Also had to make the change below to next.config.js
in order to mark puppeteer-core
as external so it's not included in the bundle during the build step reference:
/** @type {import('next').NextConfig} */
const nextConfig = {
experimental: {
serverComponentsExternalPackages: ["puppeteer-core"],
},
};
module.exports = nextConfig;
So far this has worked for me. Hope it helps!
Upvotes: 5