Reputation: 203
I'm trying to extract data from pdf files and return it. here's the code in the serverside in astro
import * as pdfjsLib from "pdfjs-dist";
pdfjsLib.GlobalWorkerOptions.workerSrc = "../../node_modules/pdfjs-dist/build/pdf.worker.mjs";
export const contentExtractor = async (arrayBufferPDF: ArrayBuffer): Promise<string> => {
const pdf = (pdfjsLib).getDocument(arrayBufferPDF);
return pdf.promise.then(async (pdf) => {
let totalContent = ""
const maxPages = pdf._pdfInfo.numPages;
for (let pageNumber = 1; pageNumber <= maxPages; pageNumber++) {
const page = await pdf.getPage(pageNumber);
const pageContent = await page.getTextContent();
const content = pageContent.items.map((s: any) => s.str).join(" ")
totalContent = totalContent + content
}
return totalContent
})
}
and the error is
12:44:40 [ERROR] Promise.withResolvers is not a function
Stack trace:
at /Users/some-user/Documents/Projects/Github/pdf-extractor/app/node_modules/pdfjs-dist/build/pdf.mjs:3026:32
[...] See full stack trace in the browser, or rerun with --verbose.
I don't understand where the problem is. Could someone help me with it?
Upvotes: 20
Views: 24172
Reputation: 11
I got the Promise.withResolvers
issue fixed using the legacy version.
in your .ts
file:
import * as pdfjs from 'pdfjs-dist/legacy/build/pdf.min.mjs';
But it triggered:
Could not find a declaration file for module 'pdfjs-dist/legacy/build/pdf.min.mjs'. '(....)/node_modules/pdfjs-dist/legacy/build/pdf.min.mjs' implicitly has an 'any' type.
If the 'pdfjs-dist' package actually exposes this module, try adding a new declaration (.d.ts) file containing `declare module 'pdfjs-dist/legacy/build/pdf.min.mjs';`
So I did:
in types.d.ts
file: (create it if not exists)
declare module 'pdfjs-dist/legacy/build/pdf.min.mjs';
in ts.config.json file:
add to compilerOptions
:
"compilerOptions": {
....
"typeRoots": [
"./node_modules/@types",
".@types"
],
}
It should remove the TypeScript error message.
Upvotes: 1
Reputation: 39488
The build of PDF.js you are using does not support running in Node.js (i.e. only in the browser). The error comes from Promise.withResolvers
being called, which is not supported by Node.js < v22. You can try upgrading your Node.js version.
But the recommended way to run pdf.js under Node.js is to use the legacy build (using pdfjs-dist/legacy/build/pdf.js
).
Upvotes: 18
Reputation: 61
Here is an another answer https://github.com/wojtekmaj/react-pdf/issues/1811#issuecomment-2157866061
// @ts-expect-error This does not exist outside of polyfill which this is doing
if (typeof Promise.withResolvers === 'undefined') {
if (window)
// @ts-expect-error This does not exist outside of polyfill which this is doing
window.Promise.withResolvers = function () {
let resolve, reject;
const promise = new Promise((res, rej) => {
resolve = res;
reject = rej;
});
return { promise, resolve, reject };
};
}
// there is your `/legacy/build/pdf.worker.min.mjs` url
pdfjs.GlobalWorkerOptions.workerSrc = new URL(
'pdfjs-dist/legacy/build/pdf.worker.min.mjs',
import.meta.url
).toString();
// or you can use this
// pdfjs.GlobalWorkerOptions.workerSrc="https://unpkg.com/[email protected]/legacy/build/pdf.worker.min.mjs"
Upvotes: 3