Reputation: 23
I am trying to use the 'pdfjs-dist' package to extract the text from a pdf file that I retrieve from my AWS S3 bucket. When I call the code I receive the error:
Error: Setting up fake worker failed: "Cannot find module './pdf.worker.js'
I am confused on what this means, and not too sure on how to resolve it. My code looks like this:
import { NextResponse } from "next/server";
import * as PDFJS from 'pdfjs-dist';
import { TextItem } from 'pdfjs-dist/types/src/display/api';
export async function POST(
) {
let myFiledata = await fetch("url to my S3 bucket")
if (myFiledata.ok) {
let pdfDoc = await PDFJS.getDocument(await myFiledata.arrayBuffer()).promise
const numPages = pdfDoc.numPages
for (let i = 0; i < numPages; i++) {
let page = await pdfDoc.getPage(i + 1)
let textContent = await page.getTextContent()
const text = textContent.items.map((item) => (item as TextItem).str).join('');
console.log(text)
}
return new NextResponse("Success", { status: 200 });
} else {
return new NextResponse("Internal Error", { status: 500 });
}
}
How do I go about solving this issue? Do I need to add a file somewhere in my project, or is it isolated to the code?
I tried various methods of initializing the pdf.worker.js such by using some variation of:
PDFJS.GlobalWorkerOptions.workerSrc
And setting it equal to a various urls such as
pdfjs-dist/legacy/build/pdf.worker.entry.js;
pdfjs-dist/legacy/build/pdf.worker.entry.entry;
pdfjs-dist/legacy/build/pdf.worker
pdfjs-dist/build/pdf.worker.min.js
And even using the pdfjs version to call:
https://cdnjs.cloudflare.com/ajax/libs/pdf.js/${pdfVersion}/pdf.worker.js
Each time it says it does not recognize these urls, so either I'm calling it incorrectly, or just way off from the actual solution.
Upvotes: 1
Views: 524
Reputation: 23
I solved the error by moving the pdf.worker.js file which was nested in my node_modules/pdfjs-dist/build/pdf.worker.js to my public folder, copying the path and setting that path in this line of code:
PDFJS.GlobalWorkerOptions.workerSrc = "copied path to pdf.worker.js"
Upvotes: 1