Reputation: 1377
In my web application, users may only upload images and PDFs. These come as base64 strings from the front to the backend. There, on the Node.js 8.9 server, I want to do some sanity checking, i.e. test whether the base64 strings I get are actually just images and PDFs - and nothing else.
For images, that was easy. Using the sharp npm-module with failOnError
true, gave me exactly what I wanted: One wrong char in the base64 string would cause a failure and the input would be rejected.
However, for PDFs I cannot find a similar solution. I've tried pdf2json (which seemed overpowered for my requirement anyway), but failed at passing base64 strings via converting to a buffer.
Upvotes: 1
Views: 4854
Reputation: 1377
I finally found an NPM module that does exactly what I expect: hummusJS. The code below works as far as my tests go: Valid PDFs are accepted, while invalid strings are rejected. Didn't notice any performance impacts so far.
var hummus = require('hummus');
let pdfBase64String = '<<base64 string here>>';
let bufferPdf;
try {
bufferPdf = Buffer.from(pdfBase64String, 'base64');
const pdfReader = hummus.createReader(new hummus.PDFRStreamForBuffer(bufferPdf));
var pages = pdfReader.getPagesCount();
if(pages > 0) {
console.log("Parsable with Hummus and more than 0 pages. Seems to be a valid PDF!");
}
else {
console.log("Unexpected outcome for number o pages: '" + pages + "'");
}
}
catch(err) {
console.log("ERROR while handling buffer of pdfBase64 and/or trying to parse PDF: " + err);
}
Upvotes: 3