Reputation: 1648
I am developing application where user can upload some drawings in pdf format. Uploaded files are stored on S3. After uploading, files has to be converted to images. For this purpose I have created lambda function which downloads file from S3 to /tmp folder in lambda execution environment and then I call ‘convert’ command from imagemagick.
convert sourceFile.pdf targetFile.png
Lambda runtime environment is nodejs 4.3. Memory is set to 128MB, timeout 30 sec.
Now the problem is that some files are converted successfully while others are failing with the following error:
{ [Error: Command failed: /bin/sh -c convert /tmp/sourceFile.pdf /tmp/targetFile.png convert:
%s' (%d) "gs" -q -dQUIET -dSAFER -dBATCH -dNOPAUSE -dNOPROMPT -dMaxBitmap=500000000 -dAlignToPixels=0 -dGridFitTT=2 "-sDEVICE=pngalpha" -dTextAlphaBits=4 -dGraphicsAlphaBits=4 "-r72x72" "-sOutputFile=/tmp/magick-QRH6nVLV--0000001" "-f/tmp/magick-B610L5uo" "-f/tmp/magick-tIe1MjeR" @ error/utility.c/SystemCommand/1890. convert: Postscript delegate failed
/tmp/sourceFile.pdf': No such file or directory @ error/pdf.c/ReadPDFImage/678. convert: no images defined `/tmp/targetFile.png' @ error/convert.c/ConvertImageCommand/3046. ] killed: false, code: 1, signal: null, cmd: '/bin/sh -c convert /tmp/sourceFile.pdf /tmp/targetFile.png' }
At first I did not understand why this happens, then I tried to convert problematic files on my local Ubuntu machine with the same command. This is the output from terminal:
**** Warning: considering '0000000000 XXXXX n' as a free entry.
**** This file had errors that were repaired or ignored.
**** The file was produced by:
**** >>>> Mac OS X 10.10.5 Quartz PDFContext <<<<
**** Please notify the author of the software that produced this
**** file that it does not conform to Adobe's published PDF
**** specification.
So the message was very clear, but the file gets converted to png anyway. If I try to do convert source.pdf target.pdf
and after that convert target.pdf image.png
, file is repaired and converted without any errors. This doesn’t work with lambda.
Since the same thing works on one environment but not on the other, my best guess is that the version of Ghostscript is the problem. Installed version on AMI is 8.70. On my local machine Ghostsript version is 9.18.
My questions are:
UPDATE Relevant part of lambda code:
var exec = require('child_process').exec;
var AWS = require('aws-sdk');
var fs = require('fs');
...
var localSourceFile = '/tmp/sourceFile.pdf';
var localTargetFile = '/tmp/targetFile.png';
var writeStream = fs.createWriteStream(localSourceFile);
writeStream.write(body);
writeStream.end();
writeStream.on('error', function (err) {
console.log("Error writing data from s3 to tmp folder.");
context.fail(err);
});
writeStream.on('finish', function () {
var cmd = 'convert ' + localSourceFile + ' ' + localTargetFile;
exec(cmd, function (err, stdout, stderr ) {
if (err) {
console.log("Error executing convert command.");
context.fail(err);
}
if (stderr) {
console.log("Command executed successfully but returned error.");
context.fail(stderr);
}else{
//file converted successfully - do something...
}
});
});
Upvotes: 4
Views: 14400
Reputation: 69
The version of GS on aws is an old version with known bugs. We can get around this by uploading an x64 GS file, compiled specifically for Linux. Then upload that using new AWS lambda layers. I have written a node function that does just this here:
https://github.com/rcastoro/PDFImagine
Make sure you have that GS layer for your lambda, however!
Upvotes: 0
Reputation: 1556
You can find a compiled version of Ghostscript for Lambda in the following repository. You should add the files to the zip file that you are uploading as the source code to AWS Lambda.
https://github.com/sina-masnadi/lambda-ghostscript
This is an npm package to call Ghostscript functions:
https://github.com/sina-masnadi/node-gs
After copying the compiled Ghostscript files to your project and adding the npm package, you can use the executablePath('path to ghostscript')
function to point the package to the compiled Ghostscript files that you added earlier.
Upvotes: 4
Reputation: 31199
Its almost certainly a bug, or perhaps limitation, with the older version of Ghostscript.
Many PDF producers create PDF files which do not conform to the specification, and yet will open without complain in Adobe Acrobat. Ghostscript endeavours to do the same, but obviously we can't know what Acrobat is going to allow, so we are continually chasing this nebulous target. (FWIW that warning is a legitimate out-of-spec PDF file).
There's nothing you can do with the old version other than replace it.
Yes you can build Ghostscript from source, I have no idea about a nodejs module, not sure why that's relevant.
There are numerous other applications which will render a PDF file, MuPDF is another one I know of. And, of course, you can use Ghostscript directly without using ImageMagick. Of course, if you can load another application, then you should simply be able to replace your Ghostscript installation too.
Upvotes: 0