Reputation: 26874
I have the need to convert any multipage PDF file into a set of JPGs.
Since the PDF files are supposed to come from a scanner, we can assume each page just contains a graphic object to extract, but I cannot be 100% sure of that.
So, I need to convert any renderable content from each page into a single JPEG file.
How can I do this with iText?
If I can't do this with iText, what Java library can achieve this?
Thanks.
Upvotes: 0
Views: 7401
Reputation: 4512
With Apache PDFBox you could do the following:
PDDocument document = PDDocument.load(pdffile);
List<PDPage> pages = document.getDocumentCatalog().getAllPages();
for (int i = 0; i < pages.size(); i++) {
PDPage page = pages.get(i);
BufferedImage image = page.convertToImage(BufferedImage.TYPE_INT_RGB, 72);
ImageIO.write(image, "jpg", new File(pdffile.getAbsolutePath() + "_" + i + ".jpg"));
}
Upvotes: 1
Reputation: 90213
Ghostscript (available for Windows, Linux, MacOS X, Solaris, AIX,...) can convert...
(The ImageMagick mentioned above doesn't do the conversion on its own -- it uses Ghostscript under the hood, as do many other tools.)
Upvotes: 2
Reputation: 3184
You can also use Sun's PDF-Renderer and JPedal does PDF to image (low and high res.
Upvotes: 1
Reputation: 75376
ICEpdf - http://www.icepdf.org/ - has an open source entry version which should do what you need.
I believe the primary difference between the open source version and the pay-for version is that the pay-for has much better font support.
Upvotes: 1