NasMed
NasMed

Reputation: 15

Extracting the first page of multiple PDFs & saving them as Image

I have about 400 ebooks, all in PDF format, and my task is to extract the cover from every one of them (which is the first page of every PDF) and export them all as separate image (PNG or JPEG) files

So I will end up with 400 ebooks and 400 images of their covers.

I have Windows

Any advice greatly appreciated.

Upvotes: 1

Views: 1851

Answers (2)

thst
thst

Reputation: 4602

Use ghostscript to render tiff or jpg from the pdf. You have fine grained control over the result.

If this is a commercial application, you need a commercial license. If you use the application commercially, but inside your organisation, you are allowed to use the GPLed version of ghostscript.

Ghostscript can be found here. The PDF interpreter in many opensource packages relies on the gs PDF interpreter. Imagemagick for example, requires ghostscript libraries.

Download GS here: http://ghostscript.com/download/gsdnld.html

Use C# Process class to execute Ghostscript, there is a SO topic on this here How to run a C# console application with the console hidden

The commandline for tiff will be:

D:\gs\gs9.20>bin\gswin64c.exe -sOutputFile=d:\some%02d.tiff -dBATCH -dNOPAUSE -sDEVICE=tiff24nc -sCompression=lzw -r150 -sPageList=1 d:\PDFReference.pdf

This will create one some01.tiff file on d:\ in 150dpi resolution.

Upvotes: 2

Artur Hovhannisyan
Artur Hovhannisyan

Reputation: 104

The following thread is suitable for your request. converting pdf file to an jpeg image

One solution is to use a third party library. ImageMagick is a very popular, freely available too. You can get a .NET wrapper for it here. The original ImageMagick download page is here.

http://www.codeproject.com/KB/library/pdftoimages.aspx Convert PDF pages to image files using the Solid Framework http://www.print-driver.com/howto/convert_pdf_to_jpeg.html Universal Document Converter http://www.makeuseof.com/tag/6-ways-to-convert-a-pdf-file-to-a-jpg-image/ 6 Ways To Convert A PDF To A JPG Image And you also can take a look at this thread: how to open a page from a pdf file in pictureBox in C#

If you use this process to convert a PDF to tiff, you can use this class to retrieve the bitmap from tiff.

Upvotes: 1

Related Questions