Reputation: 51
I'm trying to convert a multipage PDF file to image with PyMuPDF:
pdffile = "input.pdf"
doc = fitz.open(pdffile)
page = doc.loadPage() # number of page
pix = page.getPixmap()
output = "output.tif"
pix.writePNG(output)
But I need to convert all the pages of the PDF file to a single image in multi-page tiff, when I give the page argument a page range, it just takes one page, does anyone know how I can do it?
Upvotes: 4
Views: 15243
Reputation: 3140
PyMuPDF supports a limited set of image types for output. TIFF is not among them.
However, there is an easy way to interface with Pillow, which supports multiframe TIFF output.
Upvotes: 1
Reputation: 63
import fitz
pdffile = "input.pdf"
doc = fitz.open(pdffile)
i = 0
for page in doc:
i += 1
pix = page.getPixmap()
output = "output_" + str(i) + ".tif"
pix.save(output)
Upvotes: 1
Reputation: 342
import fitz
from PIL import Image
input_pdf = "input.pdf"
output_name = "output.tif"
compression = 'zip' # "zip", "lzw", "group4" - need binarized image...
zoom = 2 # to increase the resolution
mat = fitz.Matrix(zoom, zoom)
doc = fitz.open(input_pdf)
image_list = []
for page in doc:
pix = page.getPixmap(matrix = mat)
img = Image.frombytes("RGB", [pix.width, pix.height], pix.samples)
image_list.append(img)
if image_list:
image_list[0].save(
output_name,
save_all=True,
append_images=image_list[1:],
compression=compression,
dpi=(300, 300),
)
Upvotes: 7
Reputation: 196
When you want to convert all pages of the PDFs, you need a for loop. Also, when you call .getPixmap()
, you need properties like matrix = mat
to basically increase your resolution. Here is the code snippet (not sure if this is what you wanted but this will convert all PDFs to images):
doc = fitz.open(pdf_file)
zoom = 2 # to increase the resolution
mat = fitz.Matrix(zoom, zoom)
noOfPages = doc.pageCount
image_folder = '/path/to/where/to/save/your/images'
for pageNo in range(noOfPages):
page = doc.loadPage(pageNo) #number of page
pix = page.getPixmap(matrix = mat)
output = image_folder + str(pageNo) + '.jpg' # you could change image format accordingly
pix.writePNG(output)
print('Converting PDFs to Image ... ' + output)
# do your things afterwards
For resolution, here is a good example from Github to demo what it means and how it's used for your case if needed.
Upvotes: 4