Reputation: 4572
Is there any practical way to create a PDF from a list of images files, using Python?
In Perl I know that module. With it I can create a PDF in just 3 lines:
use PDF::FromImage;
...
my $pdf = PDF::FromImage->new;
$pdf->load_images(@allPagesDir);
$pdf->write_file($bookName . '.pdf');
I need to do something very similar to this, but in Python. I know the pyPdf module, but I would like something simple.
Upvotes: 151
Views: 266963
Reputation: 1
This script allows users to convert one or multiple images into a PDF file. It utilizes the FPDF library for PDF generation and Pillow (PIL) for image handling.
#!/usr/bin/env python3
from fpdf import FPDF
import os
from PIL import Image
def image_to_pdf_single(image_path, pdf_path):
pdf = FPDF()
with Image.open(image_path) as img:
img_width_px, img_height_px = img.size
img_dpi = img.info.get('dpi', (300, 300))[0]
img_width_mm = img_width_px / img_dpi * 25.4
img_height_mm = img_height_px / img_dpi * 25.4
print(img_width_mm)
page_width = 210
page_height = 297
scale_width = page_width / img_width_mm
scale_height = page_height / img_height_mm
scale = min(scale_width, scale_height)
new_width_mm = img_width_mm * scale
new_height_mm = img_height_mm * scale
x = (page_width - new_width_mm) / 2
y = (page_height - new_height_mm) / 2
pdf.add_page()
pdf.image(image_path, x=x, y=y, w=new_width_mm, h=new_height_mm)
pdf.output(pdf_path)
def image_to_pdf_multiple(mult_img, pdf_path):
pdf = FPDF()
for image_path in mult_img:
with Image.open(image_path) as img:
img_width_px, img_height_px = img.size
img_dpi = img.info.get('dpi', (300, 300))[0]
img_width_mm = img_width_px / img_dpi * 25.4
img_height_mm = img_height_px / img_dpi * 25.4
print(img_width_mm)
page_width = 210
page_height = 297
scale_width = page_width / img_width_mm
scale_height = page_height / img_height_mm
scale = min(scale_width, scale_height)
new_width_mm = img_width_mm * scale
new_height_mm = img_height_mm * scale
x = (page_width - new_width_mm) / 2
y = (page_height - new_height_mm) / 2
pdf.add_page()
pdf.image(image_path, x=x, y=y, w=new_width_mm, h=new_height_mm)
pdf.output(pdf_path)
if __name__ == "__main__":
option = input("1 for single image, 2 for multiple images: ")
if option == '1':
image_path = input("Enter image path/name pls -> ")
pdf_path = input("Enter pdf save loc/name pls -> ")
if os.path.exists(image_path):
image_to_pdf_single(image_path, pdf_path)
print(f"Converted {image_path} to {pdf_path}.")
else:
print("Invalid path/file name.")
elif option == '2':
mult_img = []
i = 1
while True:
image_name = input(f"Enter {i} image name/path pls (or type done) -> ")
if image_name.lower() == 'done':
break
else:
if os.path.exists(image_name):
mult_img.append(image_name)
i += 1
else:
print("Invalid image name/path.")
pdf_path = input("Enter pdf save loc/name pls -> ")
image_to_pdf_multiple(mult_img, pdf_path)
print(f"Successfully saved pdf to {pdf_path}.")
Upvotes: 0
Reputation: 1285
Install fpdf2 for Python:
pip install fpdf2
Now you can use the same logic:
from fpdf import FPDF
pdf = FPDF()
# imagelist is the list with all image filenames
for image in imagelist:
pdf.add_page()
pdf.image(image,x,y,w,h)
pdf.output("yourfile.pdf", "F")
You can find more info at the tutorial page or the official documentation.
Upvotes: 118
Reputation: 187
You can use pdfme. It's a powerful library in python to create PDF documents.
from pdfme import build_pdf
...
pdf_image_list = [{"image": img} for img in images]
with open('images.pdf', 'wb') as f:
build_pdf({"sections": [{"content": pdf_image_list}]})
Check the docs here
Upvotes: 1
Reputation: 99
first pip install pillow
in terminal.
Images can be in jpg or png format. if you have 2 or more images and want to make in 1 pdf file.
Code:
from PIL import Image
image1 = Image.open(r'locationOfImage1\\Image1.png')
image2 = Image.open(r'locationOfImage2\\Image2.png')
image3 = Image.open(r'locationOfImage3\\Image3.png')
im1 = image1.convert('RGB')
im2 = image2.convert('RGB')
im3 = image3.convert('RGB')
imagelist = [im2,im3]
im1.save(r'locationWherePDFWillBeSaved\\CombinedPDF.pdf',save_all=True, append_images=imagelist)
Upvotes: 6
Reputation: 3680
This answer seemed legit but I couldn't get it to work due to the error "a bytes-like object is required, not str". After reading the img2pdf documentation, this is what worked for me:
import img2pdf
import os
dirname = "/path/to/images"
imgs = []
for fname in os.listdir(dirname):
if not fname.endswith(".jpg") and not fname.endswith(".png"):
continue
path = os.path.join(dirname, fname)
if os.path.isdir(path):
continue
imgs.append(path)
with open("name.pdf","wb") as f:
f.write(img2pdf.convert(imgs))
Upvotes: 1
Reputation: 4708
The best method to convert multiple images to PDF I have tried so far is to use PIL
purely. It's quite simple yet powerful:
from PIL import Image # install by > python3 -m pip install --upgrade Pillow # ref. https://pillow.readthedocs.io/en/latest/installation.html#basic-installation
images = [
Image.open("/Users/apple/Desktop/" + f)
for f in ["bbd.jpg", "bbd1.jpg", "bbd2.jpg"]
]
pdf_path = "/Users/apple/Desktop/bbd1.pdf"
images[0].save(
pdf_path, "PDF" ,resolution=100.0, save_all=True, append_images=images[1:]
)
Just set save_all
to True
and append_images
to the list of images which you want to add.
You might encounter the AttributeError: 'JpegImageFile' object has no attribute 'encoderinfo'
. The solution is here Error while saving multiple JPEGs as a multi-page PDF
Note:Install the newest PIL
to make sure save_all
argument is available for PDF.
p.s.
In case you get this error
cannot save mode RGBA
apply this fix
png = Image.open('/path/to/your/file.png')
png.load()
background = Image.new("RGB", png.size, (255, 255, 255))
background.paste(png, mask=png.split()[3]) # 3 is the alpha channel
Upvotes: 169
Reputation: 12380
In my case there was need to convert more then 100 images in different formats (with and with out alpha channel and with different extensions).
I tried all the recepts from answers to this question.
Pil => cannot combine with and without alpha channel (neet to convert images)
fpdf => stack on lots of images
print from html in gotenberg => extremely long processing
And my last attempt was reportlab. And it works nice and fast. (But produce corrupted pdf sometimes on big input). Here is my code
from PyPDF2 import PdfMerger
from reportlab.lib.pagesizes import letter
from reportlab.lib.units import inch
from reportlab.platypus import Image, PageBreak, Paragraph, SimpleDocTemplate
async def save_report_lab_story_to_pdf(file_name, story):
doc = SimpleDocTemplate(
file_name,
pagesize=letter,
rightMargin=32,
leftMargin=32,
topMargin=18,
bottomMargin=18,
)
doc.build(story)
async def reportlab_pdf_builder(data, images):
story = []
width = 7.5 * inch
height = 9 * inch
chunk_size = 5 * 70
pdf_chunks = []
files_to_clean_up = []
for trip in data['trips']:
for invoice in trip['invoices']:
for page in invoice['pages']:
if trip['trip_label']:
story.append(Paragraph(
f"TRIP: {trip['trip_label']} {trip['trip_begin']} - {trip['trip_end']}"
))
else:
story.append(Paragraph("No trip"))
story.append(Paragraph(
f"""Document number: {invoice['invoice_number']}
Document date: {invoice['document_date']}
Amount: {invoice['invoice_trip_value']} {invoice['currency_code']}
"""
))
story.append(Paragraph(" "))
img_name = page['filename']
img_bytes = images[page['path']]
tmp_img_filename = f'/tmp/{uuid.uuid4()}.{img_name}'
with open(tmp_img_filename, "wb") as tmp_img:
tmp_img.write(img_bytes)
im = Image(tmp_img_filename, width, height)
story.append(im)
story.append(PageBreak())
files_to_clean_up.append(tmp_img_filename)
# 5 objects per page in story
if len(story) >= chunk_size:
file_name = f"/tmp/{uuid.uuid4()}_{data['tail_number']}.pdf"
await save_report_lab_story_to_pdf(file_name, story)
story = []
pdf_chunks.append(file_name)
merger = PdfMerger()
for pdf in pdf_chunks:
merger.append(pdf)
res_file_name = f"/tmp/{uuid.uuid4()}_{data['tail_number']}.pdf"
merger.write(res_file_name)
merger.close()
Upvotes: 2
Reputation: 319
Adding to @ilovecomputer's answer, if you want to keep pdf in memory rather than disk, then you can do this:
import io
from pdf2image import convert_from_bytes
pil_images = convert_from_bytes(original_pdf_bytes, dpi=100) # (OPTIONAL) do this if you're converting a normal pdf to images first and then back to only image pdf
pdf_output = io.BytesIO()
pil_images[0].save(pdf_output, "PDF", resolution=100.0, save_all=True, append_images=pil_images[1:])
pdf_bytes = pdf_output.getvalue()
Upvotes: 0
Reputation: 1332
If you use Python 3, you can use the python module img2pdf
install it using pip3 install img2pdf
and then you can use it in a script
using import img2pdf
sample code
import os
import img2pdf
with open("output.pdf", "wb") as f:
f.write(img2pdf.convert([i for i in os.listdir('path/to/imageDir') if i.endswith(".jpg")]))
or (If you get any error with previous approach due to some path issue)
# convert all files matching a glob
import glob
with open("name.pdf","wb") as f:
f.write(img2pdf.convert(glob.glob("/path/to/*.jpg")))
Upvotes: 69
Reputation: 94
What worked for me in python 3.7 and img2pdf version 0.4.0 was to use something similar to the code given by Syed Shamikh Shabbir but changing the current working directory using OS as Stu suggested in his comment to Syed's solution
import os
import img2pdf
path = './path/to/folder'
os.chdir(path)
images = [i for i in os.listdir(os.getcwd()) if i.endswith(".jpg")]
for image in images:
with open(image[:-4] + ".pdf", "wb") as f:
f.write(img2pdf.convert(image))
It is worth mentioning this solution above saves each .jpg separately in one single pdf. If you want all your .jpg files together in only one .pdf you could do:
import os
import img2pdf
path = './path/to/folder'
os.chdir(path)
images = [i for i in os.listdir(os.getcwd()) if i.endswith(".jpg")]
with open("output.pdf", "wb") as f:
f.write(img2pdf.convert(images))
Upvotes: 2
Reputation: 515
I know this is an old question. In my case I use Reportlab.
Sheet dimensions are expressed in points, not pixels, with a point equal to 1/72 inch. An A4 sheet is made up of 595.2 points width and 841.8 points height. The origin of the position coordinates (0, 0) is in the lower left corner. When creating an instance of canvas.Canvas, you can specify the size of the sheets using the pagesize parameter, passing a tuple whose first element represents the width in points and the second, the height. The c.showPage () method tells ReportLab that it has already finished working on the current sheet and moves on to the next one. Although a second sheet has not yet been worked on (and will not appear in the document as long as nothing has been drawn) it is good practice to remember to do so before invoking c.save (). To insert images into a PDF document, ReportLab uses the Pillow library. The drawImage () method takes as its argument the path of an image (supports multiple formats such as PNG, JPEG and GIF) and the position (x, y) in the that you want to insert. The image can be reduced or enlarged indicating its dimensions via the width and height arguments.
The following code provides pdf file name, list with png files, coordinates to insert images as well as size to fit in portrait letter pages.
def pntopd(file, figs, x, y, wi, he):
from reportlab.pdfgen import canvas
from reportlab.lib.pagesizes import A4, letter, landscape, portrait
w, h = letter
c = canvas.Canvas(str(file), pagesize=portrait(letter))
for png in figs:
c.drawImage(png, x, h - y, width=wi, height=he)
c.showPage()
c.save()
from datetime import date
from pathlib import Path
ruta = "C:/SQLite"
today = date.today()
dat_dir = Path(ruta)
tit = today.strftime("%y%m%d") + '_ParameterAudit'
pdf_file = tit + ".pdf"
pdf_path = dat_dir / pdf_file
pnglist = ['C0.png', 'C4387.png', 'C9712.png', 'C9685.png', 'C4364.png']
pntopd(pdf_path, pnglist, 50, 550, 500, 500)
Upvotes: 1
Reputation: 445
If your images are in landscape mode, you can do like this.
from fpdf import FPDF
import os, sys, glob
from tqdm import tqdm
pdf = FPDF('L', 'mm', 'A4')
im_width = 1920
im_height = 1080
aspect_ratio = im_height/im_width
page_width = 297
# page_height = aspect_ratio * page_width
page_height = 200
left_margin = 0
right_margin = 0
# imagelist is the list with all image filenames
for image in tqdm(sorted(glob.glob('test_images/*.png'))):
pdf.add_page()
pdf.image(image, left_margin, right_margin, page_width, page_height)
pdf.output("mypdf.pdf", "F")
print('Conversion completed!')
Here page_width and page_height is the size of 'A4' paper where in landscape its width will 297mm and height will be 210mm; but here I have adjusted the height as per my image. OR you can use either maintaining the aspect ratio as I have commented above for proper scaling of both width and height of the image.
Upvotes: 1
Reputation: 1151
Here is ilovecomputer's answer packed into a function and directly usable. It also allows to reduce image sizes and works well.
The code assumes a folder inside input_dir that contains images ordered alphabetically by their name and outputs a pdf with the name of the folder and potentially a prefix string for the name.
import os
from PIL import Image
def convert_images_to_pdf(export_dir, input_dir, folder, prefix='', quality=20):
current_dir = os.path.join(input_dir, folder)
image_files = os.listdir(current_dir)
im_list = [Image.open(os.path.join(current_dir, image_file)) for image_file in image_files]
pdf_filename = os.path.join(export_dir, prefix + folder + '.pdf')
im_list[0].save(pdf_filename, "PDF", quality=quality, optimize=True, save_all=True, append_images=im_list[1:])
export_dir = r"D:\pdfs"
input_dir = r"D:\image_folders"
folders = os.listdir(input_dir)
[convert_images_to_pdf(export_dir, input_dir, folder, prefix='') for folder in folders];
Upvotes: 2
Reputation: 46463
Ready-to-use solution that converts all PNG in the current folder to a PDF, inspired by @ilovecomputer's answer:
import glob, PIL.Image
L = [PIL.Image.open(f) for f in glob.glob('*.png')]
L[0].save('out.pdf', "PDF" ,resolution=100.0, save_all=True, append_images=L[1:])
Nothing else than PIL is needed :)
Upvotes: 2
Reputation: 618
The best answer already exists !!! I am just improving the answer a little bit. Here's the code :
from fpdf import FPDF
pdf = FPDF()
# imagelist is the list with all image filenames you can create using os module by iterating all the files in a folder or by specifying their name
for image in imagelist:
pdf.add_page()
pdf.image(image,x=0,y=0,w=210,h=297) # for A4 size because some people said that every other page is blank
pdf.output("yourfile.pdf", "F")
You'll need to install FPDF for this purpose.
pip install FPDF
Upvotes: 0
Reputation: 3879
If your images are plots you created mith matplotlib, you can use matplotlib.backends.backend_pdf.PdfPages
(See documentation).
import matplotlib.pyplot as plt
from matplotlib.backends.backend_pdf import PdfPages
# generate a list with dummy plots
figs = []
for i in [-1, 1]:
fig = plt.figure()
plt.plot([1, 2, 3], [i*1, i*2, i*3])
figs.append(fig)
# gerate a multipage pdf:
with PdfPages('multipage_pdf.pdf') as pdf:
for fig in figs:
pdf.savefig(fig)
plt.close()
Upvotes: 8
Reputation: 777
It's not a truly new answer, but - when using img2pdf the page size didn't come out right. So here's what I did to use the image size, I hope it finds someone well:
assuming 1) all images are the same size, 2) placing one image per page, 3) image fills the whole page
from PIL import Image
import img2pdf
with open( 'output.pdf', 'wb' ) as f:
img = Image.open( '1.jpg' )
my_layout_fun = img2pdf.get_layout_fun(
pagesize = ( img2pdf.px_to_pt( img.width, 96 ), img2pdf.px_to_pt( img.height, 96 ) ), # this is where image size is used; 96 is dpi value
fit = img2pdf.FitMode.into # I didn't have to specify this, but just in case...
)
f.write( img2pdf.convert( [ '1.jpg', '2.jpg', '3.jpg' ], layout_fun = my_layout_fun ))
Upvotes: 2
Reputation: 69
How about this??
from fpdf import FPDF
from PIL import Image
import glob
import os
# set here
image_directory = '/path/to/imageDir'
extensions = ('*.jpg','*.png','*.gif') #add your image extentions
# set 0 if you want to fit pdf to image
# unit : pt
margin = 10
imagelist=[]
for ext in extensions:
imagelist.extend(glob.glob(os.path.join(image_directory,ext)))
for imagePath in imagelist:
cover = Image.open(imagePath)
width, height = cover.size
pdf = FPDF(unit="pt", format=[width + 2*margin, height + 2*margin])
pdf.add_page()
pdf.image(imagePath, margin, margin)
destination = os.path.splitext(imagePath)[0]
pdf.output(destination + ".pdf", "F")
Upvotes: 3
Reputation: 139
I know the question has been answered but one more way to solve this is using the pillow library. To convert a whole directory of images:
from PIL import Image
import os
def makePdf(imageDir, SaveToDir):
'''
imageDir: Directory of your images
SaveToDir: Location Directory for your pdfs
'''
os.chdir(imageDir)
try:
for j in os.listdir(os.getcwd()):
os.chdir(imageDir)
fname, fext = os.path.splitext(j)
newfilename = fname + ".pdf"
im = Image.open(fname + fext)
if im.mode == "RGBA":
im = im.convert("RGB")
os.chdir(SaveToDir)
if not os.path.exists(newfilename):
im.save(newfilename, "PDF", resolution=100.0)
except Exception as e:
print(e)
imageDir = r'____' # your imagedirectory path
SaveToDir = r'____' # diretory in which you want to save the pdfs
makePdf(imageDir, SaveToDir)
For using it on an single image:
From PIL import Image
import os
filename = r"/Desktop/document/dog.png"
im = Image.open(filename)
if im.mode == "RGBA":
im = im.convert("RGB")
new_filename = r"/Desktop/document/dog.pdf"
if not os.path.exists(new_filename):
im.save(new_filename,"PDF",resolution=100.0)
Upvotes: 2
Reputation: 219
**** Convert images files to pdf file.****
from os import listdir
from fpdf import FPDF
path = "/home/bunny/images/" # get the path of images
imagelist = listdir(path) # get list of all images
pdf = FPDF('P','mm','A4') # create an A4-size pdf document
x,y,w,h = 0,0,200,250
for image in imagelist:
pdf.add_page()
pdf.image(path+image,x,y,w,h)
pdf.output("images.pdf","F")
Upvotes: 4
Reputation: 149
I had the same problem, so I created a python function to unite multiple pictures in one pdf. The code (available from my github page, uses reportlab
, and is based on answers from the following links:
Here is example of how to unite images into pdf:
We have folder "D:\pictures" with pictures of types png and jpg, and we want to create file pdf_with_pictures.pdf out of them and save it in the same folder.
outputPdfName = "pdf_with_pictures"
pathToSavePdfTo = "D:\\pictures"
pathToPictures = "D:\\pictures"
splitType = "none"
numberOfEntitiesInOnePdf = 1
listWithImagesExtensions = ["png", "jpg"]
picturesAreInRootFolder = True
nameOfPart = "volume"
unite_pictures_into_pdf(outputPdfName, pathToSavePdfTo, pathToPictures, splitType, numberOfEntitiesInOnePdf, listWithImagesExtensions, picturesAreInRootFolder, nameOfPart)
Upvotes: 1
Reputation: 23443
I take the code and made some slight change to make it useable as it is.
from fpdf import FPDF
from PIL import Image
import os # I added this and the code at the end
def makePdf(pdfFileName, listPages, dir=''):
if (dir):
dir += "/"
cover = Image.open(dir + str(listPages[0]))
width, height = cover.size
pdf = FPDF(unit="pt", format=[width, height])
for page in listPages:
pdf.add_page()
pdf.image(dir + str(page), 0, 0)
pdf.output(dir + pdfFileName + ".pdf", "F")
# this is what I added
x = [f for f in os.listdir() if f.endswith(".jpg")]
y = len(x)
makePdf("file", x)
Upvotes: 3
Reputation: 5275
pgmagick is a GraphicsMagick(Magick++)
binding for Python.
It's is a Python wrapper for for ImageMagick (or GraphicsMagick).
import os
from os import listdir
from os.path import isfile, join
from pgmagick import Image
mypath = "\Images" # path to your Image directory
for each_file in listdir(mypath):
if isfile(join(mypath,each_file)):
image_path = os.path.join(mypath,each_file)
pdf_path = os.path.join(mypath,each_file.rsplit('.', 1)[0]+'.pdf')
img = Image(image_path)
img.write(pdf_path)
Sample input Image:
PDF looks like this:
pgmagick iinstallation instruction for windows:
1) Download precompiled binary packages from the Unofficial Windows Binaries for Python Extension Packages (as mentioned in the pgmagick web page) and install it.
Note: Try to download correct version corresponding to your python version installed in your machine and whether its 32bit installation or 64bit.
You can check whether you have 32bit or 64bit python by just typing python at your terminal and press Enter..
D:\>python
ActivePython 2.7.2.5 (ActiveState Software Inc.) based on
Python 2.7.2 (default, Jun 24 2011, 12:21:10) [MSC v.1500 32 bit (Intel)] on win32
Type "help", "copyright", "credits" or "license" for more information.
So it has python version 2.7
and its of 32 bit (Intel)] on win32
so you have to downlad and install pgmagick‑0.5.8.win32‑py2.7.exe
.
These are the following available Python Extension Packages for pgmagick:
2) Then you can follow installation instruction from here.
pip install pgmagick
An then try to import it.
>>> from pgmagick import gminfo
>>> gminfo.version
'1.3.x'
>>> gminfo.library
'GraphicsMagick'
>>>
Upvotes: 5