macabeus
macabeus

Reputation: 4572

Create PDF from a list of images

Is there any practical way to create a PDF from a list of images files, using Python?

In Perl I know that module. With it I can create a PDF in just 3 lines:

use PDF::FromImage;
...
my $pdf = PDF::FromImage->new;
$pdf->load_images(@allPagesDir);
$pdf->write_file($bookName . '.pdf');

I need to do something very similar to this, but in Python. I know the pyPdf module, but I would like something simple.

Upvotes: 151

Views: 266963

Answers (23)

man44
man44

Reputation: 1

This script allows users to convert one or multiple images into a PDF file. It utilizes the FPDF library for PDF generation and Pillow (PIL) for image handling.


#!/usr/bin/env python3

from fpdf import FPDF
import os
from PIL import Image

def image_to_pdf_single(image_path, pdf_path):
    pdf = FPDF()
    
    with Image.open(image_path) as img:
        img_width_px, img_height_px = img.size
        img_dpi = img.info.get('dpi', (300, 300))[0]

    img_width_mm = img_width_px / img_dpi * 25.4
    img_height_mm = img_height_px / img_dpi * 25.4
    
    print(img_width_mm)
    
    page_width = 210
    page_height = 297
    
    scale_width = page_width / img_width_mm
    scale_height = page_height / img_height_mm
    scale = min(scale_width, scale_height)
    
    new_width_mm = img_width_mm * scale
    new_height_mm = img_height_mm * scale
    
    x = (page_width - new_width_mm) / 2
    y = (page_height - new_height_mm) / 2
    
    pdf.add_page()
    pdf.image(image_path, x=x, y=y, w=new_width_mm, h=new_height_mm)
    
    pdf.output(pdf_path)

def image_to_pdf_multiple(mult_img, pdf_path):
    pdf = FPDF()

    for image_path in mult_img:
        with Image.open(image_path) as img:
            img_width_px, img_height_px = img.size
            img_dpi = img.info.get('dpi', (300, 300))[0]
    
        img_width_mm = img_width_px / img_dpi * 25.4
        img_height_mm = img_height_px / img_dpi * 25.4
    
        print(img_width_mm)
    
        page_width = 210
        page_height = 297
    
        scale_width = page_width / img_width_mm
        scale_height = page_height / img_height_mm
        scale = min(scale_width, scale_height)
    
        new_width_mm = img_width_mm * scale
        new_height_mm = img_height_mm * scale
    
        x = (page_width - new_width_mm) / 2
        y = (page_height - new_height_mm) / 2
    
        pdf.add_page()
        pdf.image(image_path, x=x, y=y, w=new_width_mm, h=new_height_mm)
    
    pdf.output(pdf_path)    

if __name__ == "__main__":
    option = input("1 for single image, 2 for multiple images: ")

    if option == '1':
        image_path = input("Enter image path/name pls -> ")
        pdf_path = input("Enter pdf save loc/name pls -> ")

        if os.path.exists(image_path):
            image_to_pdf_single(image_path, pdf_path)
            print(f"Converted {image_path} to {pdf_path}.")
        else:
            print("Invalid path/file name.")
    
    elif option == '2':
        mult_img = []
        i = 1

        while True:
            image_name = input(f"Enter {i} image name/path pls (or type done) -> ")
            if image_name.lower() == 'done':
                break
            else:
                if os.path.exists(image_name):
                    mult_img.append(image_name)
                    i += 1
                else:
                    print("Invalid image name/path.")

        pdf_path = input("Enter pdf save loc/name pls -> ")

        image_to_pdf_multiple(mult_img, pdf_path)
        print(f"Successfully saved pdf to {pdf_path}.")


Upvotes: 0

Ilya Vinnichenko
Ilya Vinnichenko

Reputation: 1285

Install fpdf2 for Python:

pip install fpdf2

Now you can use the same logic:

from fpdf import FPDF
pdf = FPDF()
# imagelist is the list with all image filenames
for image in imagelist:
    pdf.add_page()
    pdf.image(image,x,y,w,h)
pdf.output("yourfile.pdf", "F")

You can find more info at the tutorial page or the official documentation.

Upvotes: 118

Felipe Sierra
Felipe Sierra

Reputation: 187

You can use pdfme. It's a powerful library in python to create PDF documents.

from pdfme import build_pdf

...

pdf_image_list = [{"image": img} for img in images]

with open('images.pdf', 'wb') as f:
    build_pdf({"sections": [{"content": pdf_image_list}]})

Check the docs here

Upvotes: 1

swati bohidar
swati bohidar

Reputation: 99

first pip install pillow in terminal. Images can be in jpg or png format. if you have 2 or more images and want to make in 1 pdf file.

Code:

from PIL import Image

image1 = Image.open(r'locationOfImage1\\Image1.png')
image2 = Image.open(r'locationOfImage2\\Image2.png')
image3 = Image.open(r'locationOfImage3\\Image3.png')

im1 = image1.convert('RGB')
im2 = image2.convert('RGB')
im3 = image3.convert('RGB')

imagelist = [im2,im3]

im1.save(r'locationWherePDFWillBeSaved\\CombinedPDF.pdf',save_all=True, append_images=imagelist)

Upvotes: 6

Hagbard
Hagbard

Reputation: 3680

This answer seemed legit but I couldn't get it to work due to the error "a bytes-like object is required, not str". After reading the img2pdf documentation, this is what worked for me:

import img2pdf
import os

dirname = "/path/to/images"
imgs = []
for fname in os.listdir(dirname):
    if not fname.endswith(".jpg") and not fname.endswith(".png"):
        continue
    path = os.path.join(dirname, fname)
    if os.path.isdir(path):
        continue
    imgs.append(path)
with open("name.pdf","wb") as f:
    f.write(img2pdf.convert(imgs))

Upvotes: 1

ilovecomputer
ilovecomputer

Reputation: 4708

The best method to convert multiple images to PDF I have tried so far is to use PIL purely. It's quite simple yet powerful:

from PIL import Image  # install by > python3 -m pip install --upgrade Pillow  # ref. https://pillow.readthedocs.io/en/latest/installation.html#basic-installation

images = [
    Image.open("/Users/apple/Desktop/" + f)
    for f in ["bbd.jpg", "bbd1.jpg", "bbd2.jpg"]
]

pdf_path = "/Users/apple/Desktop/bbd1.pdf"
    
images[0].save(
    pdf_path, "PDF" ,resolution=100.0, save_all=True, append_images=images[1:]
)

Just set save_all to True and append_images to the list of images which you want to add.

You might encounter the AttributeError: 'JpegImageFile' object has no attribute 'encoderinfo'. The solution is here Error while saving multiple JPEGs as a multi-page PDF

Note:Install the newest PIL to make sure save_all argument is available for PDF.

p.s.

In case you get this error

cannot save mode RGBA

apply this fix

png = Image.open('/path/to/your/file.png')
png.load()
background = Image.new("RGB", png.size, (255, 255, 255))
background.paste(png, mask=png.split()[3]) # 3 is the alpha channel

Upvotes: 169

Ryabchenko Alexander
Ryabchenko Alexander

Reputation: 12380

In my case there was need to convert more then 100 images in different formats (with and with out alpha channel and with different extensions).

I tried all the recepts from answers to this question.

Pil => cannot combine with and without alpha channel (neet to convert images)

fpdf => stack on lots of images

print from html in gotenberg => extremely long processing

And my last attempt was reportlab. And it works nice and fast. (But produce corrupted pdf sometimes on big input). Here is my code

from PyPDF2 import PdfMerger
from reportlab.lib.pagesizes import letter
from reportlab.lib.units import inch
from reportlab.platypus import Image, PageBreak, Paragraph, SimpleDocTemplate

async def save_report_lab_story_to_pdf(file_name, story):
    doc = SimpleDocTemplate(
        file_name,
        pagesize=letter,
        rightMargin=32,
        leftMargin=32,
        topMargin=18,
        bottomMargin=18,
    )
    doc.build(story)


async def reportlab_pdf_builder(data, images):
    story = []
    width = 7.5 * inch
    height = 9 * inch

    chunk_size = 5 * 70
    pdf_chunks = []

    files_to_clean_up = []
    for trip in data['trips']:
        for invoice in trip['invoices']:
            for page in invoice['pages']:
                if trip['trip_label']:
                    story.append(Paragraph(
                        f"TRIP: {trip['trip_label']} {trip['trip_begin']} - {trip['trip_end']}"
                    ))
                else:
                    story.append(Paragraph("No trip"))

                story.append(Paragraph(
                    f"""Document number: {invoice['invoice_number']}
                        Document date: {invoice['document_date']}
                        Amount: {invoice['invoice_trip_value']} {invoice['currency_code']}
                    """
                ))
                story.append(Paragraph(" "))
                img_name = page['filename']
                img_bytes = images[page['path']]
                tmp_img_filename = f'/tmp/{uuid.uuid4()}.{img_name}'
                with open(tmp_img_filename, "wb") as tmp_img:
                    tmp_img.write(img_bytes)
                im = Image(tmp_img_filename, width, height)
                story.append(im)
                story.append(PageBreak())
                files_to_clean_up.append(tmp_img_filename)
                # 5 objects per page in story

                if len(story) >= chunk_size:
                    file_name = f"/tmp/{uuid.uuid4()}_{data['tail_number']}.pdf"
                    await save_report_lab_story_to_pdf(file_name, story)
                    story = []
                    pdf_chunks.append(file_name)

    merger = PdfMerger()
    for pdf in pdf_chunks:
        merger.append(pdf)

    res_file_name = f"/tmp/{uuid.uuid4()}_{data['tail_number']}.pdf"
    merger.write(res_file_name)
    merger.close()

Upvotes: 2

Prince Mathur
Prince Mathur

Reputation: 319

Adding to @ilovecomputer's answer, if you want to keep pdf in memory rather than disk, then you can do this:

import io
from pdf2image import convert_from_bytes
 
pil_images = convert_from_bytes(original_pdf_bytes, dpi=100) # (OPTIONAL) do this if you're converting a normal pdf to images first and then back to only image pdf
pdf_output = io.BytesIO()
pil_images[0].save(pdf_output, "PDF", resolution=100.0, save_all=True, append_images=pil_images[1:])
pdf_bytes = pdf_output.getvalue()

Upvotes: 0

Syed Shamikh Shabbir
Syed Shamikh Shabbir

Reputation: 1332

If you use Python 3, you can use the python module img2pdf

install it using pip3 install img2pdf and then you can use it in a script using import img2pdf

sample code

import os
import img2pdf

with open("output.pdf", "wb") as f:
    f.write(img2pdf.convert([i for i in os.listdir('path/to/imageDir') if i.endswith(".jpg")]))

or (If you get any error with previous approach due to some path issue)

# convert all files matching a glob
import glob
with open("name.pdf","wb") as f:
    f.write(img2pdf.convert(glob.glob("/path/to/*.jpg")))

Upvotes: 69

BrunoSE
BrunoSE

Reputation: 94

What worked for me in python 3.7 and img2pdf version 0.4.0 was to use something similar to the code given by Syed Shamikh Shabbir but changing the current working directory using OS as Stu suggested in his comment to Syed's solution

import os
import img2pdf

path = './path/to/folder'
os.chdir(path)
images = [i for i in os.listdir(os.getcwd()) if i.endswith(".jpg")]

for image in images:
    with open(image[:-4] + ".pdf", "wb") as f:
        f.write(img2pdf.convert(image))

It is worth mentioning this solution above saves each .jpg separately in one single pdf. If you want all your .jpg files together in only one .pdf you could do:

import os
import img2pdf

path = './path/to/folder'
os.chdir(path)
images = [i for i in os.listdir(os.getcwd()) if i.endswith(".jpg")]

with open("output.pdf", "wb") as f:
    f.write(img2pdf.convert(images))

Upvotes: 2

GERMAN RODRIGUEZ
GERMAN RODRIGUEZ

Reputation: 515

I know this is an old question. In my case I use Reportlab.

Sheet dimensions are expressed in points, not pixels, with a point equal to 1/72 inch. An A4 sheet is made up of 595.2 points width and 841.8 points height. The origin of the position coordinates (0, 0) is in the lower left corner. When creating an instance of canvas.Canvas, you can specify the size of the sheets using the pagesize parameter, passing a tuple whose first element represents the width in points and the second, the height. The c.showPage () method tells ReportLab that it has already finished working on the current sheet and moves on to the next one. Although a second sheet has not yet been worked on (and will not appear in the document as long as nothing has been drawn) it is good practice to remember to do so before invoking c.save (). To insert images into a PDF document, ReportLab uses the Pillow library. The drawImage () method takes as its argument the path of an image (supports multiple formats such as PNG, JPEG and GIF) and the position (x, y) in the that you want to insert. The image can be reduced or enlarged indicating its dimensions via the width and height arguments.

The following code provides pdf file name, list with png files, coordinates to insert images as well as size to fit in portrait letter pages.

def pntopd(file, figs, x, y, wi, he):
    from reportlab.pdfgen import canvas
    from reportlab.lib.pagesizes import A4, letter, landscape, portrait
    w, h = letter
    c = canvas.Canvas(str(file), pagesize=portrait(letter))
    for png in figs:
        c.drawImage(png, x, h - y, width=wi, height=he)
        c.showPage()
    c.save()
    
    
    
from datetime import date
from pathlib import Path
ruta = "C:/SQLite"
today = date.today()
dat_dir = Path(ruta)
tit = today.strftime("%y%m%d") + '_ParameterAudit'
pdf_file = tit + ".pdf"
pdf_path = dat_dir / pdf_file
pnglist = ['C0.png', 'C4387.png', 'C9712.png', 'C9685.png', 'C4364.png']
pntopd(pdf_path, pnglist, 50, 550, 500, 500)

Upvotes: 1

VARAT BOHARA
VARAT BOHARA

Reputation: 445

If your images are in landscape mode, you can do like this.

from fpdf import FPDF
import os, sys, glob
from tqdm import tqdm

pdf = FPDF('L', 'mm', 'A4')
im_width = 1920
im_height = 1080

aspect_ratio = im_height/im_width
page_width = 297
# page_height = aspect_ratio * page_width
page_height = 200
left_margin = 0
right_margin = 0

# imagelist is the list with all image filenames
for image in tqdm(sorted(glob.glob('test_images/*.png'))):
pdf.add_page()
pdf.image(image, left_margin, right_margin, page_width, page_height)
pdf.output("mypdf.pdf", "F")
print('Conversion completed!')

Here page_width and page_height is the size of 'A4' paper where in landscape its width will 297mm and height will be 210mm; but here I have adjusted the height as per my image. OR you can use either maintaining the aspect ratio as I have commented above for proper scaling of both width and height of the image.

Upvotes: 1

faysou
faysou

Reputation: 1151

Here is ilovecomputer's answer packed into a function and directly usable. It also allows to reduce image sizes and works well.

The code assumes a folder inside input_dir that contains images ordered alphabetically by their name and outputs a pdf with the name of the folder and potentially a prefix string for the name.

import os
from PIL import Image

def convert_images_to_pdf(export_dir, input_dir, folder, prefix='', quality=20):
    current_dir = os.path.join(input_dir, folder)
    image_files = os.listdir(current_dir)
    im_list = [Image.open(os.path.join(current_dir, image_file)) for image_file in image_files]

    pdf_filename = os.path.join(export_dir, prefix + folder + '.pdf')
    im_list[0].save(pdf_filename, "PDF", quality=quality, optimize=True, save_all=True, append_images=im_list[1:])

export_dir = r"D:\pdfs"
input_dir = r"D:\image_folders"
folders = os.listdir(input_dir)
[convert_images_to_pdf(export_dir, input_dir, folder, prefix='') for folder in folders];

Upvotes: 2

Basj
Basj

Reputation: 46463

Ready-to-use solution that converts all PNG in the current folder to a PDF, inspired by @ilovecomputer's answer:

import glob, PIL.Image
L = [PIL.Image.open(f) for f in glob.glob('*.png')]
L[0].save('out.pdf', "PDF" ,resolution=100.0, save_all=True, append_images=L[1:])

Nothing else than PIL is needed :)

Upvotes: 2

shekhar chander
shekhar chander

Reputation: 618

The best answer already exists !!! I am just improving the answer a little bit. Here's the code :

from fpdf import FPDF
pdf = FPDF()
# imagelist is the list with all image filenames you can create using os module by iterating all the files in a folder or by specifying their name
for image in imagelist:
    pdf.add_page()
    pdf.image(image,x=0,y=0,w=210,h=297) # for A4 size because some people said that every other page is blank
pdf.output("yourfile.pdf", "F")

You'll need to install FPDF for this purpose.

pip install FPDF

Upvotes: 0

Qaswed
Qaswed

Reputation: 3879

If your images are plots you created mith matplotlib, you can use matplotlib.backends.backend_pdf.PdfPages (See documentation).

import matplotlib.pyplot as plt
from matplotlib.backends.backend_pdf import PdfPages

# generate a list with dummy plots   
figs = []
for i in [-1, 1]:
    fig = plt.figure()
    plt.plot([1, 2, 3], [i*1, i*2, i*3])
    figs.append(fig)

# gerate a multipage pdf:
with PdfPages('multipage_pdf.pdf') as pdf:
    for fig in figs:
        pdf.savefig(fig)
        plt.close()

Upvotes: 8

svinec
svinec

Reputation: 777

It's not a truly new answer, but - when using img2pdf the page size didn't come out right. So here's what I did to use the image size, I hope it finds someone well:

assuming 1) all images are the same size, 2) placing one image per page, 3) image fills the whole page

from PIL import Image
import img2pdf

with open( 'output.pdf', 'wb' ) as f:
    img = Image.open( '1.jpg' )
    my_layout_fun = img2pdf.get_layout_fun(
        pagesize = ( img2pdf.px_to_pt( img.width, 96 ), img2pdf.px_to_pt( img.height, 96 ) ), # this is where image size is used; 96 is dpi value
        fit = img2pdf.FitMode.into # I didn't have to specify this, but just in case...
    )
    f.write( img2pdf.convert( [ '1.jpg', '2.jpg', '3.jpg' ], layout_fun = my_layout_fun ))

Upvotes: 2

Daisuke Harada
Daisuke Harada

Reputation: 69

How about this??

from fpdf import FPDF
from PIL import Image
import glob
import os


# set here
image_directory = '/path/to/imageDir'
extensions = ('*.jpg','*.png','*.gif') #add your image extentions
# set 0 if you want to fit pdf to image
# unit : pt
margin = 10

imagelist=[]
for ext in extensions:
    imagelist.extend(glob.glob(os.path.join(image_directory,ext)))

for imagePath in imagelist:
    cover = Image.open(imagePath)
    width, height = cover.size

pdf = FPDF(unit="pt", format=[width + 2*margin, height + 2*margin])
pdf.add_page()

pdf.image(imagePath, margin, margin)

destination = os.path.splitext(imagePath)[0]
pdf.output(destination + ".pdf", "F")

Upvotes: 3

Vaibhav Singh
Vaibhav Singh

Reputation: 139

I know the question has been answered but one more way to solve this is using the pillow library. To convert a whole directory of images:

from PIL import Image
import os


def makePdf(imageDir, SaveToDir):
     '''
        imageDir: Directory of your images
        SaveToDir: Location Directory for your pdfs
    '''
    os.chdir(imageDir)
    try:
        for j in os.listdir(os.getcwd()):
            os.chdir(imageDir)
            fname, fext = os.path.splitext(j)
            newfilename = fname + ".pdf"
            im = Image.open(fname + fext)
            if im.mode == "RGBA":
                im = im.convert("RGB")
            os.chdir(SaveToDir)
            if not os.path.exists(newfilename):
                im.save(newfilename, "PDF", resolution=100.0)
    except Exception as e:
        print(e)

imageDir = r'____' # your imagedirectory path
SaveToDir = r'____' # diretory in which you want to save the pdfs
makePdf(imageDir, SaveToDir)

For using it on an single image:

From PIL import Image
import os

filename = r"/Desktop/document/dog.png"
im = Image.open(filename)
if im.mode == "RGBA":
    im = im.convert("RGB")
new_filename = r"/Desktop/document/dog.pdf"
if not os.path.exists(new_filename):
    im.save(new_filename,"PDF",resolution=100.0)

Upvotes: 2

user7384403
user7384403

Reputation: 219

**** Convert images files to pdf file.****
from os import listdir
from fpdf import FPDF

path = "/home/bunny/images/" # get the path of images

imagelist = listdir(path) # get list of all images

pdf = FPDF('P','mm','A4') # create an A4-size pdf document 

x,y,w,h = 0,0,200,250

for image in imagelist:

    pdf.add_page()
    pdf.image(path+image,x,y,w,h)

pdf.output("images.pdf","F")

Upvotes: 4

Valeriy Ivanov
Valeriy Ivanov

Reputation: 149

I had the same problem, so I created a python function to unite multiple pictures in one pdf. The code (available from my github page, uses reportlab, and is based on answers from the following links:

Here is example of how to unite images into pdf:

We have folder "D:\pictures" with pictures of types png and jpg, and we want to create file pdf_with_pictures.pdf out of them and save it in the same folder.

outputPdfName = "pdf_with_pictures"
pathToSavePdfTo = "D:\\pictures"
pathToPictures = "D:\\pictures"
splitType = "none"
numberOfEntitiesInOnePdf = 1
listWithImagesExtensions = ["png", "jpg"]
picturesAreInRootFolder = True
nameOfPart = "volume"

unite_pictures_into_pdf(outputPdfName, pathToSavePdfTo, pathToPictures, splitType, numberOfEntitiesInOnePdf, listWithImagesExtensions, picturesAreInRootFolder, nameOfPart)

Upvotes: 1

PythonProgrammi
PythonProgrammi

Reputation: 23443

Some changes to make a pdf from the dir where the files are

I take the code and made some slight change to make it useable as it is.

from fpdf import FPDF
from PIL import Image
import os # I added this and the code at the end

def makePdf(pdfFileName, listPages, dir=''):
    if (dir):
        dir += "/"

    cover = Image.open(dir + str(listPages[0]))
    width, height = cover.size

    pdf = FPDF(unit="pt", format=[width, height])

    for page in listPages:
        pdf.add_page()
        pdf.image(dir + str(page), 0, 0)

    pdf.output(dir + pdfFileName + ".pdf", "F")


# this is what I added
x = [f for f in os.listdir() if f.endswith(".jpg")]
y = len(x)

makePdf("file", x)

Upvotes: 3

Tanveer Alam
Tanveer Alam

Reputation: 5275

pgmagick is a GraphicsMagick(Magick++) binding for Python.

It's is a Python wrapper for for ImageMagick (or GraphicsMagick).

import os
from os import listdir
from os.path import isfile, join 
from pgmagick import Image

mypath = "\Images" # path to your Image directory 

for each_file in listdir(mypath):
    if isfile(join(mypath,each_file)):
        image_path = os.path.join(mypath,each_file)
        pdf_path =  os.path.join(mypath,each_file.rsplit('.', 1)[0]+'.pdf')
        img = Image(image_path)
        img.write(pdf_path)

Sample input Image:

enter image description here

PDF looks like this:

enter image description here

pgmagick iinstallation instruction for windows:

1) Download precompiled binary packages from the Unofficial Windows Binaries for Python Extension Packages (as mentioned in the pgmagick web page) and install it.

Note: Try to download correct version corresponding to your python version installed in your machine and whether its 32bit installation or 64bit.

You can check whether you have 32bit or 64bit python by just typing python at your terminal and press Enter..

D:\>python
ActivePython 2.7.2.5 (ActiveState Software Inc.) based on
Python 2.7.2 (default, Jun 24 2011, 12:21:10) [MSC v.1500 32 bit (Intel)] on win32
Type "help", "copyright", "credits" or "license" for more information.

So it has python version 2.7 and its of 32 bit (Intel)] on win32 so you have to downlad and install pgmagick‑0.5.8.win32‑py2.7.exe.

These are the following available Python Extension Packages for pgmagick:

  • pgmagick‑0.5.8.win‑amd64‑py2.6.exe
  • pgmagick‑0.5.8.win‑amd64‑py2.7.exe
  • pgmagick‑0.5.8.win‑amd64‑py3.2.exe
  • pgmagick‑0.5.8.win32‑py2.6.exe
  • pgmagick‑0.5.8.win32‑py2.7.exe
  • pgmagick‑0.5.8.win32‑py3.2.exe

2) Then you can follow installation instruction from here.

pip install pgmagick

An then try to import it.

>>> from pgmagick import gminfo
>>> gminfo.version
'1.3.x'
>>> gminfo.library
'GraphicsMagick'
>>>

Upvotes: 5

Related Questions