M K
M K

Reputation: 5

Turning a .xlsx into multiple PDFs

I have a python script that takes an .xlsx file with multiple worksheetes and splits each one of them into a seperate PDF with the worksheet name as the new file name.

import openpyxl
import pandas as pd
from matplotlib.backends.backend_pdf import PdfPages
import matplotlib.pyplot as plt

def xlsx_to_pdf(filename):
    wb = openpyxl.load_workbook(filename)
    for ws in wb:
        df = pd.DataFrame(ws.values)
        with PdfPages(f"{ws.title}.pdf") as pdf:
            fig, ax = plt.subplots()
            ax.axis('tight')
            ax.axis('off')
            the_table = ax.table(cellText=df.values,
                                  colLabels=df.columns,
                                  cellLoc='center',
                                  loc='center')
            pdf.savefig(fig, bbox_inches='tight')

xlsx_to_pdf("Provisionsabrechnung.xlsx")

The only problem I have is that the lines on the tabel it creates are really thick and the text is super small. imported worksheet exported PDF

Since I'm a complete Python noob I would really appreciate any tips.

Thanks in advance.

Upvotes: 0

Views: 91

Answers (1)

Timeless
Timeless

Reputation: 37857

You're close, you just need to adjust the fontsize with Table.set_font_size (after turning off the auto-size) :

the_table.auto_set_font_size(False)
the_table.set_fontsize(12) # <- adjust the size here

And for the borders, you can make thin borders with table.Cell.set_linewidth for each cell separatly :

cell.set_linewidth(0.3) # <- adjust the width here

The full code :

def xlsx_to_pdf(filename):
    wb = openpyxl.load_workbook(filename)
    for ws in wb:
        df = pd.DataFrame(ws.values)
        with PdfPages(f"{ws.title}.pdf") as pdf:
            fig, ax = plt.subplots()
            ax.axis("tight")
            ax.axis("off")
            the_table = ax.table(cellText=df.values,
                                  colLabels=df.columns,
                                  cellLoc="center",
                                  loc="center")
            the_table.auto_set_font_size(False)
            the_table.set_fontsize(10)
            for _, cell in the_table._cells.items():
                cell.set_linewidth(0.5)
            pdf.savefig(fig, bbox_inches="tight")

xlsx_to_pdf("file.xlsx")

Output :

enter image description here

Upvotes: 1

Related Questions