galyoss
galyoss

Reputation: 31

Python export df to pdf - Adjusting Zoom in PDFKit for Dynamic HTML Table Widths Similar to Google Sheets Fit to Width

TLDR - My goal is to mimic google sheet export to pdf functionality of "scale - fit to width" in python.

I'm trying to build an automation to export various data frames to a pdf file. each df contains a different amount of columns, some with many columns (~30-40) and some with much less.

I do this by converting the df to html using to_html() func, and then using pdfkit library.

pdfkit.from_string(df.to_html(), "table.pdf", options={'zoom': '0.3',})

However, I can't set the zoom for a specific value as different tables require different zoom values to fit in the page properly.

If you'll go to google sheets and export a table as pdf, you'll be able to choose 'scale - fit to width' which is exactly what I want to do.

I assume it's not in the html level since if I view the raw html in browser, it looks good.

How can I achieve this?

Upvotes: 1

Views: 157

Answers (1)

user21600038
user21600038

Reputation:

one way to do this is to use report lab this is some basic code i had written on this topic

from reportlab.lib.pagesizes import letter
from reportlab.platypus import SimpleDocTemplate, Table, TableStyle
from reportlab.lib import colors
import pandas as pd

def export_df_to_pdf(df, filename):
    # Calculate the width of the DataFrame table
    col_widths = [len(str(col)) for col in df.columns]
    table_width = sum(col_widths)

    # Calculate scaling factor to fit to width of the page
    page_width, page_height = letter
    max_table_width = page_width - 100  # Adjust as needed
    scaling_factor = max_table_width / table_width

    # Scale the DataFrame table
    scaled_df = df.copy()
    scaled_df.columns = [str(col)[:int(col_widths[i]*scaling_factor)] for i, col in enumerate(df.columns)]

    # Create PDF
    doc = SimpleDocTemplate(filename, pagesize=letter)
    elements = []

    # Convert DataFrame to list of lists for ReportLab Table
    data = [scaled_df.columns.tolist()] + scaled_df.values.tolist()

    # Create Table object
    table = Table(data)

    # Apply table style
    style = TableStyle([('BACKGROUND', (0, 0), (-1, 0), colors.grey),
                        ('TEXTCOLOR', (0, 0), (-1, 0), colors.whitesmoke),
                        ('ALIGN', (0, 0), (-1, -1), 'CENTER'),
                        ('FONTNAME', (0, 0), (-1, 0), 'Helvetica-Bold'),
                        ('BOTTOMPADDING', (0, 0), (-1, 0), 12),
                        ('BACKGROUND', (0, 1), (-1, -1), colors.beige),
                        ('GRID', (0, 0), (-1, -1), 1, colors.black)])

    table.setStyle(style)

    # Add table to elements
    elements.append(table)

    # Build PDF
    doc.build(elements)

# Example usage
df = pd.DataFrame({'A': [1, 2, 3], 'B': [4, 5, 6], 'C': [7, 8, 9]})
export_df_to_pdf(df, 'output.pdf')

Upvotes: 0

Related Questions