Ritesh Kadmawala
Ritesh Kadmawala

Reputation: 743

Convert HTML into PDF using Python

I am trying to convert HTML into a PDF document in Django and haven't been successful.

I have tried using wkhtmltopdf 0.9.9, however Apache throws an error that wkhtmltopdf cannot connect to server. When I use wkhtmltopdf directly, it runs perfectly fine and converts the HTML into a PDF document.

I have also tried using unoconv, however the rendered PDF file doesn't have any CSS applied to it. I have also tried using xhtml2pdf. Again I am facing same issue; the rendered PDF file doesn't have any CSS styling applied. I have spent the better part of today and last night trying to solve this issue and I'm still no closer to solving the problem.

Let me know if you need any more information

Upvotes: 4

Views: 11944

Answers (4)

itto shura
itto shura

Reputation: 57

You can convert a HTML page to pdf by using the pyhtml2pdf module

#if your using website URL
from pyhtml2pdf import converter
url = 'https://.....'
converter.convert(url, 'sample.pdf')

# if have the html file saved 
import os
from pyhtml2pdf import converter
path = os.path.abspath('abcd.html')
converter.convert(f'file:///{path}', 'sample.pdf') 

Source for the code

Upvotes: 0

arie
arie

Reputation: 18972

Configuring Pisa for Django shouldn't be too hard.

There are really several examples on the net that show you how to do it and explain how to link to external resources in your templates:

In your case you should try the link-callback-function mentioned in the first blog post:

def fetch_resources(uri, rel):
    """
    Callback to allow pisa/reportlab to retrieve Images,Stylesheets, etc.
    `uri` is the href attribute from the html link element.
    `rel` gives a relative path, but it's not used here.

    """
    path = os.path.join(settings.MEDIA_ROOT, uri.replace(settings.MEDIA_URL, ""))
    return path

For newer Django-Version you probably should use STATIC_ROOT instead of MEDIA_ROOT

Then use fetch resources accordingly in your render-method:

pdf = pisa.pisaDocument(StringIO.StringIO(
        html.encode("UTF-8")), 
        result, 
        link_callback=fetch_resources,
        encoding="utf-8")

Upvotes: 4

Johan Dahlin
Johan Dahlin

Reputation: 26496

A possible, but not so elegant solution, is to run a small scripts which renders the html via a headless browser component (webkit/xvfb on Linux) and then saves it as a pdf.

Upvotes: 0

Carlos Castellanos
Carlos Castellanos

Reputation: 2378

I suggest you to use pisa, pypdf and html5lib combination, it worked for me.

Upvotes: 0

Related Questions