Nori
Nori

Reputation: 2972

How to solve "wkhtmltopdf reported an error: Exit with code 1 due to network error: ProtocolUnknownError" in python pdfkit

I'm using Django. This is code is in views.py.

def download_as_pdf_view(request, doc_type, pk):
    import pdfkit
    file_name = 'invoice.pdf'
    pdf_path = os.path.join(settings.BASE_DIR, 'static', 'pdf', file_name)

    template = get_template("paypal/card_invoice_detail.html")
    _html = template.render({})
    pdfkit.from_string(_html, pdf_path)

    return FileResponse(open(pdf_path, 'rb'), filename=file_name, content_type='application/pdf')

Traceback is below.


[2022-09-05 00:56:35,785] ERROR [django.request.log_response:224] Internal Server Error: /paypal/download_pdf/card_invoice/MTE0Nm1vamlva29zaGkz/
Traceback (most recent call last):
  File "/usr/local/lib/python3.8/site-packages/django/core/handlers/exception.py", line 47, in inner
    response = get_response(request)
  File "/usr/local/lib/python3.8/site-packages/django/core/handlers/base.py", line 181, in _get_response
    response = wrapped_callback(request, *callback_args, **callback_kwargs)
  File "/opt/project/app/paypal/views.py", line 473, in download_as_pdf_view
    pdfkit.from_string(str(_html), pdf_path)
  File "/usr/local/lib/python3.8/site-packages/pdfkit/api.py", line 75, in from_string
    return r.to_pdf(output_path)
  File "/usr/local/lib/python3.8/site-packages/pdfkit/pdfkit.py", line 201, in to_pdf
    self.handle_error(exit_code, stderr)
  File "/usr/local/lib/python3.8/site-packages/pdfkit/pdfkit.py", line 155, in handle_error
    raise IOError('wkhtmltopdf reported an error:\n' + stderr)
OSError: wkhtmltopdf reported an error:
Exit with code 1 due to network error: ProtocolUnknownError

[2022-09-05 00:56:35,797] ERROR [django.server.log_message:161] "GET /paypal/download_pdf/card_invoice/MTE0Nm1vamlva29zaGkz/ HTTP/1.1" 500 107486

This is work file.

pdfkit.from_url('https://google.com', 'google.pdf')

However pdfkit.from_string and pdfkit.from_file return "ProtocolUnknownError"

Please help me!

Update

I tyied this code.

    _html = '''<html><body><h1>Hello world</h1></body></html>'''
    pdfkit.from_string(_html), pdf_path)

It worked fine. I saved above html as sample.html. Then run this code

    _html = render_to_string('path/to/sample.html')
    pdfkit.from_string(str(_html), pdf_path, options={"enable-local-file-access": ""})

It worked fine! And the "ProtocolUnknownError" error is gone thanks to options={"enable-local-file-access": ""}.

So, I changed the HTML file path to the one I really want to use.

    _html = render_to_string('path/to/invoice.html')
    pdfkit.from_string(_html, pdf_path, options={"enable-local-file-access": ""})
    return FileResponse(open(pdf_path, 'rb'), filename=file_name, content_type='application/pdf')

It does not finish convert pdf. When I run the code line by line.

stdout, stderr = result.communicate(input=input) does not return.

It was processing long time.

Upvotes: 27

Views: 45500

Answers (2)

Didier Corbi&#232;re
Didier Corbi&#232;re

Reputation: 161

pdfkit uses wkhtmltopdf, so this issue is usefull.

Try to remove url parameters if any

change this

<link rel="stylesheet" type="text/css" href="/path/to/mainsite.css?13452" />

to this

<link rel="stylesheet" type="text/css" href="/path/to/mainsite.css" />

Upvotes: 1

Nori
Nori

Reputation: 2972

I solved this problem. Theare are 3 step to pass this problems.

  1. You need to set options {"enable-local-file-access": ""}. pdfkit.from_string(_html, pdf_path, options={"enable-local-file-access": ""})

  2. pdfkit.from_string() can't load css from URL. It's something like this. <link rel="stylesheet" href="https://path/to/style.css"> css path should be absolute path or write style in same file.

  3. If css file load another file. ex: font file. It will be ContentNotFoundError.

My solution

I used simple css file like this.

body {
    font-size: 18px;
    padding: 55px;
}

h1 {
    font-size: 38px;
}

h2 {
    font-size: 28px;
}

h3 {
    font-size: 24px;
}

h4 {
    font-size: 20px;
}

table, th, td {
    margin: auto;
    text-align: center;
    border: 1px solid;
}

table {
    width: 80%;
}

.text-right {
    text-align: right;
}


.text-left {
    text-align: left;
}

.text-center {
    text-align: center;
}

This code insert last css file as style in same html.

import os

import pdfkit
from django.http import FileResponse
from django.template.loader import render_to_string

from paypal.models import Invoice
from website import settings


def download_as_pdf_view(request, pk):
    # create PDF from HTML template file with context.
    invoice = Invoice.objects.get(pk=pk)
    context = {
        # please set your contexts as dict.
    }
    _html = render_to_string('paypal/card_invoice_detail.html', context)
     # remove header
    _html = _html[_html.find('<body>'):]  

    # create new header
    new_header = '''<!DOCTYPE html>
    <html lang="ja">
    <head>
    <meta charset="utf-8"/>
    </head>
    <style>
'''
    # add style from css file. please change to your css file path.
    css_path = os.path.join(settings.BASE_DIR, 'paypal', 'static', 'paypal', 'css', 'invoice.css')
    with open(css_path, 'r') as f:
        new_header += f.read()
    new_header += '\n</style>'
    print(new_header)

    # add head to html
    _html = new_header + _html[_html.find('<body>'):]
    with open('paypal/sample.html', 'w') as f: f.write(_html)  # for debug

    # convert html to pdf
    file_name = 'invoice.pdf'
    pdf_path = os.path.join(settings.BASE_DIR, 'static', 'pdf', file_name)
    pdfkit.from_string(_html, pdf_path, options={"enable-local-file-access": ""})
    return FileResponse(open(pdf_path, 'rb'), filename=file_name, content_type='application/pdf')

Upvotes: 44

Related Questions