Reputation: 896
I'm struggling to generate just a simple PDF with non-ascii characters using Python 3.5.2, python-pdfkit and wkhtmltox-0.12.2.
This is the easiest example I could write:
import pdfkit
html_content = u'<p>ö</p>'
pdfkit.from_string(html_content, 'out.pdf')
This is like the output document looks like:
Upvotes: 6
Views: 14675
Reputation: 771
It also possible to set charset in options. This way you don't have to alter the HTML file - especially if you're not the one creating it, and you don't want to mess with it.
def get_options():
return {
'encoding': 'UTF-8',
'enable-local-file-access': True
}
pdfkit.from_string(html, verbose=True, options=get_options(), configuration=_pdfkit_config)
Upvotes: 3
Reputation: 896
I found out that I just needed to add a meta tag with charset attribute to my HTML code:
import pdfkit
html_content = """
<!DOCTYPE html>
<html>
<head>
<meta charset="utf-8">
</head>
<body>
<p>€</p>
<p>áéíóúñö</p>
<body>
</html>
"""
pdfkit.from_string(html_content, 'out.pdf')
I actually spent quite some time following wrong solutions like the one suggested here. In case someone is interested, I wrote a short story on my blog. Sorry for the SPAM :)
Upvotes: 34
Reputation: 26184
There is a relevant issue in pdfkit project https://github.com/devongovett/pdfkit/issues/470 that says
"You need to use an embedded font. The built-in fonts have a limited character set available."
An answer to this question How to: output Euro symbol in pdfkit for nodejs gives a clue how to do it.
Upvotes: 1