Reputation: 198228
I have HTML that contains some Unicode characters, and saved in "UTF-8" to disk. I can use less to display it, all characters displayed well:
<h1>什么是Action?</h1>
<p>Play程序接收到的大部分请求,都是由<code>Action</code>来处理的。
But when I use "wkhtmltopdf" to convert it to PDF, it shows broken characters:
My command is:
wkhtmltopdf --encoding utf-8 book.html book.pdf
How to fix this?
Upvotes: 5
Views: 8823
Reputation: 942
I was having this problem too. Turned out, the HTML file had a meta
tag that was setting the wrong charset
.
Eg the HTML file had
<head>
<meta http-equiv=Content-Type content="text/html; charset=windows-1252">
<style>
and the issue was resolved when I switched it to instead utf-8
for the charset, like so:
<head>
<meta http-equiv=Content-Type content="text/html; charset=utf-8">
<style>
Upvotes: 3
Reputation: 609
If you are on a MS Windows machine (the above answer is for X Windows font server), the following worked for me:
You can use YaHei or SimSun with wkhtmltoimage.
Explicitly set content using Chinese letters to the new font-family in your style:
.smsnotification_chinese {
font-size: 30px;
font-family: "Microsoft Yahei", SimSun;
}
This will work on stock US Windows machines. There is a more robust description of font fallbacks described here for others: Chinese Standard Web Fonts: A Guide to CSS Font Family Declarations for Web Design in Simplified Chinese.
Note: The wkhtmltoimage binary does not work on Azure worker machines due to GDI+ sandbox restrictions. You can get around this by writing your own web service wrapper or using this free wrapper: Convert HTML to PDF in .Net on Azure
Upvotes: 0
Reputation: 198228
Finally I found the reason: I don't have unicode fonts in my ubuntu server.
I upload some truetype fonts from my local ubuntu to the server, everything works fine.
freewind@freewind:/usr/share/fonts$ cd truetype/
freewind@freewind:/usr/share/fonts/truetype$ ls
arphic ttf-dejavu ttf-lao
freefont ttf-devanagari-fonts ttf-liberation
kochi ttf-gujarati-fonts ttf-malayalam-fonts
msttcorefonts ttf-indic-fonts-core ttf-oriya-fonts
openoffice ttf-japanese-gothic.ttf ttf-punjabi-fonts
sazanami ttf-japanese-mincho.ttf ttf-tamil-fonts
takao ttf-kacst-one ttf-telugu-fonts
thai ttf-kannada-fonts unfonts
ttf-bengali-fonts ttf-khmeros-core wqy
I simply upload them all, it fix this problem, although I don't know which font is the key.
Upvotes: 17