Reputation: 1123
I'm looking for a good, open source, PDF generator/library that will convert html (with styling etc.) into a PDF file.
Requirement:
Yes I have tried searching for this myself - I've tried many "solutions" that I've found on Google etc. None yet satisfy me. Many seem incomplete, buggy or don't work well on GAE. So I figured I would appeal to the StackOverflow community for opinions or suggestions.
Upvotes: 14
Views: 3086
Reputation: 1525
For HTML/image to PDF I use the Python library http://www.xhtml2pdf.com/ which uses Pisa, Reportlab, pyPdf, and html5lib running on GAE. I have been using it to generate very nice article PDFs with embedded images and once I figured out how to get the page size correct I have found this to be a very good library.
You will need the xhtml2pdf library and it's dependencies: https://github.com/chrisglass/xhtml2pdf
I threw together some example Python code and put it in this pastebin: http://pastebin.com/FFEZjNs3
The pdf_data you get at the end is the binary PDF file data. The html_data you give to pisa is really any string containing an HTML document.
There are some recommended things to include in your HTML to get a well formatted PDF output. Here is an example HTML document similar to the base template I use. Note the author meta field and the @page CSS: http://pastebin.com/q1wRm9nJ
Here are the docs about the compatible CSS and HTML: https://github.com/chrisglass/xhtml2pdf/blob/master/doc/usage.rst#supported-css-properties
You can include images using either the URL of the external image or you can use a dataUri and xhtml2pdf has a function for creating these "pisa.makeDataURI()".
Hopefully that helps.
Upvotes: 12