Dewiniaeth
Dewiniaeth

Reputation: 1123

Google App Engine PDF converter

I'm looking for a good, open source, PDF generator/library that will convert html (with styling etc.) into a PDF file.

Requirement:

Yes I have tried searching for this myself - I've tried many "solutions" that I've found on Google etc. None yet satisfy me. Many seem incomplete, buggy or don't work well on GAE. So I figured I would appeal to the StackOverflow community for opinions or suggestions.

Upvotes: 14

Views: 3086

Answers (1)

Bryce Cutt
Bryce Cutt

Reputation: 1525

For HTML/image to PDF I use the Python library http://www.xhtml2pdf.com/ which uses Pisa, Reportlab, pyPdf, and html5lib running on GAE. I have been using it to generate very nice article PDFs with embedded images and once I figured out how to get the page size correct I have found this to be a very good library.

You will need the xhtml2pdf library and it's dependencies: https://github.com/chrisglass/xhtml2pdf

I threw together some example Python code and put it in this pastebin: http://pastebin.com/FFEZjNs3

The pdf_data you get at the end is the binary PDF file data. The html_data you give to pisa is really any string containing an HTML document.

There are some recommended things to include in your HTML to get a well formatted PDF output. Here is an example HTML document similar to the base template I use. Note the author meta field and the @page CSS: http://pastebin.com/q1wRm9nJ

Here are the docs about the compatible CSS and HTML: https://github.com/chrisglass/xhtml2pdf/blob/master/doc/usage.rst#supported-css-properties

You can include images using either the URL of the external image or you can use a dataUri and xhtml2pdf has a function for creating these "pisa.makeDataURI()".

Hopefully that helps.

Upvotes: 12

Related Questions