Goro
Goro

Reputation: 10249

Automating PDF generation

What would be a solid tool to use for generating PDF reports? Particularly, we are interested in creating interactive PDFs that have video, like the example found here.

Right now we are using Python and reportlab to generate PDFs, but have not explored the library completely (mostly because the license pricing is a little prohibitive)

We have been looking at the Adobe's SDK and iText libraries but it's hard to say what the capabilities of either are.

The ability to generate a document from a template PDF would be a plus.

Any pointers or comments will be appreciated.

Thanks,

Upvotes: 3

Views: 3909

Answers (2)

Daniel Naab
Daniel Naab

Reputation: 23066

Recently, I needed to create PDF reports for a Django application; a ReportLab license was available, but I ended up choosing LaTeX. The benefit of this approach is that we could use Django templates to generate the LaTeX source, and not get over encumbered writing lots of code for the many reports we needed to create. Plus, we could take advantage of the relatively much more concise LaTeX syntax (which does have it's many quirks and is not suitable for every purpose).

This snippet provides a general overview of the approach. I found it necessary to make some changes, which I have provided at the end of this question. The main addition is detection for Rerun LaTeX messages, which indicates an additional pass is required. Usage is as simple as:

def my_view(request):
    pdf_stream = process_latex(
        'latex_template.tex',
        context=RequestContext(request, {'context_obj': context_obj})
    )
    return HttpResponse(pdf_stream, content_type='application/pdf')

It is possible to embed videos in LaTeX generated PDFs, however I do not have any experience with it. Here is a top Google result.

This solution does require spawning a new process (pdflatex), so if you want a pure Python solution keep looking.

import os
from subprocess import Popen, PIPE
from tempfile import NamedTemporaryFile

from django.template import loader, Context


class LaTeXException(Exception):
    pass


def process_latex(template, context={}, type='pdf', outfile=None):
    """
    Processes a template as a LaTeX source file.
    Output is either being returned or stored in outfile.
    At the moment only pdf output is supported.
    """
    t = loader.get_template(template)
    c = Context(context)
    r = t.render(c)

    tex = NamedTemporaryFile()
    tex.write(r)
    tex.flush()
    base = tex.name
    names = dict((x, '%s.%s' % (base, x)) for x in (
        'log', 'aux', 'pdf', 'dvi', 'png'))
    output = names[type]

    stdout = None
    if type == 'pdf' or type == 'dvi':
        stdout = pdflatex(base, type)
    elif type == 'png':
        stdout = pdflatex(base, 'dvi')
        out, err = Popen(
            ['dvipng', '-bg', '-transparent', names['dvi'], '-o', names['png']],
            cwd=os.path.dirname(base), stdout=PIPE, stderr=PIPE
        ).communicate()

    os.remove(names['log'])
    os.remove(names['aux'])

    # pdflatex appears to ALWAYS return 1, never returning 0 on success, at
    # least on the version installed from the Ubuntu apt repository.
    # so instead of relying on the return code to determine if it failed,
    # check if it successfully created the pdf on disk.
    if not os.path.exists(output):
        details = '*** pdflatex output: ***\n%s\n*** LaTeX source: ***\n%s' % (
            stdout, r)
        raise LaTeXException(details)

    if not outfile:
        o = file(output).read()
        os.remove(output)
        return o
    else:
        os.rename(output, outfile)


def pdflatex(file, type='pdf'):
    out, err = Popen(
        ['pdflatex', '-interaction=nonstopmode', '-output-format', type, file],
        cwd=os.path.dirname(file), stdout=PIPE, stderr=PIPE
    ).communicate()

    # If the output tells us to rerun, do it by recursing over ourself.
    if 'Rerun LaTeX.' in out:
        return pdflatex(file, type)
    else:
        return out

Upvotes: 5

lig
lig

Reputation: 3890

I suggest to use https://github.com/mreiferson/py-wkhtmltox to render HTML to PDF.

And use any tool you choose to render reports as HTML. I like http://www.makotemplates.org/

Upvotes: 0

Related Questions