pwaring
pwaring

Reputation: 3074

Populating PDF fields from a database

I have a PDF file (not created by me - I have no control over the design etc.) which allows users to fill in some form fields in Adobe Reader and save the result. I want to automate the process of populating the fields, using the following steps:

  1. Fetch data from database.
  2. Open PDF template.
  3. Populate form fields with data.
  4. Save modified file to a separate location on disk.
  5. Lock modified file so that the form fields can no longer be edited.
  6. Send file to user.

I'm happy to use PHP, Perl, Python or Java to do steps 2-5 (in descending order of preference), but whatever I use has to work under Linux (i.e. it mustn't rely on libraries which are only available on Windows for example).

The end result should be a PDF which the average user can open and print, but not modify (I'm sure advanced users could find a way to do so, but I accept that I can't guarantee complete security against modification). I don't want to change the structure of the PDF, merely populate the form fields.

Is there a standard piece of software for doing this? I've seen mentions of FDF Toolkit, but I'm not entirely sure if that's what I want and whether it will allow me to lock the file afterwards, and whether what I want to do fits in with the EULA.

Edit: Final answer is to use iText (as suggested by Mark Storer) but to implement it as a web service which allows you to pass in an array of form field names and values and the PDF file 'template'. The web service will be open source (and available on GitHub once I've written it), as per the AGPL, but anything connecting to it won't have to be.

Upvotes: 3

Views: 14299

Answers (1)

Mark Storer
Mark Storer

Reputation: 15870

Filling

Any number of different libraries can fill in field values. I'm partial to iText (java) or iTextSharp (c#). I wrote one in Java a number of years ago. It's not that hard). There are lots. Search SO, you'll find 'em.

Locking

There are a couple different levels of "lock the fields".

  • Each field has a "read only" flag. This is pretty much a courtesy as far as other libraries capable of setting field values are concerned. In fact, it's generally considered to mean "the ui cannot make changes". Form script can, regardless.
  • Form flattening: Draw the fields directly into the page and removing all the interactivity.

Each one has pros and cons.

Flag: None too secure. Form data still easily accessible. Scrolling fields still scroll.
Flattening: Pretty much the exact opposite. It's harder to modify (though far from impossile). The form data can only be extracted via text extraction (which is hard, but becoming increasingly common). List & text fields that contain more stuff than is visible will no longer scroll.

The ability to flatten forms is relatively rare. Again, iText can do it (as can iTextSharp), but I'm not aware of any other third party libraries that can... I'm sure they exist, I just can't name them off the top of my head.

Upvotes: 6

Related Questions