Syd
Syd

Reputation: 1546

How to process PDF documents?

I need some recommendations on processing PDF documents. These documents are annual statements and contains amounts and dollar figures that I need to reconcile.

I saw some recommendations on

1) iTextSharp, 
2) PDFBox (IKVM)
3) PDFSharp
4) PDFEdit API (from Adobe)

Which ones would you recomend and if there are any limitations that I should be aware of? Besides open source, I do not mind paying for a commercial product as long as it is well supported and fully featured.

**Other information: ** The PDFs are all generated by the same third party vendor. Not all the PDFs have the same structure - there are about 10 different structures (templates).

I do not have a write requirement on PDF.

Many thanks in advance.

Upvotes: 2

Views: 2141

Answers (3)

unclepaul84
unclepaul84

Reputation: 1404

Check out http://www.pdftron.com/. We use it to both read and write PDF documents- very reliable.

Upvotes: 1

Douglas Anderson
Douglas Anderson

Reputation: 4690

You could also look at PDFText. We use this in many cases for extracting raw data from PDF files. He also has other inexpensive libraries to aid with other aspects of PDF manipulation.

This assumes that the document is not scanned and has data that can be extracted.

Upvotes: 1

Tim Jarvis
Tim Jarvis

Reputation: 18815

My vote would be PDFSharp for the following reasons...

  • Easier to use than ITextSharp (subjective opinion)
  • Permissive licence (X11 licence)
  • I had never heard of PDFBox before ;-)

Upvotes: 2

Related Questions