Bhuvan
Bhuvan

Reputation: 1573

How to read PDF form data using iTextSharp?

I am trying to find out if it is possible to read PDF Form data (Forms filled in and saved with the form) using iTextSharp. How can I do this?

Upvotes: 20

Views: 47565

Answers (6)

Serg
Serg

Reputation: 57

This worked for me! Note the parameters when defining stamper! '\0', true

string TempFilename = Path.GetTempFileName();

PdfReader pdfReader = new PdfReader(FileName);
//PdfStamper stamper = new PdfStamper(pdfReader, new FileStream(TempFilename, FileMode.Create));
PdfStamper stamper = new PdfStamper(pdfReader, new FileStream(TempFilename, FileMode.Create), '\0', true);

AcroFields fields = stamper.AcroFields;
AcroFields pdfFormFields = pdfReader.AcroFields;

foreach (KeyValuePair<string, AcroFields.Item> kvp in fields.Fields)
{
    string FieldValue = GetXMLNode(XMLFile, kvp.Key);
    if (FieldValue != "")
    {
        fields.SetField(kvp.Key, FieldValue);
    }
}

stamper.FormFlattening = false;
stamper.Close();
pdfReader.Close()

Upvotes: 3

Misguided Chunk
Misguided Chunk

Reputation: 397

If anybody is still wondering about this answer, this is how I extracted the text in the field (provided you know the field name):

PdfReader reader = new("filepath");
PdfDocument doc = new(reader);
PdfAcroForm form = PdfAcroForm.GetAcroForm(document, false);

Form.GetField("FieldNameHere").GetValueAsString();

Works for iText 7.1.16

Upvotes: 0

EIV
EIV

Reputation: 389

The PDF name is "report.pdf"..

The data field to be read into TextBox1 is "TextField25" in the PDF..

        Dim pdf As String = "report.pdf"
        Dim reader As New PdfReader(pdf)
        Dim fields As AcroFields = reader.AcroFields
        TextBox1.Text = fields.GetField("TextField25")

Important Note: This can be used ONLY IF the PDF is not flattened (means the fields should be editable) while it was created using iTextSharp..

i.e.

       pdfStamper.FormFlattening = False

This is very simple.. And it works like a charm.. :)

Upvotes: 2

Adam Jones
Adam Jones

Reputation: 2460

Maybe the iTextSharp library has changed recently but I wasn't able to get the accepted answer to work. Here is my solution:

var pdf_filename = "pdf2read.pdf";
using (var reader = new PdfReader(pdf_filename))
{
    var fields = reader.AcroFields.Fields;

    foreach (var key in fields.Keys)
    {
        var value = reader.AcroFields.GetField(key);
        Console.WriteLine(key + " : " + value);
    }
}

A very subtle difference, due to reader.AcroFields.Fields returning an IDictionary instead of just an AcroFields object.

Upvotes: 17

The Powershell Ninja
The Powershell Ninja

Reputation: 779

If you are using Powershell, the discovery code for fields is:

    Add-Type -Path C:\Users\Micah\Desktop\PDF_Test\itextsharp.dll
    $MyPDF = "C:\Users\Micah\Desktop\PDF_Test\something_important.pdf"
    $PDFDoc = New-Object iTextSharp.text.pdf.pdfreader -ArgumentList $MyPDF
    $PDFDoc.AcroFields.Fields

That code will give you the names of all the fields on the PDF Document, "something_important.pdf".

This is how you access each field once you know the name of the field:

    $PDFDoc.AcroFields.GetField("Name of the field here")

Upvotes: 3

cecilphillip
cecilphillip

Reputation: 11586

You would have to find out the field names in the PDF form. Get the fields and then read their value.

string pdfTemplate = "my.pdf";
PdfReader pdfReader = new PdfReader(pdfTemplate);
AcroFields fields = pdfReader.AcroFields.Fields;
string val = fields.GetField("fieldname");

Obviously in the code above, field name is the name of the PDF form field and the GetField method returns a string representation of that value. Here is an article with example code that you could probably use. It shows how you can both read and write form fields using iTextSharp.

Upvotes: 23

Related Questions