itextsharp extract text pdf not working

I'm having trouble getting the text from the page.

Object reference error not set to an instance of an object, in the bold line.

String extractText = PdfTextExtractor.GetTextFromPage(pdfReader, i);

Follow the code below

 var pdfText = new StringBuilder();
 using (var pdfReader = new PdfReader(cbPdf.SelectedValue + ""))
 {
      for (var i = 0; i <= pdfReader.NumberOfPages; i++)
      {
         String extractText = PdfTextExtractor.GetTextFromPage(pdfReader, i);
         extractText = Encoding.UTF8.GetString(Encoding.Convert(Encoding.Default, Encoding.UTF8, Encoding.Default.GetBytes(extractText)));
         pdfText.Append(extractText);
      }
 }
 rtxtTexto.Text = pdfText.ToString();

Upvotes: 0

Views: 372

Answers (1)

mkl
mkl

Reputation: 96064

iText numbers pages 1-based, i.e. the first page has number 1.

You already did take that into account at the end of your loop (by comparing using <=), merely not at the start (where you start at 0).

Thus,

for (var i = 1; i <= pdfReader.NumberOfPages; i++)

That being said, as far as I know your line

extractText = Encoding.UTF8.GetString(Encoding.Convert(Encoding.Default, Encoding.UTF8, Encoding.Default.GetBytes(extractText)));

is nonsense.

Upvotes: 1

Related Questions