Sunil
Sunil

Reputation: 4303

Read pdf using iText

I am getting problem to read pdf files using iText in java. I can read only one page but when I go to second page it gives exception.I want to read all the pages of any pdf file.

PdfTextExtractor parser =new PdfTextExtractor(new PdfReader("C:/Text.pdf"));
parser.getTextFromPage(3);

I am using these lines and at second line gives exception.

Upvotes: 7

Views: 35692

Answers (4)

KIBOU Hassan
KIBOU Hassan

Reputation: 389

import com.itextpdf.text.pdf.PdfReader;
import com.itextpdf.text.pdf.parser.PdfTextExtractor;

/**
 * This class is used to read an existing
 *  pdf file using iText jar.
 * @author javawithease
 */
public class PDFReadExample {
  public static void main(String args[]){
    try {
    //Create PdfReader instance.
    PdfReader pdfReader = new PdfReader("D:\\testFile.pdf");    

    //Get the number of pages in pdf.
    int pages = pdfReader.getNumberOfPages(); 

    //Iterate the pdf through pages.
    for(int i=1; i<=pages; i++) { 
      //Extract the page content using PdfTextExtractor.
      String pageContent = 
        PdfTextExtractor.getTextFromPage(pdfReader, i);

      //Print the page content on console.
      System.out.println("Content on Page "
                          + i + ": " + pageContent);
      }

      //Close the PdfReader.
      pdfReader.close();
    } catch (Exception e) {
    e.printStackTrace();
    }
  }
}

Upvotes: 0

Kevin Day
Kevin Day

Reputation: 16413

Are you re-constructing the parser and reader for each operation? You can do that, but it's not very efficient (there is a lot of overhead with creating a new PdfReader).

Upvotes: 0

Kushal Paudyal
Kushal Paudyal

Reputation: 3811

  1. Try changing the file location. Sometimes OS does not allow file to be read from some system drives by other applications. Put somewhere in D: etc. I face this problem in Vista when reading files from desktop.

  2. I in fact ran the same two lines of code on one of my PDF and it did print the text. Also make sure you have sufficient pages in the PDF. (3 pages or more) or try with parser.getTextFromPage(1) etc. to get content from other pages.

Upvotes: 2

Mark Redman
Mark Redman

Reputation: 24535

when you say one page, do you mean the first page? you might be indexing the pages incorrectly? Without any more info it could be anything.

Upvotes: 0

Related Questions