Ali.A
Ali.A

Reputation: 11

Android: How can i define Reader only for One Page in iText

PdfReader reader = new PdfReader(new FileInputStream(fpath));

We can define reader in this type.This is for all PDF file. I need to define this reader only for one page. For example just for 10. page of PDF file.

We can read page by page for getting text but I need to do this for getting images .

TextExtractionStrategy strategy;
StringBuilder sb = new StringBuilder();

for (int i = 1; i <= reader.getNumberOfPages(); i++)
{
    strategy = parser.processContent(i, new SimpleTextExtractionStrategy());
    strW.write(strategy.getResultantText());

    sb.append(strW.toString);
}

Upvotes: 0

Views: 157

Answers (1)

Bruno Lowagie
Bruno Lowagie

Reputation: 77528

Please consult the official documentation and search for selectPages. The selectPages() method reduces the PdfReader instance to the pages listed in your selection.

For instance, if you want to limit the PdfReader instance to page 10, you could use this line:

reader.selectPages("10");

Update

You claim that the above doesn't answer your question. If that is true, then please rephrase your question because I can't think of another interpretation of your question.

Maybe there's a language problem as Amedee indicates in his comment, but I think that the problem is related to some misconceptions about PDF. A PDF file is a series of objects. These objects are listed in a cross-reference table. Any software that reads a PDF needs to start at the end of the file where it will find the trailer dictionary. This trailer dictionary will refer to the root dictionary by number. The viewer will look up the object in the cross-reference table and look for the pages dictionary in the catalog. The pages dictionary contains a tree structure: the page tree. A PDF reader will move through the page tree and find a page dictionary for each page. The page dictionary will contain references to all the resources needed to render the page: the content stream(s), references to fonts, images, and so on. These objects (page dictionaries, streams, font dictionaries, etc) can be found throughout the file (at the start, in the middle, at the end). They aren't ordered in the same order as the pages. The cross-reference table knows the byte offset of each of these objects.

If you know this, you know that any question asking "I want to read only one page of a PDF by isolating a specific number of continuous bytes" is a question that reveals a deep lack of understanding of PDF.

Upvotes: 1

Related Questions