gupta07
gupta07

Reputation: 1

.getNumPages() method of PyPDF2 Python library returns total number of pages in a pdf as 0

I want to retrieve the text from the pdf files but using this code, I get the total number of pages as 0. How should I improve so as to get the correct total pages in a pdf?

Upvotes: 0

Views: 2146

Answers (3)

user20662222
user20662222

Reputation:

To get the number of pages with pypdf (PyPDF2 is deprecated):

from pypdf import PdfReader

reader = PdfReader("example.pdf")
number_of_pages = len(reader.pages)

Upvotes: 2

Mounesh
Mounesh

Reputation: 744

.pages method helps to do it

from PyPDF2 import PdfReader
    
# Read the pdf
reader = PdfReader("US_Declaration.pdf")

# Find total number of pages
readpdf = len(reader.pages)

Upvotes: 0

Joris Schellekens
Joris Schellekens

Reputation: 9012

(disclaimer: I am the author of pText, the library used in this answer.)

As an alternative to pypdf2 you could also try pText.

1.Load the Document

with open("input.pdf", "rb") as pdf_file_handle:
    doc = PDF.loads(pdf_file_handle)

2.Get the DocumentInfo

    doc_info = doc.get_document_info()
    number_of_pages = doc_info.get_number_of_pages()

You can obtain pText either on GitHub, or using PyPi There are a ton more examples, check them out to find out more about working with images.

Upvotes: 0

Related Questions