Nitin Bhojwani
Nitin Bhojwani

Reputation: 712

PdfFileReader: PdfReadError: Could not find xref table at specified location

I am trying to read Pdf file in python through:

from PyPDF2 import PdfFileReader, PdfFileWriter
test_reader = PdfFileReader(file("test.pdf", "rb"))

Above Line throws error:

PyPDF2.utils.PdfReadError: Could not find xref table at specified location

Any help will be highly appreciated

Upvotes: 6

Views: 17147

Answers (3)

Cherise
Cherise

Reputation: 77

You can fix this issue by opening each PDF in Adobe Acrobat Reader and then saving the opened PDF with the same name. This will fix the corruption so PyPDF2 can read the file.

Upvotes: 4

Wesley - Synio
Wesley - Synio

Reputation: 684

You could use qpdf to fix a corrupted PDF, or you could simply use pikepdf (which is based on qpdf) instead of PyPDF2. That library is able to work well with corrupted PDFs because it is based on qpdf.

Example:

import pikepdf
pdf = pikepdf.Pdf.open(file)

Pikepdf docs: https://pikepdf.readthedocs.io/en/latest/

Upvotes: 3

Nitin Bhojwani
Nitin Bhojwani

Reputation: 712

It's fixed. Actually, there wasn't any problem. Seems, the pdf I was using to test was corrupted one (even though when I opened it, the content was there, which is why I couldn't figure out at first place)

I replaced it with another one and it worked as expected.

Upvotes: 5

Related Questions