Getting PDF Version using Python

Question

I need to extract the PDF version from a PDF document. I tried PDF miner but it provides the below info only:

PDF Producer
Created
Modified
Application

Below is the code I tried:

from pdfminer.pdfparser import PDFParser
from pdfminer.pdfdocument import PDFDocument

fp = open("ibs.servlets.pdf", 'rb')
parser = PDFParser(fp)
doc = PDFDocument(parser)
parser.set_document(doc)
if len(doc.info) > 0:
   info = doc.info[0]
   print(info)

Is there any other libraries apart from pdf miner I can use?

Frodon · Accepted Answer

The PDF version is stored as a comment in the first line of the PDF file. I couldn't find how to get this information using pdfparser, but using PyPDF2 I could retrieve this information manually:

from PyPDF2.pdf import PdfFileReader
doc = PdfFileReader('ibs.servlets.pdf')
doc.stream.seek(0) # Necessary since the comment is ignored for the PDF analysis
print(doc.stream.readline().decode())

Output:

%PDF-1.5

Getting PDF Version using Python

Answers (1)

Related Questions