x89
x89

Reputation: 3460

extract metadata of a pdf file (dimensions or orientation)

Given a pdf file, is there any way to find its page dimensions and orientations (horizontal or vertical) etc? The pypdf2 library gives a function to check for number of pages but how can I extract other info? Is it possible to use this link to find information about the file. Date of creation, number of pages, title etc? Or anything else that is possible.

from PyPDF2 import PdfFileWriter, PdfFileReader

input1 = PdfFileReader(open("document1.pdf", "rb"))

# print how many pages input1 has:
print "document1.pdf has %d pages." % input1.getNumPages()

https://pythonhosted.org/PyPDF2/

Upvotes: 0

Views: 756

Answers (1)

rawrex
rawrex

Reputation: 4064

You can use the /Rotate in order to get a page's rotation.

pdf = PyPDF2.PdfFileReader(open('document1.pdf', 'rb'))
orientation = pdf.getPage(pagenumber).get('/Rotate')

It will yield a value in degrees. Though it may be useful for some documents, you should note, that the page rotation by itself does not denote the orientation. As was contributed by @mkl in the comments.

As to other metadata, there are many things you can pull out. You can look into PyPDF2.pdf.DocumentInformation methods for all of them.

Upvotes: 1

Related Questions