Reputation: 1779
Using Python, is it possible to crop a pdf page to the content as shown in the image below where the task is achieved in Inkscape? The bounding area for the content should be found automatically.
Using PyPDF2 I can crop the page, but it requires the coordinates to be manually found, which is tedious for a large number of files. In Inkscape, the coordinates are automatically found.
The code I'm using is shown below and an example input file is available here.
# Python 3.7.0
import PyPDF2 # version 1.26.0
with open('document-1.pdf','rb') as fin:
pdf = PyPDF2.PdfFileReader(fin)
page = pdf.getPage(0)
# Coordinates found by inspection.
# Can these coordinates be found automatically?
page.cropBox.lowerLeft=(88,322)
page.cropBox.upperRight = (508,602)
output = PyPDF2.PdfFileWriter()
output.addPage(page)
with open('cropped-1.pdf','wb') as fo:
output.write(fo)
Upvotes: 2
Views: 4590
Reputation: 1087
I was able to do this with the pip-installable CLI https://pypi.org/project/pdfCropMargins/
Since I originally answered, a Python interface has been added: https://github.com/abarker/pdfCropMargins#python-interface (h/t @Paul)
My original answer calling it from the commandline is below.
Unfortunately, I don't believe there's a great way to call it directly from a script, so for now I'm using os.system
.
$ python -m pip install pdfCropMargins --user
$ pdf-crop-margins document.pdf -o output.pdf -p 0
import os
os.system('pdf-crop-margins document.pdf -o output.pdf -p 0')
Upvotes: 2