Reputation: 8052
Here I've found this code for splitting pdf page.
#!/usr/bin/env python
import copy, sys
from pyPdf import PdfFileWriter, PdfFileReader
input = PdfFileReader(sys.stdin)
output = PdfFileWriter()
for p in [input.getPage(i) for i in range(0,input.getNumPages())]:
q = copy.copy(p)
(w, h) = p.mediaBox.upperRight
p.mediaBox.upperRight = (w/2, h)
q.mediaBox.upperLeft = (w/2, h)
output.addPage(p)
output.addPage(q)
output.write(sys.stdout)
If one page contains four another pages like this:
+-------+-------+
| 1 | 2 |
|-------+-------|
| 3 | 4 |
+-------+-------+
Then the code will split it to two pages (in this order) containing another two pages:
+-------+-------+
| 3 | 4 |
+-------+-------+
+-------+-------+
| 1 | 2 |
+-------+-------+
You can test it e.g. on following document. If I correctly understand upperRight
, upperLeft
(and other) variables mentioned in code, then this is the document representation as seen by pyPdf:
UL(0,10) UR(10,10)
+-------+-------+
| 1 | 2 |
|-------+-------|
| 3 | 4 |
+-------+-------+
LL(0,0) LR(10,0)
UL(x,y) = UpperLeft
UR(x,y) = UpperRight
LL(x,y) = LowerLeft
LR(x,y) = LowerRight
According to mentioned code:
(w, h) = p.mediaBox.upperRight
p.mediaBox.upperRight = (w/2, h)
q.mediaBox.upperLeft = (w/2, h)
I was expecting this output:
p:
+-------+
| 1 |
|-------+
| 3 |
+-------+
q:
+-------+
| 2 |
|-------+
| 4 |
+-------+
What I'm missing here?
Upvotes: 2
Views: 1939
Reputation: 4871
In PDF there are 2 ways to get a landscape page:
Your sample PDF uses the second way: all the pages are 595x842 with a rotation of 270 degrees. Not taking the rotation into account causes vertical to be interpreted as horizontal and vice versa.
Upvotes: 5