vasco_t
vasco_t

Reputation: 133

PDF coordinate system conversion using Python

I was using a pdf miner to extract data to a xml file, when I noticed that the objects' coordinate system was represented in the following way:

<textline bbox="187.098,693.242,288.642,709.202">

How can I convert this coordinates to a pixel system (x,y) using Python and having already parsed the bbox data to a variable in Python?

Upvotes: 2

Views: 6108

Answers (2)

Based on the last answer I made my own function because it didn't work. the last answer is correct but fails to correct the y coordinates. therefore you have to get the size of the PDF and substract the pdf coordinates AND THEN Transform it. my function takes three arguments where src is a tuple of 4 float numbers representing (x0, y0, x1, y1). the bounding box co-ordinates, pdf is the object read by the library pdfrw:

    from pdfrw import PdfReader
    pdf = PdfReader(<path>)

and im is the PIL object obtained from the image:

    from PIL import Image
    im = Image.open(<path>)

function

    def TranslatePoints(src, pdf, im):

        sx0, sy0, sx1, sy1 = src
        ssx, ssy = (int(pdf.pages[1].MediaBox[2]),int(pdf.pages[1].MediaBox[3]))
        dsx, dsy = im.size
        sy01 = ssy-sy1
        sy11 = ssy-sy0
        x0 = sx0/int(pdf.pages[1].MediaBox[2])*im.size[0]
        x1 = sx1/int(pdf.pages[1].MediaBox[2])*im.size[0]
        y0 = sy01/int(pdf.pages[1].MediaBox[3])*im.size[1]
        y1 = sy11/int(pdf.pages[1].MediaBox[3])*im.size[1]
        return (x0, y0, x1, y1)

to check I cropped the image, it works just fine

    im.crop(TranslatePoints(src, pdf, im))

Upvotes: 2

Myonaiz
Myonaiz

Reputation: 365

first let's understand what the numbers generated by pdfminer for bounding box means. bounding box is a list of 4 numbers organized as the following (x0, x1, y0, y1) where x0, y0 are the co-ordinates of the top left point and y0, y1 are the co-ordinates of the bottom right point. basically, we can use those two points to draw a rectangle around the LT object. also a very important info here is that the origin of the pdf page lies in the bottom left not the top left as images (see this: https://github.com/euske/pdfminer/issues/19). the standard dpi of pdf page is 72 (see this: https://github.com/euske/pdfminer/issues/74) so basically it is in pixel system (this is because the default for png images also is 72 dpi). knowing this information you can implement a function that translates a point or a rectangle points to a new co-ordinate system (produced by supplying a new dpi number) as the following

def TranslatePoints(src, srcSize, dstSize):
    sx0, sy0, sx1, sy1 = src
    ssx, ssy = srcSize
    dsx, dsy = dstSize

    dx0 = sx0 / ssx * dsx
    dx1 = sx1 / ssx * dsx
    dy0 = sy0 / ssy * dsy
    dy1 = sy1 / ssy * dsy
    return dx0, dy0, dx1, dy1

src: the src rectangle. a tuple of 4 float numbers representing (x0, y0, x1, y1). the bounding box co-ordinates. srcSize: the size of the pdf page as tuple of two numbers (width, height) which you can get through my answer https://stackoverflow.com/a/48886525/3022413 (nvm the downvotes it is working and i tested it) dstSize: the size of the target co-ordinate system. tuple of x and y (max limit of the new co-ordinate system, maybe the size of an image you are trying to plot the rectangles on?)

Upvotes: 1

Related Questions