mapping highlights/annotations to text in pdf

Question

So i have this sample pdf file with three words on separate lines:

"
hello
there
world
"

I have highlighted the word "there" on the second line. Internally, within the pdf, i'm trying to map the highlight/annotation structure to the text (BT) area.

The section corresponding to the word "there" looks like so:

BT
/F0 14.6599998 Tf
1 0 0 -1 0 130 Tm
96 0 Td <0057> Tj
4.0719757 0 Td <004B> Tj
8.1511078 0 Td <0048> Tj
8.1511078 0 Td <0055> Tj
4.8806458 0 Td <0048> Tj
ET

I also have an annotation section where I have my highlight which has the following rect dimensions:

18 0 19 15 20 694 21 786 22 853 23 1058 24 1331 [19 0 R 20 0 R]<>
...
(I left the top part of the annotation out on purpose because it is long.  I extracted what i thought were the most important parts.
Rect[68.0024 690.459 101.054 706.37]

I'm kind of confused about how my text is mapped to this one highlight that I have. The coordinates do not seem to match (130 y vs 690 y)? Am I looking in the right place and interpreting my text and/or highlight annotation coordinates correctly?

Update:

i want to add more info on how I created this test pdf.

Its pretty simple to recreate the pdf. I went to google docs and created an empty document. On three lines i wrote my text as described above. I downloaded that as a pdf and then opened it in adobe acrobat reader DC (the newest one i think). I then used adobe acrobat reader to highlight the specified line and re save it. After that I used some python to unzip the pdf sections. The python code to decompress the pdf sections:

import re
import zlib

pdf = open("helloworld.pdf", "rb").read()
stream = re.compile(r'.*?FlateDecode.*?stream(.*?)endstream', re.S)

for s in stream.findall(pdf):
    s = s.strip('
')
    try:
        print(zlib.decompress(s))
        print("")
    except:
        pass

mkl · Accepted Answer

Unfortunately the OP only explained how he created his document and did not share the document itself. I followed his instructions but the coordinates of the annotation differ. As I only have this document for explanation, though, the OP will have to mentally adapt the following to the precise numbers in his document.

The starting coordinate system

The starting (default) user coordinate system in the document is implied by the crop box. In the document at hand the crop box is defined as

/CropBox [0 0 596 843]

i.e. the visible page is 596 units wide and 843 units high (given the default user unit of 1/72" this is an A4 format) and the origin is in the lower left corner. x coordinates increase to the right, y coordinate increase upwards. Thus, a coordinate system as usually started with in math, too.

The annotation rectangle

This also is the coordinate system of the annotation rectangle coordinates.

In the case at hand they are

/Rect [68.0595 741.373 101.138 757.298]

i.e. the rectangle with the lower left corner at (68.0595, 741.373) and the upper right at (101.138, 757.298).

Transformations of the coordinate system

In the page content stream up to the text object already identified by the OP the coordinate system gets transformed a number of times.

Mirroring, translation

In the very first line of the page content

1 0 0 -1 0 843 cm

This transformation moves the origin up by 843 units and mirrors (multiplies by -1) the y coordinate.

Thus, now be have a coordinate system with the origin in the upper left and y coordinate increasing downwards.

Scaling

A bit later in the content stream the coordinate system is scaled

.75062972 0 0 .75062972 0 0 cm

Thus, the coordinate units are compressed to about 3/4 of their original width and height, i.e. each unit along the x or y is only 1/96" wide/high.

The text "there"

Only after these transformations have been applied to the coordinate system, the text object identified by the OP is drawn. It starts by setting and changing the text matrix:

1 0 0 -1 0 130 Tm

This sets the text matrix to translate by 130 units in y direction and mirroring y coordinates once again. (Mirroring back again is necessary as otherwise the text would be drawn upside down.)

96 0 Td

This changes the text matrix by moving 96 units along the x axis.

And the starting point where the text is drawn is at the origin of the coordinate system first changed by the mirroring and translation, and then by scaling of the current transformation matrix, and then by mirroring and translation according to the text matrix.

Does it match?

Which coordinate would this point be in the default user coordinate system?

x = (0 + 96) * .75062972 = 72 (approximately)
y = (((0 * (-1)) + 130) * .75062972) * (-1) + 843 = 745,4 (approximately)

This matches with the annotation rectangle (see above) with x coordinates between 68.0595 and 101.138 and y coordinates between 741.373 and 757.298.

So