Reputation: 1
so im trying to extract data from invoices for that im using abby cloud ocr. im got the output as xml file now what i want to do is look for a text and take its rectangle cordinates and then look for closest rectangle and take its value to do that i need the rectangle coordinates well the xml file actually return cordinates but i cant understand it
ill show u an example of the xml output (ill replace uneeded text with '....')
<line baseline="2062" l="2037" t="2033" r="2206" b="2064">....</line>
<line baseline="2101" l="295" t="2070" r="588" b="2097">....</line>
these are too different rectangles anyway i went to see the documentation and this is what is says
baseline — the distance from the base line to the top edge of the page
l — the coordinate of the left border of the surrounding rectangle,
t — the coordinate of the top border of the surrounding rectangle
r — the coordinate of the right border of the surrounding rectangle
b — the coordinate of the bottom border of the surrounding rectangle
what coordinate of the left border of the surrounding rectangle mean ?
isnt the rectangle coordinates on this format [[x1,y1],[x2,y2],[x3,y3],[x4,y4]]?
can you explain to me what they mean by these coordinates or how can i use it ??
Upvotes: 0
Views: 198