Reputation: 25
this is about image to text (OCR) converter using terrasect. Refering to a working codepen demo at here, I managed to extract the text using data.text in my code. May I ask how to extract the numbers (Highlighted in Green) which is 936 and 385 in my case? I have tried using data.html but does not work.
I will aprreciate any help I can get. You will have to upload an image with words for it to work since it is an OCR Reader.
Image with text: https://i.ibb.co/gZLWbjC/dog.jpg
function result(data){
var r = $(".result");
console.log(data);
r.append(
"<div class='sixteen wide column output'>success" +
"<div class='ui message'><pre>" + data.text +"</pre></div>" +
"</div>"
);
}
Upvotes: 0
Views: 90
Reputation: 10824
You could use DOMParser
to parse the HTML and then get the page_1
element and then get its title. After that, you could parse the title to get the numbers by selecting the numbers between bbox
and ;
, then you could take the third and fourth number.
Modify the function result(data)
to this:
function result(data){
var r = $(".result");
const parser = new DOMParser();
const parsed = parser.parseFromString(data.html, 'text/html');
const firstOccurrence = parsed.getElementById('page_1').getAttribute('title');
const numbers = firstOccurrence.split('bbox ')[1].split(';')[0].split(' ');
console.log("green numbers:", numbers[2], numbers[3])
r.append(
"<div class='sixteen wide column output'>success" +
"<div class='ui message'><pre>" + data.text +"</pre></div>" +
"</div>"
);
}
Here is the working fork codepen.
Upvotes: 1