Reputation: 2742
Is there any way we can get the text from a scanned document in jpg jpeg or any other format ? I am using ruby as my programming language . But I guess if I can get the texts with some help from other programming languages , it will not be much of a problem to integrate.
Thanks.
Upvotes: 1
Views: 2550
Reputation: 13004
This technology is called optical character recognition (OCR).
For programming, check out this question, which recommends tesseract-ocr.
OCR for ruby? check out this question.
If it's just a couple images, here's a site that supposedly does it for free.
Upvotes: 1
Reputation: 333
OCR Terminal http://www.ocrterminal.com has been the best (most accurate) free tool out of at least a dozen that I have used. It works especially well with formatted (table) data.
Upvotes: 0
Reputation: 2894
Yes, you can use an OCR library. There are additional details at https://stackoverflow.com/questions/1085/free-ocr-library.
In brief, you may wish to consider using tessnet (http://www.pixel-technology.com/freeware/tessnet2/).
Upvotes: 2