Prabesh Shrestha
Prabesh Shrestha

Reputation: 2742

read text document from scanned image

Is there any way we can get the text from a scanned document in jpg jpeg or any other format ? I am using ruby as my programming language . But I guess if I can get the texts with some help from other programming languages , it will not be much of a problem to integrate.

Thanks.

Upvotes: 1

Views: 2550

Answers (3)

dsg
dsg

Reputation: 13004

This technology is called optical character recognition (OCR).

For programming, check out this question, which recommends tesseract-ocr.

OCR for ruby? check out this question.

If it's just a couple images, here's a site that supposedly does it for free.

Upvotes: 1

Digitz
Digitz

Reputation: 333

OCR Terminal http://www.ocrterminal.com has been the best (most accurate) free tool out of at least a dozen that I have used. It works especially well with formatted (table) data.

Upvotes: 0

Zian Choy
Zian Choy

Reputation: 2894

Yes, you can use an OCR library. There are additional details at https://stackoverflow.com/questions/1085/free-ocr-library.

In brief, you may wish to consider using tessnet (http://www.pixel-technology.com/freeware/tessnet2/).

Upvotes: 2

Related Questions