brandon
brandon

Reputation: 1230

Simplified OCR with unchanging font

I'm working on a project with a need for a simpler and more accurate OCR tool

My Scenario:

I would use a normal OCR program, but I know I can get more accurate results, if not perfect results, because it's always the same font.

So, what is a good approach? I don't want to do a ton of work writing this from scratch, but I don't want an over generalized OCR tool that comes pre trained. I want to train it on this one font so it will get very accurate results. I also don't want to do feature extraction of separating out the words and finding the lines for the letters etc.

Upvotes: 2

Views: 1272

Answers (3)

Nikolay
Nikolay

Reputation: 2214

Sounds like you should look for field level recognition, where you don't perform OCR for the full image, but specify only a number of fields with coordinates. If you're planning a commercial software and seek enterprise accuracy - have a look at www.ocrsdk.com - it is a cloud based OCR SDK recently launched by ABBYY. It's now in beta so it's completely free to use. It has a nice method suitable for extracting text from a document and C# sample codes.

Upvotes: 0

Andrew Cash
Andrew Cash

Reputation: 2394

I would employ an economical OCR engine such as TOCR from http://www.transym.com. The license fees are very cheap, the OCR is fast and very accurate results especially if you define a fixed rectangle to extract from and there is no background noise. You should download a trial version to test the results before committing to a purchase.

By the time you set up a custom OCR engine and train it you will have spent considerably more than the small license fee and you may find the results to be more accurate anyway.

If were were able to see a graphic or two of the text you want to OCR then we would be able to give a more accurate answer.

Upvotes: 0

Mario
Mario

Reputation: 36487

I'd probably use OpenCV's machine learning (e.g. using haar cascades), unless the character's position is really perfectly static - in that case a simple comparison could do the trick (find the best match using absolute sum of differences for example).

Is the font fixed? If not, you could use one of the special OCR fonts to get characters that are hard to confuse, even on worse images.

Although, considering you said you'd like to teach it, you might be best off with machine learning.

Upvotes: 1

Related Questions