Abdul Wasae
Abdul Wasae

Reputation: 3688

Android, Java - Fix an OCR-ed word to a valid english dictionary word in real time

My application involves scanning through the phone camera and detecting text. The only words that my application is concerned with is valid english words. I have a list of ~354,000 valid english words that i can compare my scanned word with.

Since my application continuously detects text, i need this functionality to be very very fast. I have applied Levenshtein Distance technique. For each word, I:

  1. Store the contents of the text file into an Arraylist<String> using Scanner
  2. Calculate Levenshtein Distance of the word with each of the 354k words
  3. Return the word corresponding to the minimum distance value

The problem is that it is very very slow. Without applying this, my app manages to ocr more than 20 words in around 70 to 100 millisecond. When i include this fixing routine, my app takes more that 1 full minute (60000ms) for a single word.

I was wondering if this technique is even suitable, given my case. If not, what other tested way should i go with? Any help would be greatly appreciated. I know this is possible, looking at how android keyboards are able to instantly correct our incorrectly typed words.

Other Failed endeavors:

Upvotes: 0

Views: 161

Answers (1)

Abdul Wasae
Abdul Wasae

Reputation: 3688

My Solution that works: I created a MYSQL table and uploaded the list of valid english words in it. It solves all the problems addressed in the question.

Here is my Android Application for reference: Optical Dictionary & Vocabulary Teacher

Upvotes: 0

Related Questions