Reputation: 1794
I have the following code:
public String getName(BufferedImage subc){
String name=null;
Tesseract1 instance = new Tesseract1();
instance.setPageSegMode(8);
instance.setLanguage("eng");
instance.setTessVariable("tessedit_char_whitelist", "qwertyuiopasdfghjklzxcvbnmQWERTYUIOPASDFGHJKLZXCVBNM0123456789_.");
try {
name=instance.doOCR(subc);
} catch (TesseractException e) {System.err.println(e.getMessage());}
name=new StringTokenizer(name,"\n").nextToken();
return name;
}
where subc is the image already cut and preprocessed of the word. What I want is either to obtain the confidence of the recognition of the image or to iterate the first, lets say, 30 most likely words. I have found examples like this Tess4J: How to get a Character's confidence value?, but it breaks at the first line,
TessResultIterator ri = TessAPI1.TessBaseAPIGetIterator(api);
when I put my object "instance" as the parameter "api", and after some trying to use getpointer and different objects I ve had no luck so far. Here http://tess4j.sourceforge.net/docs/docs-1.0/net/sourceforge/tess4j/package-summary.html, in the class summary I understand that maybe the objects Tesseract or Tesseract1 are not the most appropiate for what I want to do, but I didn't manage to recognize a word from an image with TessAPI or TessAPI1. The ResultIterator in c++ looks pretty concise, but with pointers: https://code.google.com/p/tesseract-ocr/wiki/APIExample Thanks!
Upvotes: 0
Views: 2406
Reputation: 8345
The Tesseract
is a simplified API, exposing only the most commonly used methods from TessAPI
interface. To get the text confidence, you'll need to work with the TessAPI
. The library's unit tests include some common use cases. You definitely want to take a look at them.
Upvotes: 1