user5747838
user5747838

Reputation: 91

I have used tess4j to extract text from image but not getting correct result

I have used tess4j but not getting correct result. below is my code.

 public static String crackImage(String filePath) {


  BufferedImage img = null;
  try {
      img = ImageIO.read(new File(filePath));
  } catch (IOException e) {
  }
    ITesseract instance = new Tesseract();
    instance.setLanguage("eng");
  //  instance.setPageSegMode((3));
 img=  ImageHelper.convertImageToGrayscale(img);
    instance.setDatapath("C:\\tessdata");

    try {
        String result = instance.doOCR(img);
        return result;
    } catch (TesseractException e) {
        System.err.println(e.getMessage());
        return "Error while reading image";
    }
}

I attached sample image.

Smaple Image

MY output is:

arm m manner: a; man

mfl/Vemmnh 1951 mm 8221 11m 3521|\|\|II\IIIIIIHIIIIIHIIIH

scum—WWW

%‘

Please suggest how can I get correct result

Upvotes: 0

Views: 1259

Answers (1)

Himeshgiri gosvami
Himeshgiri gosvami

Reputation: 2884

here is the best practice,

you need to do image processing prefer to use (OpenCV) before running that tess4j command. https://github.com/tesseract-ocr/tesseract/wiki/ImproveQuality

or you can choose Google Ml KIT

https://firebase.google.com/docs/ml-kit/recognize-text

Upvotes: 1

Related Questions