Reputation: 928
I want to check if particular string is present in the image. Is that possible? Is pngj can do that?
My file will contain a graph and some legends. I want to check the if the legends are correct.
Upvotes: 3
Views: 3590
Reputation: 19
You can use Tesserat.sample code below:-
String src = "path of File";
String ocrString = "";
Tesseract instance = new Tesseract(); instance.setDatapath("path of tessdata\\Tess4J\\tessdata");
BufferedImage bufferedImage = ImageIO.read(new File(src));
ocrString = instance.doOCR(new File(src));
Upvotes: 0
Reputation: 13015
Here I use scala to give out my solution. If you are java developer, it is quite easy for you to convert the scala code to your java code.
Step1: in build.sbt to add one more line
libraryDependencies += "com.asprise.ocr" % "java-ocr-api" % "[15,)"
Step2: import library
import com.asprise.ocr.Ocr
Step2: scala code.Please note: here is a File type. If you only have fileName/filePath, you need to use new File() to convert it.
try {
// Image
Ocr.setUp()
val ocr = new Ocr
ocr.startEngine("eng", Ocr.SPEED_FASTEST)
val files = List(<your_file>)
val outputString = ocr.recognize(files.toArray, Ocr.RECOGNIZE_TYPE_ALL, Ocr.OUTPUT_FORMAT_PLAINTEXT)
ocr.stopEngine()
Some(outputString)
} catch {
case e: Exception => None // todo: to support multiple file types
}
I also write a blog to give more details info about how to extract text/content from another file(pdf, html, image, etc)
If you want to read more about this java-ocr-api, you can read its official website here.
Upvotes: 1
Reputation: 11745
You can try Asprise OCR out. It's a good OCR API available in Java.
Upvotes: 0
Reputation: 839154
No, you can't do that with pngj. The text that is visible in the PNG image is not internally stored as text. You will need OCR software if you wish to identify the text.
However it would be much better if you could get the data in another format that is easier to parse by a computer.
Upvotes: 4
Reputation: 7662
Yes, it seems to be possible. However, you should find a good OCR library. And then, assuming that your OCR library returned proper results you need to verify somehow if your legends are placed in proper positions.
Upvotes: 1