Reputation: 13
I've been following this tutorial for trying to create an OCR and I've copy and pasted all of the necessary code and followed the steps but I keep receiving this error when I run OCRDemo.java:
Error opening data file ./eng.traineddata Please make sure the TESSDATA_PREFIX environment variable is set to your "tessdata" directory. Failed loading language 'eng' Tesseract couldn't load any languages!
So I'm assuming the issue is that TESSDATA_PREFIX has the wrong directory. Currently it is "C:\CodeRepository\OCR\tessdata" and I got that directory and confirmed that directory by literally going into file explorer and copying and pasting it. But I keep getting this error message. I've also tried "OCR\tessdata", "tessdata" but none of them work. Help?
Here's my pom.xml code that has the TESSDATA_PREFIX:
<project xmlns="http://maven.apache.org/POM/4.0.0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/xsd/maven-4.0.0.xsd">
<modelVersion>4.0.0</modelVersion>
<groupId>OCR</groupId>
<artifactId>OCR</artifactId>
<version>0.0.1-SNAPSHOT</version>
<properties>
<TESSDATA_PREFIX>C:\CodeRepository\OCR\tessdata</TESSDATA_PREFIX>
</properties>
<dependencies>
<dependency>
<groupId>net.sourceforge.tess4j</groupId>
<artifactId>tess4j</artifactId>
<version>4.3.1</version>
</dependency>
</dependencies>
</project>
Upvotes: 0
Views: 1701
Reputation: 11
ITesseract instance = new Tesseract();
instance.setDatapath("C:\\Users\\Tux\\Documents\\tessdata");
this worked for me without the need for setting environment variables. I just put the language file in the 'tessdata' folder
Upvotes: 1
Reputation: 8365
From the given link, it looks like it points the readers to incompatible language data files. Try https://github.com/tesseract-ocr/tessdata_fast.
Upvotes: 0