Reputation: 31
I currently try to use tesseract.js in angular, to perform some recognition on images that have previously been modified in opencv.js.
Image manipulation via opencv.js is working really great now, but I can't figure whats wrong with my differents tries with tesseract.js...
When I follow some tutorials on the web, it works great and I can perform OCR on the default example image, for example (only the revelant part)
const exampleImage = 'https://tesseract.projectnaptha.com/img/eng_bw.png';
const worker = Tesseract.createWorker({
logger: m => console.log(m)
});
Tesseract.setLogging(true);
work();
async function work() {
await worker.load();
await worker.loadLanguage('eng');
await worker.initialize('eng');
let result = await worker.detect(exampleImage);
console.log(result.data);
await worker.terminate();
}
But, when I try to do the same with a previously processed image (via opencv.js), with an cv.Mat() image, or via the resulting html canvas... I always get the same error:
tesseract.js error : TypeError: Cannot read property 'SetImage' of null
I also get this error : Error in pixReadMem: size < 12
I don't really understand what I'm doing wrong, and I believe that my error can be in the way I give the picture to tesseract... But every way that I've tried didn't work, so here I am to ask for your help.
Example of code not working :
const worker = Tesseract.createWorker({
logger: m => console.log(m)
});
Tesseract.setLogging(true);
work(onlyDocument);
async function work(d) {
await worker.load();
const ctx = document.getElementById('result').getContext('2d');
const buffer = ctx.getImageData(0, 0, ctx.canvas.width, ctx.canvas.height).data.buffer;
const result2 = await worker.detect(buffer);
console.log(result2.data);
await worker.terminate();
}
I must precise that every I tried every format that I could think to give that image to tesseract.js (buffer, the canvas, array, ...)
Upvotes: 1
Views: 1707
Reputation: 14
You would need to initialize the Tesseract API before performing any OCR tasks. This would resolve the following error.
tesseract.js error : TypeError: Cannot read property 'SetImage' of null
Solution:
//Your async function
async function work(d) {
await worker.load();
await worker.loadLanguage('eng');
await worker.initialize('eng');
//language choice (e.g.: 'eng') based on trained data available
//Image like input can now be given to recognize(), detect() methods
...
await worker.terminate();
}
After initialization, as long as the input to API is image-like, it should work regardless of whether the image is pre-processed/ unprocessed. Hope this solves your query.
P.S.: The tutorial sample had the API initialized and hence no errors were thrown.
Upvotes: 0