LLL RRR
LLL RRR

Reputation: 189

tesseract.js returns too long string with base64

I would like to test the tesseract.js library on the node.js server, but when running the following code:

var TESSERACT = require('tesseract.js');
var base64String = 'data:image/png;base64,' + pngInBase64;
var job1 = TESSERACT.recognize(base64String, {
    progress: show_progress,
    lang: 'ang'
});

function show_progress(p) {
    console.log(p);
}

It receives an error in the form:

Error: ENAMETOOLONG: name too long, open 'data:image/png;base64,iVBORw0KGgoAAAA ...

Is it possible to set (enlarge) the maximum length of the base64 string in some way?

Upvotes: 0

Views: 2671

Answers (1)

tomshacham
tomshacham

Reputation: 41

Using "tesseract.js": "1.0.10":

By passing in a string as the parameter to recognize, Tesseract is trying to open a file named data:image/png;base64,{bytes...} and this is throwing the error that you see, namely ENAMETOOLONG, the filename is too long.

To recognize a base64 string, turn it into a Buffer whose contents are base64 decoded:

Tesseract.recognize(Buffer.from(base64String, 'base64'));
// have a cup of tea

Note: Tesseract.recognize doesn't work on a base64 Buffer and you will also need to get rid of the metadata: data:image/png;base64.

So this won't work:

Tesseract.recognize(Buffer.from(base64string));

and this won't work either:

const base64string = 'data:image/png;base64,{bytes...}'
Tesseract.recognize(Buffer.from(base64string));

you need to get the bytes:

const base64string = 'data:image/png;base64,{bytes...}'.split(',')[1];
Tesseract.recognize(Buffer.from(base64string, 'base64'));

Upvotes: 1

Related Questions