user9803071
user9803071

Reputation: 75

OCRmyPDF - Wired error message from tesseract

I get a strange error message when running a OCRmyPDF command

My setup:

I have to say that the command is triggered by the software NoodleSoft Hazel, and as far as i understand Hazel executes the shell commands in a dedicated environment. However, my setup worked fine for a few weeks, but within the processing of a batch of PDF files, the following error started to occur. Since then I was not able to bring it back to work.

The debug file debugOCR.txt shows the following error:

1 [tesseract] Error in fopenReadStream: failed to open locally with tail 000001_ocr.png for filename /tmp/ocrmypdf.io.81a_o2mw/000001_ocr.png
1 [tesseract] Leptonica Error in findFileFormat: image file not found: /tmp/ocrmypdf.io.81a_o2mw/000001_ocr.png
1 [tesseract] Error in fopenReadStream: failed to open locally with tail PNG for filename PNG
1 [tesseract] Leptonica Error in pixRead: image file not found: PNG
1 [tesseract] Image file PNG cannot be read!
1 [tesseract] Error during processing.
SubprocessOutputError

In the folder /tmp i can't find any subfolder like /tmp/ocrmypdf.io.81a_o2mw/.

I also have to mention that when executing the following commands directly in Apple Terminal, they work fine:

ocrmypdf -l deu+fra+eng --clean --force-ocr test.pdf test-out.pdf 2>> debugOCR.txt
tesseract test.tiff output --oem 1 -l eng pdf 

Any hints where I have to dig deeper? Is ocrmypdf or tesseract missing some environment variables in the Hazel environment? Other hints?

Thanks a lot

AJ

Upvotes: 0

Views: 50

Answers (1)

Aman Rusia
Aman Rusia

Reputation: 16

https://github.com/tesseract-ocr/tesseract/issues/4333

This is likely the issue.

I faced the same while using wcgw mcp which also has a separate terminal evironment.

Setting TMPDIR to //tmp helped me.

Upvotes: 0

Related Questions