skatz
skatz

Reputation: 125

ImageMagick convert out of memory

I have a custom application runnning on CentOS 6.7 with 64 GB of RAM, which is basically a file crawler that calls the following bash script every time it finds a file matching some file extensions(mainly TIFFs or multipage TIFFs). I can't tell exactly the frequency and how many files are been considered, but it's in the order of thousands.

#!/bin/bash

IMAGE_INPUT=$1
OUTPUT=$2
TMP_FOLDER=/data/tesseract-tmp

# generating a unique random file name
TFN=`cat /dev/urandom | tr -cd 'a-f0-9' | head -c 32`;
# converting the image and putting the result into the TFN
/usr/bin/convert -density 288 "$IMAGE_INPUT" -resize 75% -quality 100 -append jpeg:$TMP_FOLDER/$TFN;
# extract text with tesseract and put it into a result file
/usr/local/bin/tesseract $TMP_FOLDER/$TFN $TMP_FOLDER/$TFN.out;
cp $TMP_FOLDER/$TFN.out.txt $OUTPUT;
# returning the file content to std output
cat $OUTPUT;

The temp files are being cleaned by a cronjob.

I have noticed that after some time and a lot of calls to the script, the top command shows me that the gs and convert processes of imagemagick are taking all the memory available, and they start to consume all the swap space available. If I don't kill those processes the system runs out of memory and freezes.

How can I solve this situation? Is there a way to limit the amount of memory for a particular program(convert) or is there the possibility to queue the execution of calls to the script?

N.B. I have seen that there is the limit option for the convert command, but if I'm understanding right, it applies to the single instance of the running process, while I would like to limit the memory usage for the whole running instances.

Thanks

Upvotes: 1

Views: 3855

Answers (2)

skatz
skatz

Reputation: 125

I've solved it by using the following command:

nice -20 /usr/bin/convert -limit memory 32 -limit map 32 -density 288 "$IMAGE_INPUT" -resize 75% -quality 100 -append jpeg:$TMP_FOLDER/$TFN;

This way the memory il fully occupied but it never starts swapping, and the system never freeze.

Thanks anyway to Mark Setchell answer, it's been useful and appropriate for my purpose.

Upvotes: 4

Mark Setchell
Mark Setchell

Reputation: 208052

You could try using GNU Parallel to limit the memory use and improve the speed by running jobs in parallel . Basically, it won't start another parallel job till the specified amount of memory is free.

So, assuming your script is called OCR and it takes an input filename as parameter:

parallel --memfree 1G OCR {} ::: *.tif

Upvotes: 2

Related Questions