Reputation: 35826
I would like to resize a large number (about 5200) of image files (PPM format, each 5 MB in size) and save them to PNG format using convert
.
Short version:
convert
blows up 24 GB of memory although I use the syntax that tells convert
to process image files consecutively.
Long version:
Regarding more than 25 GB of image data, I figure I should not process all files simultaneously. I searched the ImageMagick documentation about how to process image files consecutively and I found:
It is faster and less resource intensive to resize each image it is read:
$ convert '*.jpg[120x120]' thumbnail%03d.png
Also, the tutorial states:
For example instead of...
montage '*.tiff' -geometry 100x100+5+5 -frame 4 index.jpg
which reads all the tiff files in first, then resizes them. You can instead do...
montage '*.tiff[100x100]' -geometry 100x100+5+5 -frame 4 index.jpg
This will read each image in, and resize them, before proceeding to the next image. Resulting in far less memory usage, and possibly prevent disk swapping (thrashing), when memory limits are reached.
Hence, this is what I am doing:
$ convert '*.ppm[1280x1280]' pngs/%05d.png
According to the docs, it should treat each image file one by one: read, resize, write. I am doing this on a machine with 12 real cores and 24 GB of RAM. However, during the first two minutes, the memory usage of the convert
process grows to about 96 %. It stays there a while. CPU usage is at maximum. A bit longer and the process dies, just saying:
Killed
At this point, no output files have been produced. I am on Ubuntu 10.04 and convert --version
says:
Version: ImageMagick 6.5.7-8 2012-08-17 Q16 http://www.imagemagick.org
Copyright: Copyright (C) 1999-2009 ImageMagick Studio LLC
Features: OpenMP
It looks like convert
tries to read all data before starting the conversion. So either there is a bug in convert
, an issue with the documentation or I did not read the documentation properly.
What is wrong? How can I achieve low memory usage while resizing this large number of image files?
BTW: a quick solution would be to just loop over the files using the shell and invoke convert
for each file independently. But I'd like to understand how to achieve the same with pure ImageMagick.
Thanks!
Upvotes: 10
Views: 5434
Reputation: 208077
I would go with GNU Parallel if you have 12 cores - something like this, which works very well. As it does only 12 images at a time, whilst still preserving your output file numbering, it only uses minimal RAM.
scene=0
for f in *.ppm; do
echo "$f" $scene
((scene++))
done | parallel -j 12 --colsep ' ' --eta convert {1}[1280x1280] -scene {2} pngs/%05d.png
Notes
-scene
lets you set the scene counter, which comes out in your %05d
part.
--eta
predicts when your job will be done (Estimated Arrival Time).
-j 12
runs 12 jobs in parallel at a time.
Upvotes: 2
Reputation: 1765
Got the same issue, it seems it's because ImageMagick create temporary files into the /tmp directory, which is often mounted as a tmpfs.
Just move your tmp somewhere else.
For example:
create a "tmp" directory on a big external drive
mkdir -m777 /media/huge_device/tmp
make sure the permissions are set to 777
chmod 777 /media/huge_device/tmp
as root, mount it in replacement to your /tmp
mount -o bind /media/huge_device/tmp /tmp
Note: It should be possible to use with the TMP environment variable to do the same trick.
Upvotes: 3
Reputation: 90335
Without having direct access to your system it's really hard to help you debugging this.
But you can do three things to help yourself narrowing down this problem:
Add -monitor
as the first commandline argument to see more details about what's going on.
(Optionally) add -debug all -log "domain: %d +++ event: %e +++ function: %f +++ line: %l +++ module: %m +++ processID: %p +++ realCPUtime: %r +++ wallclocktime: %t +++ userCPUtime: %u \n\r"
Temporarily, don't use '*.ppm[1280x1280]' as an argument, but use 'a*.ppm[1280x1280]' instead. The purpose is to limit your wildcard expansion (or some other suitable way to achieve the same) to only a few matches, instead of all possible matches.
If you do '2.' you'll need to do '3.' as well otherwise you'll be overwhelmed by the mass of output. (Also your system does seem to not be able to process the full wildcard anyway without having to kill the process...)
If you do not find a solution, then...
Upvotes: 6