Antonin GAVREL
Antonin GAVREL

Reputation: 11219

How to convert images with parallel from one directory to another

I am trying to use the following command:

ls -1d a/*.jpg | parallel convert -resize 300x300  {}'{=s/\..*//=}'.png

However one problem that I didn't succeed to solve is to have the files to be output to folder b and not in the same folder

Spent quite some times looking for an answer but didn't find any where the files are piped through ls command. (thousands of pictures). I would like to keep the same tools (ls pipe, parallel and convert - or mogrify if better)

Upvotes: 1

Views: 630

Answers (1)

Mark Setchell
Mark Setchell

Reputation: 207355

First, with mogrify:

mkdir -p b     # ensure output directory exists
magick mogrify -path b -resize 300x300 a/*.jpg

This creates a single mogrify process that does all the files without the overhead of creating a new process for each image. It is likely to be faster if you have a smallish number of images. The advantage of this method is that it doesn't require you to install GNU Parallel. The disadvantage is that there is no parallelism.


Second, with GNU Parallel:

mkdir -p b     # ensure output directory exists
parallel --dry-run magick {} b/{/} ::: a/*.jpg

Here {/} means "the filename with the directory part removed" and GNU Parallel does it all nicely and simply for you.

If your images are large, say 8-100 megapixels, it will definitely be worth using the JPEG "shrink-on-load" feature to reduce disk i/o and memory pressure like this:

magick -define jpeg:size=512x512 ...

in the above command.

This creates a new process for each image, and is likely to be faster if you have lots of CPU cores and lots of images. If you have 12 CPU cores it will keep all 12 busy till all your images are done - you could change the number or percentage of used cores with -j parameter. The slight performance hit is that a new convert process is created for each image.


Probably the most performant option is to use GNU Parallel for parallelism along with mogrify to amortize process creation across more images, say 32, like this:

mkdir -p b
parallel -n 32 magick mogrify -path b -resize 300x300 ::: a/*.jpg

Note: You should try to avoid parsing the output of ls, it is error prone. I mean avoid this:

ls file*.jpg | parallel

You should prefer feeding in filenames like this:

parallel ... ::: file*.jpg

Note: There is a -X option for GNU Parallel which is a bit esoteric and likely to only come into its own with hundreds/thousands/millions of images. That would pass as many filenames as possible (in view of command-line length limitations) to each mogrify process. And amortise the process startup costs across more files. For 99% of use cases the answers I have given should be performant enough.

Note: If your machine doesn't have multiple cores, or your images are very large compared to the installed RAM, or your disk subsystem is slow, your mileage will vary and it may not be worth parallelising your code. Measure and see!

Upvotes: 1

Related Questions