Joe DiGiovanna
Joe DiGiovanna

Reputation: 31

Batch Convert Sony raw ".ARW" image files to .jpg with raw image settings on the command line

I am looking to convert 15 Million 12.8 mb Sony .ARW files to .jpg

I have figured out how to do it using sips on the Command line BUT what I need is to make adjustments to the raw image settings: Contrast, Highlights, Blacks, Saturation, Vibrance, and most importantly Dehaze. I would be applying the same settings to every single photo.

It seems like ImageMagick should work if I can make adjustments for how to incorporate Dehaze but I can't seem to get ImageMagick to work.

I have done benchmark testing comparing Lightroom Classic / Photoshop / Bridge / RAW Power / and a few other programs. Raw Power is fastest by far (on a M1 Mac Mini 16GB Ram) but Raw Power doesn't allow me to process multiple folders at once.

I do a lot of scripting / actions with photoshop - but in this case photoshop is by far the slowest option. I believe this is because it opens each photo.

Upvotes: 1

Views: 2780

Answers (1)

Mark Setchell
Mark Setchell

Reputation: 208052

That's 200TB of input images, without even allowing any storage space for output images. It's also 173 solid days of 24 hr/day processing, assuming you can do 1 image per second - which I doubt.

You may want to speak to Fred Weinhaus @fmw42 about his Retinex script (search for "hazy" on that page), which does a rather wonderful job of haze removal. Your project sounds distinctly commercial.

enter image description here

© Fred Weinhaus - Fred's ImageMagick scripts

If/when you get a script that does what you want, I would suggest using GNU Parallel to get decent performance. I would also think you may want to consider porting, or having ported, Fred's algorithm to C++ or Python to run with OpenCV rather than ImageMagick.

So, say you have a 24-core MacPro, and a bash script called ProcessOne that takes the name of a Sony ARW image as parameter, you could run:

find . -iname \*.arw -print0 | parallel --progress -0 ProcessOne {}

and that will recurse in the current directory finding all Sony ARW files and passing them into GNU Parallel, which will then keep all 24-cores busy until the whole lot are done. You can specify fewer, or more jobs in parallel with, say, parallel -j 8 ...

Note 1: You could also list the names of additional servers in your network and it will spread the load across them too. GNU Parallel is capable of transferring the images to remote servers along with the jobs, but I'd have to question whether it makes sense to do that for this task - you'd probably want to put a subset of the images on each server with its own local disk I/O and run the servers independently yourself rather than distributing from a single point globally.

Note 2: You will want your disks well configured to handle multiple, parallel I/O streams.

Note 3: If you do write a script to process an image, write it so that it accepts multiple filenames as parameters, then you can run parallel -X and it will pass as many filenames as your sysctl parameter kern.argmax allows. That way you won't need a whole bash or OpenCV C/C++ process per image.

Upvotes: 2

Related Questions