Saikiran
Saikiran

Reputation: 776

How to optimize ImageMagick CPU usage on server

I'm trying to overlap an image on top of another image using ImageMagick. I set up an AWS beanstalk machine with 16 core cpu with 32gb ram(c5 4x large) and am running the code in a Go environment. Whenever a GET request hits the server, the following shell command gets executed.Here's the command that I'm running

cmd := "convert "+ img1 + " -page +"+fmt.Sprintf("%.1f", offsetX)+"+"+fmt.Sprintf("%.1f", offsetY) + " " + img2 + " -background none -flatten "+outputFilePath
cmdout,err := exec.Command("sh","-c",cmd).CombinedOutput()
//convert img1.png -page +10+10 img2.png -background none -flatten  output.png

img1 is of size around 500x500 and im2 is around 200x200

I performed a load test and found out that the current setup can only handle 15 requests/second with a cpu usage of 51%. At 25req/sec, the cpu usage becomes 95%. I strongly believe there's something that I'm doing wrong. I'm using Imagemagick v6.7.8. Would upgrading to latest version or compiling ImageMagick from source (instead of yum install) help?

What should I be doing inorder to meet 100req/sec and make sure all vCPU's are optimally utilized

Upvotes: 1

Views: 1652

Answers (1)

jcupitt
jcupitt

Reputation: 11210

I tried on my 2015 i5 laptop (two core, four thread). I made some test data like this:

$ mkdir sample
$ cd sample
$ vipsheader ../fg.png ../bg.png 
../fg.png: 200x200 uchar, 4 bands, srgb, pngload
../bg.png: 500x500 uchar, 4 bands, srgb, pngload
$ for i in {0..1000}; do cp ../fg.png fg$i.png; done
$ for i in {0..1000}; do cp ../bg.png bg$i.png; done

So 1,000 500x500 and 200x200 PNG images.

First, the base case (IM 6.9.10):

$ time for i in {0..1000}; do convert bg$i.png -page +10+10 fg$i.png -background none -flatten out$i.png; done
real    0m49.461s
user    1m4.875s
sys 0m6.690s

49s is about 20 ops/second.

Next, I tried with GNU parallel. This is a simple way to run enough of them in parallel to keep all cores loaded:

$ time parallel convert bg{}.png -page +10+10 fg{}.png -background none -flatten  out{}.png ::: {0..1000}
real    0m32.278s
user    1m46.428s
sys 0m11.897s

32s is 31 ops/second. This is on a two-core laptop -- you'd see a better speedup with a larger desktop machine.

Finally, I wrote a tiny pyvips program to do your task. pyvips is the Python binding for libvips, but there are Go bindings too.

import pyvips

for i in range(0, 1000):
    bg_name = "bg" + str(i) + ".png"
    fg_name = "fg" + str(i) + ".png"
    out_name = "out" + str(i) + ".png"

    bg = pyvips.Image.new_from_file(bg_name, access="sequential")
    fg = pyvips.Image.new_from_file(fg_name, access="sequential")

    result = bg.composite2(fg, "over", x=10, y=10)

    result.write_to_file(out_name)

I see:

$ time ~/try/try289.py 
real    0m25.887s
user    0m36.625s
sys 0m1.442s

26s is about 40 ops/second. You'd be able to get it a bit quicker if you ran several in parallel.

One of the limits you are hitting is the PNG format -- the library is single-threaded, and rather slow. If you are willing to try TIFF, you can get quite a bit more speed.

TIFF with deflate compression is functionally similar to PNG. If I try:

$ vips copy fg.png fg.tif[compression=deflate]
$ vips copy bg.png bg.tif[compression=deflate]
$ ls -l bg.*
-rw-r--r-- 1 john john 19391 Dec 27 20:48 bg.png
-rw-r--r-- 1 john john 16208 Jan  2 18:36 bg.tif

So it's actually slightly smaller, in this case. If I change the pyvips program to be:

bg_name = "bg" + str(i) + ".tif"
fg_name = "fg" + str(i) + ".tif"
out_name = "out" + str(i) + ".tif[compression=deflate]"

And run it, I see:

$ time ~/try/try289.py 
real    0m17.618s
user    0m23.234s
sys 0m1.823s

About 55 ops/second.

Upvotes: 3

Related Questions