Reputation: 776
I'm trying to overlap an image on top of another image using ImageMagick. I set up an AWS beanstalk machine with 16 core cpu with 32gb ram(c5 4x large) and am running the code in a Go environment. Whenever a GET request hits the server, the following shell command gets executed.Here's the command that I'm running
cmd := "convert "+ img1 + " -page +"+fmt.Sprintf("%.1f", offsetX)+"+"+fmt.Sprintf("%.1f", offsetY) + " " + img2 + " -background none -flatten "+outputFilePath
cmdout,err := exec.Command("sh","-c",cmd).CombinedOutput()
//convert img1.png -page +10+10 img2.png -background none -flatten output.png
img1 is of size around 500x500 and im2 is around 200x200
I performed a load test and found out that the current setup can only handle 15 requests/second with a cpu usage of 51%. At 25req/sec, the cpu usage becomes 95%. I strongly believe there's something that I'm doing wrong. I'm using Imagemagick v6.7.8. Would upgrading to latest version or compiling ImageMagick from source (instead of yum install) help?
What should I be doing inorder to meet 100req/sec and make sure all vCPU's are optimally utilized
Upvotes: 1
Views: 1652
Reputation: 11210
I tried on my 2015 i5 laptop (two core, four thread). I made some test data like this:
$ mkdir sample
$ cd sample
$ vipsheader ../fg.png ../bg.png
../fg.png: 200x200 uchar, 4 bands, srgb, pngload
../bg.png: 500x500 uchar, 4 bands, srgb, pngload
$ for i in {0..1000}; do cp ../fg.png fg$i.png; done
$ for i in {0..1000}; do cp ../bg.png bg$i.png; done
So 1,000 500x500 and 200x200 PNG images.
First, the base case (IM 6.9.10):
$ time for i in {0..1000}; do convert bg$i.png -page +10+10 fg$i.png -background none -flatten out$i.png; done
real 0m49.461s
user 1m4.875s
sys 0m6.690s
49s is about 20 ops/second.
Next, I tried with GNU parallel. This is a simple way to run enough of them in parallel to keep all cores loaded:
$ time parallel convert bg{}.png -page +10+10 fg{}.png -background none -flatten out{}.png ::: {0..1000}
real 0m32.278s
user 1m46.428s
sys 0m11.897s
32s is 31 ops/second. This is on a two-core laptop -- you'd see a better speedup with a larger desktop machine.
Finally, I wrote a tiny pyvips program to do your task. pyvips is the Python binding for libvips, but there are Go bindings too.
import pyvips
for i in range(0, 1000):
bg_name = "bg" + str(i) + ".png"
fg_name = "fg" + str(i) + ".png"
out_name = "out" + str(i) + ".png"
bg = pyvips.Image.new_from_file(bg_name, access="sequential")
fg = pyvips.Image.new_from_file(fg_name, access="sequential")
result = bg.composite2(fg, "over", x=10, y=10)
result.write_to_file(out_name)
I see:
$ time ~/try/try289.py
real 0m25.887s
user 0m36.625s
sys 0m1.442s
26s is about 40 ops/second. You'd be able to get it a bit quicker if you ran several in parallel.
One of the limits you are hitting is the PNG format -- the library is single-threaded, and rather slow. If you are willing to try TIFF, you can get quite a bit more speed.
TIFF with deflate compression is functionally similar to PNG. If I try:
$ vips copy fg.png fg.tif[compression=deflate]
$ vips copy bg.png bg.tif[compression=deflate]
$ ls -l bg.*
-rw-r--r-- 1 john john 19391 Dec 27 20:48 bg.png
-rw-r--r-- 1 john john 16208 Jan 2 18:36 bg.tif
So it's actually slightly smaller, in this case. If I change the pyvips program to be:
bg_name = "bg" + str(i) + ".tif"
fg_name = "fg" + str(i) + ".tif"
out_name = "out" + str(i) + ".tif[compression=deflate]"
And run it, I see:
$ time ~/try/try289.py
real 0m17.618s
user 0m23.234s
sys 0m1.823s
About 55 ops/second.
Upvotes: 3