Reputation: 373
I'm using pillow
and working with pretty large images (at least 10500 x 10500 px), which in turn uses up quite a lot of memory. I was wondering if there was a way to lower it and tried using a compressed image to load (which would be ~400kb instead of 420mb), instead of directly creating a new one.. but the memory usage is the same:
Line # Mem usage Increment Line Contents
================================================
151 35.969 MiB 0.742 MiB base = Image.open("C:/Users/Nick/Desktop/transparent.png")
152 456.992 MiB 421.023 MiB base.load()
155 877.641 MiB 420.648 MiB base_hallway = Image.new("RGBA", (map_width_px, map_height_px))
I also tried using a jpg or Image.new()
with RGB only for the second image, but ditching the alpha channel didn't work either.
Line # Mem usage Increment Line Contents
================================================
151 36.309 MiB 0.766 MiB base = Image.open("C:/Users/Nick/Desktop/transparent.png")
152 457.359 MiB 421.051 MiB base.load()
156 457.367 MiB 0.008 MiB base_hallway = Image.open("C:/Users/Nick/Desktop/blackjpg.jpg")
157 878.312 MiB 420.945 MiB base_hallway.load()
Mainly the operation being run on the base images is pasting other images on top of them in different positions. The rooms or hallways also have operations on them, but use almost no memory in comparison, such as picking the proper position to paste depending on the previous room or hallway, rotating if necessary, etc. But since it requires dozens or even hundreds of items pasted on top, I can't close the base images after every iteration (so only base OR base_hallway is open at any one time). I tried to open the base and base_hallway images only when needed, which requires a lot of save and close operations as well. That ended up increasing the time it takes for the code to run tenfold.. Simplified:
room = Image.open(open_room)
if next_tile == "room":
base.paste(room, box=(rand_width_position, rand_height_position), mask=room)
elif next_tile == "hallway" or next_tile == "junction":
base_hallway.paste(room, box=(rand_width_position, rand_height_position), mask=room)
Is there any way to optimize the memory usage?
Thanks!
Upvotes: 4
Views: 4288
Reputation: 11190
I had a go with pyvips. I don't know if that's a possibility for you.
pyvips is a streaming image processing library, so rather than keeping everything in memory, it builds a network of operations and then streams pixels from your source images through the network and straight back to disc.
This program will load an image, paste a lot more images on top at random positions, then write the result back.
import sys
import random
import pyvips
# the access hint means we want to stream this image
base = pyvips.Image.new_from_file(sys.argv[2], access='sequential')
for filename in sys.argv[3:]:
tile = pyvips.Image.new_from_file(filename, access='sequential')
x = random.randint(0, base.width - tile.width)
y = random.randint(0, base.height - tile.height)
base = base.insert(tile, x, y)
# all the processing happens on the final save as the pipeline executes
base.write_to_file(sys.argv[1])
For test data, I made 100 1,500 x 2,000 pixel images plus a 10,000 x 10,000 pixel background image. I can run it like this:
$ /usr/bin/time -f %e:%M python3 ../insert.py x.jpg ../background.jpg *.jpg
775200:0.75
So that's 0.75s and 780mb of memory for the whole process.
This is a big desktop machine with 32 threads. If I tell vips to run with fewer threads, memory use drops quite a bit:
$ VIPS_CONCURRENCY=1 /usr/bin/time -f %e:%M python3 ../insert.py x.jpg ~/pics/huge.jpg *.jpg
199020:1.38
Under 200mb now, though it's slower.
Upvotes: 5