How to vectorize tasks in python?

Question

I (will) have a list of coordinates; using python's pillow module, I want to save a series of (cropped) smaller images to disk. Currently, I am using a for loop to act to determine one coordinate at a time then crop/save the image before proceeding to the next coordinate.

Is there a way to divide this job up such that multiple images can be cropped/saved simultaneously? I understand that this would take up more RAM but would be decrease performance time.

I'm sure this is possible but I'm not sure if this is simple. I've heard terms like 'vectorization' and 'multi-threading' that sound vaguely appropriate to this situation. But these topics extend beyond my experience.

I've attached the code for reference. However, I'm simply trying to solicit recommended strategies. (i.e. what techniques should I learn about to better tailor my approach, take multiple crops at once, etc?)


def parse_image(source, square_size, count, captures, offset=0, offset_type=0, print_coords=False):
    """
    Starts at top left corner of image. Iterates through image by square_size (width = height)
    across x values and after exhausting the row, begins next row lower by function of 
    square_size. Offset parameter is available such that, with multiple function calls, 
    overlapping images could be generated.
    """
    src = Image.open(source)
    dimensions = src.size
    max_down = int(src.height/square_size) * square_size + square_size
    max_right = int(src.width/square_size) * square_size + square_size

    if offset_type == 1:
        tl_x = 0 + offset
        tl_y = 0
        br_x = square_size + offset 
        br_y = square_size

        for y in range(square_size,max_down,square_size):
            for x in range(square_size + offset,max_right - offset,square_size):
                if (tl_x,tl_y) not in captures:
                    sample = src.crop((tl_x,tl_y,br_x,br_y))
                    sample.save(f"{source[:-4]}_sample_{count}_x{tl_x}_y{tl_y}.jpg")
                    captures.append((tl_x,tl_y))

                    if print_coords == True: 
                        print(f"image {count}: top-left (x,y): {(tl_x,tl_y)}, bottom-right (x,y): {(br_x,br_y)}")
                    tl_x = x
                    br_x = x + square_size
                    count +=1                
                else:
                    continue
            tl_x = 0 + offset
            br_x = square_size + offset
            tl_y = y
            br_y = y + square_size
    else:
        tl_x = 0
        tl_y = 0 + offset
        br_x = square_size 
        br_y = square_size + offset

        for y in range(square_size + offset,max_down - offset,square_size):
            for x in range(square_size,max_right,square_size):
                if (tl_x,tl_y) not in captures:
                    sample = src.crop((tl_x,tl_y,br_x,br_y))
                    sample.save(f"{source[:-4]}_sample_{count}_x{tl_x}_y{tl_y}.jpg")
                    captures.append((tl_x,tl_y))

                    if print_coords == True: 
                        print(f"image {count}: top-left (x,y): {(tl_x,tl_y)}, bottom-right (x,y): {(br_x,br_y)}")
                    tl_x = x
                    br_x = x + square_size
                    count +=1
                else:
                    continue
            tl_x = 0
            br_x = square_size 
            tl_y = y + offset
            br_y = y + square_size + offset
    return count

H_DANILO · Accepted Answer

What you want to achieve here is to have a higher degree of parallelism, the first thing to do is to understand what is the minimum task that you need to do here, and from that, think in a way to better distribute it.

First thing to notice here is that there is two behaviour, first if you have offset_type 0, and another if you have offset_type 1, split that off into two different functions.

Second thing is: given an image, you're taking crops of a given size, at a given offset(x,y) for the whole image. You could for instance, simplify this function to take one crop of the image, given the image offset(x,y). Then, you could call this function for all the x and y of the image in parallel. That's pretty much what most image processing frameworks tries to achieve, even more the one's that run code inside the GPU, small blocks of code, that operates locally in the image.

So lets say your image has width=100, height=100, and you're trying to make crops of w=10,h=10. Given the simplistic function that I described, I will call it crop(img, x, y, crop_size_x, crop_size_y) All you have to do is create the image:

img = Image.open(source)
crop_size_x = 10
crop_size_y = 10
crops = [crop(img, x, y, crop_size_x, crop_size_y) for x, y in zip(range(img.width), range(img.height))]

later on, you can then replace the list comprehension for a multi_processing library that can actually spawn many processes, do real parallelism, or even write such code inside a GPU kernel/shader, and use the GPU parallelism to achieve high performance.

How to vectorize tasks in python?

Answers (1)

Related Questions