Matthew Cassell
Matthew Cassell

Reputation: 267

Vectorising numpys roll

I'm doing some sampling from an array at the moment, and I'm using a boolean mask to count the number of values that lay within a certain radius from a given centroid. I was wondering if someone could show me how to vectorise my approach instead of using a python loop to loop over elements of a list.

I have a 1d numpy array called centres, where each element of centres is a centroid given as a 1d numpy array. I have also created a boolean mask which is centered in the middle of the array ('centre_Y', 'centre_X' in the code below), which I am rolling from the middle of the array to the centroid position. I am then making the mask be sparse because my actual array that I am sampling from ('item' in the code below) is sparse. I then count the number of nonzero elements under the mask. Below is my code

for item in data:

    ### Do stuff

    for radius in radii:

        ### Do stuff

        # roll mask to centroid and count the number of elements within radius
        for centroid in centres:

            # roll in the vertical direction to centroid y coordinate
            mask_roll_y = np.roll(mask,centroid[0]-centre_Y,axis=0)

            # roll in the horizontal direction to centroid x coordinate and make sparse
            roll_mask = sparse.csr_matrix(np.roll(mask_roll_y,centroid[1]-centre_X,axis=1))

            # apply sparse mask to sparse data and count number of nonzero elements below
            number_count.append(np.count_nonzero(item[roll_mask]))

Now, the code above works perfectly fine and gives the results I want. My issue is, after doing some timings, the 'for centroid in centres' loop takes about .4 seconds to compute (for one array in data while using 50 radii and taking 100 samples for each radii), which is far and away the most time consuming part of my code. I need to do this for about 100 data sets with roughly 1000 arrays in each data set and I would like to take far more radii and samples than I did previously. So, I was wondering how I could eliminate the 'for centroid in centres' loop? I tried the following horrible piece of code

number_count.append(np.count_nonzero(item[sparse.csr_matrix(np.roll(np.roll(mask,centres[:][0]-centre_Y,axis=0),centres[:][1]-centre_X,axis=1))]))

where I tried to vectorise centres, but this just gave me a list of zeros in number count of length len(centres). Could anyone help please?

EDIT:

I forgot to specify that I need to roll my mask from the center to certain positions to maintain periodicity of the mask. Applying the mask straight away near the edges of the boolean array doesn't wrap the mask around to the parallel edge, and I don't want the mask to have a cutoff. This is because the data I am applying the mask to comes from a simulation with periodic boundary conditions.

UPDATE:

So I managed to write a piece of code using Numba that eliminated the need for masks and is quite fast. It takes in a 2d numpy array, a list of positions (centroids) to sample from and a radius to sample out to. The code iterates through the positions and searches for the nonzero elements within radius distance from each position. The boundaries are handled periodically. Some aspects of the code are redundant and I'm sure it can be made better and more general, but it works for what I need. Thanks to those who gave their input into the problem.

@nb.njit(fastmath=True)
def nonzero_count(array,positions,radius):

    stored_values = list()
    y,x = array.shape

    for k in range(len(positions)):

        pos_y,pos_x = positions[k][0],positions[k][1]
        nonzero = 0

        # this section handles the periodic boundary conditions
        if pos_y+radius+1>y:
            # recenter pos_y so it is given a negative value
            aa = y-(pos_y+radius+1)
            # iterate around new pos_y, from -'ve to +'ve
            yy = (aa-radius,aa+radius+1)
        else:
            aa = pos_y
            yy = (pos_y-radius,pos_y+radius+1)

        if pos_x+radius+1>x:
            # recenter pos_x so it is given a negative value
            bb = x-(pos_x+radius+1)
            # iterate around new pos_x, from -'ve to +'ve
            xx = (bb-radius,bb+radius+1)
        else:
            bb = pos_x
            xx = (pos_x-radius,pos_x+radius+1)

        # this section handles the count
        for i in range(yy[0],yy[1]):
            for j in range(xx[0],xx[1]):
                # check nonzero elements lie within radius
                if array[i,j] != 0 and (bb-j)**2+(aa-i)**2<=radius**2:
                    nonzero += 1

        stored_values.append(nonzero)

    return stored_values

Upvotes: 0

Views: 310

Answers (2)

Mike
Mike

Reputation: 1827

If your mask is sparse, it is better to first convert it to sparse and then roll it; this will give you significant speedups. As you are doing it now, you are converting the matrix to sparse in the loop, which is both expensive and negates the advantages.

Observe:

import numpy as np
import scipy.sparse as sparse

zero_matrix = np.zeros((1024, 1024))
%timeit np.roll(zero_matrix, (10, 10))
>>1000 loops, best of 3: 1.36 ms per loop

sparse_matrix = sparse.random(1024, 1024, density = 0.001)
%timeit np.roll(sparse_matrix, (10, 10))
>>100000 loops, best of 3: 15.4 µs per loop

Upvotes: 0

pythonweb
pythonweb

Reputation: 1073

I would suggest using the numba package to speed up these loops, it seems it would be perfect for your use case. Just do something like the following, and some refactoring will most likely be required:

import numba as nb

@nb.njit(nopython=True)
def slow_code():
    # roll in the vertical direction to centroid y coordinate
    mask_roll_y = np.roll(mask,centroid[0]-centre_Y,axis=0)

    # roll in the horizontal direction to centroid x coordinate and make sparse
    roll_mask = sparse.csr_matrix(np.roll(mask_roll_y,centroid[1]-centre_X,axis=1))

    # apply sparse mask to sparse data and count number of nonzero elements below
    number_count.append(np.count_nonzero(item[roll_mask]))

for item in data:

    ### Do stuff

    for radius in radii:

        ### Do stuff

        # roll mask to centroid and count the number of elements within radius

        for centroid in centres:
            slow_code()

Upvotes: 1

Related Questions