blaufer
blaufer

Reputation: 119

Vectorizing a function with numpy arrays

I am attempting to speed up some code that I wrote but am having a large amount of trouble doing so. I know that being able to remove for loops and using numpy can help to do this so that is what I have been attempting with little success.

The working function without any speedups is

def acf(x, y, z, cutoff=0):
    steps = x.shape[1]
    natoms = x.shape[0]

    z_x = np.zeros((steps,natoms))
    z_y, z_z = np.zeros_like(z_x), np.zeros_like(z_x)

    xmean = np.mean(x, axis=1)
    ymean = np.mean(y, axis=1)
    zmean = np.mean(z, axis=1)

    for k in range(steps-cutoff): # x.shape[1]
        xtemp, ytemp, ztemp = [], [], []
        for i in range(x.shape[0]): # natoms
            xtop, ytop, ztop = 0.0, 0.0, 0.0
            xbot, ybot, zbot = 0.0, 0.0, 0.0
            for j in range(steps-k): # x.shape[1]-k
                xtop += (x[i][j] - xmean[i]) * (x[i][j+k] - xmean[i])
                ytop += (y[i][j] - ymean[i]) * (y[i][j+k] - ymean[i])
                ztop += (z[i][j] - zmean[i]) * (z[i][j+k] - zmean[i])
                xbot += (x[i][j] - xmean[i])**2
                ybot += (y[i][j] - ymean[i])**2
                zbot += (z[i][j] - zmean[i])**2
            xtemp.append(xtop/xbot)
            ytemp.append(ytop/ybot)
            ztemp.append(ztop/zbot)
        z_x[k] = xtemp
        z_y[k] = ytemp
        z_z[k] = ztemp

    z_x = np.mean(np.array(z_x), axis=1)
    z_y = np.mean(np.array(z_y), axis=1)
    z_z = np.mean(np.array(z_z), axis=1)

    return z_x, z_y, z_z

The inputs x, y, and z for this function are numpy arrays of the same dimensions. An example of x (or y or z for that matter) is:

x = np.array([[1,2,3],[4,5,6]])

So far what I have been able to do is

def acf_quick(x, y, z, cutoff=0):
    steps = x.shape[1]
    natoms = x.shape[0]

    z_x = np.zeros((steps,natoms))
    z_y, z_z = np.zeros_like(z_x), np.zeros_like(z_x)

    x -= np.mean(x, axis=1, keepdims=True)
    y -= np.mean(y, axis=1, keepdims=True)
    z -= np.mean(z, axis=1, keepdims=True)

    for k in range(steps-cutoff): # x.shape[1]
        for i in range(natoms):
            xtop, ytop, ztop = 0.0, 0.0, 0.0
            xbot, ybot, zbot = 0.0, 0.0, 0.0
            for j in range(steps-k): # x.shape[1]-k
                xtop += (x[i][j]) * (x[i][j+k])
                ytop += (y[i][j]) * (y[i][j+k])
                ztop += (z[i][j]) * (z[i][j+k])
                xbot += (x[i][j])**2
                ybot += (y[i][j])**2
                zbot += (z[i][j])**2
            z_x[k][i] = xtop/xbot
            z_y[k][i] = ytop/xbot
            z_z[k][i] = ztop/xbot

    z_x = np.mean(np.array(z_x), axis=1)
    z_y = np.mean(np.array(z_y), axis=1)
    z_z = np.mean(np.array(z_z), axis=1)

    return z_x, z_y, z_z

This speeds it up by about 33% but I believe there is a way to remove the for i in range(natoms) using something along the lines of x[:][j]. So far I have been unsuccessful and any help would be greatly appreciated.

Before anyone asks, I know that this is an autocorrelation function and there are several built into numpy, scipy, etc but I need to write my own.

Upvotes: 1

Views: 214

Answers (1)

sltzgs
sltzgs

Reputation: 166

Here is a vectorized form of your loop:

def acf_quick_new(x, y, z, cutoff=0):
    steps = x.shape[1]
    natoms = x.shape[0]

    lst_inputs = [x.copy(),y.copy(),z.copy()]
    lst_outputs = []
    for x_ in lst_inputs:

        z_x_ = np.zeros((steps,natoms))

        x_ -= np.mean(x_, axis=1, keepdims=True)

        x_top = np.diag(np.dot(x_,x_.T))
        x_bot = np.sum(x_**2, axis=1)

        z_x_[0,:] = np.divide(x_top, x_bot)


        for k in range(1,steps-cutoff): # x.shape[1]

            x_top = np.diag(np.dot(x_[:,:-k],x_.T[k:,:]))
            x_bot = np.sum(x_[:,:-k]**2, axis=1)

            z_x_[k,:] = np.divide(x_top, x_bot)


        z_x_ = np.mean(np.array(z_x_), axis=1)
        lst_outputs.append(z_x_)    

    return lst_outputs

Note, that in your _quick-function there is a little bug: you always divide by xbot instead of xbot,ybot, and zbot. Moreover, my suggestion can be written a little nicer, but it should do the trick for your problem and speed up the calculations a lot :)

Upvotes: 1

Related Questions