Reputation: 2651
I have a function which I would like to vectorize away the remaining loop. I believe it is correct and I am happy with performance, but would just like to know more about vectorizing code. The function is:
def f(x, A, c):
# A : d*d numpy array
# c : length d numpy array
# x : N x d or length d numpy array
x = np.atleast_2d(x)
b = np.zeros(x.shape[0], dtype=np.bool)
for row in range(x.shape[0]):
xmc = x[row, :] - c
b[row] = xmc.dot(A).dot(xmc) <= 1
return b
Is it possible to vectorize the function and remove the remaining loop while keeping it reasonably simple? Are there any guidelines for when independent calculations in a loop cannot be vectorized well? Typical values for N and d are 10000 and 4 respectively. Thank you.
Upvotes: 1
Views: 608
Reputation: 221684
You could vectorize like so -
xc = x-c
b_out = ((xc.dot(A))*xc).sum(1) <= 1
You can also use np.einsum
-
xc = x-c
b_out = np.einsum('ij,jk,ik->i',xc,A,xc) <= 1
Runtime tests -
Define functions:
def org_app(x, A, c):
x = np.atleast_2d(x)
b = np.zeros(x.shape[0], dtype=np.bool)
for row in range(x.shape[0]):
xmc = x[row, :] - c
b[row] = xmc.dot(A).dot(xmc) <= 1
return b
def vectorized_app1(x,A,c):
xc = x-c
return ((xc.dot(A))*xc).sum(1) <= 1
def vectorized_app2(x,A,c):
xc = x-c
return np.einsum('ij,jk,ik->i',xc,A,xc) <= 1
Timings:
In [266]: N = 20
...: d = 20
...: A = np.random.rand(d,d)
...: c = np.random.rand(d)
...: x = np.random.rand(N,d)
...:
In [267]: %timeit org_app(x,A,c)
1000 loops, best of 3: 274 µs per loop
In [268]: %timeit vectorized_app1(x,A,c)
10000 loops, best of 3: 46 µs per loop
In [269]: %timeit vectorized_app2(x,A,c)
10000 loops, best of 3: 63.7 µs per loop
In [270]: N = 100
...: d = 100
...: A = np.random.rand(d,d)
...: c = np.random.rand(d)
...: x = np.random.rand(N,d)
...:
In [271]: %timeit org_app(x,A,c)
100 loops, best of 3: 2.74 ms per loop
In [272]: %timeit vectorized_app1(x,A,c)
1000 loops, best of 3: 1.46 ms per loop
In [273]: %timeit vectorized_app2(x,A,c)
100 loops, best of 3: 4.72 ms per loop
Upvotes: 1