Reputation: 801
I am trying to make array assignment in python, but it is very slow, is there any way to accelerate?
simi_matrix_img = np.zeros((len(annot), len(annot)), dtype='float16')
for i in range(len(annot)):
for j in range(i + 1):
score = 0
times = 0
if i != j:
x_idx = [p1 for (p1, q1) in enumerate(annot[i]) if np.abs(q1 - 1) < 1e-5]
y_idx = [p2 for (p2, q2) in enumerate(annot[j]) if np.abs(q2 - 1) < 1e-5]
for idx in itertools.product(x_idx, y_idx):
score += simi_matrix_word[idx]
times += 1
simi_matrix_img[i, j] = score/times
else:
simi_matrix_img[i, j] = 1.0
"annot" is a numpy array. Is there any way to accelerate it?
Upvotes: 1
Views: 142
Reputation: 231425
I think the indent for this line is wrong:
simi_matrix_img[i, j] = score/times
you want to perform that assignment after all the product
iterations. But since it's the last assignment that takes, the results will be the same.
Here's a partial reworking of your code
def foo1(annot, simi_matrix_word):
N = annot.shape[0]
simi_matrix_img = np.zeros((N,N))
for i in range(N):
for j in range(i + 1):
if i != j:
x_idx = np.nonzero(annot[i])[0]
y_idx = np.nonzero(annot[j])[0]
idx = np.ix_(x_idx, y_idx)
# print(idx, simi_matrix_word[idx])
score = simi_matrix_word[idx].mean()
simi_matrix_img[i, j] = score
else:
simi_matrix_img[i, j] = 1.0
return simi_matrix_img
For a small test case, it returns the same thing:
annot=np.array([[1,0,1],[0,1,1]])
simi_matrix_word = np.arange(12, dtype=float).reshape(3,4)
[[ 1. 0.]
[ 7. 1.]]
That gets rid of all the inner iterations. Next step would be reduce the outer iterations. For example start with np.eye(N)
, and just iterate on the lower tri indices:
In [169]: np.eye(2)
Out[169]:
array([[ 1., 0.],
[ 0., 1.]])
In [170]: np.tril_indices(2,-1)
Out[170]: (array([1]), array([0]))
Note that for a 2 row annot
, we are only calculating one score
, at [1,0]
.
Replacing nonzero
with boolean indexing:
def foo3(annot, simi_matrix_word):
N = annot.shape[0]
A = annot.astype(bool)
simi_matrix_img = np.eye(N,dtype=float)
for i,j in zip(*np.tril_indices(N,-1)):
score = simi_matrix_word[A[i],:][:,A[j]]
simi_matrix_img[i, j] = score.mean()
return simi_matrix_img
or this might speed up the indexing a bit:
def foo4(annot, simi_matrix_word):
N = annot.shape[0]
A = annot.astype(bool)
simi_matrix_img = np.eye(N,dtype=float)
for i in range(1,N):
x = simi_matrix_word[A[i],:]
for j in range(i):
score = x[:,A[j]]
simi_matrix_img[i, j] = score.mean()
return simi_matrix_img
Since the number of nonzero values for each row of annot
can differ, the number of terms that are summed for each score
also differs. That strongly suggests that further vectorization is impossible.
Upvotes: 1
Reputation: 1200
(1) You could use generators instead of list comprehension where possible. For example:
x_idx = (p1 for (p1, q1) in enumerate(annot[i]) if np.abs(q1 - 1) < 1e-5)
y_idx = (p2 for (p2, q2) in enumerate(annot[j]) if np.abs(q2 - 1) < 1e-5)
With this, you iterate only once over those items (in for idx in itertools.product(x_idx, y_idx)
), as opposed to twice (once for constructing the list then again in said for
loop).
(2) What Python are you using? If <3, I have a hunch that a significant part of the problem is you're using range()
, which can be expensive in connection with really large ranges (as I'm assuming you're using here). In Python 2.7, range()
actually constructs lists (not so in Python 3), which can be an expensive operation. Try achieving the same result using a simple while
loop. For example, instead of for i in range(len(annot))
, do:
i=0
while i < len(annot):
... do stuff with i ...
i += 1
(3) Why call len(annot)
so many times? It doesn't seem like you're mutating annot
. Although len(annot)
is a fast O you could store the length in a var, e.g., annot_len = len(annot)
, and then just reference that. Wouldn't scrape much off though, I'm afraid.
Upvotes: 1