Reputation: 3624
I have 3 numpy arrays dm_w, dm_s
and dm_p
. I am in need of iterating through these arrays in parallel, do some computation based on a check condition as shown in code below.
My code works well for smaller arrays, but takes too long with larger arrays. I need an efficient and faster method to achieve this. Need some expert opinion.
My code:
prox_mat = []
for w_dist, s_dist, PI in zip(np.nditer(dm_w), np.nditer(dm_s), np.nditer(dm_p)):
if PI == 0.0:
proximity_score = ((w_dist + len(np.unique(dm_s) * s_dist)) /
(dm_w.shape[0] * len(np.unique(dm_s))))
prox_mat.append(proximity_score)
else:
proximity_score = ((w_dist + len(np.unique(dm_s) * s_dist)) /
(dm_w.shape[0] * len(np.unique(dm_s)))) * log10(10 * PI)
prox_mat.append(proximity_score)
ps = np.array(prox_mat)
ps = np.reshape(ps, dm_w.shape)
Upvotes: 1
Views: 1374
Reputation: 13261
Several things. One, computation of np.unique(dm_s)
should be pulled outside of the loop. Even further, it looks like:
len(np.unique(dm_s) * s_dist) == len(np.unique(dm_s))
Which should either be pulled out of the loop or is a mistake. In any case..
We should just vectorize the forloop/append construct:
dm_s_uniques = len(np.unique(dm_s))
logs = np.log10(10 * dm_p)
logs[logs == -np.inf] = 1
prox_mat = ((dm_w + dm_s_uniques) / (dm_w.shape[0] * dm_s_uniques)) * logs
ps = np.reshape(ps, dm_w.shape)
It looks like I map
Upvotes: 4