pandagrammer
pandagrammer

Reputation: 871

How to eliminate for-loop and use list comprehension when using numpy array?

I am trying to avoid using for-loop with numpy array.

If I have a code that looks like below:

psimaps = [np.zeros((10,10)) for i in range(len(features)-1)]

for k in range(len(features)-1):
    if k != len(features)-2:
        psimaps[k] = np.reshape(np.sum(featureParams*features[k], axis=1), (10,1)) + transitiveParams
    else:
        psimaps[k] = np.reshape(np.sum(featureParams*features[k], axis=1), (10,1)) + (np.sum(featureParams * features[k+1], axis=1)) + transitiveParams
return psimaps

How do I change this into a list representation to do this operation without a for loop? Thanks.


I added an original code. Basically, I'm generating new array computing from two arrays.

Upvotes: 2

Views: 658

Answers (2)

askewchan
askewchan

Reputation: 46530

Basically all you need to do is broadcast your features array to your Params arrays. This can be done by inserting two new axes at the end of features (or more, if the Params arrays are not 2d). Note that I used keepdims instead of the reshaping after the sum.

psimaps = np.sum(featureParams*features[..., None, None], axis=2, keepdims=True) + transitiveParams

After you do the above, you have to add the last two rows together, then remove the last row, since you had that strange end of loop thing:

psimaps[-2] += psimaps[-1] - transitiveParams
psimaps = psimaps[:-1]

By the way, I first had to simplify your original loop before I could understand it. I'll leave my simplified version here for any interest:

Fake data (and my assumption of shapes)

size = 30
features = np.random.rand(50)
transitiveParams = np.random.rand(size, size)
featureParams = np.random.rand(size, size)

Original code by OP

psimaps_OP = [np.zeros((size,size)) for i in range(len(features)-1)]
for k in range(len(features)-1):
    if k != len(features)-2:
        psimaps_OP[k] = np.reshape(np.sum(featureParams*features[k], axis=1), (size,1)) + transitiveParams
    else:
        psimaps_OP[k] = np.reshape(np.sum(featureParams*features[k], axis=1), (size,1)) + (np.sum(featureParams * features[k+1], axis=1)) + transitiveParams

simplified:

psimaps_simp = np.zeros((len(features)-1, size, size))
for k in range(len(features)-1):
    psimaps_simp[k] = np.sum(featureParams*features[k], axis=1, keepdims=True) + transitiveParams
psimaps_simp[-1] += np.sum(featureParams*features[-1], axis=1)

list comp:

psimaps_comp = [np.sum(featureParams*features[k], axis=1, keepdims=True) + transitiveParams for k in xrange(len(features)-1)]
psimaps_comp[-1] += np.sum(featureParams*features[-1], axis=1)

vectorised:

psimaps_vec = np.sum(featureParams*features[..., None, None], axis=2, keepdims=True) + transitiveParams
psimaps_vec[-2] += psimaps_vec[-1] - transitiveParams
psimaps_vec = psimaps_vec[:-1]

Next, check to make sure they all give the same result:

assert np.allclose(psimaps_simp, psimaps_OP), "simplification failed"
assert np.allclose(psimaps_simp, psimaps_vec), "vectorization failed"

Finally, timings:

#OP
100 loops, best of 3: 1.99 ms per loop

#simplified:
1000 loops, best of 3: 1.94 ms per loop

#list comp:
1000 loops, best of 3: 1.63 ms per loop

#vectorised:
1000 loops, best of 3: 407 µs per loop

Upvotes: 3

Evelin Amorim
Evelin Amorim

Reputation: 1078

If initialization is not important, maybe you can do like that:

psimaps = [ featureParams + transitiveParams for k in xrange(1,10)]

For each k, the sum featureParams + transitiveParams will be executed.

Upvotes: 0

Related Questions