Potato
Potato

Reputation: 561

Better alternative to nested for loops through arrays in numpy?

Often I need to traverse an array and perform some operation on each entry, where the operation may depend on the indices and the value of the entry. Here is a simple example.

import numpy as np

N=10
M = np.zeros((N,N))

for i in range(N):
    for j in range(N):
        M[i,j] = 1/((i+2*j+1)**2)

Is there a shorter, cleaner, or more pythonic way to perform such tasks?

Upvotes: 6

Views: 8597

Answers (4)

scavi
scavi

Reputation: 57

Yes, you can do this in pure NumPy without using any loops:

import numpy as np

N = 10

i = np.arange(N)[:, np.newaxis]
j = np.arange(N)
M = 1/((i+2*j+1)**2)

The reason why this works is because NumPy automatically performs outer products whenever you mix row- and column vectors within an expression.

Moreover, since this is pure NumPy, the code will also run a lot faster. For example, for N=10**4, the double for loop version takes 48.3 seconds on my computer, whereas this code is already finished after only 1.2 seconds.

Upvotes: 1

B. M.
B. M.

Reputation: 18628

np.fromfunction is intended for that :

def f(i,j) : return 1/((i+2*j+1)**2) 
M = np.fromfunction(f,(N,N))

it's slighty slower that the 'hand made' vectorised way , but easy to understand.

Upvotes: 3

hpaulj
hpaulj

Reputation: 231355

What you show is 'pythonic' in the sense that it uses a Python list and iteration approach. The only use of numpy is in assigning the values, M{i,j] =. Lists don't take that kind of index.

To make most use of numpy, make index grids or arrays, and calculate all values at once, without explicit loop. For example, in your case:

In [333]: N=10
In [334]: I,J = np.ogrid[0:10,0:10]
In [335]: I
Out[335]: 
array([[0],
       [1],
       [2],
       [3],
       [4],
       [5],
       [6],
       [7],
       [8],
       [9]])
In [336]: J
Out[336]: array([[0, 1, 2, 3, 4, 5, 6, 7, 8, 9]])
In [337]: M = 1/((I + 2*J + 1)**2)
In [338]: M
Out[338]: 
array([[ 1.        ,  0.11111111,  0.04      ,  0.02040816,  0.01234568,
         0.00826446,  0.00591716,  0.00444444,  0.00346021,  0.00277008],
 ...
       [ 0.01      ,  0.00694444,  0.00510204,  0.00390625,  0.00308642,
         0.0025    ,  0.00206612,  0.00173611,  0.00147929,  0.00127551]])

ogrid is one of several ways of construction sets of arrays that can be 'broadcast' together. meshgrid is another common function.

In your case, the equation is one that works well with 2 arrays like this. It depends very much on broadcasting rules, which you should study.

If the function only takes scalar inputs, we will have to use some form of iteration. That has been a frequent SO question; search for [numpy] vectorize.

Upvotes: 5

alainh
alainh

Reputation: 83

I would say that's the most straight forward and universally understood way of performing that iteration.

An alternative would be to iterate over over the values and call a function for a given (i, j) pair

import itertools

N = 10
M = np.zeros((N,N))

def do_work(i, j):
       M[i,j] = 1/((i+2*j+1)**2)

[do_work(i, j) for (i, j) in itertools.product(xrange(N), xrange(N))]

Here I just used itertools.product to create a generator for an possible (i, j) values, you can just as well use a for loop.

for (i, j) in itertools.product(xrange(N), xrange(N)):
    M[i,j] = 1/((i+2*j+1)**2)

Upvotes: 1

Related Questions