Reputation: 5646
I want to the accelerate following double for
loop:
n = 10000
A = np.zeros((n, n))
twopi = 2.0 * np.pi
for i in range(0, n):
for j in range(0, n):
A[i,j] = twopi * i*j/n
How could I do it? My idea: compute a vector v
containing all the elements from 0
to n-1
:
v = np.arange(n)
then construct two matrices I
and J
, one having all rows equal to v
and the other one having all columns equal to v
. If I'm not mistaken, then I could substitute the double for
loop with
A = twopi * I*J/n
Is this correct? How do I build I
and J
? Are there better ways?
Upvotes: 0
Views: 94
Reputation: 114015
In [118]: matrix = np.array([np.arange(n) for _ in range(n)])
In [119]: matrix
Out[119]:
array([[ 0, 1, 2, ..., 9997, 9998, 9999],
[ 0, 1, 2, ..., 9997, 9998, 9999],
[ 0, 1, 2, ..., 9997, 9998, 9999],
...,
[ 0, 1, 2, ..., 9997, 9998, 9999],
[ 0, 1, 2, ..., 9997, 9998, 9999],
[ 0, 1, 2, ..., 9997, 9998, 9999]])
In [120]: matrix [0,1]
Out[120]: 1
In [121]: for j in range(n): matrix[:,j] *= j
In [122]: matrix
Out[122]:
array([[ 0, 1, 4, ..., 99940009, 99960004, 99980001],
[ 0, 1, 4, ..., 99940009, 99960004, 99980001],
[ 0, 1, 4, ..., 99940009, 99960004, 99980001],
...,
[ 0, 1, 4, ..., 99940009, 99960004, 99980001],
[ 0, 1, 4, ..., 99940009, 99960004, 99980001],
[ 0, 1, 4, ..., 99940009, 99960004, 99980001]])
In [123]: matrix = matrix/n
In [124]: matrix = matrix * np.pi*2
In [125]: matrix
Out[125]:
array([[0.00000000e+00, 6.28318531e-04, 2.51327412e-03, ...,
6.27941596e+04, 6.28067228e+04, 6.28192873e+04],
[0.00000000e+00, 6.28318531e-04, 2.51327412e-03, ...,
6.27941596e+04, 6.28067228e+04, 6.28192873e+04],
[0.00000000e+00, 6.28318531e-04, 2.51327412e-03, ...,
6.27941596e+04, 6.28067228e+04, 6.28192873e+04],
...,
[0.00000000e+00, 6.28318531e-04, 2.51327412e-03, ...,
6.27941596e+04, 6.28067228e+04, 6.28192873e+04],
[0.00000000e+00, 6.28318531e-04, 2.51327412e-03, ...,
6.27941596e+04, 6.28067228e+04, 6.28192873e+04],
[0.00000000e+00, 6.28318531e-04, 2.51327412e-03, ...,
6.27941596e+04, 6.28067228e+04, 6.28192873e+04]])
Same thing as a script:
matrix = np.array([np.arange(n) for _ in range(n)])
for j in range(n): matrix[:,j] *= j
matrix = matrix/n
matrix = matrix * np.pi*2
Upvotes: 1
Reputation: 2167
You can indeed use numpy
's broadcasting ability.
import numpy as np
n = 10000
twopi = 2.0 * np.pi / n
A = np.arange(n) * np.arange(n).reshape(-1, 1) * twopi
The reshape of the second np.arange()
drives the product to broadcast.
It's very similar to calling np.outer()
as Dr.V did, even from a time performance perspective.
Config:
import numpy as np
n = 10000
twopi = 2 * np.pi / n
Times:
# List comprehension
A = np.array([
[twopi*i*j for j in range(n)]
for i in range(n)])
# timeit > 10.2 s ± 129 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)
# Outer
i = np.arange(n)
A = twopi * np.outer(i, i)
# timeit > 234 ms ± 6.87 ms per loop (mean ± std. dev. of 7 runs, 10 loops each)
# Product
A = np.arange(n) * np.arange(n).reshape(-1, 1) * twopi
# timeit > 208 ms ± 41.8 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)
As said, the outer and product methods are very similar. But the list comprehension is not because it defeats the purpose of numpy which is to vectorize the array computations with distributed sub-tasks (and, very often, low-level optimizations).
Upvotes: 1
Reputation: 1914
Something like this:
from numpy import outer, pi, arange
n = 10
i = arange(n)
A = 2 * pi / n * outer(i, i)
The outer
product does what you ask for.
Edit: A simple way to test performance is the following:
from time import time
n = 10000
t = time()
for _ in range(10):
i = arange(n)
A = 2 * pi / n * outer(i, i)
t = time() - t
print("Time used [s]", t)
For me, it uses 5 seconds for these 10 repetitions, and there is no significant difference between range
and arange
.
Upvotes: 3
Reputation: 3591
You can use list comprehensions:
A = np.array([
[twopi*i*j/n for j in range(n)]
for i in range(n)])
The default is for range
to start at zero, so range(0,n)
rather than range(n)
is redundant.
Upvotes: 1