Reputation: 42329
I'm working with a nested array. I need to apply a rather simple but costly arithmetic operation on each element of this array.
Below is the MWE
, where the "second block" is the one that takes up most of the time (it will run thousands of times).
The first and second blocks need to be separated since the first one is only processed once, given that the real way to obtain a,b,c
is very costly time-wise.
I'm not sure how I could improve the performance of this operation applied on each element of the nested array. Surely numpy
would do this much faster, but I'm not much familiarized with broadcasting operations on arrays.
import numpy as np
import time
# Generate some random data.
N = 100
x, y, z = [np.random.uniform(0., 10., N) for _ in range(3)]
# Grid of values in 2 dimensions.
M = 200
p_lst, q_lst = np.linspace(0., 50., M), np.linspace(0., 25., M)
# Define empty nested list to be filled below.
# The shape is given by the length of the lists defined above.
abc_lst = [[[] for _ in p_lst] for _ in q_lst]
# First block. This needs to be separated from the block below.
# Fill nested list with values.
for i, p in enumerate(p_lst):
for j, q in enumerate(q_lst):
# a,b,c are obtained via some complicated function of p,q.
# This is just for the purpose of this example.
a, b, c = 1.*p, 1.*q, p+q
# Store in nested list.
abc_lst[i][j] = [a, b, c]
# Second block <-- THIS IS THE BOTTLENECK
tik = time.time()
# Apply operation on nested list.
lst = []
for i in range(len(p_lst)):
for j in range(len(q_lst)):
# Extract a,b,c values from nested list.
a, b, c = abc_lst[i][j]
# Apply operation. This is the *actual* operation
# I need to apply.
d = sum(abs(a*x + y*b + c*z))
# Store value.
lst.append(d)
print time.time() - tik
Upvotes: 1
Views: 128
Reputation: 42329
I've found an answer in this question, using the np.outer() function.
It only takes a bit of re-arranging of the first block, and the second block runs many many times faster.
# First block. Store a,b,c separately.
a_lst, b_lst, c_lst = [], [], []
for i, p in enumerate(p_lst):
for j, q in enumerate(q_lst):
# a,b,c are obtained via some complicated function of p,q.
# This is just for the purpose of this example.
a_lst.append(1.*p)
b_lst.append(1.*q)
c_lst.append(p+q)
# As arrays.
a_lst, b_lst, c_lst = np.asarray(a_lst), np.asarray(b_lst), np.asarray(c_lst)
# Second block.
# Apply operation on nested list using np.outer.
lst = np.sum(abs(np.outer(a_lst, x) + np.outer(b_lst, y) + np.outer(c_lst, z)), axis=1)
Upvotes: 1
Reputation: 3674
I don't think two sets of loops are necessary. Just collapse into one:
## Always pre-allocate with zeros if possible...not just empty lists
lst = np.zeros(M*M)
# First block. This one runs fast.
tik = time.time()
# Fill nested list with values.
for i, p in enumerate(p_lst):
for j, q in enumerate(q_lst):
# a,b,c are obtained via some complicated function of p,q.
# This is just for the purpose of this example.
a, b, c = 1.*p, 1.*q, p+q
# Don't store in nested list, just calculate
##abc_lst[i][j] = [a, b, c]
lst[i*M+j] = (sum(abs(a*x + y*b + c*z)))
Upvotes: 0