mrtaste
mrtaste

Reputation: 13

How to speed up sensitivity analysis?

I've got to deal with a sensitivity analysis that needs speed up. Data is given in a numpy array let's call it A. A got shape (M, N) where M is the number of data points and N is the number of attributes each data point consists of and on which the analysis shall be computed. For simplicity let's assume M=2, N=4. Have something like M=1+e9 in mind. Anyway. Let a_{mn} be an element of A. Analysis should be done for function f(a_{m1},a_{m2}, a_{m3}, a_{m4}) = a_{m1} - a_{m2} - ( a_{m3} * a_{m4} ) computing for each row, so that f(A) leads to array B shape (M,1). So b_m is an element of B.

Want to create array E shape (M, N) containing the sensitivity for every element on B in total. e.g. element e: m=1 an n=2, e_{mn}= e_{12} = f(a_{11},a_{12}*(1-i), a_{13}, a_{14}) - b_1

Now searching for each elements sensitivity on B. Let the sensitivity i be i=0.05. First of all I computed an array of shape (M, N) that contains all elements and its change. Let's call that C = B * i, where * is an element-wise multiplication. After that, creating D, I looped over every single element in the array. Finally subtracted B to get E. That is too expensive and very cheesy, I guess. That's why it doesn't work with a huge amount of data. Here is what I got:

import numpy as np

A = np.array([
    [2., 2., 100., 0.02],
    [4., 2., 100., 0.02]
])

def f_from_array(data):
    att_1 = data[:, 0]
    att_2 = data[:, 1]
    att_3 = data[:, 2]
    att_4 = data[:, 3]
    return ((att_1 - att_2) - (att_3 * att_4)).reshape(-1, 1)

def f_from_list(data):
    att_1 = data[0]
    att_2 = data[1]
    att_3 = data[2]
    att_4 = data[3]
    return ((att_1 - att_2) - (att_3 * att_4)).reshape(-1, 1)

B = f_from_array(A)

# B = np.array([
#     [-2.],
#     [0.]
# ])

i = 0.05
C = A * i
A_copy = A * 1
D = np.zeros(A.shape)
for m in range(A.shape[0]):
    for n in range(A.shape[1]):
        A_copy[m][n] -= C[m][n]
        D[m][n] = f_from_list(A_copy[m])
        A_copy = A * 1

E = D - B
E = np.sqrt(E**2)

Output:

E = np.array([
    [0.1, 0.1, 0.1, 0.1],
    [0.2, 0.1, 0.1, 0.1]
])

Upvotes: 0

Views: 54

Answers (1)

blubberdiblub
blubberdiblub

Reputation: 4135

Obviously, the problematic part of your code is the nested for loop. There's a lot that can be done here and it's probably possible to eliminate the loop completely.

But without thinking too much about what the code does, the most obvious time killer is probably that you create a copy of the whole array during every loop iteration. Eliminate that by just restoring the one element instead of the whole array.

Instead of

A_copy = A * 1

inside the loop, do this:

A_copy[m, n] = A[m, n]

(As an aside: Indexing with comma is slightly faster than doing a multiple-step indexing with more than one pair of brackets, but it will probably be insignificant for your case.)

Upvotes: 1

Related Questions