mrav
mrav

Reputation: 33

Pythonic way to vectorize double summation

I'm attempting to convert a double summation formula into code, but can't figure out the correct matrix/vector representation of it.

The first summation is i to n, and the second is over j > i to n.

I'm guessing there is a much more efficient & pythonic way of writing this?

I resorted to nested for loops to just get it working but, as expected, it runs very slowly with a large dataset:

def wapc_denom(weights, vols):
    x = []
    y = []

    for i, wi in enumerate(weights):
        for j, wj in enumerate(weights):
            if j > i:
                x.append(wi * wj * vols[i] * vols[j])
        y.append(np.sum(x))

    return np.sum(y)

Edit:

Using guidance from smci's answer I think I have a potential solution:

def wapc_denom2(weights, vols):
    return np.sum(np.tril(np.outer(weights, vols.T)**2, k=-1))

Upvotes: 3

Views: 667

Answers (3)

smci
smci

Reputation: 33970

  • wi * wj * vols[i] * vols[j] is a telltale. vols is another vector, so first you want to compute the vector wv = w * vols
  • then (wj * vols[j]) * (wi * vols[i]) = wv^T * wv is your (matrix outer product) expression; that's a column vector * a row vector. But actually you only want the sum. So I don't see a need to construct a vector y.append(np.sum(x)), you're only going to sum it anyway np.sum(y)
  • also the if j > i part means you only want the sum of the Lower Triangular part, and exclude the diagonal.
    • EDIT: the result is fully determined just from wv, I didn't think we needed the matrix to get the sum, and we didn't need the diagonal; @PaulPanzer found the most compact expression.

Upvotes: 1

Paul Panzer
Paul Panzer

Reputation: 53089

Assuming you want to count every term only once (for that you have to move the x = [] into the outer loop) one cheap way of computing the sum would be

Create mock data

weights = np.random.random(10)
vols = np.random.random(10)

Do the calculation

wv = weights * vols
result = (wv.sum()**2 - wv@wv) / 2

Check that it's the same

def wapc_denom(weights, vols):
    y = []

    for i, wi in enumerate(weights):
        x = []
        for j, wj in enumerate(weights):
            if j > i:
                x.append(wi * wj * vols[i] * vols[j])
        y.append(np.sum(x))

    return np.sum(y)

assert np.allclose(result, wapc_denom(weights, vols))

Why does it work?

What we are doing is compute the sum of the full matrix, subtract the diagonal and divide by two. This is cheap because it is easy to verify that the sum of an outer product is just the product of the summed factors.

Upvotes: 3

Tarifazo
Tarifazo

Reputation: 4343

You can use triangulations in numpy, check np.triu and np.meshgrid. Do:

np.product(np.triu(np.meshgrid(weights,weights), 1) * np.triu(np.meshgrid(vols,vols), 1),0).sum(1).cumsum().sum()

Example:

w = np.arange(4) +1
v = np.array([1,3,2,2])
print(np.triu(np.meshgrid(w,w), k=1))
>>array([[[0, 2, 3, 4],
        [0, 0, 3, 4],
        [0, 0, 0, 4],
        [0, 0, 0, 0]],

       [[0, 1, 1, 1],
        [0, 0, 2, 2],
        [0, 0, 0, 3],
        [0, 0, 0, 0]]])

# example of product + triu + meshgrid (your x values):
print(np.product(np.triu(np.meshgrid(w,w), 1) * np.triu(np.meshgrid(v,v), 1),0))
>>array([[ 0,  6,  6,  8],
   [ 0,  0, 36, 48],
   [ 0,  0,  0, 48],
   [ 0,  0,  0,  0]])

print(np.product(np.triu(np.meshgrid(w,w), 1) * np.triu(np.meshgrid(v,v), 1),0).sum(1).cumsum().sum())
>> 428
print(wapc_denom(w, v))
>> 428

Upvotes: 0

Related Questions