Cycling through array elements efficiently in python

Question

I'm trying to sum the elements of separate data array by their characteristics efficiently. I have three identifying characteristics (age, year, and cause) in a given array, and for each age, year, cause, I have 1000 values. I need to add those values to another data array when the characteristics are the same. For now, I'm doing something like this where each datasets is ~ (80000, 1000):

import numpy as np
datasets = np.vstack(dataset1, dataset2)
for a in ages:
    for y in years:
        for c in causes:
             output = np.sum(datasets[(age==a) & (year==y) & (cause==c)], axis = 0)

However, with 60,000 iterations, this is incredibly slow. The challenge is that the arrays don't necessarily all have the same shape. Any thoughts?

mike · Accepted Answer

SEE LINK BELOW

I'm not sure how to properly link another answer to this answer. When I tried one sentence followed by the link, it converted the answer to a comment. I'm now being long-winded to try and make stack-overflow think that this text is long enough to constitute an answer. Here is the link to a great answer to this question.

Summing Arrays by Characteristics in Python

Cycling through array elements efficiently in python

Answers (2)

Related Questions