Reputation: 89
I'm trying to find a way to sum an array of values based off of a boolean index, using a modulo function to determine month beginning/end.
months = np.arange(36) + 1 # +1 to denote months rather than index
vals = np.ones(36)
vals[12:24] = 2
vals[24:36] = 3
# closest try:
vals.cumsum()[[months % 12 == 0]] # returns array([12, 36, 72])
# target result = array([12, 24, 36])
The vals.sum() function just sums the whole thing, but cumsum accumulates over the whole thing, which isn't quite what I'm looking for. Target result is included above - this is a common spreadsheet summarization technique that would usually be done using a SUMIF function to sum values according to certain parameters.
Is there an easy way to do this? I'm sure there is, I'm just missing it and I've put a bit of time trying to get this figured - would prefer not to use a for loop.
Thanks.
Upvotes: 1
Views: 353
Reputation: 214957
Seems you need np.add.reduceat
:
np.add.reduceat(vals, np.flatnonzero((months - 1) % 12 == 0))
# array([ 12., 24., 36.])
Explanations:
months
# array([ 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17,
# 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34,
# 35, 36])
1). Use modulo to find out where the condition where the sum should start by (months - 1) % 12
:
(months - 1) % 12 == 0
# array([ True, False, False, False, False, False, False, False, False,
# False, False, False, True, False, False, False, False, False,
# False, False, False, False, False, False, True, False, False,
# False, False, False, False, False, False, False, False, False], dtype=bool)
2). np.flatnonzero
is similar to np.where
and gives the indices, so here, the first sum starts from 0 till 12 (exclusive), etc:
np.flatnonzero((months - 1) % 12 == 0)
array([ 0, 12, 24])
3). After finding out the indices, use np.add.reduceat
to sum up the segments:
np.add.reduceat(vals, [0, 12, 24])
# array([ 12., 24., 36.])
Essentially, this is equivalent to [sum(vals[0:12]), sum(vals[12:24]), sum(vals[24:])]
and gives the output you need.
Upvotes: 2
Reputation: 855
np.sum(vals[np.where(months % 12 == 0)[0]])
maybe?
np.where
is used to select the indices.
Upvotes: 1